Large-scale repositories of genomic data are providing opportunities for researchers to answer biological questions at unprecedented resolution. Uncovering the structure underlying these datasets is a fundamental task where the structure can correspond to biological signals of interest or to confounders such as ancestry and batch effects that must be accounted for to prevent spurious findings. While discovering structure is a challenging problem, the growing size of genomic datasets leads to computational bottlenecks that further complicate their analysis. Here, we propose three scalable approaches for detecting structure in genomic data. We present ProPCA, a probabilistic principal component analysis method for large-scale genomic data. We...
International audienceMOTIVATION: Population stratification is one of the major sources of confoundi...
High-dimensional genomic data can provide deep insight into biological processes. However, conventio...
Genotype data, consisting large numbers of markers, is used as demographic and association studies t...
Large-scale repositories of genomic data are providing opportunities for researchers to answer biolo...
Inferring the structure of human populations from genetic variation data is a key task in population...
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un...
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un...
<div><p>The advent of genome-wide dense variation data provides an opportunity to investigate ancest...
Current methods for inferring population structure from genetic data do not provide formal significa...
The ever more complex and larger datasets that statisticians can routinely access have prompted the ...
Existing methods to ascertain small sets of markers for the identification of human population struc...
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un...
Studying genomic patterns of human population structure provides important insights into human evolu...
During the past decades, population structure analysis has been playing an important role for strati...
With the advancements in DNA sequencing technology and the decreasing cost of sequencing, there has ...
International audienceMOTIVATION: Population stratification is one of the major sources of confoundi...
High-dimensional genomic data can provide deep insight into biological processes. However, conventio...
Genotype data, consisting large numbers of markers, is used as demographic and association studies t...
Large-scale repositories of genomic data are providing opportunities for researchers to answer biolo...
Inferring the structure of human populations from genetic variation data is a key task in population...
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un...
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un...
<div><p>The advent of genome-wide dense variation data provides an opportunity to investigate ancest...
Current methods for inferring population structure from genetic data do not provide formal significa...
The ever more complex and larger datasets that statisticians can routinely access have prompted the ...
Existing methods to ascertain small sets of markers for the identification of human population struc...
The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in un...
Studying genomic patterns of human population structure provides important insights into human evolu...
During the past decades, population structure analysis has been playing an important role for strati...
With the advancements in DNA sequencing technology and the decreasing cost of sequencing, there has ...
International audienceMOTIVATION: Population stratification is one of the major sources of confoundi...
High-dimensional genomic data can provide deep insight into biological processes. However, conventio...
Genotype data, consisting large numbers of markers, is used as demographic and association studies t...