The use of machine learning techniques, in particular unsupervised clustering and dimensionality reduction algorithms, is quickly becoming a standard workflow for identifying and visualizing biological populations from within high-dimensional data. These methods allow researchers to approach data analysis without the bias and subjectivity that has traditionally been standard in the field. Algorithms have context-dependent strengths and weaknesses. Across algorithms, an inability to scale computation to large datasets is a common theme. Most algorithms are designed and distributed to run on individual computers where memory and CPU are quickly exhausted by large datasets. Even when high-performance compute resources are available, algorithms...
Background: Recent biological discoveries have shown that clustering large datasets is essential for...
Kary Ocaña,1 Daniel de Oliveira2 1National Laboratory of Scientific Computing, Petrópo...
The computational demands of multivariate clustering grow rapidly, and therefore processing large da...
Abstract Background In recent years, the demand for computational power in computational biology has...
Background: The amount of data generated in large clinical and phenotyping studies that use single-c...
Currently, clustering applications use classical methods to partition a set of data (or objects) in ...
t-Distributed Stochastic Neighbor Embedding (t-SNE or viSNE) is a dimensionality reduction algorithm...
As DNA sequencing outpaces improvements in computer speed, there is a critical need to accelerate ta...
Motivation: Single cell data measures multiple cellular markers at the single-cell level for thousan...
Recent technological developments in high-dimensional flow cytometry and mass cytometry (CyTOF) have...
Clustering in bioinformatics is a fundamental process involving computational issues that are far fr...
Our work aims to explore interoperability of large-scale cloud data processing software in HPC envir...
Flow/Mass cytometry data analysis is essential in the study of diverse phenotypes and functions at t...
© 2020, The Author(s), under exclusive licence to Springer Nature America, Inc. Massively parallel s...
Modern biology is largely shaped by the development of next generation sequencing (NGS) technology. ...
Background: Recent biological discoveries have shown that clustering large datasets is essential for...
Kary Ocaña,1 Daniel de Oliveira2 1National Laboratory of Scientific Computing, Petrópo...
The computational demands of multivariate clustering grow rapidly, and therefore processing large da...
Abstract Background In recent years, the demand for computational power in computational biology has...
Background: The amount of data generated in large clinical and phenotyping studies that use single-c...
Currently, clustering applications use classical methods to partition a set of data (or objects) in ...
t-Distributed Stochastic Neighbor Embedding (t-SNE or viSNE) is a dimensionality reduction algorithm...
As DNA sequencing outpaces improvements in computer speed, there is a critical need to accelerate ta...
Motivation: Single cell data measures multiple cellular markers at the single-cell level for thousan...
Recent technological developments in high-dimensional flow cytometry and mass cytometry (CyTOF) have...
Clustering in bioinformatics is a fundamental process involving computational issues that are far fr...
Our work aims to explore interoperability of large-scale cloud data processing software in HPC envir...
Flow/Mass cytometry data analysis is essential in the study of diverse phenotypes and functions at t...
© 2020, The Author(s), under exclusive licence to Springer Nature America, Inc. Massively parallel s...
Modern biology is largely shaped by the development of next generation sequencing (NGS) technology. ...
Background: Recent biological discoveries have shown that clustering large datasets is essential for...
Kary Ocaña,1 Daniel de Oliveira2 1National Laboratory of Scientific Computing, Petrópo...
The computational demands of multivariate clustering grow rapidly, and therefore processing large da...