Background: Data transformations are commonly used in bioinformatics data processing in the context of data projection and clustering. The most used Euclidean metric is not scale invariant and therefore occasionally inappropriate for complex, e.g., multimodal distributed variables and may negatively affect the results of cluster analysis. Specifically, the squaring function in the definition of the Euclidean distance as the square root of the sum of squared differences between data points has the consequence that the value 1 implicitly defines a limit for distances within clusters versus distances between (inter-) clusters. Methods: The Euclidean distances within a standard normal distribution (N(0,1)) follow a N(0,2–√) distribution. The E...
Clustering is a long-standing problem in computer science and is applied in virtually any scientific...
a b s t r a c t Traditional approach to clustering is to fit a model (partition or prototypes) for t...
Background: Clustering is crucial for gene expression data analysis. As an unsupervised explorator...
MOTIVATION: Many popular clustering methods are not scale-invariant because they are based on Euclid...
There are many distance-based methods for classification and clustering, and for data with a high n...
Clustering is basically one of the major sources of primary data mining tools. It makes researchers ...
The grouping of clusters is an important task to perform for the initial stage of clinical implicati...
Clustering of patients allows to find groups of subjects with similar characteristics. This categori...
Normalization before clustering is often needed for proximity indices, such as Euclidian distance, w...
The purpose of this thesis is to propose new methodology for data normalization and cluster predicti...
The K-means clustering algorithm is an old algorithm that has been intensely researched owing to its...
It is reported in this paper, the results of a study of the partitioning around medoids (PAM) cluste...
Abstract Background Clustering methods are becoming widely utilized in biomedical research where the...
Background: Clustering is crucial for gene expression data analysis. As an unsupervised exploratory ...
Abstract. Clustering algorithms are employed in many bioinformatics tasks, including classification ...
Clustering is a long-standing problem in computer science and is applied in virtually any scientific...
a b s t r a c t Traditional approach to clustering is to fit a model (partition or prototypes) for t...
Background: Clustering is crucial for gene expression data analysis. As an unsupervised explorator...
MOTIVATION: Many popular clustering methods are not scale-invariant because they are based on Euclid...
There are many distance-based methods for classification and clustering, and for data with a high n...
Clustering is basically one of the major sources of primary data mining tools. It makes researchers ...
The grouping of clusters is an important task to perform for the initial stage of clinical implicati...
Clustering of patients allows to find groups of subjects with similar characteristics. This categori...
Normalization before clustering is often needed for proximity indices, such as Euclidian distance, w...
The purpose of this thesis is to propose new methodology for data normalization and cluster predicti...
The K-means clustering algorithm is an old algorithm that has been intensely researched owing to its...
It is reported in this paper, the results of a study of the partitioning around medoids (PAM) cluste...
Abstract Background Clustering methods are becoming widely utilized in biomedical research where the...
Background: Clustering is crucial for gene expression data analysis. As an unsupervised exploratory ...
Abstract. Clustering algorithms are employed in many bioinformatics tasks, including classification ...
Clustering is a long-standing problem in computer science and is applied in virtually any scientific...
a b s t r a c t Traditional approach to clustering is to fit a model (partition or prototypes) for t...
Background: Clustering is crucial for gene expression data analysis. As an unsupervised explorator...