Abstract. Cluster discovery is an essential part of many data mining applications. While cluster discovery process is mainly unsupervised in nature, it can often be aided by a small amount of labeled data. A probabilistic model on the clustering structure is adopted and a novel unified energy equation for clustering that incor-porates both labeled data and unlabeled data is introduced. This formulation is inspired by a force-field model integrating labeling constraint on labeled data and similarity information on unlabeled data for joint estimation. Experimental results show that good clusters can be identified using small amount of labeled data
Traditional data mining methods for clustering only use unlabeled data objects as input. The aim of ...
This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Mar...
In many machine learning domains (e.g. text processing, bioinformatics), there is a large supply of ...
We consider the problem of clustering in its most basic form where only a local metric on the data s...
Unsupervised clustering can be significantly improved using supervision in the form of pairwise cons...
A new approach to clustering multivariate data, based on a multilevel linear mixed model, is propose...
One of the common problems with clustering is that the generated clusters often do not match user ex...
We derive a new clustering algorithm based on information theory and statistical mechanics, which is...
We introduce a new, non-parametric and principled, distance based clustering method. This method com...
The thesis tackles the problem of uncovering hidden structures in high-dimensional data in the prese...
Finding clusters in data is a challenging problem. Given a dataset, we usually do not know the numbe...
We present a discriminative clustering approach in which the feature representation can be learned f...
The paper deals with a model-theoretic approach to clustering. The approach can be used to generate ...
Supervised clustering is an emerging area of machine learning, where the goal is to find class-unifo...
This paper introduces an approach for clustering/classification which is based on the use of local, ...
Traditional data mining methods for clustering only use unlabeled data objects as input. The aim of ...
This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Mar...
In many machine learning domains (e.g. text processing, bioinformatics), there is a large supply of ...
We consider the problem of clustering in its most basic form where only a local metric on the data s...
Unsupervised clustering can be significantly improved using supervision in the form of pairwise cons...
A new approach to clustering multivariate data, based on a multilevel linear mixed model, is propose...
One of the common problems with clustering is that the generated clusters often do not match user ex...
We derive a new clustering algorithm based on information theory and statistical mechanics, which is...
We introduce a new, non-parametric and principled, distance based clustering method. This method com...
The thesis tackles the problem of uncovering hidden structures in high-dimensional data in the prese...
Finding clusters in data is a challenging problem. Given a dataset, we usually do not know the numbe...
We present a discriminative clustering approach in which the feature representation can be learned f...
The paper deals with a model-theoretic approach to clustering. The approach can be used to generate ...
Supervised clustering is an emerging area of machine learning, where the goal is to find class-unifo...
This paper introduces an approach for clustering/classification which is based on the use of local, ...
Traditional data mining methods for clustering only use unlabeled data objects as input. The aim of ...
This paper discusses a probabilistic model-based approach to clustering sequences, using hidden Mar...
In many machine learning domains (e.g. text processing, bioinformatics), there is a large supply of ...