Abstract — Data is growing at an unprecedented rate in commercial and scientific areas. Clustering algorithms for large data which require small memory consumption and scalability become increasingly important under this circumstance. In this paper, we propose a new clustering approach called stochastic gradient based fuzzy clustering(SGFC) which achieves the optimization based on stochastic approximation to handle such kind of large data. We derive an adaptive learning rate which can be updated incrementally and maintained automatically in gradient descent approach employed in SGFC. Moreover, SGFC is extended to a mini-batch SGFC to reduce the stochastic noise. Additionally, multi-pass SGFC is also proposed to improve the clustering perfor...
Unsupervised learning based clustering methods are gaining importance in the field of data analytics...
Clustering algorithms by minimizing an objective function share a clear drawback of having to set th...
Clustering algorithms resume the datasets into few number of data points such as centroids or medoid...
Huge amount of the dataset consists millions of explanation and thousands, hundreds of features, whi...
A massive volume of digital data holding valuable information, called Big Data, is produced each gen...
Clustering large data sets has become very important as the amount of available unlabeled data incre...
Cluster analysis has been widely applied in many areas such as data mining, geographical data proces...
Online clustering is of significant interest for real-time data analysis. Generic offline clustering...
Clustering algorithms are a primary tool in data analysis, facilitating the discovery of groups and ...
Alex N, Hammer B, Klawonn F. Single pass clustering for large data sets. In: Proceedings of 6th Int...
Finding an efficient data reduction method for large-scale problems is an imperative task. In this p...
International audienceThis paper proposes two new incremental fuzzy c medoids clustering algorithms ...
Abstract — Data mining is the process used to analyze a large quantity of heterogeneous data to extr...
Virtually every sector of business and industry that use computing, including financial analysis, se...
The application of fuzzy cluster analysis to larger data sets can cause runtime and memory overflow ...
Unsupervised learning based clustering methods are gaining importance in the field of data analytics...
Clustering algorithms by minimizing an objective function share a clear drawback of having to set th...
Clustering algorithms resume the datasets into few number of data points such as centroids or medoid...
Huge amount of the dataset consists millions of explanation and thousands, hundreds of features, whi...
A massive volume of digital data holding valuable information, called Big Data, is produced each gen...
Clustering large data sets has become very important as the amount of available unlabeled data incre...
Cluster analysis has been widely applied in many areas such as data mining, geographical data proces...
Online clustering is of significant interest for real-time data analysis. Generic offline clustering...
Clustering algorithms are a primary tool in data analysis, facilitating the discovery of groups and ...
Alex N, Hammer B, Klawonn F. Single pass clustering for large data sets. In: Proceedings of 6th Int...
Finding an efficient data reduction method for large-scale problems is an imperative task. In this p...
International audienceThis paper proposes two new incremental fuzzy c medoids clustering algorithms ...
Abstract — Data mining is the process used to analyze a large quantity of heterogeneous data to extr...
Virtually every sector of business and industry that use computing, including financial analysis, se...
The application of fuzzy cluster analysis to larger data sets can cause runtime and memory overflow ...
Unsupervised learning based clustering methods are gaining importance in the field of data analytics...
Clustering algorithms by minimizing an objective function share a clear drawback of having to set th...
Clustering algorithms resume the datasets into few number of data points such as centroids or medoid...