To detect the noise data in the datasets and remove them, a new approach for noise data detection based on fast search and find density peaks (FSFDP) and information entropy (IE) was proposed in this article. In the proposed method, FSFDP was used to cluster the original datasets and remove the outliers. Then construct the rectangular panes and mesh generation for each class according to the clustering results. Calculate the IE of each class after projecting all samples to the mesh, and remove the samples which have the lower local density in the class. If the IE value change obviously after the sample was removed from the class, the sample was marked as a noise. Finally, the result of the experiment shows that the presented approach is eff...
Specific to data mining or data analysis in general, noise raises the difficulty for many convention...
DoctorClustering analysis is an unsupervised learning technique for partitioning objects into severa...
The data in industrial informatics may be high-dimensional and mislabeled. Irrelevant or noisy featu...
To detect the noise data in the datasets and remove them, a new approach for noise data detection ba...
To improve the quality of real datasets by remove noise data, a new method for noise data detection ...
Processing noise data is one of the most important fields on mining data streams. To address this pr...
The rapid expansion of the Internet has madeWeb a popular place for disseminating andcollecting info...
Removing objects that are noise is an important goal of data cleaning as noise hinders most types of...
International audienceIn this paper, an automatic adaptive method for identification and separation ...
Real data may have a considerable amount of noise produced by error in data collection, transmission...
In recent years, clustering methods have attracted more attention in analysing and monitoring data s...
Clustering of data with high dimension and variable densities poses a remarkable challenge to the tr...
Clustering of data with high dimension and variable densities poses a remarkable challenge to the tr...
Noise filtering is most frequently used in data preprocessing to improve the accuracy of induced cla...
Specific to data mining or data analysis in general, noise raises the difficulty for many convention...
Specific to data mining or data analysis in general, noise raises the difficulty for many convention...
DoctorClustering analysis is an unsupervised learning technique for partitioning objects into severa...
The data in industrial informatics may be high-dimensional and mislabeled. Irrelevant or noisy featu...
To detect the noise data in the datasets and remove them, a new approach for noise data detection ba...
To improve the quality of real datasets by remove noise data, a new method for noise data detection ...
Processing noise data is one of the most important fields on mining data streams. To address this pr...
The rapid expansion of the Internet has madeWeb a popular place for disseminating andcollecting info...
Removing objects that are noise is an important goal of data cleaning as noise hinders most types of...
International audienceIn this paper, an automatic adaptive method for identification and separation ...
Real data may have a considerable amount of noise produced by error in data collection, transmission...
In recent years, clustering methods have attracted more attention in analysing and monitoring data s...
Clustering of data with high dimension and variable densities poses a remarkable challenge to the tr...
Clustering of data with high dimension and variable densities poses a remarkable challenge to the tr...
Noise filtering is most frequently used in data preprocessing to improve the accuracy of induced cla...
Specific to data mining or data analysis in general, noise raises the difficulty for many convention...
Specific to data mining or data analysis in general, noise raises the difficulty for many convention...
DoctorClustering analysis is an unsupervised learning technique for partitioning objects into severa...
The data in industrial informatics may be high-dimensional and mislabeled. Irrelevant or noisy featu...