Missing data handling is an important preparation step for most data discrimination or mining tasks. Inappropriate treatment of missing data may cause large errors or false results. In this paper, we study the effect of a missing data recovery method, namely the pseudo- nearest neighbor substitution approach, on Gaussian distributed data sets that represent typical cases in data discrimination and data mining applications. The error rate of the proposed recovery method is evaluated by comparing the clustering results of the recovered data sets to the clustering results obtained on the originally complete data sets. The results are also compared with that obtained by applying two other missing data handling methods, the constant default valu...
Missing data continues to be an issue not only the field of statistics but in any field, that deals ...
Learning from data that contain missing values represents a common phenomenon in many domains. Relat...
Random forest (RF) missing data algorithms are an attractive approach for imputing missing data. The...
In fact, raw data in the real world is dirty. Each large data repository contains various types of a...
Data mining requires a pre-processing task in which the data are prepared,cleaned,integrated,transfo...
Missing data is a widely recognized problem affecting large database in data mining. The substitutio...
Abstract. This work proposes and evaluates a Nearest-Neighbor Method to substitute missing values in...
Many real-world applications encountered a common issue in data analysis is the presence of missing ...
Missing data is an issue in many real-world datasets yet robust methods for dealing with missing dat...
The real-world data analysis and processing using data mining techniques often are facing observatio...
K-nearest neighbors (KNN) has been extensively used as imputation algorithm to substitute missing da...
Abstract Machine learning has been the corner stone in analysing and extracting information from dat...
Missing value imputation is an actual yet challenging issue confronted by machine learning and data ...
Missing data is a common concern in health datasets, and its impact on good decision-making processe...
Missing data is an important issue in almost all fields of quantitative research. A nonparametric pr...
Missing data continues to be an issue not only the field of statistics but in any field, that deals ...
Learning from data that contain missing values represents a common phenomenon in many domains. Relat...
Random forest (RF) missing data algorithms are an attractive approach for imputing missing data. The...
In fact, raw data in the real world is dirty. Each large data repository contains various types of a...
Data mining requires a pre-processing task in which the data are prepared,cleaned,integrated,transfo...
Missing data is a widely recognized problem affecting large database in data mining. The substitutio...
Abstract. This work proposes and evaluates a Nearest-Neighbor Method to substitute missing values in...
Many real-world applications encountered a common issue in data analysis is the presence of missing ...
Missing data is an issue in many real-world datasets yet robust methods for dealing with missing dat...
The real-world data analysis and processing using data mining techniques often are facing observatio...
K-nearest neighbors (KNN) has been extensively used as imputation algorithm to substitute missing da...
Abstract Machine learning has been the corner stone in analysing and extracting information from dat...
Missing value imputation is an actual yet challenging issue confronted by machine learning and data ...
Missing data is a common concern in health datasets, and its impact on good decision-making processe...
Missing data is an important issue in almost all fields of quantitative research. A nonparametric pr...
Missing data continues to be an issue not only the field of statistics but in any field, that deals ...
Learning from data that contain missing values represents a common phenomenon in many domains. Relat...
Random forest (RF) missing data algorithms are an attractive approach for imputing missing data. The...