This paper discusses the so-called missing data problem, i.e. the problem of imputing missing values in information systems. A new algorithm, called the ARSI algorithm, is proposed to address the imputation problem of missing values on categorical databases using the framework of rough set theory. This algorithm can be seen as a refinement of the ROUSTIDA algorithm and combines the approach of a generalized non-symmetric similarity relation with a generalized discernibility matrix to predict the missing values on incomplete information systems. Computational experiments show that the proposed algorithm is as efficient and competitive as other imputation algorithms
The evolution of big data analytics through machine learning and artificial intelligence techniq...
International audienceThe authors analyze the efficiency of six missing data techniques for categori...
Many datasets include missing values in their attributes. Data mining techniques are not applicable ...
In fact, raw data in the real world is dirty. Each large data repository contains various types of a...
Missing values exist in many generated datasets in science. Therefore, utilizing missing data imputa...
Many existing, industrial, and research data sets contain missing values (MVs). There are various re...
The rough set theory, based on the original definition of the indiscernibility relation, is not usef...
Missing value imputation is an actual yet challenging issue confronted by machine learning and data ...
Rough set theory is an effective approach to imprecision, vagueness, and uncertainty. This theory ov...
Existence of missing values creates a big problem in real world data. Unless those values are missi...
The subject of missing values in databases and how to handle them has received very little attention...
The original rough set theory deals with precise and complete data, while real applications frequent...
Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing v...
The key point of the tolerance relation or similarity relation presented in the literature is to ass...
The paper presents rough fuzzy subspace clustering algorithm and experimental results of clustering....
The evolution of big data analytics through machine learning and artificial intelligence techniq...
International audienceThe authors analyze the efficiency of six missing data techniques for categori...
Many datasets include missing values in their attributes. Data mining techniques are not applicable ...
In fact, raw data in the real world is dirty. Each large data repository contains various types of a...
Missing values exist in many generated datasets in science. Therefore, utilizing missing data imputa...
Many existing, industrial, and research data sets contain missing values (MVs). There are various re...
The rough set theory, based on the original definition of the indiscernibility relation, is not usef...
Missing value imputation is an actual yet challenging issue confronted by machine learning and data ...
Rough set theory is an effective approach to imprecision, vagueness, and uncertainty. This theory ov...
Existence of missing values creates a big problem in real world data. Unless those values are missi...
The subject of missing values in databases and how to handle them has received very little attention...
The original rough set theory deals with precise and complete data, while real applications frequent...
Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing v...
The key point of the tolerance relation or similarity relation presented in the literature is to ass...
The paper presents rough fuzzy subspace clustering algorithm and experimental results of clustering....
The evolution of big data analytics through machine learning and artificial intelligence techniq...
International audienceThe authors analyze the efficiency of six missing data techniques for categori...
Many datasets include missing values in their attributes. Data mining techniques are not applicable ...