A high level of data quality has always been a concern for many applications based on machine learning, including clinical decision support systems, weather forecasting, traffic predictions, and many others. A very limited amount of work is devoted to exploiting the missing values for effective imputation and better prediction. This paper introduces a unique approach to predicting and imputing missing data fields in the multivariate dataset such as numerical, categorical, and unstructured. The proposed imputation method is a multi-model scheme based on the joint approach of natural language processing (NLP) encoders, machine learning-driven feature extractors, and a sequential regression imputation technique to predict missing values. The p...
[[abstract]]While there is an ample amount of medical information available for data mining, many of...
In this paper we propose a new method to deal with missingness in categorical data. The new proposal...
Missing values are common in real-world datasets and pose a significant challenge to the performance...
The incomplete dataset is an unescapable problem in data preprocessing that primarily machine learni...
Many real-world datasets suffer from missing data, which can introduce uncertainty into ensuing anal...
The analysis of digital health data with machine learning models can be used in clinical application...
Clinical decision support using data mining techniques offers more intelligent way to reduce the dec...
International audienceBACKGROUND: As databases grow larger, it becomes harder to fully control their...
Pristine and trustworthy data are required for efficient computer modelling for medical decision-mak...
Abstract Background Incomplete categorical variables with more than two categories are common in pub...
One important characteristic of good data is completeness. Missing data is a major problem in the cl...
: The missing data mechanism is a relevant problem in Machine Learning (ML) and biomedical informati...
Abstract Missing data is a common problem in longitudinal datasets which include mult...
This research paper explores a variety of strategies for performing classification with missing feat...
Prediction and learning in the presence of missing data are pervasive problems in data analysis by m...
[[abstract]]While there is an ample amount of medical information available for data mining, many of...
In this paper we propose a new method to deal with missingness in categorical data. The new proposal...
Missing values are common in real-world datasets and pose a significant challenge to the performance...
The incomplete dataset is an unescapable problem in data preprocessing that primarily machine learni...
Many real-world datasets suffer from missing data, which can introduce uncertainty into ensuing anal...
The analysis of digital health data with machine learning models can be used in clinical application...
Clinical decision support using data mining techniques offers more intelligent way to reduce the dec...
International audienceBACKGROUND: As databases grow larger, it becomes harder to fully control their...
Pristine and trustworthy data are required for efficient computer modelling for medical decision-mak...
Abstract Background Incomplete categorical variables with more than two categories are common in pub...
One important characteristic of good data is completeness. Missing data is a major problem in the cl...
: The missing data mechanism is a relevant problem in Machine Learning (ML) and biomedical informati...
Abstract Missing data is a common problem in longitudinal datasets which include mult...
This research paper explores a variety of strategies for performing classification with missing feat...
Prediction and learning in the presence of missing data are pervasive problems in data analysis by m...
[[abstract]]While there is an ample amount of medical information available for data mining, many of...
In this paper we propose a new method to deal with missingness in categorical data. The new proposal...
Missing values are common in real-world datasets and pose a significant challenge to the performance...