Missing data is an intrinsic problem of broad science and engineering. In the emerging era of big data and machine learning (ML), the missing data may substantially damage the reliability and accuracy of ML predictions and statistical inference. Researchers are not certain about the negative impact of incomplete data on the final ML and statistical analyses and also about how to tackle large, complex incomplete data. Existing data curing methods (imputation methods) are difficult for general researchers and often unsuitable for large complex data. To resolve these challenges, this project’s goal is to develop a new community-level data-curing service running on the NSF cyberinfrastructure and local high-performance computing (HPC) faciliti...
2018-01-18This is the era of big data, where both challenges and opportunities lie ahead for the mac...
Large-scale computing and machine learning (ML) has been spurred in recent years by the availability...
Philosophiae Doctor - PhD (Statistics and Population Studies)The aim of this study is to look at the...
UP-FHDI is the first ultra data-oriented imputation software in the world, capable of curing data up...
For reliable machine learning and statistical inference with large/big data, curing incomplete data ...
This electronic version was submitted by the student author. The certified thesis is available in th...
There emerges a strong need for a large/big data-oriented imputation method for accelerating data-dr...
Prediction and learning in the presence of missing data are pervasive problems in data analysis by m...
The purpose of this article is to propose a methodology involving various methods that can be used t...
Machine learning (ML) research often operates within silos, separate from the people who created the...
This paper discusses a novel algorithm for solving a missing data problem in the machine learning pr...
Massive high-dimensional data sets are ubiquitous in all scientific disciplines. Extracting meaningf...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environment...
Some of the most challenging issues in big data are size, scalability and reliability. Big data, su...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environments...
2018-01-18This is the era of big data, where both challenges and opportunities lie ahead for the mac...
Large-scale computing and machine learning (ML) has been spurred in recent years by the availability...
Philosophiae Doctor - PhD (Statistics and Population Studies)The aim of this study is to look at the...
UP-FHDI is the first ultra data-oriented imputation software in the world, capable of curing data up...
For reliable machine learning and statistical inference with large/big data, curing incomplete data ...
This electronic version was submitted by the student author. The certified thesis is available in th...
There emerges a strong need for a large/big data-oriented imputation method for accelerating data-dr...
Prediction and learning in the presence of missing data are pervasive problems in data analysis by m...
The purpose of this article is to propose a methodology involving various methods that can be used t...
Machine learning (ML) research often operates within silos, separate from the people who created the...
This paper discusses a novel algorithm for solving a missing data problem in the machine learning pr...
Massive high-dimensional data sets are ubiquitous in all scientific disciplines. Extracting meaningf...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environment...
Some of the most challenging issues in big data are size, scalability and reliability. Big data, su...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environments...
2018-01-18This is the era of big data, where both challenges and opportunities lie ahead for the mac...
Large-scale computing and machine learning (ML) has been spurred in recent years by the availability...
Philosophiae Doctor - PhD (Statistics and Population Studies)The aim of this study is to look at the...