There emerges a strong need for a large/big data-oriented imputation method for accelerating data-driven scientific discovery in the new era of big data and powerful computing. Imputation is a statistics-based procedure to fill in missing data, and there exists a wide spectrum of methods. Still, they are often not applicable for large/big incomplete data and require difficult statistical assumptions. With support from NSF (OAC-1931380), we developed the ultra data-oriented parallel fractional hot-deck imputation (UP-FHDI [1,2]) which is general-purpose, assumption-free software for handling item nonresponse in big incomplete data by leveraging the theory of FHDI and parallel computing. Here, “ultra” data means a data set with high dimension...
We present single imputation method for missing values which borrows the idea of data depth—a measur...
International audienceWe present the Parallel, Forward–Backward with Pruning (PFBP) algorithm for fe...
Missing values are very common in real-world datasets for a variety of reasons. Deleting data points...
For reliable machine learning and statistical inference with large/big data, curing incomplete data ...
Solving the missing-value (MV) problem with small estimation errors in big data environments is a no...
Solving the missing-value (MV) problem with small estimation errors in big data environments is a no...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environment...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environments...
Missing data is an intrinsic problem of broad science and engineering. In the emerging era of big da...
Fractional hot deck imputation, considered in Fuller and Kim (2005), is extended to multivariate mis...
International audienceThe presented methodology for single imputation of missing values borrows the ...
The growth in the use of computationally intensive statistical procedures, especially with big data,...
UP-FHDI is the first ultra data-oriented imputation software in the world, capable of curing data up...
Philosophiae Doctor - PhD (Statistics and Population Studies)The aim of this study is to look at the...
Imputation of missing data is important in many areas, such as reducing non-response bias in surveys...
We present single imputation method for missing values which borrows the idea of data depth—a measur...
International audienceWe present the Parallel, Forward–Backward with Pruning (PFBP) algorithm for fe...
Missing values are very common in real-world datasets for a variety of reasons. Deleting data points...
For reliable machine learning and statistical inference with large/big data, curing incomplete data ...
Solving the missing-value (MV) problem with small estimation errors in big data environments is a no...
Solving the missing-value (MV) problem with small estimation errors in big data environments is a no...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environment...
Solving the missing-value (MV) problem with small estimation errors in large-scale data environments...
Missing data is an intrinsic problem of broad science and engineering. In the emerging era of big da...
Fractional hot deck imputation, considered in Fuller and Kim (2005), is extended to multivariate mis...
International audienceThe presented methodology for single imputation of missing values borrows the ...
The growth in the use of computationally intensive statistical procedures, especially with big data,...
UP-FHDI is the first ultra data-oriented imputation software in the world, capable of curing data up...
Philosophiae Doctor - PhD (Statistics and Population Studies)The aim of this study is to look at the...
Imputation of missing data is important in many areas, such as reducing non-response bias in surveys...
We present single imputation method for missing values which borrows the idea of data depth—a measur...
International audienceWe present the Parallel, Forward–Backward with Pruning (PFBP) algorithm for fe...
Missing values are very common in real-world datasets for a variety of reasons. Deleting data points...