Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in Big Data analysis is data reduction. In this presentation, I will review some existing approaches in data reduction and introduce a new strategy called information-based optimal subdata selection (IBOSS). Under linear and nonlinear models set up, theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to other approaches in term of parameter estimation and predictive performance. The tradeoff between accuracy and computation cost is also investigated. When models are mis-specified, the per...
With electronic data increasing dramatically in almost all areas of research, a plethora of new tech...
this paper we describe and evaluate several popular techniques for data reduction. Historically, the...
With electronic data increasing dramatically in almost all areas of research, a plethora of new tech...
abstract: This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared...
The demand of computational resources for the modeling process increases as the scale of the dataset...
Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science,...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
A massive bulk of data is being created due to digitalisation in various industries, including medic...
This thesis is focused on the development of computationally efficient procedures for regression mod...
In many research areas, such as health science, environmental sciences, agricultural sciences, etc.,...
Analytics of Big data research has been entering the latest processes of "fast-data", in which every...
With electronic data increasing dramatically in almost all areas of research, a plethora of new tech...
this paper we describe and evaluate several popular techniques for data reduction. Historically, the...
With electronic data increasing dramatically in almost all areas of research, a plethora of new tech...
abstract: This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared...
The demand of computational resources for the modeling process increases as the scale of the dataset...
Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science,...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
The availability of big data sets in research, industry and society in general has opened up many po...
A massive bulk of data is being created due to digitalisation in various industries, including medic...
This thesis is focused on the development of computationally efficient procedures for regression mod...
In many research areas, such as health science, environmental sciences, agricultural sciences, etc.,...
Analytics of Big data research has been entering the latest processes of "fast-data", in which every...
With electronic data increasing dramatically in almost all areas of research, a plethora of new tech...
this paper we describe and evaluate several popular techniques for data reduction. Historically, the...
With electronic data increasing dramatically in almost all areas of research, a plethora of new tech...