The rapid revolutionary rapid Big Data technology has attracted increasing attention and widely been used in many industries. It is not only benefiting our life dramatically, but also posing new challenges to us at the same time. In many situations, dealing with these big and complex data can extremely difficult. However, do we really always need big data? This thesis attempted to investigate whether do we need a large dataset to build a model with acceptable accuracy, how the number of observations affect the performance of statistical predictive methods and use learning curves to describe this relationship. Some popular statis- tical learning methods were considered and applied on 3 large datasets. An efficient parallel co...
The dissertation focuses on two separate problems. Each is informed by real-world applications. The ...
In this thesis we explore a wide range of statistical learning algorithms and evaluate their abiliti...
Many traditional and newly-developed causal inference approaches require imposing strong data assump...
This paper presents a learning machine overview for Big Data Predictive Analytic. Produced data, in ...
One of the fundamental machine learning tasks is that of predictive classification. Given that organ...
The realm of big data is a very wide and varied one. We discuss old, new, small and big data, with s...
Data mining techniques allow the extraction of valuable information from heterogeneous and possibly ...
The Big Data Era creates a lot of exciting opportunities for new developments in economics and econo...
Big Data refers to data sets of much larger size, higher frequency, and often more personalized info...
Editorial Statistics and computer science have grown as separate disciplines with little interactio...
In an era with remarkable advancements in computer engineering, computational algorithms, and mathem...
As the amount of information available for data mining grows larger, the amount of time needed to tr...
Data analysis is changing fast. Driven by a vast range of application domains and affordable tools, ...
The increasing automation in data collection, either in structured or unstructured formats...
International audienceData analysis in this chapter mainly means descriptive and exploratory methods...
The dissertation focuses on two separate problems. Each is informed by real-world applications. The ...
In this thesis we explore a wide range of statistical learning algorithms and evaluate their abiliti...
Many traditional and newly-developed causal inference approaches require imposing strong data assump...
This paper presents a learning machine overview for Big Data Predictive Analytic. Produced data, in ...
One of the fundamental machine learning tasks is that of predictive classification. Given that organ...
The realm of big data is a very wide and varied one. We discuss old, new, small and big data, with s...
Data mining techniques allow the extraction of valuable information from heterogeneous and possibly ...
The Big Data Era creates a lot of exciting opportunities for new developments in economics and econo...
Big Data refers to data sets of much larger size, higher frequency, and often more personalized info...
Editorial Statistics and computer science have grown as separate disciplines with little interactio...
In an era with remarkable advancements in computer engineering, computational algorithms, and mathem...
As the amount of information available for data mining grows larger, the amount of time needed to tr...
Data analysis is changing fast. Driven by a vast range of application domains and affordable tools, ...
The increasing automation in data collection, either in structured or unstructured formats...
International audienceData analysis in this chapter mainly means descriptive and exploratory methods...
The dissertation focuses on two separate problems. Each is informed by real-world applications. The ...
In this thesis we explore a wide range of statistical learning algorithms and evaluate their abiliti...
Many traditional and newly-developed causal inference approaches require imposing strong data assump...