International audienceThe amount of available data for data mining, knowledge discovery continues to grow very fast with the era of Big Data. Genetic Programming algorithms (GP), that are efficient machine learning techniques, are face up to a new challenge that is to deal with the mass of the provided data. Active Sampling, already used for Active Learning, might be a good solution to improve the Evolutionary Algorithms (EA) training from very big data sets. This paper investigates the adaptation of Topology Based Selection (TBS) to face massive learning datasets by means of Hierarchical Sampling. We propose to combine the Random Subset Selection (RSS) with the TBS to create the RSS-TBS method. Two variants are implemented, applied to solv...
[Motivation] Detecting positive selection in genomic regions is a recurrent topic in natural populat...
AbstractOver recent decades, database sizes have grown considerably. Larger sizes present new challe...
This thesis investigates the problem of high-dimensional data classification using evolutionary rule...
in Springer series Advances in Intelligent Systems and Computing, vol. 529International audienceThe ...
Genetic programming (GP) has the potential to provide unique solutions to a wide range of supervise...
Big data processing is the new challenge for analytical, machine learning techniques. Many efforts a...
Extract knowledge and significant information from very large data sets is a main topic in Data Scie...
In this thesis, we investigate the adaptation of GP to overcome the data Volume hurdle in Big Data p...
This paper proposes and surveys genetic implementations of algorithms for selection and partitioning...
[[abstract]]Active learning is a kind of semi-supervised learning methods in which learning algorith...
One of the biggest research challenges in KDD and Data Mining is to develop methods that scale up w...
Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest resea...
Dans cette thèse, nous étudions l'adaptation des Programmes Génétiques (GP) pour surmonter l'obstacl...
Performance of evolutionary algorithms depends on many factors such as population size, number of ge...
© 2014 IEEE. In this paper we propose a deterministic method to obtain subsets from big data which a...
[Motivation] Detecting positive selection in genomic regions is a recurrent topic in natural populat...
AbstractOver recent decades, database sizes have grown considerably. Larger sizes present new challe...
This thesis investigates the problem of high-dimensional data classification using evolutionary rule...
in Springer series Advances in Intelligent Systems and Computing, vol. 529International audienceThe ...
Genetic programming (GP) has the potential to provide unique solutions to a wide range of supervise...
Big data processing is the new challenge for analytical, machine learning techniques. Many efforts a...
Extract knowledge and significant information from very large data sets is a main topic in Data Scie...
In this thesis, we investigate the adaptation of GP to overcome the data Volume hurdle in Big Data p...
This paper proposes and surveys genetic implementations of algorithms for selection and partitioning...
[[abstract]]Active learning is a kind of semi-supervised learning methods in which learning algorith...
One of the biggest research challenges in KDD and Data Mining is to develop methods that scale up w...
Scalability is a key requirement for any KDD and data mining algorithm, and one of the biggest resea...
Dans cette thèse, nous étudions l'adaptation des Programmes Génétiques (GP) pour surmonter l'obstacl...
Performance of evolutionary algorithms depends on many factors such as population size, number of ge...
© 2014 IEEE. In this paper we propose a deterministic method to obtain subsets from big data which a...
[Motivation] Detecting positive selection in genomic regions is a recurrent topic in natural populat...
AbstractOver recent decades, database sizes have grown considerably. Larger sizes present new challe...
This thesis investigates the problem of high-dimensional data classification using evolutionary rule...