Machine learning is a rapidly advancing field. While increasingly sophisticated statistical methods are being developed, their use for concrete applications is not necessarily clear-cut. This thesis explores techniques to handle some issues which arise when applying machine learning algorithms to practical data sets. The focus is on two particular problems: how to effectively make use of incomplete data sets without having to discard samples with missing values, and how to select an appropriately representative set of variables for a given task. For tasks with missing values, distance estimation is presented as a new approach which would directly enable a large class of machine learning methods to be used. It is shown that the distance can...
The objective of this thesis is to consider some challenges that arise when conducting causal infere...
OptoNova is a world leading producer of inspection systems for quality control of surfaces and edges...
The statistical analysis of growing masses of data represents a real added value for numerous and va...
Machine learning is a rapidly advancing field. While increasingly sophisticated statistical methods ...
International audienceThe majority of all commonly used machine learning methods can not be applied ...
The importance of variable selection procedures in non-linear regression analysis is becoming increa...
International audienceMany data sets have missing values, however the majority of statistical method...
Modern data sets often suffer from the problem of having measurements from very few samples. The sm...
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for represen...
Training of machine learning models often require sampling when the dataset is large. The manner in ...
In the field of machine learning, representation learning is a collection of techniques that transfo...
Machine learning has becoming a trending topic in the last years, being now one of the most demandin...
Despite the rapid progress in the field of machine learning and artificial neural networks, many hur...
Single-Gaussian and Gaussian-Mixture Models are utilized in various pattern recognition tasks. The m...
Most diseases have different heterogeneous effects on patients. Broadly, one may conclude what manif...
The objective of this thesis is to consider some challenges that arise when conducting causal infere...
OptoNova is a world leading producer of inspection systems for quality control of surfaces and edges...
The statistical analysis of growing masses of data represents a real added value for numerous and va...
Machine learning is a rapidly advancing field. While increasingly sophisticated statistical methods ...
International audienceThe majority of all commonly used machine learning methods can not be applied ...
The importance of variable selection procedures in non-linear regression analysis is becoming increa...
International audienceMany data sets have missing values, however the majority of statistical method...
Modern data sets often suffer from the problem of having measurements from very few samples. The sm...
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for represen...
Training of machine learning models often require sampling when the dataset is large. The manner in ...
In the field of machine learning, representation learning is a collection of techniques that transfo...
Machine learning has becoming a trending topic in the last years, being now one of the most demandin...
Despite the rapid progress in the field of machine learning and artificial neural networks, many hur...
Single-Gaussian and Gaussian-Mixture Models are utilized in various pattern recognition tasks. The m...
Most diseases have different heterogeneous effects on patients. Broadly, one may conclude what manif...
The objective of this thesis is to consider some challenges that arise when conducting causal infere...
OptoNova is a world leading producer of inspection systems for quality control of surfaces and edges...
The statistical analysis of growing masses of data represents a real added value for numerous and va...