Analyzing large volumes of high-dimensional data is an issue of fundamental importance in data science, molecular simulations and beyond. Several approaches work on the assumption that the important content of a dataset belongs to a manifold whose Intrinsic Dimension (ID) is much lower than the crude large number of coordinates. Such manifold is generally twisted and curved; in addition points on it will be non-uniformly distributed: two factors that make the identification of the ID and its exploitation really hard. Here we propose a new ID estimator using only the distance of the first and the second nearest neighbor of each point in the sample. This extreme minimality enables us to reduce the effects of curvature, of density variation, a...
dissertationIntrinsic dimension estimation is a fundamental problem in manifold learning. In applica...
While analyzing multidimensional data, we often have to reduce their dimensionality so that to prese...
The high dimensionality of some real life signals makes the usage of the most common signal processi...
Analyzing large volumes of high-dimensional data is an issue of fundamental importance in data scien...
Analyzing large volumes of high-dimensional data is an issue of fundamental importance in science a...
Modern datasets are characterized by numerous features related by complex dependency structures. To ...
Identifying the minimal number of parameters needed to describe a dataset is a challenging problem k...
One of the founding paradigms of machine learning is that a small number of variables is often suffi...
When dealing with datasets comprising high-dimensional points, it is usually advantageous to discove...
International audienceAccurate estimation of Intrinsic Dimensionality (ID) is of crucial importance ...
Dimensionality reduction is a very important tool in data mining. An intrinsic dimensionality of a d...
We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we sh...
This thesis concerns the problem of dimensionality reduction through information geometric methods o...
Let $X_1,...,X_N$, $X_i\in \mathbb{R}^D$ be an uniform drawn on a compact $d-$dimensional manifold $...
Dimensionality reduction methods are preprocessing techniques used for coping with high dimensionali...
dissertationIntrinsic dimension estimation is a fundamental problem in manifold learning. In applica...
While analyzing multidimensional data, we often have to reduce their dimensionality so that to prese...
The high dimensionality of some real life signals makes the usage of the most common signal processi...
Analyzing large volumes of high-dimensional data is an issue of fundamental importance in data scien...
Analyzing large volumes of high-dimensional data is an issue of fundamental importance in science a...
Modern datasets are characterized by numerous features related by complex dependency structures. To ...
Identifying the minimal number of parameters needed to describe a dataset is a challenging problem k...
One of the founding paradigms of machine learning is that a small number of variables is often suffi...
When dealing with datasets comprising high-dimensional points, it is usually advantageous to discove...
International audienceAccurate estimation of Intrinsic Dimensionality (ID) is of crucial importance ...
Dimensionality reduction is a very important tool in data mining. An intrinsic dimensionality of a d...
We propose a novel method for linear dimensionality reduction of manifold modeled data. First, we sh...
This thesis concerns the problem of dimensionality reduction through information geometric methods o...
Let $X_1,...,X_N$, $X_i\in \mathbb{R}^D$ be an uniform drawn on a compact $d-$dimensional manifold $...
Dimensionality reduction methods are preprocessing techniques used for coping with high dimensionali...
dissertationIntrinsic dimension estimation is a fundamental problem in manifold learning. In applica...
While analyzing multidimensional data, we often have to reduce their dimensionality so that to prese...
The high dimensionality of some real life signals makes the usage of the most common signal processi...