We analyze an algorithm based on principal component analysis (PCA) for detecting the dimension k of a smooth manifold M ⊂ R d from a set P of point samples. The best running time so far is O(d2 O(k7 log k) ) by Giesen and Wagner after the adaptive neighborhood graph is constructed. Given the adaptive neighborhood graph, the PCA-based algorithm outputs the true dimension in O(d2 O(k) ) time, provided that P satisfies a standard sampling condition as in previous results. Our experimental results validate the effectiveness of the approach. A further advantage is that both the algorithm and its analysis can be generalized to the noisy case, in which small perturbations of the samples and a small portion of outliers are allowed
A new method for analyzing the intrinsic dimensionality (ID) of low dimensional manifolds in high di...
Most high-dimensional real-life data exhibit some dependencies such that data points do not populate...
Constructing an efficient parametrization of a large, noisy data set of points lying close to a smoo...
We present simple algorithms for detecting the dimension k of a smooth manifold M ⊂ ℝ<sup>d</sup> fr...
Estimating intrinsic dimension of data is an important problem in feature extraction and feature sel...
The problem of approximating multidimensional data with objects of lower dimension is a classical pr...
We introduce the adaptive neighborhood graph as a data structure for modeling a smooth manifold M em...
Constructing an efficient parametrization of a large, noisy data set of points lying close to a smoo...
We propose an automated way of determining the optimal number of low-rank components in dimension re...
We present a method to estimate the manifold dimension by analyzing the shape of simplices formed by...
The identification of a reduced dimensional representation of the data is among the main issues of e...
We study the performance of principal component analysis (PCA). In particular, we consider the probl...
Intuitively, learning should be easier when the data points lie on a low-dimensional submanifold of ...
In 1901, Karl Pearson invented Principal Component Analysis (PCA). Since then, PCA serves as a proto...
The analysis of high-dimensional data often begins with the identification of lower dimensional subs...
A new method for analyzing the intrinsic dimensionality (ID) of low dimensional manifolds in high di...
Most high-dimensional real-life data exhibit some dependencies such that data points do not populate...
Constructing an efficient parametrization of a large, noisy data set of points lying close to a smoo...
We present simple algorithms for detecting the dimension k of a smooth manifold M ⊂ ℝ<sup>d</sup> fr...
Estimating intrinsic dimension of data is an important problem in feature extraction and feature sel...
The problem of approximating multidimensional data with objects of lower dimension is a classical pr...
We introduce the adaptive neighborhood graph as a data structure for modeling a smooth manifold M em...
Constructing an efficient parametrization of a large, noisy data set of points lying close to a smoo...
We propose an automated way of determining the optimal number of low-rank components in dimension re...
We present a method to estimate the manifold dimension by analyzing the shape of simplices formed by...
The identification of a reduced dimensional representation of the data is among the main issues of e...
We study the performance of principal component analysis (PCA). In particular, we consider the probl...
Intuitively, learning should be easier when the data points lie on a low-dimensional submanifold of ...
In 1901, Karl Pearson invented Principal Component Analysis (PCA). Since then, PCA serves as a proto...
The analysis of high-dimensional data often begins with the identification of lower dimensional subs...
A new method for analyzing the intrinsic dimensionality (ID) of low dimensional manifolds in high di...
Most high-dimensional real-life data exhibit some dependencies such that data points do not populate...
Constructing an efficient parametrization of a large, noisy data set of points lying close to a smoo...