Latent semantic analysis (LSA), as one of the most pop-ular unsupervised dimension reduction tools, has a wide range of applications in text mining and information re-trieval. The key idea of LSA is to learn a projection matrix that maps the high dimensional vector space representations of documents to a lower dimensional la-tent space, i.e. so called latent topic space. In this pa-per, we propose a new model called Sparse LSA, which produces a sparse projection matrix via the `1 regu-larization. Compared to the traditional LSA, Sparse LSA selects only a small number of relevant words for each topic and hence provides a compact representation of topic-word relationships. Moreover, Sparse LSA is computationally very efficient with much less ...
High dimensional data are rapidly growing in many different disciplines, particularly in natural lan...
Abstract — A data sparseness problem for modeling a language often occurs in many language models (L...
Keyword matching information retrieval systems areplagued with problems of noise in the document col...
Topic modeling is a powerful tool for uncovering latent structure in many domains, including medicin...
Low-dimensional topic models have been proven very use-ful for modeling a large corpus of documents ...
Latent Semantic Analysis (LSA) is a vector space technique for representing word meaning. Traditiona...
Learning low dimensional representations from a large number of short corpora has a profound practic...
Latent semantic analysis (LSA) is a technique that analyzes relationships between documents and its ...
Classification We propose a new algorithm for dimensionality reduction and unsupervised text classif...
Latent Semantic Analysis (LSA) is a technique that analyzes relationships between documents and its ...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Document clustering is a popular tool for automatically organizing a large collection of texts. Clus...
The task in text retrieval is to find the subset of a collection of documents relevant to a user's ...
Latent semantic analysis (LSA) is a statistical technique for representing word meaning that has bee...
The latent semantic analysis (LSA) is a mathematical/statistical way of discovering hidden concepts ...
High dimensional data are rapidly growing in many different disciplines, particularly in natural lan...
Abstract — A data sparseness problem for modeling a language often occurs in many language models (L...
Keyword matching information retrieval systems areplagued with problems of noise in the document col...
Topic modeling is a powerful tool for uncovering latent structure in many domains, including medicin...
Low-dimensional topic models have been proven very use-ful for modeling a large corpus of documents ...
Latent Semantic Analysis (LSA) is a vector space technique for representing word meaning. Traditiona...
Learning low dimensional representations from a large number of short corpora has a profound practic...
Latent semantic analysis (LSA) is a technique that analyzes relationships between documents and its ...
Classification We propose a new algorithm for dimensionality reduction and unsupervised text classif...
Latent Semantic Analysis (LSA) is a technique that analyzes relationships between documents and its ...
Statistical topic models such as the Latent Dirichlet Allocation (LDA) have emerged as an attractive...
Document clustering is a popular tool for automatically organizing a large collection of texts. Clus...
The task in text retrieval is to find the subset of a collection of documents relevant to a user's ...
Latent semantic analysis (LSA) is a statistical technique for representing word meaning that has bee...
The latent semantic analysis (LSA) is a mathematical/statistical way of discovering hidden concepts ...
High dimensional data are rapidly growing in many different disciplines, particularly in natural lan...
Abstract — A data sparseness problem for modeling a language often occurs in many language models (L...
Keyword matching information retrieval systems areplagued with problems of noise in the document col...