With large sets of text documents increasing rapidly, being able to efficiently utilize this vast volume of new information and ser-vice resource presents challenges to computational scientists. Text documents are usually modeled as a term-document matrix which has high dimensional and space vectors. To reduce the high di-mensions, one of the various dimensionality reduction methods, concept decomposition, has been developed by some researchers. This method is based on document clustering techniques and least-square matrix approximation to approximate the matrix of vectors. However the numerical computation is expensive, as an inverse of a dense matrix formed by the concept vector matrix is required. In this paper we presented a class of mu...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
As an academic field of study, information retrieval is defined as an activity of finding useful inf...
International audienceConstrained tensor and matrix factorization models allow to extract interpreta...
Abstract. Unlabeled document collections are becoming increasingly common and available; mining such...
We evaluate and compare the storage efficiency of different sparse matrix storage formats as index s...
In this paper we deal with the problem of addition of new documents in collection when documents are...
Document Clustering is an issue of measuring similarity between documents and grouping similar docum...
In recent years, we have seen a tremendous growth in the volume of text documents available on the I...
The task in text retrieval is to find the subset of a collection of documents relevant to a user's ...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
In recent years, we have seen a tremendous growth in the volume of online text documents available o...
In many applications—latent semantic indexing, for example—it is required to obtain a reduced rank a...
The latent semantic analysis (LSA) is a mathematical/statistical way of discovering hidden concepts ...
The vast amount of textual information available today is useless un-less it can be eectively and ec...
International audienceThe computational cost of many signal processing and machine learning techniqu...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
As an academic field of study, information retrieval is defined as an activity of finding useful inf...
International audienceConstrained tensor and matrix factorization models allow to extract interpreta...
Abstract. Unlabeled document collections are becoming increasingly common and available; mining such...
We evaluate and compare the storage efficiency of different sparse matrix storage formats as index s...
In this paper we deal with the problem of addition of new documents in collection when documents are...
Document Clustering is an issue of measuring similarity between documents and grouping similar docum...
In recent years, we have seen a tremendous growth in the volume of text documents available on the I...
The task in text retrieval is to find the subset of a collection of documents relevant to a user's ...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
In recent years, we have seen a tremendous growth in the volume of online text documents available o...
In many applications—latent semantic indexing, for example—it is required to obtain a reduced rank a...
The latent semantic analysis (LSA) is a mathematical/statistical way of discovering hidden concepts ...
The vast amount of textual information available today is useless un-less it can be eectively and ec...
International audienceThe computational cost of many signal processing and machine learning techniqu...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
As an academic field of study, information retrieval is defined as an activity of finding useful inf...
International audienceConstrained tensor and matrix factorization models allow to extract interpreta...