In recent years, statistical language models are being proposed as alternative to the vector space model. Viewing documents as language samples introduces the issue of defining a joint probability distribution over the terms. The present paper models a document as the result of a Markov process. It argues that this process is ergodic, which is theoretically plausible, and easy to verify in practice. The theoretical result is that the joint distribution can be easily obtained. This can also be applied for search resolutions other than the document level. We verified this in an experiment on query expansion demonstrating both the validity and the practicability of the method. This holds a promise for general language models
This paper faces a central theme in applied statistics and information science, which is the assessm...
This paper faces a central theme in applied statistics and information science, which is the assessm...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
In recent years, statistical language models are being proposed as alternative to the vector space m...
In recent years, statistical language models are being proposed as alternative to the vector space m...
Item does not contain fulltextIn recent years, statistical language models are being proposed as alt...
Intuitively, any `bag of words' approach in IR should benefit from taking term dependencies into acc...
This paper proposes a novel statistical approach to intelligent document re-trieval. It seeks to off...
Abstract. Intuitively, any ‘bag of words ’ approach in IR should benefit from taking term dependenci...
We show how Recursive Markov Chains (RMCs) and their restrictions can define probabilistic distribut...
International audienceIn this paper, me give an overview about the different results existing on the...
A central idea of Language Models is that documents (and perhaps queries) are random variables, gene...
The question of authorship of texts has already been investigated by several scientists. For example...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
This paper presents the process of refining the document and their terms in Information Retrieval. I...
This paper faces a central theme in applied statistics and information science, which is the assessm...
This paper faces a central theme in applied statistics and information science, which is the assessm...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
In recent years, statistical language models are being proposed as alternative to the vector space m...
In recent years, statistical language models are being proposed as alternative to the vector space m...
Item does not contain fulltextIn recent years, statistical language models are being proposed as alt...
Intuitively, any `bag of words' approach in IR should benefit from taking term dependencies into acc...
This paper proposes a novel statistical approach to intelligent document re-trieval. It seeks to off...
Abstract. Intuitively, any ‘bag of words ’ approach in IR should benefit from taking term dependenci...
We show how Recursive Markov Chains (RMCs) and their restrictions can define probabilistic distribut...
International audienceIn this paper, me give an overview about the different results existing on the...
A central idea of Language Models is that documents (and perhaps queries) are random variables, gene...
The question of authorship of texts has already been investigated by several scientists. For example...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
This paper presents the process of refining the document and their terms in Information Retrieval. I...
This paper faces a central theme in applied statistics and information science, which is the assessm...
This paper faces a central theme in applied statistics and information science, which is the assessm...
Written text is one of the fundamental manifestations of human language, and the study of its univer...