Every information retrieval (IR) model embeds in its scoring function a form of term frequency (TF) quantification. The contribution of the term frequency is determined by the properties of the function of the chosen TF quantification, and by its TF normalization. The first defines how independent the occurrences of multiple terms are, while the second acts on mitigating the a priori probability of having a high term frequency in a document (estimation usually based on the document length). New test collections, coming from different domains (e.g. medical, legal), give evidence that not only document length, but in addition, verboseness of documents should be explicitly considered. Therefore we propose and investigate a systematic combinati...
This thesis devises a novel methodology based on probability theory, suitable for the construction o...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
Automatic information retrieval systems have to deal with documents of varying lengths in a text col...
Open access funding provided by Austrian Science Fund (FWF). This research was partly supported by t...
We introduce and create a framework for deriving probabilistic models of Information Retrieval. The ...
Term weighting is an essential part of the modern information retrieval systems. Out of the three ma...
Document fields, such as the title or the headings of a document, offer a way to consider the struct...
Abstract. Term frequency normalization is a serious issue since lengths of doc-uments are various. G...
Automatic information retrieval systems have to deal with documents of varying lengths in a text col...
Document length normalization is an important aspect of term weight assignment in an automatic infor...
This paper presents a new probabilistic model of information retrieval. The most important modeling ...
This paper presents a new probabilistic model of information retrieval. The most important modeling ...
. This paper presents a new probabilistic model of information retrieval. The most important modelin...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
This thesis devises a novel methodology based on probability theory, suitable for the construction o...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
Automatic information retrieval systems have to deal with documents of varying lengths in a text col...
Open access funding provided by Austrian Science Fund (FWF). This research was partly supported by t...
We introduce and create a framework for deriving probabilistic models of Information Retrieval. The ...
Term weighting is an essential part of the modern information retrieval systems. Out of the three ma...
Document fields, such as the title or the headings of a document, offer a way to consider the struct...
Abstract. Term frequency normalization is a serious issue since lengths of doc-uments are various. G...
Automatic information retrieval systems have to deal with documents of varying lengths in a text col...
Document length normalization is an important aspect of term weight assignment in an automatic infor...
This paper presents a new probabilistic model of information retrieval. The most important modeling ...
This paper presents a new probabilistic model of information retrieval. The most important modeling ...
. This paper presents a new probabilistic model of information retrieval. The most important modelin...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
This thesis devises a novel methodology based on probability theory, suitable for the construction o...
The present study deals with word frequencies distributions and their relation to probabilistic Info...
Automatic information retrieval systems have to deal with documents of varying lengths in a text col...