We study the relationship between vocabulary size and text length in a corpus of 75 literary works in English, authored by six writers, distinguishing between the contributions of three grammatical classes (or 'tags,' namely, nouns, verbs and others), and analyse the progressive appearance of new words of each tag along each individual text. We find that, as prescribed by Heaps' Law, vocabulary sizes and text lengths follow a well-defined power-law relation. Meanwhile, the appearance of new words in each text does not obey a power law, and is on the whole well described by the average of random shufflings of the text. Deviations from this average, however, are statistically significant and show systematic trends across the corpus. Specifica...
We investigate the predictive capability of mathematical models of the type-token relationship appli...
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolu...
We analyze the occurrence frequencies of over 15 million words recorded in millions of books publish...
We study the relationship between vocabulary size and text length in a corpus of 75 literary works i...
Heaps' law is an empirical relation in text analysis that predicts vocabulary growth as a function o...
We analyze the frequency–rank relationship in sub-vocabularies corresponding to three different gram...
We analyze the occurrence frequencies of over 15 million words recorded in millions of books publish...
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Go...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
The dependence with text length of the statistical properties of word occurrences has long been cons...
With Zipf’s law being originally and most famously observed for word frequency, it is surprisingly l...
The dependence on text length of the statistical properties of word occurrences has long been consid...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
We investigate the predictive capability of mathematical models of the type-token relationship appli...
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolu...
We analyze the occurrence frequencies of over 15 million words recorded in millions of books publish...
We study the relationship between vocabulary size and text length in a corpus of 75 literary works i...
Heaps' law is an empirical relation in text analysis that predicts vocabulary growth as a function o...
We analyze the frequency–rank relationship in sub-vocabularies corresponding to three different gram...
We analyze the occurrence frequencies of over 15 million words recorded in millions of books publish...
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Go...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
The dependence with text length of the statistical properties of word occurrences has long been cons...
With Zipf’s law being originally and most famously observed for word frequency, it is surprisingly l...
The dependence on text length of the statistical properties of word occurrences has long been consid...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
We investigate the predictive capability of mathematical models of the type-token relationship appli...
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolu...
We analyze the occurrence frequencies of over 15 million words recorded in millions of books publish...