We develop a discrete model of type-token dynamics based on random type selection from the Zipf-Mandelbrot probability distribution, with a view to examining the relationships between the constants of Zipf’s and Heaps’ laws. Analysis of items randomly selected items from the Standardised Project Gutenberg Corpus (SPGC) reveal a significant low-frequency “droop” in the β-slope of the types vs. frequency distribution, inconsistent with the model when vocabulary is unlimited: when a finite vocabulary limit is imposed, optimal parameter selection allows the droop to be reproduced. We adjust the parameters of both the limited and unlimited vocabulary models to obtain optimal agreement with the vocabulary growth curves: the limited vocabulary mod...
The dependence on text length of the statistical properties of word occurrences has long been consid...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Human language evolved by natural mechanisms into an efficient system capable of coding and transmit...
We investigate the predictive capability of mathematical models of the type-token relationship appli...
This paper describes a population model for word frequency distributions based on the Zipf-Mandelbro...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
We propose a stochastic model for the number of different words in a given database which incorporat...
We investigate the origin of Zipf's law for words in written texts by means of a stochastic dynamic ...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
We investigate the origin of Zipf's law for words in written texts by means of a stochastic dynamic ...
This paper studies the limits of language models' statistical learning in the context of Zipf's law....
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Go...
In this paper the Zipf–Mandelbrot law is revisited in the context of linguistics. Despite its widesp...
The dependence with text length of the statistical properties of word occurrences has long been cons...
The dependence on text length of the statistical properties of word occurrences has long been consid...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Human language evolved by natural mechanisms into an efficient system capable of coding and transmit...
We investigate the predictive capability of mathematical models of the type-token relationship appli...
This paper describes a population model for word frequency distributions based on the Zipf-Mandelbro...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
We propose a stochastic model for the number of different words in a given database which incorporat...
We investigate the origin of Zipf's law for words in written texts by means of a stochastic dynamic ...
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
We investigate the origin of Zipf's law for words in written texts by means of a stochastic dynamic ...
This paper studies the limits of language models' statistical learning in the context of Zipf's law....
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Go...
In this paper the Zipf–Mandelbrot law is revisited in the context of linguistics. Despite its widesp...
The dependence with text length of the statistical properties of word occurrences has long been cons...
The dependence on text length of the statistical properties of word occurrences has long been consid...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Human language evolved by natural mechanisms into an efficient system capable of coding and transmit...