N-grams are generalized words consisting of N consecutive symbols, as they are used in a text. This paper determines the rank-frequency distribution for redundant N-grams. For entire texts this is known to be Zipf's law (i.e. an inverse power law). For N-grams, however, we show that the rank (+frequency distribution is where $N is the inverse function of fN(x)=xlnN-'x. Here we assume that the rank-frequency distribution of the symbols follows Zipf s law with exponent P
Here, assuming a general communication model where objects map to signals, a power function for the ...
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Despite being a paradigm of quantitative linguistics, Zipf\'\''s law for words suffers from three ma...
The frequency of words and letters in bodies of text has been heavily studied for several purposes, ...
We examine the relationship between two different types of ranked data, frequencies and magnitudes. ...
Human language evolved by natural mechanisms into an efficient system capable of coding and transmit...
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the ...
Some authors have recently argued that a finite-size scaling law for the text-length dependence of w...
The dependence with text length of the statistical properties of word occurrences has long been cons...
The dependence on text length of the statistical properties of word occurrences has long been consid...
This paper establishes the general relation between the distribution of N-tuples of letters (e.g. N-...
International audienceZipf’s law has intrigued people for a long time. This distribution models a ce...
In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, ...
Despite being a paradigm of quantitative linguistics, Zipf'\''s law for words suffers from three mai...
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Here, assuming a general communication model where objects map to signals, a power function for the ...
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Despite being a paradigm of quantitative linguistics, Zipf\'\''s law for words suffers from three ma...
The frequency of words and letters in bodies of text has been heavily studied for several purposes, ...
We examine the relationship between two different types of ranked data, frequencies and magnitudes. ...
Human language evolved by natural mechanisms into an efficient system capable of coding and transmit...
Zipf's law states that the frequency of a word is a power function of its rank. The exponent of the ...
Some authors have recently argued that a finite-size scaling law for the text-length dependence of w...
The dependence with text length of the statistical properties of word occurrences has long been cons...
The dependence on text length of the statistical properties of word occurrences has long been consid...
This paper establishes the general relation between the distribution of N-tuples of letters (e.g. N-...
International audienceZipf’s law has intrigued people for a long time. This distribution models a ce...
In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, ...
Despite being a paradigm of quantitative linguistics, Zipf'\''s law for words suffers from three mai...
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Here, assuming a general communication model where objects map to signals, a power function for the ...
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as wel...
Despite being a paradigm of quantitative linguistics, Zipf\'\''s law for words suffers from three ma...