Master’s degree in Physics of Complex Systems at the Universitat de les Illes Balears, academic year 2015/16.The large amount of digitized linguistic data opens up the unique possibility of using the methodology of complex systems to understand high-level human cognitive processes. Two such issues are i) the way we categorize the continuous space of real-world features into discrete concepts, and ii) the way we use language to copy a line a thought from one brain to another. In this work I address both questions by formulating a simple text generation model which reproduces the three major characteristic large-scale statistical laws of human language streams, namely Zipf’s law, Heaps’ law and Burstiness. Furthermore, the generation ...
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolu...
Large-scale linguistic data is nowadays available in abundance. Using this source of data, previous ...
We study the relationship between vocabulary size and text length in a corpus of 75 literary works i...
[eng] The large amount of digitized linguistic data opens up the unique possibility of using the me...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
In this paper we extract the topology of the semantic space in its encyclopedic acception, measuring...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Written language is a complex communication signal capable of conveying information encoded in the f...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
We present statistical analyses of the large-scale structure of three types of semantic networks: wo...
<div><p>In recent years, graph theory has been widely employed to probe several language properties....
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
The word-space model is a computational model of word meaning that utilizes the distributional patte...
The hypothesis that word co-occurrence statistics extracted from text corpora can provide a basis fo...
International audienceZipf’s law has intrigued people for a long time. This distribution models a ce...
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolu...
Large-scale linguistic data is nowadays available in abundance. Using this source of data, previous ...
We study the relationship between vocabulary size and text length in a corpus of 75 literary works i...
[eng] The large amount of digitized linguistic data opens up the unique possibility of using the me...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
In this paper we extract the topology of the semantic space in its encyclopedic acception, measuring...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
Written language is a complex communication signal capable of conveying information encoded in the f...
Written text is one of the fundamental manifestations of human language, and the study of its univer...
We present statistical analyses of the large-scale structure of three types of semantic networks: wo...
<div><p>In recent years, graph theory has been widely employed to probe several language properties....
Natural language is a remarkable example of a complex dynamical system which combines variation and ...
The word-space model is a computational model of word meaning that utilizes the distributional patte...
The hypothesis that word co-occurrence statistics extracted from text corpora can provide a basis fo...
International audienceZipf’s law has intrigued people for a long time. This distribution models a ce...
Quantitative linguistics has provided us with a number of empirical laws that characterise the evolu...
Large-scale linguistic data is nowadays available in abundance. Using this source of data, previous ...
We study the relationship between vocabulary size and text length in a corpus of 75 literary works i...