Since its foundation in May 2009, the médialab Sciences Po works to foster the use of digital methods and tools in social sciences. With the help of existing tools and methods, we experienced the use of web mining techniques to extract data on collective phenomena. We also attended the symposiums organised by the two institutions responsible of web archiving in France: BnF and INA where we learnt about the difficulties posed to social scientists by the use of web archives. Actually our own experience in mining the live web wasn't easier. Such difficulties, we believe, can be explained by the lack of tools allowing scholars to build themselves the highly specialized corpora they need from the wide heterogeneity of the web. The web isn't a we...
The World Wide Web contains an enormous amount of information, but it can be exceedingly difficult f...
The general scope of our research is to present an apparatus for thetreatment of a large corpus of t...
International audienceThis paper presents an overview of the linguists' use of the Web as a corpus. ...
Since its foundation in May 2009, the médialab Sciences Po works to foster the use of digital method...
Since its foundation in May 2009, Sciences Po’s médialab has worked to enhance the use of digital me...
In this article the web’s controversial nature as a corpus is explored on both theoretical and appli...
We investigate the potential of using the web as a huge corpus for language studies. We test the hyp...
The emergence and success of web platforms raised a gimmick into social studies: “Hyperlink is dead!...
From the beginning of the twentieth century on, the use of the World Wide Web has become a current t...
The most widespread access form to archived web is through the Wayback Machine where the archived we...
The web is a field of investigation for social sciences, and platform-based studies have long proven...
Research in Natural Language Processing (NLP) has in recent years benefited from the enormous amount...
The paper explores some of the issues raised by the notion of the web-as-corpus (Kilgarriff-Greffens...
The aim of this paper is to relate traces of the fundamental cultural characteristics of intellectua...
In this thesis I want to discuss a relatively new area of Corpus Linguistics, namely the use of the...
The World Wide Web contains an enormous amount of information, but it can be exceedingly difficult f...
The general scope of our research is to present an apparatus for thetreatment of a large corpus of t...
International audienceThis paper presents an overview of the linguists' use of the Web as a corpus. ...
Since its foundation in May 2009, the médialab Sciences Po works to foster the use of digital method...
Since its foundation in May 2009, Sciences Po’s médialab has worked to enhance the use of digital me...
In this article the web’s controversial nature as a corpus is explored on both theoretical and appli...
We investigate the potential of using the web as a huge corpus for language studies. We test the hyp...
The emergence and success of web platforms raised a gimmick into social studies: “Hyperlink is dead!...
From the beginning of the twentieth century on, the use of the World Wide Web has become a current t...
The most widespread access form to archived web is through the Wayback Machine where the archived we...
The web is a field of investigation for social sciences, and platform-based studies have long proven...
Research in Natural Language Processing (NLP) has in recent years benefited from the enormous amount...
The paper explores some of the issues raised by the notion of the web-as-corpus (Kilgarriff-Greffens...
The aim of this paper is to relate traces of the fundamental cultural characteristics of intellectua...
In this thesis I want to discuss a relatively new area of Corpus Linguistics, namely the use of the...
The World Wide Web contains an enormous amount of information, but it can be exceedingly difficult f...
The general scope of our research is to present an apparatus for thetreatment of a large corpus of t...
International audienceThis paper presents an overview of the linguists' use of the Web as a corpus. ...