This paper describes two very large (> 1 billion words) Web-derived \u201creference\u201d corpora of English and French, called ukWaC and frWaC, and reports on a pilot study in which these resources are applied to a bilingual lexicography task focusing on collocation extraction and translation. The two corpora were assembled through automated procedures, and little is known of their actual contents. The study aimed therefore at providing mainly qualitative evaluation of the corpora by applying them to a practical task, i.e. ascertaining whether resources built automatically from the Web can be profitably applied to lexicographic work, on a par with more costly and carefully-built resources such as the British National Corpus (for English). ...
This paper describes and discusses some theoretical and practical problems arising from developing a...
This research is located in the natural language processing (NLP) domain, at the intersection of com...
This article introduces ukWaC, deWaC and itWaC, three very large corpora of English, German, and Ita...
none4This paper describes two very large (> 1 billion words) Web-derived “reference” corpora of Engl...
The present contribution examines the potential of web corpora in bilingual education
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
Corpus analysis has become an essential part of unilingual lexicography and its usefulness has been ...
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
The objectives of this paper are twofold: the first is to illustrate the potential of the World Wide...
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
Research on the representation of word-formation in dictionaries is scarce and tends to be restricte...
Presentation for the 5th International Conference on Corpus Linguistics (CILC 2013), V Congreso Inte...
This paper aims to consider the impact corpora have made on language studies and to touch upon the i...
National audienceWe present a system for collocation extraction, using both monolingual and bilingua...
This paper describes and discusses some theoretical and practical problems arising from developing a...
This research is located in the natural language processing (NLP) domain, at the intersection of com...
This article introduces ukWaC, deWaC and itWaC, three very large corpora of English, German, and Ita...
none4This paper describes two very large (> 1 billion words) Web-derived “reference” corpora of Engl...
The present contribution examines the potential of web corpora in bilingual education
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
Corpus analysis has become an essential part of unilingual lexicography and its usefulness has been ...
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
The objectives of this paper are twofold: the first is to illustrate the potential of the World Wide...
Collocations are notoriously difficult for non-native speakers to translate, primarily because they ...
Research on the representation of word-formation in dictionaries is scarce and tends to be restricte...
Presentation for the 5th International Conference on Corpus Linguistics (CILC 2013), V Congreso Inte...
This paper aims to consider the impact corpora have made on language studies and to touch upon the i...
National audienceWe present a system for collocation extraction, using both monolingual and bilingua...
This paper describes and discusses some theoretical and practical problems arising from developing a...
This research is located in the natural language processing (NLP) domain, at the intersection of com...
This article introduces ukWaC, deWaC and itWaC, three very large corpora of English, German, and Ita...