This paper describes a method of comparing corpora which uses frequency profiling. The method can be used to discover key words in the corpora which differentiate one corpus from another. Using annotated corpora, it can be applied to discover key grammatical or word-sense categories. This can be used as a quick way in to find the differences between the corpora and is shown to have applications in the study of social differentiation in the use of English vocabulary, profiling of learner English and document analysis in the software engineering process.
The characteristics of different corpora influence the success of Information Retrieval and NLP meth...
Frequency lists are useful in their own right for assisting a linguist, lexicographer, language teac...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...
Corpus linguistics lacks strategies for describing and compar-ing corpora. Currently most descriptio...
In this article, I review some of the features available for language analysis in the KeyWords tool ...
In this paper, the prototype system Vis-À-Vis to support linguists in their compari-son of regional...
International audienceThis paper discusses a method to detect statistically significant linguistic d...
We develop an aggregate measure of syntactic difference for automatically finding common syntactic d...
Recently, textual characteristics, i.e. certain language statistics, have been proposed to compare c...
In this work, we discuss the benefits of using automatically parsed corpora to study language variat...
International audienceThis paper addresses comparability between learner and native corpora. The obj...
This thesis applies the word embedding mapping approach to make a lexical comparison from academic w...
The selection and assessment of ELT materials involve multiple criteria. The use of frequency word l...
Un article portant sur les corpus d'apprenants vient de sortir dans la revue Language Learning. Actu...
This chapter introduces some basic techniques for analysing corpora. It shows how we can use both qu...
The characteristics of different corpora influence the success of Information Retrieval and NLP meth...
Frequency lists are useful in their own right for assisting a linguist, lexicographer, language teac...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...
Corpus linguistics lacks strategies for describing and compar-ing corpora. Currently most descriptio...
In this article, I review some of the features available for language analysis in the KeyWords tool ...
In this paper, the prototype system Vis-À-Vis to support linguists in their compari-son of regional...
International audienceThis paper discusses a method to detect statistically significant linguistic d...
We develop an aggregate measure of syntactic difference for automatically finding common syntactic d...
Recently, textual characteristics, i.e. certain language statistics, have been proposed to compare c...
In this work, we discuss the benefits of using automatically parsed corpora to study language variat...
International audienceThis paper addresses comparability between learner and native corpora. The obj...
This thesis applies the word embedding mapping approach to make a lexical comparison from academic w...
The selection and assessment of ELT materials involve multiple criteria. The use of frequency word l...
Un article portant sur les corpus d'apprenants vient de sortir dans la revue Language Learning. Actu...
This chapter introduces some basic techniques for analysing corpora. It shows how we can use both qu...
The characteristics of different corpora influence the success of Information Retrieval and NLP meth...
Frequency lists are useful in their own right for assisting a linguist, lexicographer, language teac...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...