AbstractRecurring sequences of words have long been considered as a signifier of different genres and registers by corpus linguists. The previous research mainly focused on lexical n-grams. Nevertheless, n-grams of other linguistic features, such as part-of-speech, have been less studied. The current study is expected to examine whether n-grams of part-of-speech tags extracted from a large corpus can be a discriminator of different genres. The results show that a strong correlation exists between the information about n-grams of part-of-speech tags and the genre of the text
T-Scan is a tool for the automatic analysis of Dutch text. This paper presents the first large-scale...
This thesis is concerned with text typology. In this thesis, the written part of the British Nationa...
Genre classification has been found to improve performance in many applications of statistical NLP, ...
AbstractRecurring sequences of words have long been considered as a signifier of different genres an...
Human communicative practices are organized in terms of genres, and people are highly skilled at rec...
Genre characterizes text differently than the usual subject or prepositional content that has been t...
Until now, it is still unclear which set of features produces the best result in au-tomatic genre cl...
This paper examines automated genre classification of text documents and its role in enabling the ef...
Text genre classification is the process of identifying functional characteristics of text documents...
The article discusses the theoretical and practical problems related to the study of speech genres o...
In this paper, we study the effect of using n-grams (sequences of words of length n) for text catego...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...
International audienceIn this chapter, it is shown how we can develop a new type of learner’s or stu...
Web pages are discriminated based on their topic and genre. Web page genres are capable to improve t...
Recently, textual characteristics, i.e. certain language statistics, have been proposed to compare c...
T-Scan is a tool for the automatic analysis of Dutch text. This paper presents the first large-scale...
This thesis is concerned with text typology. In this thesis, the written part of the British Nationa...
Genre classification has been found to improve performance in many applications of statistical NLP, ...
AbstractRecurring sequences of words have long been considered as a signifier of different genres an...
Human communicative practices are organized in terms of genres, and people are highly skilled at rec...
Genre characterizes text differently than the usual subject or prepositional content that has been t...
Until now, it is still unclear which set of features produces the best result in au-tomatic genre cl...
This paper examines automated genre classification of text documents and its role in enabling the ef...
Text genre classification is the process of identifying functional characteristics of text documents...
The article discusses the theoretical and practical problems related to the study of speech genres o...
In this paper, we study the effect of using n-grams (sequences of words of length n) for text catego...
ABSTRACT This paper describes a method of comparing routine language use in different corpora, and p...
International audienceIn this chapter, it is shown how we can develop a new type of learner’s or stu...
Web pages are discriminated based on their topic and genre. Web page genres are capable to improve t...
Recently, textual characteristics, i.e. certain language statistics, have been proposed to compare c...
T-Scan is a tool for the automatic analysis of Dutch text. This paper presents the first large-scale...
This thesis is concerned with text typology. In this thesis, the written part of the British Nationa...
Genre classification has been found to improve performance in many applications of statistical NLP, ...