In this paper we present the first freely available corpus of Dutch text messages containing data originating from the Netherlands and Flanders. This corpus has been collected in the framework of the SoNaR project and constitutes a viable part of this 500-million-word corpus. About 53,000 text messages were collected on a large scale, based on voluntary donations. These messages will be distributed as such. In this paper we focus on the data collection processes involved and after studying the effect of media coverage we show that especially free publicity in newspapers and on social media networks results in more contributions. All SMS are provided with metadata information. Looking at the composition of the corpus, it becomes visible that...
The construction of a large and richly annotated corpus of written Dutch was identified as one of th...
In this paper we report on the experiences gained in the recent construction of the SoNaR corpus, a ...
This article highlights an approach based on authentic data, by focusing on recent research related ...
In this paper we present the first freely available corpus of Dutch text messages containing data or...
In this paper a collection of chats and tweets from the Netherlands and Flanders is described. The c...
Contains fulltext : 101550.pdf (publisher's version ) (Open Access)In this paper a...
The following full text is an author's version which may differ from the publisher's versi...
The development of communication technologies has contributed to the appearance of new forms in the ...
Although in recent years numerous forms of Internet communication – such as e-mail, blogs, chat room...
In this article we introduce a new corpus of computer-mediated communication in Dutch by Maroccan-Du...
The development of communication technologies has contributed to the appareance of new forms in the ...
Het SoNaR Nieuwe Media Corpus 1.0 bevat nieuwemediateksten die verzameld werden binnen het STEVIN-pr...
The Spoken Dutch Corpus that is currently under construction will constitute a 10-million-word corpu...
The use of text corpora has increased considerably in the past few years, not only in the field of l...
Virtual textual communication involves numeric supports as transporter and mediator. SMS language is...
The construction of a large and richly annotated corpus of written Dutch was identified as one of th...
In this paper we report on the experiences gained in the recent construction of the SoNaR corpus, a ...
This article highlights an approach based on authentic data, by focusing on recent research related ...
In this paper we present the first freely available corpus of Dutch text messages containing data or...
In this paper a collection of chats and tweets from the Netherlands and Flanders is described. The c...
Contains fulltext : 101550.pdf (publisher's version ) (Open Access)In this paper a...
The following full text is an author's version which may differ from the publisher's versi...
The development of communication technologies has contributed to the appearance of new forms in the ...
Although in recent years numerous forms of Internet communication – such as e-mail, blogs, chat room...
In this article we introduce a new corpus of computer-mediated communication in Dutch by Maroccan-Du...
The development of communication technologies has contributed to the appareance of new forms in the ...
Het SoNaR Nieuwe Media Corpus 1.0 bevat nieuwemediateksten die verzameld werden binnen het STEVIN-pr...
The Spoken Dutch Corpus that is currently under construction will constitute a 10-million-word corpu...
The use of text corpora has increased considerably in the past few years, not only in the field of l...
Virtual textual communication involves numeric supports as transporter and mediator. SMS language is...
The construction of a large and richly annotated corpus of written Dutch was identified as one of th...
In this paper we report on the experiences gained in the recent construction of the SoNaR corpus, a ...
This article highlights an approach based on authentic data, by focusing on recent research related ...