Janes-Kratko is a corpus of Slovene tweets manually annotated with shortening phenomena according to the supplied typology covering different types of spelling, lexical and syntactic shortenings. The corpus was sampled from the Janes-Norm corpus (http://hdl.handle.net/11356/1084), which was manually annotated for tokenisation, sentence segmentation and word normalisation of non-standard Slovene and automatically annotated with morphosyntactic descriptions and lemmas. The corpus is further described in: GOLI, Teja, OSRAJNIK, Eneja, FIŠER, Darja. Analiza krajšanja slovenskih sporočil na družbenem omrežju Twitter. Proceedings of the Conference on Language Technologies & Digital Humanities, Ljubljana, Slovenia. 2016, pp. 77-82. http://www.s...
The main objective of this article is to assess the value of the Janes corpus for research in the fi...
ReLDI-NormTag-hr 1.1 is a manually annotated corpus of Croatian tweets. It is meant as a gold-standa...
ReLDI-NormTag-sr 1.1 is a manually annotated corpus of Serbian tweets. It is meant as a gold-standar...
Janes-Syn is a syntactically annotated corpus of Slovene tweets and is meant as a gold-standard trai...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC) consistin...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is me...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is me...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is me...
Janes-Vejica is a corpus of Slovene tweets where commas are annotated with the reason for their (in)...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Tweet is an annotated corpus of almost 10 million tweets posted from 2013-06 to 2017-06 by app...
Janes-Preklop is a corpus of Slovene tweets that is manually annotated for code-switching (the use o...
The main objective of this article is to assess the value of the Janes corpus for research in the fi...
ReLDI-NormTag-hr 1.1 is a manually annotated corpus of Croatian tweets. It is meant as a gold-standa...
ReLDI-NormTag-sr 1.1 is a manually annotated corpus of Serbian tweets. It is meant as a gold-standar...
Janes-Syn is a syntactically annotated corpus of Slovene tweets and is meant as a gold-standard trai...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC) consistin...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is me...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is me...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Norm is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is me...
Janes-Vejica is a corpus of Slovene tweets where commas are annotated with the reason for their (in)...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Tag is a manually annotated corpus of Slovene Computer-Mediated Communication (CMC). It is mea...
Janes-Tweet is an annotated corpus of almost 10 million tweets posted from 2013-06 to 2017-06 by app...
Janes-Preklop is a corpus of Slovene tweets that is manually annotated for code-switching (the use o...
The main objective of this article is to assess the value of the Janes corpus for research in the fi...
ReLDI-NormTag-hr 1.1 is a manually annotated corpus of Croatian tweets. It is meant as a gold-standa...
ReLDI-NormTag-sr 1.1 is a manually annotated corpus of Serbian tweets. It is meant as a gold-standar...