Gigafida 2.0, with about 1.1 billion words, is a reference corpus of written Slovene text published in the period 1990-2018. It is comprised of daily news, magazines, a selection of web texts (a certain portion of which covers news texts as well), and different types of publications (fiction, school books, and non-fiction). The texts have been selected and automatically processed with the aim of creating a corpus that represents a sample of modern standard Slovene and can be used for research in linguistics and other branches of the humanities, for compiling modern dictionaries, grammars, and learning materials, as well as for developing language technologies for Slovene. Gigafida 2.0 is an upgraded version of the Gigafida corpus (Logar ...
A comprehensive corpus of news articles on the topic of language, published in major daily newspaper...
A comprehensive corpus of news articles on the topic of language, published in major Slovenian daily...
Frequency lists of words were extracted from the Gigafida 2.0 Corpus of Written Standard Slovene (ht...
The availability of large collections of text (language corpora) is crucial for empirically supporte...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 107 different media websites...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
In this paper we present Trendi, a monitor corpus of written Slovene, which has been compiled recent...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
Corpus ccGigafida consists of paragraph samples from 31,722 documents, each containing information a...
1siThe paper discusses the expansion of the Gigafida corpus, a Slovenian reference corpus, to includ...
MAKS (MlAdinski KorpuS, i.e. the Youth Corpus) includes texts from literature, newspapers, and, to a...
The corpus of Slovene as a foreign language KOST (Korpus slovenščine kot tujega jezika) contains 8,3...
The corpus of Slovene as a foreign language KOST (Korpus slovenščine kot tujega jezika) contains 6,3...
In the last decade, corpus linguistics has finally established itself as a separate research startin...
Področje procesiranja naravnega jezika je pomembna in obsežna panoga računalništva, vendar je večina...
A comprehensive corpus of news articles on the topic of language, published in major daily newspaper...
A comprehensive corpus of news articles on the topic of language, published in major Slovenian daily...
Frequency lists of words were extracted from the Gigafida 2.0 Corpus of Written Standard Slovene (ht...
The availability of large collections of text (language corpora) is crucial for empirically supporte...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 107 different media websites...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
In this paper we present Trendi, a monitor corpus of written Slovene, which has been compiled recent...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
Corpus ccGigafida consists of paragraph samples from 31,722 documents, each containing information a...
1siThe paper discusses the expansion of the Gigafida corpus, a Slovenian reference corpus, to includ...
MAKS (MlAdinski KorpuS, i.e. the Youth Corpus) includes texts from literature, newspapers, and, to a...
The corpus of Slovene as a foreign language KOST (Korpus slovenščine kot tujega jezika) contains 8,3...
The corpus of Slovene as a foreign language KOST (Korpus slovenščine kot tujega jezika) contains 6,3...
In the last decade, corpus linguistics has finally established itself as a separate research startin...
Področje procesiranja naravnega jezika je pomembna in obsežna panoga računalništva, vendar je večina...
A comprehensive corpus of news articles on the topic of language, published in major daily newspaper...
A comprehensive corpus of news articles on the topic of language, published in major Slovenian daily...
Frequency lists of words were extracted from the Gigafida 2.0 Corpus of Written Standard Slovene (ht...