ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (private environment, spontaneity, unpreparedness etc.) in the area of the whole Czech Republic. The corpus comprises 835 recordings from 2008–2011 that contain 2 785 189 words (i.e. 3 285 508 tokens including punctuation) uttered by 2 544 speakers, out of which 1 297 speakers are unique. ORAL2013 is balanced in the main sociolinguistic categories of speakers (gender, age group, education, region of childhood residence). The corpus is provided in a (semi-XML) vertical format used as an input to the Manatee query engine. The data thus correspond to the corpus available via the KonText query engine to registered users of the CNC at http://www.kor...
The corpus contains speech data of 2 Czech native speakers, male and female. The speech is very prec...
The corpus consists of recordings from the Chamber of Deputies of the Parliament of the Czech Republ...
PDTSC 1.0 is a multi-purpose corpus of spoken language. 768,888 tokens, 73,374 sentences and 7,324 m...
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (priv...
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (priv...
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (priv...
Balanced corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 297 recordings ma...
The paper presents a corpus of spontaneous spoken Czech called ORAL2013, its design principles and p...
Corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 221 recordings made in 200...
Corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 221 recordings made in 200...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (pr...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (pr...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (pr...
This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which cont...
This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which cont...
The corpus contains speech data of 2 Czech native speakers, male and female. The speech is very prec...
The corpus consists of recordings from the Chamber of Deputies of the Parliament of the Czech Republ...
PDTSC 1.0 is a multi-purpose corpus of spoken language. 768,888 tokens, 73,374 sentences and 7,324 m...
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (priv...
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (priv...
ORAL2013 is designed as a representation of authentic spoken Czech used in informal situations (priv...
Balanced corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 297 recordings ma...
The paper presents a corpus of spontaneous spoken Czech called ORAL2013, its design principles and p...
Corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 221 recordings made in 200...
Corpus of informal spoken Czech sized 1 MW. It contains transcriptions of 221 recordings made in 200...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (pr...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (pr...
ORTOFON v1 is designed as a representation of authentic spoken Czech used in informal situations (pr...
This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which cont...
This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which cont...
The corpus contains speech data of 2 Czech native speakers, male and female. The speech is very prec...
The corpus consists of recordings from the Chamber of Deputies of the Parliament of the Czech Republ...
PDTSC 1.0 is a multi-purpose corpus of spoken language. 768,888 tokens, 73,374 sentences and 7,324 m...