The paper presents Bulgarian National Corpus project (BulNC)- a large-scale, representative, online available corpus of Bulgarian. The BulNC is also a monolingual general corpus, fully morpho-syntactically (and partially semantically) annotated, and manually provided with detailed meta-data descriptions. Presently the Bulgarian National corpus consists of about 320 000 000 graphical words and includes more than 10 000 samples. Briefly the corpus structure and the accepted criteria for representativeness and well-balancing are presented. The query language for advance search of collocations and concordances is demonstrated with some examples- it allows to retrieve word combinations, ordered queries, inflexionally and semantically related wor...
The paper presents our considerations related to the creation of a digital corpus of Bulgarian diale...
The paper presents the history, structure and ongoing activities of the Institute for Bulgarian Lang...
The Bulgarian-English parallel corpus MaCoCu-bg-en 1.0 was built by crawling the ".bg" and ".бг" int...
The paper discusses several key concepts related to the development of corpora and reconsiders them ...
The paper presents the methodology and the outcome of the compilation and the processing of the Bulg...
The Bulgarian-Polish-Russian parallel corpus The Semantics Laboratory Team of Institute of Slavic S...
Multilingual digital resources with Bulgarian languageThe paper presents in brief Bulgarian language...
Contemporary information technologies and mathematical modelling has made creating corpora of natura...
This paper presents a linguistic processing pipeline for Bulgarian including morphological analysis,...
The article briefly reviews bilingual Slovak-Bulgarian/Bulgarian-Slovak parallel and aligned corpus....
HPSG-based annotation including: constituent structure, dependency relations, named entities (classi...
Bulgarian sense-annotated corpus – between the tradition and novelty The Bulgarian Sense-annotated ...
In this paper we report on the progress in the creation of an Ontology-based lexicon for Bulgarian. ...
The paper presents our considerations related to the creation of a digital corpus of Bulgarian diale...
The paper introduces the Political Speech Corpus of Bulgarian. First, its current state has been dis...
The paper presents our considerations related to the creation of a digital corpus of Bulgarian diale...
The paper presents the history, structure and ongoing activities of the Institute for Bulgarian Lang...
The Bulgarian-English parallel corpus MaCoCu-bg-en 1.0 was built by crawling the ".bg" and ".бг" int...
The paper discusses several key concepts related to the development of corpora and reconsiders them ...
The paper presents the methodology and the outcome of the compilation and the processing of the Bulg...
The Bulgarian-Polish-Russian parallel corpus The Semantics Laboratory Team of Institute of Slavic S...
Multilingual digital resources with Bulgarian languageThe paper presents in brief Bulgarian language...
Contemporary information technologies and mathematical modelling has made creating corpora of natura...
This paper presents a linguistic processing pipeline for Bulgarian including morphological analysis,...
The article briefly reviews bilingual Slovak-Bulgarian/Bulgarian-Slovak parallel and aligned corpus....
HPSG-based annotation including: constituent structure, dependency relations, named entities (classi...
Bulgarian sense-annotated corpus – between the tradition and novelty The Bulgarian Sense-annotated ...
In this paper we report on the progress in the creation of an Ontology-based lexicon for Bulgarian. ...
The paper presents our considerations related to the creation of a digital corpus of Bulgarian diale...
The paper introduces the Political Speech Corpus of Bulgarian. First, its current state has been dis...
The paper presents our considerations related to the creation of a digital corpus of Bulgarian diale...
The paper presents the history, structure and ongoing activities of the Institute for Bulgarian Lang...
The Bulgarian-English parallel corpus MaCoCu-bg-en 1.0 was built by crawling the ".bg" and ".бг" int...