A comprehensive corpus of news articles on the topic of language, published in major daily newspapers and news portals in Bosnia and Herzegovina in the five-year period of January 1, 2015 - January 1, 2020. The corpus is designed to facilitate research on metalanguage (‘language about language’), linguistic ideologies, language policy and planning, as well as the specific contemporary debates on language defining, naming, and standardisation, ongoing in post-Yugoslav societies. The corpus is available in plain text version and XML with full metadata. MetaLangNEWS-Bs is complemented with a separate corpus of citizen metalanguage comments, i.e. online comments to the news articles, available as MetaLangNEWS-COMMENTS-Bs (http://hdl.handle.net...
This white paper is part of a series that promotes knowledge about language technology and its poten...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
A comprehensive corpus of news articles on the topic of language, published in major Montenegrin dai...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
A comprehensive corpus of news articles on the topic of language, published in major Serbian daily n...
A comprehensive corpus of news articles on the topic of language, published in major Slovenian daily...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
A comprehensive corpus of news articles on the topic of language, published in major Macedonian dail...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
Growing interest in meta-language, in linguistics and other disciplines, has highlighted a gap in me...
Gigafida 2.0, with about 1.1 billion words, is a reference corpus of written Slovene text published ...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 107 different media websites...
The Bosnian web corpus bsWaC was built by crawling the .ba top-level domain in 2014. The corpus was ...
This white paper is part of a series that promotes knowledge about language technology and its poten...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
A comprehensive corpus of news articles on the topic of language, published in major Montenegrin dai...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
A comprehensive corpus of news articles on the topic of language, published in major Serbian daily n...
A comprehensive corpus of news articles on the topic of language, published in major Slovenian daily...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
A comprehensive corpus of news articles on the topic of language, published in major Macedonian dail...
A comprehensive corpus of user comments on online news articles on the topic of language from major ...
Growing interest in meta-language, in linguistics and other disciplines, has highlighted a gap in me...
Gigafida 2.0, with about 1.1 billion words, is a reference corpus of written Slovene text published ...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 107 different media websites...
The Bosnian web corpus bsWaC was built by crawling the .ba top-level domain in 2014. The corpus was ...
This white paper is part of a series that promotes knowledge about language technology and its poten...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...
The Trendi corpus is a monitor corpus of Slovene. It contains news from 106 different media websites...