Word embeddings map words to a high dimensional vector space, where words with similar meanings have similar vectors. We analyzed the problem of automatic identification of verbal idioms in Slovene using features built from embeddings of single words and groups of words. For this purpose, we built two data sets that contain verbal idioms and random word groups described with corresponding features. Using these data sets we evaluated the classification of verbal idioms with support vector machines, random forests, and logistic regression. All three methods were successful, the best being random forests. Due to large computational time and limitation to only identify groups of words with precomputed word embeddings the approach requires furth...
Uspoređujući metodologiju obradbe glagola u nekoliko rječnika, utvrđuje se da se specifične glagolsk...
Natural language processing is an important area of computational linguistics and artificial intelli...
Žanrska analiza, podprta z besedilnimi zgledi iz oblikoslovno označenega korpusa, pojasnjuje zaplete...
Word embeddings map words to a high dimensional vector space, where words with similar meanings have...
Sloleks is a lexicon of Slovene word forms which contains - in a structured database - Slovene words...
We aim to learn comma placing using machine learning. Our approach is¸based on adding new attributes...
There is no simple algorithm for stress assignment of Slovene words. Speakers of Slovene are usually...
Natural language processing greatly depends on a sufficient amount of training data. When handling ...
Cilj diplomske naloge je razvoj klasifikatorja za prepoznavo protipomenk. Za izdelavo rešitve je bil...
Manual transcription of speech is slow and is being replaced by automatic speech recognition systems...
The thesis deals with part of speech tagging of Slovene language. Part of speech tagging is a proces...
The goal of this thesis is to create a sentiment dictionary for the Slovenian language which can be ...
The aim of the thesis is to add the rules for comma usage to the LanguageTool program. Using the Lek...
Povzemanje besedil naslavlja problem naraščujoče količine tekstovnih podatkov, v katerih želimo odkr...
Slovenian dialect words are covered in various books and publications that have been published over ...
Uspoređujući metodologiju obradbe glagola u nekoliko rječnika, utvrđuje se da se specifične glagolsk...
Natural language processing is an important area of computational linguistics and artificial intelli...
Žanrska analiza, podprta z besedilnimi zgledi iz oblikoslovno označenega korpusa, pojasnjuje zaplete...
Word embeddings map words to a high dimensional vector space, where words with similar meanings have...
Sloleks is a lexicon of Slovene word forms which contains - in a structured database - Slovene words...
We aim to learn comma placing using machine learning. Our approach is¸based on adding new attributes...
There is no simple algorithm for stress assignment of Slovene words. Speakers of Slovene are usually...
Natural language processing greatly depends on a sufficient amount of training data. When handling ...
Cilj diplomske naloge je razvoj klasifikatorja za prepoznavo protipomenk. Za izdelavo rešitve je bil...
Manual transcription of speech is slow and is being replaced by automatic speech recognition systems...
The thesis deals with part of speech tagging of Slovene language. Part of speech tagging is a proces...
The goal of this thesis is to create a sentiment dictionary for the Slovenian language which can be ...
The aim of the thesis is to add the rules for comma usage to the LanguageTool program. Using the Lek...
Povzemanje besedil naslavlja problem naraščujoče količine tekstovnih podatkov, v katerih želimo odkr...
Slovenian dialect words are covered in various books and publications that have been published over ...
Uspoređujući metodologiju obradbe glagola u nekoliko rječnika, utvrđuje se da se specifične glagolsk...
Natural language processing is an important area of computational linguistics and artificial intelli...
Žanrska analiza, podprta z besedilnimi zgledi iz oblikoslovno označenega korpusa, pojasnjuje zaplete...