International audienceDeveloping high-quality transcription systems for very large vocabulary corpora is a challenging task. Proper names are usually key to understanding the information contained in a document. To increase the vocabulary coverage, a huge amount of text data should be used. In this paper, we extend the previously proposed neural networks for word embedding models: word vector representation proposed by Mikolov is enriched by an additional non-linear transformation. This model allows to better take into account lexical and semantic word relationships. In the context of broadcast news transcription and in terms of recall, experimental results show a good ability of the proposed model to select new relevant proper names
International audienceNamed entity recognition (NER) remains a very challenging problem essentially ...
Recent advances in neural language models have contributed new methods for learning distributed vect...
In this bachelor thesis, I first introduce the machine learning methodology of text classification w...
International audienceDeveloping high-quality transcription systems for very large vocabulary corpor...
International audienceProper names are usually key to understanding the information contained in a d...
International audienceThe problem of out-of-vocabulary words, more precisely proper names retrieval ...
International audienceThis paper deals with the problem of high-quality transcription systems for ve...
International audienceMany Proper Names (PNs) are Out-Of-Vocabulary (OOV) words for speech recogniti...
International audienceProper name recognition is a challenging task in information retrieval from la...
International audienceDespite recent progress in developing Large Vocabulary Continuous Speech Recog...
The diachronic nature of broadcast news causes frequent variations in the linguisticcontent and voca...
International audienceThe diachronic nature of broadcast news data leads to the problem of Out-Of-Vo...
International audienceProper names are usually keys to understand the information contained in a doc...
International audienceRecognition of Proper Names (PNs) in speech is important for content based ind...
International audienceProper name recognition is a challenging task in information retrieval in larg...
International audienceNamed entity recognition (NER) remains a very challenging problem essentially ...
Recent advances in neural language models have contributed new methods for learning distributed vect...
In this bachelor thesis, I first introduce the machine learning methodology of text classification w...
International audienceDeveloping high-quality transcription systems for very large vocabulary corpor...
International audienceProper names are usually key to understanding the information contained in a d...
International audienceThe problem of out-of-vocabulary words, more precisely proper names retrieval ...
International audienceThis paper deals with the problem of high-quality transcription systems for ve...
International audienceMany Proper Names (PNs) are Out-Of-Vocabulary (OOV) words for speech recogniti...
International audienceProper name recognition is a challenging task in information retrieval from la...
International audienceDespite recent progress in developing Large Vocabulary Continuous Speech Recog...
The diachronic nature of broadcast news causes frequent variations in the linguisticcontent and voca...
International audienceThe diachronic nature of broadcast news data leads to the problem of Out-Of-Vo...
International audienceProper names are usually keys to understand the information contained in a doc...
International audienceRecognition of Proper Names (PNs) in speech is important for content based ind...
International audienceProper name recognition is a challenging task in information retrieval in larg...
International audienceNamed entity recognition (NER) remains a very challenging problem essentially ...
Recent advances in neural language models have contributed new methods for learning distributed vect...
In this bachelor thesis, I first introduce the machine learning methodology of text classification w...