This paper describes the method employed to build a machinereadable pronunciation dictionary for Brazilian Portuguese. The dictionary makes use of a hybrid approach for converting graphemes into phonemes, based on both manual transcription rules and machine learning algorithms. It makes use of a word list compiled from the Portuguese Wikipedia dump. Wikipedia articles were transformed into plain text, tokenized and word types were extracted. A language identification tool was developed to detect loanwords among data. Words’ syllable boundaries and stress were identified. The transcription task was carried\ud out in a two-step process: i) words are submitted to a set of transcription rules, in which predictable graphemes (mostly consonants) ...
International audienceThis paper shows that web pronunciations such as those available in the Wiktio...
Abstract. In this paper we describe how Dicionário-Aberto, an online dictionary for the Portuguese l...
Abstract An automatic speech recognition system has modules that depend on the language and, while t...
This paper describes the method employed to build a machinereadable pronunciation dictionary for Bra...
This thesis presents tools and resources for the development of applications in Natural Language Pro...
Portuguese is the official language of eight countries, including Brazil. Since Brazil started to be...
Abstract. This paper describes one aspect of an ongoing work to incorporate pronunciation variabilit...
Speech processing has become a data-driven technology. Hence, the success of research in this area i...
Natural languages are oral communication vehicles that eventually are written. It is very important ...
Abstract. We investigate a machine learning approach to Portuguese pronoun resolution. We presently ...
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is descri...
Automatic Phonetic Transcription is a crucial task for many applications of different areas. Besides...
The Historical Dictionary of Brazilian Portuguese (HDBP), the first of its kind, is based on a corpu...
This bachelor thesis deals with the issue of creating a translation dictionary, which is exemplified...
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is descri...
International audienceThis paper shows that web pronunciations such as those available in the Wiktio...
Abstract. In this paper we describe how Dicionário-Aberto, an online dictionary for the Portuguese l...
Abstract An automatic speech recognition system has modules that depend on the language and, while t...
This paper describes the method employed to build a machinereadable pronunciation dictionary for Bra...
This thesis presents tools and resources for the development of applications in Natural Language Pro...
Portuguese is the official language of eight countries, including Brazil. Since Brazil started to be...
Abstract. This paper describes one aspect of an ongoing work to incorporate pronunciation variabilit...
Speech processing has become a data-driven technology. Hence, the success of research in this area i...
Natural languages are oral communication vehicles that eventually are written. It is very important ...
Abstract. We investigate a machine learning approach to Portuguese pronoun resolution. We presently ...
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is descri...
Automatic Phonetic Transcription is a crucial task for many applications of different areas. Besides...
The Historical Dictionary of Brazilian Portuguese (HDBP), the first of its kind, is based on a corpu...
This bachelor thesis deals with the issue of creating a translation dictionary, which is exemplified...
In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is descri...
International audienceThis paper shows that web pronunciations such as those available in the Wiktio...
Abstract. In this paper we describe how Dicionário-Aberto, an online dictionary for the Portuguese l...
Abstract An automatic speech recognition system has modules that depend on the language and, while t...