International audienceWe present a first attempt to perform attentional word segmen-tation directly from the speech signal, with the final goal to automatically identify lexical units in a low-resource, unwritten language (UL). Our methodology assumes a pairing between recordings in the UL with translations in a well-resourced language. It uses Acoustic Unit Discovery (AUD) to convert speech into a sequence of pseudo-phones that is segmented using neural soft-alignments produced by a neural machine translation model. Evaluation uses an actual Bantu UL, Mboshi; comparisons to monolingual and bilingual baselines illustrate the potential of attentional word segmentation for language documentation
For endangered languages, data collection campaigns have to accommodate the challenge that many of t...
In this paper we show that recently developed algorithms for unsupervised word segmentation can be a...
International audienceFor endangered languages, data collection campaigns have to accommodate the ch...
International audienceWe present a first attempt to perform attentional word segmen-tation directly ...
International audienceOne of the basic tasks of computational language documentation (CLD) is to ide...
Natural language processing systems such as speech recognition and ma-chine translation conventional...
Documenting languages helps to prevent the extinction of endangered dialects, many of which are othe...
Language diversity is under considerable pressure: half of the world’s languages could disappear by ...
International audienceAttention-based sequence-to-sequence neural machine translation systems have b...
International audienceThese last years, there has been a regain of interest in unsupervised sub-lexi...
International audienceAutomatic speech processing technologies hold great potential to facilitate th...
Accepted to ICASSP 2018International audienceDeveloping speech technologies for low-resource languag...
International audienceWord discovery is the task of extracting words from un-segmented text. In this...
International audienceThe attention mechanism in Neural Machine Translation (NMT) models added flexi...
Computational Language Documentation (CLD) is a research field interested in proposing methodologies...
For endangered languages, data collection campaigns have to accommodate the challenge that many of t...
In this paper we show that recently developed algorithms for unsupervised word segmentation can be a...
International audienceFor endangered languages, data collection campaigns have to accommodate the ch...
International audienceWe present a first attempt to perform attentional word segmen-tation directly ...
International audienceOne of the basic tasks of computational language documentation (CLD) is to ide...
Natural language processing systems such as speech recognition and ma-chine translation conventional...
Documenting languages helps to prevent the extinction of endangered dialects, many of which are othe...
Language diversity is under considerable pressure: half of the world’s languages could disappear by ...
International audienceAttention-based sequence-to-sequence neural machine translation systems have b...
International audienceThese last years, there has been a regain of interest in unsupervised sub-lexi...
International audienceAutomatic speech processing technologies hold great potential to facilitate th...
Accepted to ICASSP 2018International audienceDeveloping speech technologies for low-resource languag...
International audienceWord discovery is the task of extracting words from un-segmented text. In this...
International audienceThe attention mechanism in Neural Machine Translation (NMT) models added flexi...
Computational Language Documentation (CLD) is a research field interested in proposing methodologies...
For endangered languages, data collection campaigns have to accommodate the challenge that many of t...
In this paper we show that recently developed algorithms for unsupervised word segmentation can be a...
International audienceFor endangered languages, data collection campaigns have to accommodate the ch...