International audienceThis paper is intended for an audience of speech technology specialists who believe that "automatic processing of under-resourced languages is a way to study language diversity with a multi-disciplinary view" (L. Besacier, keynote speech at this conference). It aims (i) to provide an illustration of the way in which data are collected in fieldwork on endangered languages, bringing attention to the quality of the transcriptions and annotations created by linguists; (ii) to present the contents and format of a set of endangered-language documents synchronizing sound and text, which are currently available online; and (iii) to sketch out some of the research purposes and applications to which these documents lend themselv...
Haig G, Schnell S, Wegener C. Comparing corpora from endangered language projects: Explorations in l...
Documenting endangered languages has emerged in the past decade as a specialised subdiscipline of li...
Many endangered languages have little documentation, and that which does exist is often in a format ...
International audienceThis paper is intended for an audience of speech technology specialists who be...
International audienceThis paper is intended for an audience of speech technology specialists who be...
This paper is intended for an audience of speech technology specialists who believe that "automatic ...
Haig G, Nau N, Schnell S, Wegener C, eds. Documenting Endangered Languages: Achievements and Perspec...
This volume represents part of an unprecedented and still growing effort to advance, coordinate and ...
It is generally agreed that about 7,000 languages are spoken across the world today and at least hal...
New technologies are seen as an opportunity to 'save' endangered languages. But is this the real cha...
In the past 10 or so years, intensive documentation activities, i.e. compilations of large, multimed...
In the last three decades the field of endangered and minority languages has evolved rapidly, moving...
International audienceThe Pangloss Collection is a language archive developed since 1994 at the Lang...
Generating accurate word-level transcripts of recorded speech for language documentation is difficul...
It is generally agreed that about 7,000 languages are spoken across the world today and at least hal...
Haig G, Schnell S, Wegener C. Comparing corpora from endangered language projects: Explorations in l...
Documenting endangered languages has emerged in the past decade as a specialised subdiscipline of li...
Many endangered languages have little documentation, and that which does exist is often in a format ...
International audienceThis paper is intended for an audience of speech technology specialists who be...
International audienceThis paper is intended for an audience of speech technology specialists who be...
This paper is intended for an audience of speech technology specialists who believe that "automatic ...
Haig G, Nau N, Schnell S, Wegener C, eds. Documenting Endangered Languages: Achievements and Perspec...
This volume represents part of an unprecedented and still growing effort to advance, coordinate and ...
It is generally agreed that about 7,000 languages are spoken across the world today and at least hal...
New technologies are seen as an opportunity to 'save' endangered languages. But is this the real cha...
In the past 10 or so years, intensive documentation activities, i.e. compilations of large, multimed...
In the last three decades the field of endangered and minority languages has evolved rapidly, moving...
International audienceThe Pangloss Collection is a language archive developed since 1994 at the Lang...
Generating accurate word-level transcripts of recorded speech for language documentation is difficul...
It is generally agreed that about 7,000 languages are spoken across the world today and at least hal...
Haig G, Schnell S, Wegener C. Comparing corpora from endangered language projects: Explorations in l...
Documenting endangered languages has emerged in the past decade as a specialised subdiscipline of li...
Many endangered languages have little documentation, and that which does exist is often in a format ...