With the growing availability of spoken language corpora more and more data driven research in phonetics is possible. The downside of having huge speech corpora is that they have to be segmented and labeled, before they can be exploited. As labeling and annotation are time-consuming and costly, there is an interest in standardization which would support the exchange and reuse of labeled data. The MATE project proposes standards for an integrated and consistent multi-level annotation of speech and especially dialogue corpora. These proposals are based on the existing TEI standard (Text Encoding Initiative). All label information is represented in XML, thus there is a uniform representation of the different linguistic levels of description. T...
Māori speech data collection and analysis is an ongoing process, as new and existing data sets are c...
Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of co...
Researchers in various fields, from acoustic phonetics to child language development, rely on digiti...
Annotated speech corpora are databases consisting of signal data along with time-aligned symbolic `t...
In speech technology more and more databases of spo-ken language are becoming available. For researc...
In this paper, we address two problems in indexing and querying spoken language corpora with overlap...
The goal of the LACITO linguistic archive project is to conserve and to make available for research ...
The paper describes a method of collecting phonetic and linguistic data while maximizing the efficie...
Abstract Over the past several decades, research and development of human language technology has be...
This paper describes the setting up of a resource database for research and evaluation in the domain...
Representing annotated spoken corpora The annotation of linguistic resources has long-standing tradi...
This paper proposes a methodology for querying linguistic data represented in different corpus forma...
Large and open multiparallel corpora are a valuable resource for contrastive corpus linguists if the...
International audienceAlthough automatic analysis and computer-aided annotation tools are being deve...
This thesis presents a novel model for analyzing queries of the users of spoken language systems in ...
Māori speech data collection and analysis is an ongoing process, as new and existing data sets are c...
Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of co...
Researchers in various fields, from acoustic phonetics to child language development, rely on digiti...
Annotated speech corpora are databases consisting of signal data along with time-aligned symbolic `t...
In speech technology more and more databases of spo-ken language are becoming available. For researc...
In this paper, we address two problems in indexing and querying spoken language corpora with overlap...
The goal of the LACITO linguistic archive project is to conserve and to make available for research ...
The paper describes a method of collecting phonetic and linguistic data while maximizing the efficie...
Abstract Over the past several decades, research and development of human language technology has be...
This paper describes the setting up of a resource database for research and evaluation in the domain...
Representing annotated spoken corpora The annotation of linguistic resources has long-standing tradi...
This paper proposes a methodology for querying linguistic data represented in different corpus forma...
Large and open multiparallel corpora are a valuable resource for contrastive corpus linguists if the...
International audienceAlthough automatic analysis and computer-aided annotation tools are being deve...
This thesis presents a novel model for analyzing queries of the users of spoken language systems in ...
Māori speech data collection and analysis is an ongoing process, as new and existing data sets are c...
Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of co...
Researchers in various fields, from acoustic phonetics to child language development, rely on digiti...