Spoken corpora have traditionally been assembled through careful recording and transcription of discourse events, a process which is both labour intensive and often restrictive in terms of breadth of recording contexts available. To overcome these potential challenges in spoken corpus compilation, we explore the use of crowdsourcing of language samples that are reported by participants. We investigate the level of precision and recall of the ‘crowd’ when it comes to reporting language they have heard in certain contexts, alongside the use of a crowdsourcing toolkit to facilitate this task. As a focussing device for the selection of reported language samples, we draw on the use of formulaic phrases as an area that has received considerable a...
none2noA corpus is a collection of authentic, non-elicited texts selected and assembled to study lan...
This talk reports on the compilation of the new London–Lund Corpus (LLC–2) –a corpus of contemporary...
Augmented and alternative communication (AAC) devices enable users with certain communication disabi...
Spoken corpora have traditionally been assembled through careful recording and transcription of disc...
Corpora have revolutionised the way we describe and analyse language in use. The sheer scale of coll...
International audienceText corpora represent the foundation on which most natural language processin...
This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthog...
This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthog...
We explore the use of crowdsourcing to generate natural language in spoken dia-logue systems. We int...
Crowdsourcing can be defined as the purchase of data (labels, speech recordings, etc.), usually on l...
Summarization: We investigate algorithms and tools for the semi-automatic authoring of grammars for ...
Most previous work on trainable language generation has focused on two paradigms: (a) using a statis...
This paper introduces the Spoken British National Corpus 2014, an 11-million-word corpus of orthogra...
This paper demonstrates the use of crowdsourcing to accumulate ratings from na ̈ıve listeners as a m...
Statistical language modelling may not only be used to uncover the patterns which underlie the compo...
none2noA corpus is a collection of authentic, non-elicited texts selected and assembled to study lan...
This talk reports on the compilation of the new London–Lund Corpus (LLC–2) –a corpus of contemporary...
Augmented and alternative communication (AAC) devices enable users with certain communication disabi...
Spoken corpora have traditionally been assembled through careful recording and transcription of disc...
Corpora have revolutionised the way we describe and analyse language in use. The sheer scale of coll...
International audienceText corpora represent the foundation on which most natural language processin...
This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthog...
This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthog...
We explore the use of crowdsourcing to generate natural language in spoken dia-logue systems. We int...
Crowdsourcing can be defined as the purchase of data (labels, speech recordings, etc.), usually on l...
Summarization: We investigate algorithms and tools for the semi-automatic authoring of grammars for ...
Most previous work on trainable language generation has focused on two paradigms: (a) using a statis...
This paper introduces the Spoken British National Corpus 2014, an 11-million-word corpus of orthogra...
This paper demonstrates the use of crowdsourcing to accumulate ratings from na ̈ıve listeners as a m...
Statistical language modelling may not only be used to uncover the patterns which underlie the compo...
none2noA corpus is a collection of authentic, non-elicited texts selected and assembled to study lan...
This talk reports on the compilation of the new London–Lund Corpus (LLC–2) –a corpus of contemporary...
Augmented and alternative communication (AAC) devices enable users with certain communication disabi...