International audienceWe present in this paper a new system, MarsaTag, aiming at segmenting, tagging and chunking French input. The originality of the tool, on top of its efficiency, is its ability to process written texts as well as speech transcriptions. The tagger executes the three following operations. First, a rule-based tokenizer splits the raw textual input in a sequence of tokens. In a second step, thanks to a broad-coverage morphosyntactic lexicon, each token form is associated to a tag distribution. The last step consists in disambiguating the tagging by selecting the POS tag sequence with the highest probability. The probability of a sequence of tags is computed thanks to a stochastic model using the Hidden Markov Model machiner...
Annotated corpora are widely employed in a variety of fields such as linguistics, translation studie...
International audienceWe present DisMo, a multi-level annotator for spoken language corpora that int...
Abstract. The aim of our paper is to study the interest of part of speech (POS) tagging to improve s...
International audienceWe present in this paper a new system, MarsaTag, aiming at segmenting, tagging...
International audienceWe present in this paper a new system, MarsaTag, aiming at segmenting, tagging...
Tools for textual data enrichment (written text and speech transcription) : tokenizer, morphosyntact...
The explicit introduction of morphosyntactic information into statistical machine translation approa...
This paper describes the process and the resources used to automatically annotate a French corpus of...
This paper describes the process and the resources used to automatically annotate a French corpus of...
This paper presents a new part-ofspeech tagger that takes into account both linguistic knowledge and...
This paper describes the process and the resources used to automatically annotate a French corpus of...
This paper describes the process and the resources used to automatically annotate a French corpus of...
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging ...
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging ...
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging ...
Annotated corpora are widely employed in a variety of fields such as linguistics, translation studie...
International audienceWe present DisMo, a multi-level annotator for spoken language corpora that int...
Abstract. The aim of our paper is to study the interest of part of speech (POS) tagging to improve s...
International audienceWe present in this paper a new system, MarsaTag, aiming at segmenting, tagging...
International audienceWe present in this paper a new system, MarsaTag, aiming at segmenting, tagging...
Tools for textual data enrichment (written text and speech transcription) : tokenizer, morphosyntact...
The explicit introduction of morphosyntactic information into statistical machine translation approa...
This paper describes the process and the resources used to automatically annotate a French corpus of...
This paper describes the process and the resources used to automatically annotate a French corpus of...
This paper presents a new part-ofspeech tagger that takes into account both linguistic knowledge and...
This paper describes the process and the resources used to automatically annotate a French corpus of...
This paper describes the process and the resources used to automatically annotate a French corpus of...
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging ...
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging ...
International audienceThe aim of our paper is to study the interest of part of speech (POS) tagging ...
Annotated corpora are widely employed in a variety of fields such as linguistics, translation studie...
International audienceWe present DisMo, a multi-level annotator for spoken language corpora that int...
Abstract. The aim of our paper is to study the interest of part of speech (POS) tagging to improve s...