A perl programme to tokenise texts in Occitan. The programme is an adaptation from the perl programme to tokenize texts in French made by Tanguy et Hathout (2007) in its extended version (that is to say with a list of exceptions). To launch the programme, execute the following instruction: perl segmenteur_occitan.pl exceptions_occitan.txt output This tool was developed in the context of the RESTAURE project, funded by the French ANR
Part-Of-Speech (POS) tagging, including tokenization and sentence splitting, is the first step in al...
International audienceThe RESTAURE project (2015-2018) aimed at providing computational resources a...
This paper presents recent contributions to the creation of NLP tools and resources for Occitan. Sev...
A python module to tokenise texts in the Alsatian dialects. See the module header for help on how to...
Site d'accompagnement : http://perl.linguistes.free.frAdressé aux linguistes qui souhaitent travaill...
Site d'accompagnement : http://perl.linguistes.free.frAdressé aux linguistes qui souhaitent travaill...
These guidelines were produced in the context of the RESTAURE project, funded by the French ANR. Th...
This software is developed for the tokenisation of Picard texts, e.g. splitting sentences into words...
This corpus contains a collection of texts in Occitan which were manually annotated with parts-of-sp...
The French Perl Workshop (Journées Francophones de Perl - FPW2006) Communication oraleDans le cadre ...
Proceedings to appearInternational audienceLanguage documentation, as defined by Himmelmann (2006), ...
International audienceThe RESTAURE project (2015-2018) aimed at providing digital resources and natu...
http://www.univ-nancy2.fr/pers/namer/Telecharger_Flemm.htmFLEMMv3.1 est un ensemble de modules Perl5...
International audienceLe Pôle d’Elaboration de Ressources Linguistiques (PERL) est un service partag...
Corpus audiovisuel DLC (Diversité Culturelle et Linguistique)1) Xavier BACH - Auteur, chargé de cour...
Part-Of-Speech (POS) tagging, including tokenization and sentence splitting, is the first step in al...
International audienceThe RESTAURE project (2015-2018) aimed at providing computational resources a...
This paper presents recent contributions to the creation of NLP tools and resources for Occitan. Sev...
A python module to tokenise texts in the Alsatian dialects. See the module header for help on how to...
Site d'accompagnement : http://perl.linguistes.free.frAdressé aux linguistes qui souhaitent travaill...
Site d'accompagnement : http://perl.linguistes.free.frAdressé aux linguistes qui souhaitent travaill...
These guidelines were produced in the context of the RESTAURE project, funded by the French ANR. Th...
This software is developed for the tokenisation of Picard texts, e.g. splitting sentences into words...
This corpus contains a collection of texts in Occitan which were manually annotated with parts-of-sp...
The French Perl Workshop (Journées Francophones de Perl - FPW2006) Communication oraleDans le cadre ...
Proceedings to appearInternational audienceLanguage documentation, as defined by Himmelmann (2006), ...
International audienceThe RESTAURE project (2015-2018) aimed at providing digital resources and natu...
http://www.univ-nancy2.fr/pers/namer/Telecharger_Flemm.htmFLEMMv3.1 est un ensemble de modules Perl5...
International audienceLe Pôle d’Elaboration de Ressources Linguistiques (PERL) est un service partag...
Corpus audiovisuel DLC (Diversité Culturelle et Linguistique)1) Xavier BACH - Auteur, chargé de cour...
Part-Of-Speech (POS) tagging, including tokenization and sentence splitting, is the first step in al...
International audienceThe RESTAURE project (2015-2018) aimed at providing computational resources a...
This paper presents recent contributions to the creation of NLP tools and resources for Occitan. Sev...