International audienceAbstract Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool’s development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57–81 and 63–83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operat...
textabstractWe participated in task 2 of the CLEF eHealth 2016 chal-lenge. Two subtasks were address...
Most surveys use an open-ended question to measure occupation, followed by office coding. This is ex...
International audiencePro-TEXT is a corpus of keystroke logs written in French. Keystroke logs are r...
Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) ...
External cause of injury codes (E-codes) and the Occupational Injury and Illness Classification syst...
Machine learning approaches achieve high accuracy for text recognition and are therefore increasingl...
Currently, most surveys ask for occupation with open-ended questions. The verbatim responses are cod...
Abstract. This paper describes Automated Industry and Occupation Coding System (AIOCS). The main fun...
The increasing availability of digitised registration records presents a significant opportunity for...
The increasing availability of digitised registration records presents a significant opportunity for...
The encoding of Electronic Medical Records is a complex and time-consuming task. We report on a mach...
We develop a new automatic coding system with a three-grade confidence level corresponding to each o...
This paper describes the architecture of an encoding system which aim is to be implemented as a codi...
Abstract in Undetermined The importance of computer learner corpora for research in both second lang...
International audienceIn the area of large French speech corpora, there is a demonstrated need for a...
textabstractWe participated in task 2 of the CLEF eHealth 2016 chal-lenge. Two subtasks were address...
Most surveys use an open-ended question to measure occupation, followed by office coding. This is ex...
International audiencePro-TEXT is a corpus of keystroke logs written in French. Keystroke logs are r...
Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) ...
External cause of injury codes (E-codes) and the Occupational Injury and Illness Classification syst...
Machine learning approaches achieve high accuracy for text recognition and are therefore increasingl...
Currently, most surveys ask for occupation with open-ended questions. The verbatim responses are cod...
Abstract. This paper describes Automated Industry and Occupation Coding System (AIOCS). The main fun...
The increasing availability of digitised registration records presents a significant opportunity for...
The increasing availability of digitised registration records presents a significant opportunity for...
The encoding of Electronic Medical Records is a complex and time-consuming task. We report on a mach...
We develop a new automatic coding system with a three-grade confidence level corresponding to each o...
This paper describes the architecture of an encoding system which aim is to be implemented as a codi...
Abstract in Undetermined The importance of computer learner corpora for research in both second lang...
International audienceIn the area of large French speech corpora, there is a demonstrated need for a...
textabstractWe participated in task 2 of the CLEF eHealth 2016 chal-lenge. Two subtasks were address...
Most surveys use an open-ended question to measure occupation, followed by office coding. This is ex...
International audiencePro-TEXT is a corpus of keystroke logs written in French. Keystroke logs are r...