International audienceDuring the last decade, the availability of scientific papers in full text and in in machine-readable formats has become more and more widespread thanks to the growing number of publications on online platforms such as ArXiv, CiteSeer or PLoS and so forth. At the same time, research in the field of natural language processing and computational linguistics have provided a number of open source tools for versatile text processing (e.g. NLTK, Mallet, OpenNLP, CoreNLP, Gate, CiteSpace). The rise of Open Access publishing and the standardized formats for the representation of scientific papers (such as NLM-JATS, TEI, DocBook), and the availability of full-text datasets for research experiments and information retrieval corp...