This dissertation proposes a set of procedures for the computational processing of Portuguese. Five tasks are covered: Sentence Segmentation, Tokenization, Part-of-Speech Tagging, Nominal Featurization and Nominal Lemmatization. These are some of the initial steps producing linguistic information Ñ such as POS categories or lemmas Ñ that is important to most subsequent processing (e.g. syntactic and semantic analysis). I follow a shallow processing approach, where linguistic information is associated to text based on local information (i.e. using the word itself or perhaps a limited window of context containing just a few words). I begin by identifying and describing the key problems raised by each task, with special focus on the problems t...
The paper discusses, on the lexical level, the integration of heuristic solutions into a lexicon bas...
To produce fast, reasonably intelligible and easily correctable translations between related languag...
This document presents the implementation of LXGram, in its version A.4.1. LXGram is a grammar for t...
This dissertation proposes a set of procedures for the computational processing of Portuguese. Five ...
Tese de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciênc...
This paper presents the TagShare project and the linguistic resources and tools for the shallow proc...
Although lemmatization is a very common subtask in many natural language processing tasks, there is ...
As in many other natural language processing (NLP) fields, the use of statistical methods is now par...
Python has a growing community of users, especially in the AI and ML fields. Yet, Computational Proc...
The paper describes Portuguese large-scale linguistic resources, mainly computational lexicons and g...
In the implementation of a surface realisation engine, many of the computational techniques seen in ...
This paper presents on-going research on the building of an electronic dictionary of frozen sentence...
Using standard methods and formats established at LADL, and adopted by several European research tea...
Tese de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciênc...
To produce fast, reasonably intelligible and easily corrected translations between related languages...
The paper discusses, on the lexical level, the integration of heuristic solutions into a lexicon bas...
To produce fast, reasonably intelligible and easily correctable translations between related languag...
This document presents the implementation of LXGram, in its version A.4.1. LXGram is a grammar for t...
This dissertation proposes a set of procedures for the computational processing of Portuguese. Five ...
Tese de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciênc...
This paper presents the TagShare project and the linguistic resources and tools for the shallow proc...
Although lemmatization is a very common subtask in many natural language processing tasks, there is ...
As in many other natural language processing (NLP) fields, the use of statistical methods is now par...
Python has a growing community of users, especially in the AI and ML fields. Yet, Computational Proc...
The paper describes Portuguese large-scale linguistic resources, mainly computational lexicons and g...
In the implementation of a surface realisation engine, many of the computational techniques seen in ...
This paper presents on-going research on the building of an electronic dictionary of frozen sentence...
Using standard methods and formats established at LADL, and adopted by several European research tea...
Tese de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciênc...
To produce fast, reasonably intelligible and easily corrected translations between related languages...
The paper discusses, on the lexical level, the integration of heuristic solutions into a lexicon bas...
To produce fast, reasonably intelligible and easily correctable translations between related languag...
This document presents the implementation of LXGram, in its version A.4.1. LXGram is a grammar for t...