Structured knowledge representations are becoming central to the area of Information Science. Search engines companies have said that constructing an entity graph is the key to classifying their enormous corpus of documents in order to provide more relevant results to their users. Our work presents WikiLabel, a novel approach to text classification using ontological knowledge. We match a document's terms to Wikipedia entities and use, amongst other measures, the path-length shortest distance from each entity to a given Wikipedia category to determine which label should be associated with the document. In the second part of our work, we use the obtained labels to train a supervised machine learning text classification algorithm, an approach ...
This thesis focuses on the design of algorithms for the extraction of knowledge (in terms of entitie...
Extracting and disambiguating entities and concepts is a crucial step toward understanding natural l...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...
When humans approach the task of text categorization, they interpret the specific wording of the doc...
Document classification is a key task for many text min-ing applications. However, traditional text ...
Wikipedia is a goldmine of information. Each article describes a single concept, and together they c...
The exponential growth of text documents available on the Internet has created an urgent need for ac...
The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one sti...
Supervised learning is a popular approach to text classification among the research community as wel...
Due to the long duration required to perform manual knowledge entry by human knowledge engineers it ...
Because of the explosion of digital and online text information, automatic organization of documents...
The quality and maintainability of a knowledge graph are determined by the process in which it is cr...
Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics...
© 2017 Elsevier Inc. A traditional classification approach based on keyword matching represents each...
As free online encyclopedias with massive volumes of content, Wikipedia and Wikidata are key to many...
This thesis focuses on the design of algorithms for the extraction of knowledge (in terms of entitie...
Extracting and disambiguating entities and concepts is a crucial step toward understanding natural l...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...
When humans approach the task of text categorization, they interpret the specific wording of the doc...
Document classification is a key task for many text min-ing applications. However, traditional text ...
Wikipedia is a goldmine of information. Each article describes a single concept, and together they c...
The exponential growth of text documents available on the Internet has created an urgent need for ac...
The Web has evolved into a huge mine of knowledge carved in different forms, the predominant one sti...
Supervised learning is a popular approach to text classification among the research community as wel...
Due to the long duration required to perform manual knowledge entry by human knowledge engineers it ...
Because of the explosion of digital and online text information, automatic organization of documents...
The quality and maintainability of a knowledge graph are determined by the process in which it is cr...
Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics...
© 2017 Elsevier Inc. A traditional classification approach based on keyword matching represents each...
As free online encyclopedias with massive volumes of content, Wikipedia and Wikidata are key to many...
This thesis focuses on the design of algorithms for the extraction of knowledge (in terms of entitie...
Extracting and disambiguating entities and concepts is a crucial step toward understanding natural l...
This paper describes the usage of machine learning techniques to assign keywords to documents. The l...