We introduce several novel word features for keyword extraction and headline generation. These new word features are derived according to the background knowledge of a document as supplied by Wikipedia. Given a document, to acquire its background knowledge from Wikipedia, we first generate a query for searching the Wikipedia corpus based on the key facts present in the document. We then use the query to find articles in the Wikipedia corpus that are closely related to the contents of the document. With the Wikipedia search result article set, we extract the inlink, outlink, category and infobox information in each article to derive a set of novel word features which reflect the document's background knowledge. These newly introduced word fe...
Keyword extraction is vital for Knowledge Management Systems, Information Re- trieval Systems, and D...
A journal article is often accompanied by a list of keyphrases, composed of about five to fifteen im...
This research addresses the problem of automatic keyphrase extraction from large documents and back ...
We introduce several novel word features for keyword extraction and headline generation. These new w...
In this paper we present three new methods to extract key-words from web pages using Wikipedia as an...
Abstract. Automatic keyword extraction is an important subfield of information extraction process. I...
Keywords have become integral to many Knowledge Management Systems, Information Retrieval Systems, a...
This thesis deals with automatic type extraction in English Wikipedia articles and their attributes....
Extraction of keyphrases from individual documents is a research area in which one try to extract a ...
The process whereby inferences are made from textual data is broadly referred to as text mining. In ...
Abstract-- key terms are important terms in the document, which can give high-level description of c...
Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to ...
Keyphrases that efficiently summarize a document’s content are used in various document processing a...
Summarization and Keyword Selection are two important tasks in NLP community. Although both aim to s...
It is a fundamental and important task to extract key phrases from documents. Generally, phrases in ...
Keyword extraction is vital for Knowledge Management Systems, Information Re- trieval Systems, and D...
A journal article is often accompanied by a list of keyphrases, composed of about five to fifteen im...
This research addresses the problem of automatic keyphrase extraction from large documents and back ...
We introduce several novel word features for keyword extraction and headline generation. These new w...
In this paper we present three new methods to extract key-words from web pages using Wikipedia as an...
Abstract. Automatic keyword extraction is an important subfield of information extraction process. I...
Keywords have become integral to many Knowledge Management Systems, Information Retrieval Systems, a...
This thesis deals with automatic type extraction in English Wikipedia articles and their attributes....
Extraction of keyphrases from individual documents is a research area in which one try to extract a ...
The process whereby inferences are made from textual data is broadly referred to as text mining. In ...
Abstract-- key terms are important terms in the document, which can give high-level description of c...
Keyphrases describe a document in a coherent and simple way, giving the prospective reader a way to ...
Keyphrases that efficiently summarize a document’s content are used in various document processing a...
Summarization and Keyword Selection are two important tasks in NLP community. Although both aim to s...
It is a fundamental and important task to extract key phrases from documents. Generally, phrases in ...
Keyword extraction is vital for Knowledge Management Systems, Information Re- trieval Systems, and D...
A journal article is often accompanied by a list of keyphrases, composed of about five to fifteen im...
This research addresses the problem of automatic keyphrase extraction from large documents and back ...