We introduce WikiDoMiner - a tool for automatically generating domain-specific corpora by crawling Wikipedia. WikiDoMiner helps requirements engineers create an external knowledge resource that is specific to the underlying domain of a given requirements specification (RS). Being able to build such a resource is important since domain-specific datasets are scarce. WikiDoMiner generates a corpus by first extracting a set of domain-specific keywords from a given RS, and then querying Wikipedia for these keywords. The output of WikiDoMiner is a set of Wikipedia articles relevant to the domain of the input RS. Mining Wikipedia for domain-specific knowledge can be beneficial for multiple requirements engineering tasks, e.g., ambiguity handling, ...
Domain terms are a useful resource for tuning both resources and NLP processors to domain specific t...
Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show h...
There are many opportunities to improve the interactivity of information retrieval systems beyond th...
We introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora by crawling...
We introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora by crawling...
peer reviewedWe introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora...
peer reviewedWe introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora...
The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. F...
AbstractThe online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked art...
AbstractThe online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked art...
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing comm...
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing comm...
We present a simple but effective method of automatically extracting domain-specific terms using Wik...
Wikipedia is not only a large encyclopedia, but lately also a source of linguistic data for various ...
This paper focuses on the central role played by lexical information in the task of Recognizing Text...
Domain terms are a useful resource for tuning both resources and NLP processors to domain specific t...
Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show h...
There are many opportunities to improve the interactivity of information retrieval systems beyond th...
We introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora by crawling...
We introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora by crawling...
peer reviewedWe introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora...
peer reviewedWe introduce WikiDoMiner -- a tool for automatically generating domain-specific corpora...
The online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked articles. F...
AbstractThe online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked art...
AbstractThe online encyclopedia Wikipedia is a vast, constantly evolving tapestry of interlinked art...
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing comm...
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing comm...
We present a simple but effective method of automatically extracting domain-specific terms using Wik...
Wikipedia is not only a large encyclopedia, but lately also a source of linguistic data for various ...
This paper focuses on the central role played by lexical information in the task of Recognizing Text...
Domain terms are a useful resource for tuning both resources and NLP processors to domain specific t...
Domain-specific thesauri are high-cost, high-maintenance, high-value knowledge structures. We show h...
There are many opportunities to improve the interactivity of information retrieval systems beyond th...