Biomedical terms extracted using Word2vec, the most popular word embedding model in recent years, serve as the foundation for various natural language processing (NLP) applications, such as biomedical information retrieval, relation extraction, and recommendation systems. The objective of this study is to examine how changes in the ratio of the biomedical domain to general domain data in the corpus affect the extraction of similar biomedical terms using Word2vec. We downloaded abstracts of 214,892 articles from PubMed Central (PMC) and the 3.9 GB Billion Word (BW) benchmark corpus from the computer science community. The datasets were preprocessed and grouped into 11 corpora based on the ratio of BW to PMC, ranging from 0:10 to 10:0, and th...
which permits unrestricted use, distribution, and reproduction in any medium, provided the original ...
Big Data paradigm is leading both research and industry effort calling for new approaches in many co...
Abstract Background Word representations support a variety of Natural Language Processing (NLP) task...
Abstract Background Understanding semantic relatedness and similarity between biomedical terms has a...
Due to the recent advances in unsupervised language processing methods, it’s now possible to use lar...
Biomedical and life science literature is an essential way to publish experimental results. With the...
Discovering links and relationships is one of the main challenges in biomedical research, as scienti...
The massive growth of biomedical text makes it very challenging for researchers to review all releva...
Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural...
Evaluation Dataset: Samples/words in Bio-SimVerb (verbs) and Bio-SimLex (nouns) are collected from a...
BACKGROUND: Word representations support a variety of Natural Language Processing (NLP) tasks. The q...
BACKGROUND: We introduce the linguistic annotation of a corpus of 97 full-text biomedical publicatio...
Comprehensive terminology is essential for a community to describe, exchange, and retrieve data. In ...
Biomedical data exists in the form of journal articles, research studies, electronic health records,...
Abstract Background Due to the rapidly expanding body of biomedical literature, biologists require i...
which permits unrestricted use, distribution, and reproduction in any medium, provided the original ...
Big Data paradigm is leading both research and industry effort calling for new approaches in many co...
Abstract Background Word representations support a variety of Natural Language Processing (NLP) task...
Abstract Background Understanding semantic relatedness and similarity between biomedical terms has a...
Due to the recent advances in unsupervised language processing methods, it’s now possible to use lar...
Biomedical and life science literature is an essential way to publish experimental results. With the...
Discovering links and relationships is one of the main challenges in biomedical research, as scienti...
The massive growth of biomedical text makes it very challenging for researchers to review all releva...
Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural...
Evaluation Dataset: Samples/words in Bio-SimVerb (verbs) and Bio-SimLex (nouns) are collected from a...
BACKGROUND: Word representations support a variety of Natural Language Processing (NLP) tasks. The q...
BACKGROUND: We introduce the linguistic annotation of a corpus of 97 full-text biomedical publicatio...
Comprehensive terminology is essential for a community to describe, exchange, and retrieve data. In ...
Biomedical data exists in the form of journal articles, research studies, electronic health records,...
Abstract Background Due to the rapidly expanding body of biomedical literature, biologists require i...
which permits unrestricted use, distribution, and reproduction in any medium, provided the original ...
Big Data paradigm is leading both research and industry effort calling for new approaches in many co...
Abstract Background Word representations support a variety of Natural Language Processing (NLP) task...