In an effort to automate the process of identifying and analyzing the use of software in biomedical research, we have developed a SciBERT-based machine learning model to extract mentions of software from scientific articles. The input to this model is the full text from a scientific article and the output is a list of mentioned software within it. We applied this model to the CORD-19 full-text articles and stored the output in this dataset, which includes metadata of over 77,000 COVID-19 and coronavirus-related papers and a list of software tools mentioned in each.Notes: Not all papers in the CORD-19 dataset mention software. We only include here the subset of articles for which there was full-text and which also had at least one detect...
Target identification and prioritisation are prominent first steps in modern drug discovery. Traditi...
The aim of this work is to accelerate scientific discovery by advancing machine reading approaches d...
Science is progressive, and every discovery, set of data, and publication builds on previous work. T...
In this paper, we investigate progress toward improved software citation by examining current softwa...
The code accompanying our new dataset of software mentions in biomedical papers (dataset, preprint)....
Softcite software mention extraction from the CORD-19 publications This dataset is the result of t...
Software and data have become major components of modern research, which is also reflected by an inc...
<p>This is a compressed .sql.gz file of a MySQL database dump. The table contains the automatically ...
The COVID-19 pandemic has resulted in an unprecedented acceleration in scientific production across ...
Despite the popularity of data-driven research in scientific fields, we are intrigued by the combine...
Semantic text annotations have been a key factor for supporting computer applications ranging from k...
[Abstract] Background: Currently, existing biomedical literature repositories do not commonly prov...
Text analysis can help to identify named entities (NEs) of small molecules, proteins, and genes. Suc...
Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the d...
The Softcite dataset is a gold-standard dataset of software mentions in research publications, a fre...
Target identification and prioritisation are prominent first steps in modern drug discovery. Traditi...
The aim of this work is to accelerate scientific discovery by advancing machine reading approaches d...
Science is progressive, and every discovery, set of data, and publication builds on previous work. T...
In this paper, we investigate progress toward improved software citation by examining current softwa...
The code accompanying our new dataset of software mentions in biomedical papers (dataset, preprint)....
Softcite software mention extraction from the CORD-19 publications This dataset is the result of t...
Software and data have become major components of modern research, which is also reflected by an inc...
<p>This is a compressed .sql.gz file of a MySQL database dump. The table contains the automatically ...
The COVID-19 pandemic has resulted in an unprecedented acceleration in scientific production across ...
Despite the popularity of data-driven research in scientific fields, we are intrigued by the combine...
Semantic text annotations have been a key factor for supporting computer applications ranging from k...
[Abstract] Background: Currently, existing biomedical literature repositories do not commonly prov...
Text analysis can help to identify named entities (NEs) of small molecules, proteins, and genes. Suc...
Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the d...
The Softcite dataset is a gold-standard dataset of software mentions in research publications, a fre...
Target identification and prioritisation are prominent first steps in modern drug discovery. Traditi...
The aim of this work is to accelerate scientific discovery by advancing machine reading approaches d...
Science is progressive, and every discovery, set of data, and publication builds on previous work. T...