The extraction of chemical information from documents is a demanding task in cheminformatics due to the variety of text and image-based representations of chemistry. The present work describes the extraction of chemical compounds with unique chemical structures from the open access CORE (COnnecting REpositories) and Google Patents full text document repositories. The importance of structure normalization is demonstrated using three open access cheminformatics toolkits: the Chemistry Development Kit (CDK), RDKit and OpenChemLib (OCL). Each toolkit was used for structure parsing, normalization and subsequent substructure searching, using SMILES as structure representations of chemical molecules and International Chemical Identifiers (InChIs) ...
Chemists not only produce a significant amount of data-rich scholarly communication artifacts, but h...
Background In commercial research and development projects, public disclosure of new chemical compou...
ABSTRACT: The availability of structures and linked bioactivity data in databases is powerfully enab...
peer reviewedExtracting PFAS with open source cheminformatics toolkits reveals ~1.78 million PFAS in...
Presentation in the " Poly- and perfluoroalkyl substances (PFASs): Addressing Urgent Questions in th...
A great wealth of chemical information is to be found in the literature. For example, PubMed contain...
The SureChEMBL database provides open access to 17 million chemical entities mentioned in 14 million...
<p>The communication of chemistry-related information occurs both via print and electronic media and...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
Patent specifications are one of many information sources needed to progress drug discovery projects...
Background Wikipedia, the world's largest and most popular encyclopedia is an indispensable source ...
The discovery of new chemical compounds and their synthesis process is of great importance to the ch...
Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fi...
Efficient access to chemical information contained in scientific literature, patents, technical repo...
In depth analysis of non-patent literature prior art is a crucial step in checking patentability of ...
Chemists not only produce a significant amount of data-rich scholarly communication artifacts, but h...
Background In commercial research and development projects, public disclosure of new chemical compou...
ABSTRACT: The availability of structures and linked bioactivity data in databases is powerfully enab...
peer reviewedExtracting PFAS with open source cheminformatics toolkits reveals ~1.78 million PFAS in...
Presentation in the " Poly- and perfluoroalkyl substances (PFASs): Addressing Urgent Questions in th...
A great wealth of chemical information is to be found in the literature. For example, PubMed contain...
The SureChEMBL database provides open access to 17 million chemical entities mentioned in 14 million...
<p>The communication of chemistry-related information occurs both via print and electronic media and...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
Patent specifications are one of many information sources needed to progress drug discovery projects...
Background Wikipedia, the world's largest and most popular encyclopedia is an indispensable source ...
The discovery of new chemical compounds and their synthesis process is of great importance to the ch...
Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fi...
Efficient access to chemical information contained in scientific literature, patents, technical repo...
In depth analysis of non-patent literature prior art is a crucial step in checking patentability of ...
Chemists not only produce a significant amount of data-rich scholarly communication artifacts, but h...
Background In commercial research and development projects, public disclosure of new chemical compou...
ABSTRACT: The availability of structures and linked bioactivity data in databases is powerfully enab...