Methods to automatically extract Open Data from the chemical literature, validate it, and use it to validate theory are examined. Chemical identifiers which assist the automatic location of chemical structures using commercial Web search engines are investigated. The IUPAC International Chemical Idenfitifer (InChI) gives almost 100% recall and precision, though is shown to be too long for present search engines. A combination of InChI and InChIKey, a shorter, fixed-length hash of the InChIstring, is concluded to be the best current method of identifying structures. The proportion of published, Open Crystallographic Information Files(CIFs) that are valid with respect to the specification is shown to be improving, and is around 99% in 2007. T...
Abstract Computer descriptions of chemical molecular connectivity are necessary for searching chemic...
Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fi...
Computers are increasingly being used to manage the deluge of experimental and computational data th...
Methods to automatically extract Open Data from the chemical literature, validate it, and use it to...
ThesisMethods to automatically extract and validate data from the chemical literature in legacy form...
<p>The communication of chemistry-related information occurs both via print and electronic media and...
As the major media, often the only source for chemical information, Internet provides both challenge...
<p>Cheminformatics methods form an essential basis for providing analytical scientists with access t...
A freely available small-molecule structure database, the Crystallography Open Database (COD), is us...
The problem: Vast quantities of chemical data (e.g. crystal structures, NMR spectra, experimental re...
Crystallographic information provides the fundamental basis for understanding the properties and beh...
The task of finding chemical information online can be daunting, since even the most rudimentary que...
This dissertation describes fully automated means to extract geometric information – interatomic bon...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
The prediction of macroscopic chemical properties is recognised as a valuable ability. Statistical ...
Abstract Computer descriptions of chemical molecular connectivity are necessary for searching chemic...
Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fi...
Computers are increasingly being used to manage the deluge of experimental and computational data th...
Methods to automatically extract Open Data from the chemical literature, validate it, and use it to...
ThesisMethods to automatically extract and validate data from the chemical literature in legacy form...
<p>The communication of chemistry-related information occurs both via print and electronic media and...
As the major media, often the only source for chemical information, Internet provides both challenge...
<p>Cheminformatics methods form an essential basis for providing analytical scientists with access t...
A freely available small-molecule structure database, the Crystallography Open Database (COD), is us...
The problem: Vast quantities of chemical data (e.g. crystal structures, NMR spectra, experimental re...
Crystallographic information provides the fundamental basis for understanding the properties and beh...
The task of finding chemical information online can be daunting, since even the most rudimentary que...
This dissertation describes fully automated means to extract geometric information – interatomic bon...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
The prediction of macroscopic chemical properties is recognised as a valuable ability. Statistical ...
Abstract Computer descriptions of chemical molecular connectivity are necessary for searching chemic...
Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fi...
Computers are increasingly being used to manage the deluge of experimental and computational data th...