ThesisMethods to automatically extract and validate data from the chemical literature in legacy formats to machine-understandable forms are examined. The work focuses of three types of data: analytical data reported in articles, computational chemistry output files and crystallographic information files (CIFs). It is shown that machines are capable of reading and extracting analytical data from the current legacy formats with high recall and precision. Regular expressions cannot identify chemical names with high precision or recall but non-deterministic methods perform significantly better. The lack of machine-understandable connection tables in the literature has been identified as the major issue preventing molecule-based data-driven scie...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
Single crystal X-ray crystallography has developed into a unique, highly automated and accessible to...
Storing chemical information in a computer is not a trivial task. Many different approaches and form...
Methods to automatically extract Open Data from the chemical literature, validate it, and use it to ...
A freely available small-molecule structure database, the Crystallography Open Database (COD), is us...
This dissertation describes fully automated means to extract geometric information – interatomic bon...
Abstract Knowledge about the 3-dimensional structure, orientation and interaction of chemical compou...
Background: To search for chemical structures in research articles, diagrams or text representing mo...
<p>The communication of chemistry-related information occurs both via print and electronic media and...
The field of Chemoinformatics has enabled QSAR/QSPR predictive models useful for the rapid virtual a...
Chemists not only produce a significant amount of data-rich scholarly communication artifacts, but h...
The crystallographically determined bond length, valence angle, and torsion angle information in the...
Recently the computer graphics systems and memory capabilities necessary to perform detailed chemica...
Background: To search for chemical structures in research articles, diagrams or text representing mo...
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Compute...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
Single crystal X-ray crystallography has developed into a unique, highly automated and accessible to...
Storing chemical information in a computer is not a trivial task. Many different approaches and form...
Methods to automatically extract Open Data from the chemical literature, validate it, and use it to ...
A freely available small-molecule structure database, the Crystallography Open Database (COD), is us...
This dissertation describes fully automated means to extract geometric information – interatomic bon...
Abstract Knowledge about the 3-dimensional structure, orientation and interaction of chemical compou...
Background: To search for chemical structures in research articles, diagrams or text representing mo...
<p>The communication of chemistry-related information occurs both via print and electronic media and...
The field of Chemoinformatics has enabled QSAR/QSPR predictive models useful for the rapid virtual a...
Chemists not only produce a significant amount of data-rich scholarly communication artifacts, but h...
The crystallographically determined bond length, valence angle, and torsion angle information in the...
Recently the computer graphics systems and memory capabilities necessary to perform detailed chemica...
Background: To search for chemical structures in research articles, diagrams or text representing mo...
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Compute...
The automated extraction of semantic chemical data from the existing literature is demonstrated. For...
Single crystal X-ray crystallography has developed into a unique, highly automated and accessible to...
Storing chemical information in a computer is not a trivial task. Many different approaches and form...