Supplementary information for the preprint (Struct2IUPAC: A transformer-based model for chemical names generation) version 2.0. opsin_mistakes.csv -- Structures that have been processed by OPSIN incorrectly. IUPAC2Struct_mistakes.csv -- Structures that have been processed by our IUPAC2Smiles model incorrectly. test_100000_uniform.csv -- 100 000 structures from the test set. smiles_lens_1000.zip -- Sets of SMILES of different lengths IUPAC2Struct-main.zip -- the code from github (model is at the separate file iupac2smiles_model.pt, put it to "models" folder!) iupac2smiles_model.pt -- binary dump of pytorch IUPAC2Struct mode
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequence...
The structural validation problem using quantum chemistry approaches (confirm or reject a candidate ...
The defined secondary structure of proteins method is often considered the gold standard for assignm...
Supplementary information for the paper (Struct2IUPAC: A transformer-based model for chemical names ...
Supplementary information for the paper: Image2SMILES: Transformer-based Molecular Optical Recogniti...
This is the supplementary data for the manuscript: Image2SMILES: Transformer-based Molecular Optical...
Abstract We developed a Transformer-based artificial neural approach to translate between SMILES and...
Table S1. The 25 sequences of the G Switch Proteins dataset (GSW25). The 12 G A sequences and 13 G B...
Circular dichroism spectroscopy is a quick method for determining the average secondary structures o...
Circular dichroism (CD) spectroscopy is a quick method for measuring data that can be used to determ...
Prediction of protein secondary structures is one of the oldest problems in Bioinformatics. Although...
Motivation: Accurately predicting protein side-chain conformations is an important subproblem of the...
This repository contains a pre-processed dataset derived from the GNPS public repository of natural ...
<p>Dataset 1: List of 5422 native protein structures<br>The positive dataset (PSN-QA_positive) used ...
Data for the paper "Physics-enhanced neural networks for equation-of-state calculations" There are ...
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequence...
The structural validation problem using quantum chemistry approaches (confirm or reject a candidate ...
The defined secondary structure of proteins method is often considered the gold standard for assignm...
Supplementary information for the paper (Struct2IUPAC: A transformer-based model for chemical names ...
Supplementary information for the paper: Image2SMILES: Transformer-based Molecular Optical Recogniti...
This is the supplementary data for the manuscript: Image2SMILES: Transformer-based Molecular Optical...
Abstract We developed a Transformer-based artificial neural approach to translate between SMILES and...
Table S1. The 25 sequences of the G Switch Proteins dataset (GSW25). The 12 G A sequences and 13 G B...
Circular dichroism spectroscopy is a quick method for determining the average secondary structures o...
Circular dichroism (CD) spectroscopy is a quick method for measuring data that can be used to determ...
Prediction of protein secondary structures is one of the oldest problems in Bioinformatics. Although...
Motivation: Accurately predicting protein side-chain conformations is an important subproblem of the...
This repository contains a pre-processed dataset derived from the GNPS public repository of natural ...
<p>Dataset 1: List of 5422 native protein structures<br>The positive dataset (PSN-QA_positive) used ...
Data for the paper "Physics-enhanced neural networks for equation-of-state calculations" There are ...
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequence...
The structural validation problem using quantum chemistry approaches (confirm or reject a candidate ...
The defined secondary structure of proteins method is often considered the gold standard for assignm...