This article describes an unsupervised machine learning method for computing distributed vector representation of molecular fragments. These vectors encode fragment features in a continuous high-dimensional space and enable similarity computation between individual fragments, even for small fragments with only two heavy atoms. The method is based on a word embedding algorithm borrowed from natural language processing field, and approximately 6 million unlabeled PubChem chemicals were used for training. The resulting dense fragment vectors are in contrast to the traditional sparse “one-hot” fragment representation and capture rich relational structure in the fragment space. The vectors of small linear fragments were averaged to yield distrib...
The main task of drug discovery is to find novel bioactive molecules, i.e., chemical compounds that,...
Molecular dynamics (MD) simulations present a data-mining challenge, given that they can generate a ...
International audienceStructural libraries of fragments are commonly used to model or design the 3D ...
<p>1. Approximately 3.7 million (3,700,103) PubChem compounds were used in training the fragment ...
Chosen molecular representation is one of the key parameters of virtual screening campaigns where on...
The ensemble of conceivable molecules is referred to as the Chemical Space. In this article we descr...
International audienceIn structural biology, many fragment-based 3D modeling methods require fragmen...
Abstract. This paper discusses methods to discover frequent, discriminative connected subgraphs (fra...
Molecule generation is a challenging open problem in cheminformatics. Currently, deep generative app...
We present an algorithm to find fragments in a set of molecules that help to discriminate between di...
In real world applications sequential algorithms of data mining and data exploration are often unsu...
We present an algorithm to find fragments in a set of molecules that help to discriminate between di...
We present an algorithm to find fragments in a set of molecules that help to discriminate between di...
In real world applications sequential algorithms of data mining and data exploration are often unsui...
A new algorithm for divisive hierarchical clustering of chemical compounds based on 2D structural fr...
The main task of drug discovery is to find novel bioactive molecules, i.e., chemical compounds that,...
Molecular dynamics (MD) simulations present a data-mining challenge, given that they can generate a ...
International audienceStructural libraries of fragments are commonly used to model or design the 3D ...
<p>1. Approximately 3.7 million (3,700,103) PubChem compounds were used in training the fragment ...
Chosen molecular representation is one of the key parameters of virtual screening campaigns where on...
The ensemble of conceivable molecules is referred to as the Chemical Space. In this article we descr...
International audienceIn structural biology, many fragment-based 3D modeling methods require fragmen...
Abstract. This paper discusses methods to discover frequent, discriminative connected subgraphs (fra...
Molecule generation is a challenging open problem in cheminformatics. Currently, deep generative app...
We present an algorithm to find fragments in a set of molecules that help to discriminate between di...
In real world applications sequential algorithms of data mining and data exploration are often unsu...
We present an algorithm to find fragments in a set of molecules that help to discriminate between di...
We present an algorithm to find fragments in a set of molecules that help to discriminate between di...
In real world applications sequential algorithms of data mining and data exploration are often unsui...
A new algorithm for divisive hierarchical clustering of chemical compounds based on 2D structural fr...
The main task of drug discovery is to find novel bioactive molecules, i.e., chemical compounds that,...
Molecular dynamics (MD) simulations present a data-mining challenge, given that they can generate a ...
International audienceStructural libraries of fragments are commonly used to model or design the 3D ...