Вы находитесь на странице: 1из 1

Molecular representation

Based on: Chapter 3 of An Introduction to Chemoinformatics, by A. Leach & V. Gillet and


”Molecular Descriptors”, by V. Consonni & R. Todeschini
Brandon Meza-González

September 4, 2018

In order to obtain and manipulate information from molecules, chemist have developed the
concept of molecular descriptor. Descriptors are numerical models used for characterize specific
properties of molecules, they are obtained with different methods. These models can be simple (e.g.
a calc of molecular mass) or more elaborated based on quantum mechanics approximations.

Descriptors based on 2D representations There are elementary descriptors like number


of hydrogens, bond donors/acceptors, rotatables bonds and so on, which are often combined with
other descriptors.
Hydrophobicity, solubility and molar refractivity are physicochemical descriptors that can be obtai-
ned through atomic contributions. There also exist structural descriptors like topological indices,
which characterize structures by branching, size and overall shape. Some of the most used are χ mo-
lecular connectivity indices, κ shape indices and electrotopological state indices. Furthermore, there
are descriptors such Atom pairs and Topological torsions that bring a description of the presence or
absence of a particular atom pair or topological torsion in the molecule, however, BCUT descriptors
are used to detail intermolecular interactions.
In the other hand, 2D fingerprints are descriptors developed to improve the performance of substruc-
ture searching algorithms, but they are also employed due to their dependency with other molecular
properties.

Descriptors based on 3D representations Although generation of 3D structure can be


computationally time-consuming it’s important to develop 3D descriptors since real molecule con-
formations are three dimensional, for this reason 3D fragment screens are useful to 3D substructure
searching, likewise Pharmacophore keys is a extension of it and they’re crucial for the knowledge
of receptor binding. This descriptor is widely used with multiple properties: donor, acceptor, acid,
base, aromatic centre and hydrophobic behavior. Moreover, many 3D descriptors have their 2D
counterparts.

Data verification and manipulation It is also important to obtain the certainty that des-
criptors will use are good enough to employ them in an molecular analysis. There are many techni-
ques to reach it, for instance: Data distribution, scaling (in order to obtain an accurate comparison
between descriptors) and correlations. Other important method used to reduce the number of va-
riables that describe objects is Principal Components Analysis (PCA), this method introduce some
special properties (principal components)which are useful to describe variations in data. Principal
components are linear combinations of certain variables.

Вам также может понравиться