Stereocentres with only three ligands:
case study
It is almost a
rule to ignore hydrogen atoms while drawing - and later storing in some
supported format � the structural diagrams of compounds. The hydrogen atoms are
said to be implicit for such compounds. If the atom with implicit hydrogen(s) is
an asymmetric centre (which is a very frequent case) then additional complexity
results for models based on the �fixed Z-coordinate�. Assignment of ligands for
such centres can be a source of additional errors. The situations get even more
complicated if one has to handle three-valent nitrogen whose asymmetry is caused
by the pair of electrons.
All these problems
have nothing to with our determinant algorithm method at all. The method is not
based on pre-determined fixed values of the Z-coordinate, but analyses strictly
the signs of the summands in (3). For implicit hydrogen and nitrogen cases it
should be assumed that the coordinates of the missing ligand(s) are identical
with the central stereocentre atom and the virtual ligand automatically gets the
lowest d rank. It can be mathematically proven that
such a virtual hydrogen can change the absolute value of the determinant (3) but
definitely does not influence the sign of this determinant and thus has meaning
in the process of the stereodescriptor assignment.
Stereocentres with four ligands: case
study
A central atom
with 4 ligands corresponds to 5 points in space while for definition of a 3D
space only 4 points are necessary. The additional information can be used for
testing the unambiguouity of coding of the stereocentre. For a central atom
within an environment of ligands corresponding to a parallel projection of a
tetrahedral system the result of the determinant algorithm method is independent
of the length of bonds (those drawn in the structural diagram!), However for
sketches of stereocentres (and its neighbouring atoms) which correspond to a
pyramidal tetrahedron the deformation of the graphic representation can be so
extensive that central atom may be determined to be located beyond the walls of
the tetrahedron delimited by the ligands. In such cases the structure can be
marked as unambiguous or, which should be tried in the first place, the lengths
of bonds should be �normalized�.
Geometrical ( cis-trans )
stereoisomers
Geometrical (or,
according to the newest terminology recommendations of IUPAC, cis-trans
isomerism) stereoisomers are, by definition, flat objects which makes the
algorithmic analysis much easier. The crucial danger, however, while processing
geometrical stereoisomers comes from the unavoidable uncertainty whether the
asymmetric (if perceived as such) placement of substituents on both sides of the
plane set by the unsaturated bond is conscious or simply accidental. Another
issue in the case of geometrical stereoisomerism for rings containing
unsaturated bonds concerns the size of the smallest ring closures within a ring
system. The rings, if supposed to display geometrical isomerism, have to exceed
a certain minimum size.
For alkenes the
terms cis and trans may be ambiguous, particularly for cases where the main
chain in substituted sequences cannot be uniquely determined, and have therefore
largely replaced by the E,Z convention. Therefore, according to the latest
recommendations of IUPAC for nomenclature of organic compounds the usage of the
traditional cis and trans is strongly discouraged [25] and this concerns all
classes of compounds and not only alkenes.
The algorithms
used in software analysis of geometrical isomers have to assign E,Z descriptors
which is a trivial task for mono-substituted atoms connected by a double bond
(single non- hydrogen ligand on both sides). In more complex cases of
multi-substituted atoms the ligands on one (or both if necessary) atoms of the
double bond are ranked using the usual CIP rules and if CIP cannot provide an
ultimate unambiguous ranking, the parity attribute and canonical numbering of
atoms is used.
Having determined
which two ligands attached to atoms that are connected by a double bond are to
be compared, one has to find out (calculate) how they are located relative to
each other. The double bond is described as E if the two ligands lie on opposite
sides of the plane of the double bond, or Z if the ligands are on the same side
of the plane. The determinant form that best fits such calculations is given in
equation (6). One can easily notice and deduce that the two triangles spread
along the double bond edge and with the third vertex at the atom of a ligand
have the same sign [according to definition (4)] only if the ligands are on the
same side of the double bond plane and different signs if they lie on opposite
sides. If the value of (4) is zero or close to zero (it measures the area of the
triangle) then the analysed ligand lies co-linearly (or almost co-linearly) with
the double bond.
Summary
The determinant
algorithm described in this paper was invented, designed, programmed and
implemented during the Registry II project launched by the Beilstein Institute
in Frankfurt/Main. The purpose of this project was to replace the existing
unreliable compound registration software with a better one. Most of the
advantages of the new software were expected in the area of stereoisomer
registration. The algorithm was (successfully) tested on huge sample of over 3.1
million compounds from the Beilstein File. The data resulting from these tests
will be presented and discussed in detail in a future paper.
|