[BlueObelisk-discuss] another CIP question

Discussion:

Robert Hanson

2017-04-19 04:31:04 UTC

OK, on to Rule 4. I am reading Chapter 9 and its associated papers very
closely now. For anyone of the opinion that "this has been done many times
before" I point out that between 1966 and 1982 and 1993 and 2004 and 2013
and between all the different suggestions for revision, it's pretty hard to
believe that anyone (except of course ACD/Labs) has fully implemented and
fully validated code following the IUPAC 2013 Sequence Rules. I'd sure like
to see it if they do.

Here is my esoteric question for the day. I was under the impression that
two branches could be distinguished by direct comparison. That's certainly
true for, say, linear glycerols. But reading Mata (Tetrahedron Asymmetry
vol 4, 1995, p. 657) and more specifically the IUPAC 2013 examples on pages
1193-1205, I am surprised to see that in order to make a characterization
in certain cases, up to three distinct subbranches need to be analyzed in
parallel.

The first example of this I see is on page 1201, Example 5. One cannot
start the analysis without first identifying that there are three branches
for comparison, not two. This is not a problem, as my algorithm will
clearly identify that after Rule 3 we have three same-priority branches.
But it does introduce an entirely new concept. Up to this point branches
are compared two at a time. Now all three subbranches have to be scanned in
parallel prior to determining the "reference" stereochemistry to be R or S.
The example there goes through the process we need to use to sort the
subbranches such as:

C--*S*--S
/
--C--C--S--S
\
C--R--R

OK, I can do that, but it is quite a different concept than comparing in
pairs.

Odder still (shown in Example 4) is that in the case where we have
subbranches of differing priorities, of completely different types (one
starting with O, one with C), now we don't just use the higher-priority
branch as the determinant. We do use it for the reference determination,
but after that we still go through the parallel processing as before, rank
by rank. But since these are now distinctly different structural branches,
the various stereocenters. For example, we could have:

O--*S*--S
/
--C--C--S--S
\
C--R--R

That's fine, too. But these examples are too easy! What if we have:

O--C--C--C--C--*S*--S--X
/
Y--C--C--S--S--S
\
C--R--R--S

There could be anything on that oxygen -- X could be a cholesteryl ether
for all we know. Y has to have some symmetry here, I think, because the
goal is still to rank this entire ligand relative to some other evil
diasteriomeric twin somewhere in "Y". (It's not to give this particular
ligand a descriptor at the carbon attached to Y here. That would be trivial
in this case.) Our goal is to analyze a sequence of "like" and "unlike"
pairs within this unit and compare that to a similar set in another unit
somewhere within Y for the purpose of establishing the descriptor for some
*other* atom in Y.

Wolf, Tim, Mikko? Have I go this right so far?

Bob

My reading of the discussion within these elaborations of Rule 4 suggest:

a) The first S on the highest-ranking branch (O...X) sets the reference
atom for this unit, regardless of how far down the line it is.
b) The other two subbranches must be geometrically identical, or we
wouldn't be here. All the stereochemistry would have been set in a Rule 1
step long ago (just like we already know that we have S and R along the
branch). Someone has already gone through at least Steps 1-3 to do that,
possibly 1-5.
--
Robert M. Hanson
Larson-Anderson Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr

If nature does not answer first what we want,
it is better to take what answer we get.

-- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900

John Mayfield

2017-04-19 10:52:11 UTC

Permalink

Post by Robert Hanson
a) The first S on the highest-ranking branch (O...X) sets the reference
atom for this unit, regardless of how far down the line it is.

Nope, because that's not the highest ranked node. Remember it's all about
spheres :-), nodes are ranked hierarchally.

The rule for pairing stereodescriptors is as follows: âA reference

Post by Robert Hanson
descriptor for chirality centers, identified as âRâ or âSâ (not associated
with any node of the digraph), is chosen in each ligand and is: (a) the one
associated with the *highest rank node* corresponding to a chiral unit in
the ligand; (b) the one that occurs the most in the set of equivalent
highest rank nodes; or (c) sequentially both descriptors (âRâ and âSâ), if
these occur in the same number in the set of equivalent highest ranked
nodes.â

An easier example by your scheme this would be R. ChemSketch (and centres -
shown) say S:

[image: Inline images 1]
It's been a few years but IIRC this rule is nasty and effectively requires
a power set <https://en.wikipedia.org/wiki/Power_set> in some cases.

John

Robert Hanson

2017-04-19 17:00:29 UTC

Permalink

OK, that helps. Sure. Ranking Rule #1. I don't know what ChemSketch is
doing here, but it is missing two trivial centers in Sphere I. Although you
have not indicated them, they have to be something simple, at least for
Jmol! But let's say that is still possible to not have them defined. Our
job is to create a ranked list of centers. Would you agree with this?

left: RSS
right: RRS

Different at the second point, so right beats left: The center is S.

So what's the power thing? These are auxiliary designators, not necessarily
"real" ones. They don't create a set that intercepts itself. The pairing
rules are pretty exacting.

Bob

Robert Hanson

2017-04-19 18:18:09 UTC

Permalink

ps, before John jumps on me, I had better add the intermediate l/u business:

left: RSS (uu)
right: RRS (lu)

l > u, so right wins.

Since it comes down to ordering of the RSS... list, that's what I need
clarification on. We have Ranking Rule #1, which is great and helps in this
case (if John agrees).

John, quick question: Rule 5 is the place - the only place -- that assigns
"r" and "s". Right?

Bob

John Mayfield

2017-04-19 19:50:44 UTC

Permalink

from Mata 2005...

If both descriptors, R and S, are used as first descriptors (see Fig. 5),

they should be independently and sequen- tially used to form pairs of
descriptors. Then all pairs sit- uated at the same rank level are compared.
The first difference encountered is used to rank ligands. This means that
in the examples in this figure, pairs 1,2 and 2,1 (the order is not
important) should be simultaneously compared in both ligands and the number
of l and u pairs evaluated

[image: Inline images 1]

I don't know what ChemSketch is doing here, but it is missing two trivial
centers in Sphere I. Although you have not indicated them, they have to be
something simple, at least for Jmol! But let's say that is still possible
to not have them defined.

That's CDK not ChemSketch (although ChemSketch labels the same as I said),
I deliberately contrived the example and missed those off on purpose to
simplify it - replace with a N if you like. In practice such situations a
very rare.
It's been a long time but IIRC the power set/perminations is when
everything is tired for something like this. Maybe they've replaced this
S - R OH R - S
\ | /
R - S - N - CH - N - S - R
/ \
S - R S - R
the branches equal so the middle is unspecified. To prove it you need
generate all possible sequences of like/unlike descriptors. Even if they're
not equal... I can't remember exactly if the right here the reference is
now always S but even so we still need to generate all the way they can be
S - R OH R - S
\ | /
R - S - N - CH - N - S - S
/ \
S - R S - R
SRSSSR -> lulllu
SSRRSS -> lluull (best)
Rule 5 is the place - the only place -- that assigns "r" and "s". Right?
Yes