Extended SMILES, SMARTS
Codename: cxsmiles,cxsmarts
Contents:
Extended SMILES, SMARTS format
ChemAxon Extended SMILES/SMARTS is used for storing special features
of the molecules after the SMILES string.
Any information can be stored after the SMILES string
if it is separated by space or tab characters as the SMILES parsers ignore them
or use them as comment.
The extended features are stored in the following format:
SMILES_String |<feature1>,<feature2>,...|
The extended feature description is economic.
If some feature is missing in the molecule, then the corresponding special
characters are not written.
(Eg: If the atoms of the molecule has no alias strings at all,
no "$" and ";" characters are written.)
Moreover, if no feature of the molecule to be written,
the extended feature field is omitted.
In extended smiles export the following additional features are exported:
- Molecule absolute stereoconfiguration
The relative stereoconfiguration is stored as "r".
The absolute stereoconfiguration is the default, which is not marked.
(Absolute stereoconfiguration known also as "Chiral flag" in MDL molfiles.)
- Enhanced stereochemical representation
The following stereochemical group types are stored:
- Absolute stereo group type.
a:<atomindex>,<atomindex>...
- OR stereo group type.
o<group>:<atomindex>,<
atomindex>...
- AND stereo group type.
&<group>:<atomindex>,<
atomindex>...
Atom labels
Atom labels are written between "$" characters each label is
separated by ";" characters.
Single "Up or Down" (Wiggly) bonds
Atom indexes relating to wiggly bonds are written after "w:"
followed by the wiggly bond index and separated by commas.
CIS, TRANS, UNSPEC bond info for double bonds in rings
Bond indexes of the double bonds in SSSR are written.
The bond stereo information is generated as the following:
the double bond has the representation a1-a2=a3-a4, where
- a1 is the smallest atom index of the generated smiles connected to a2
- a2 is the double bond smaller atom index in the generated smiles
- a3 is the double bond larger atom index in the generated smiles
- a4 is the smallest atom index of the generated smiles connected to a3
The CIS double bond indexes are written after "c:",
the TRANS double bond indexes are written after "t:",
the double bond indexes with UNSPEC flag are written after "u:".
Fragment level grouping of reactant, agent and product fragments
Grouped fragment indexes are written after "f:" in the following
format:
- Connected groups are separated by ",".
- A connected group is a "." separated list of fragment indices.
Example: "f:0.1,5.6"
Local parity information
Atom indexes with local ODD parity are written after "@:",
while atom indexes with local EVEN parity are written
after "@@:"
characters separated by commas.
Radical numbers
Atom indexes with
- monovalent radical center are written after "^1:",
- divalent radical center are written after "^2:",
- divalent singlet radical center are written after "^3:",
- divalent triplet radical center are written after "^4:",
- trivalent radical center are written after "^5:",
characters separated by commas.
Lone electron pairs
The indexes of the atoms having lone electron pairs are written after
"LP:".
Import options
See SMILES import options
Export options
Export options can be specified in the format string. The format descriptor
and the options are separated by a colon.
If no ChemAxon Extended SMILES/SMARTS specific export option (detailed below)
is given, all export options are used.
Examples: "cxmiles:le" writes only atom labels and
wiggly bond indexes,
"cxmiles:" writes all features (absolute stereoconfiguration, enhanced
stereo features, atom labels, wiggly bond indexes, ring stereo bond info and
reaction fragment level grouping).
e |
Write relative stereo configuration and enhanced stereo features.
|
l |
Write atom aliases. |
w |
Write wiggly bond indexes. |
d |
Write CIS, TRANS bond indexes. |
f |
Reaction fragment level grouping. |
p |
Write local parities. |
R |
Write radical numbers. |
L |
Write lone electron pairs. |
See also SMILES export options
and basic export options.