QSAR WORLD
Home | About QSAR World | Strand Life Sciences | Contact Us
Google Custom Search

The IUPAC International Chemical Identifier

Both parts of the InChIKey are based on a truncated SHA-256 hash17 of the corresponding InChI layers. For encoding of the data, only uppercase ASCII letters are used which ensures that the indexing engines will not split the data and also avoids case-sensitivity problems. There is a finite, but extremely small probability of finding two structures with the same InChIKey.


An example will make the structure of the key clearer. The “standard InChI” and InChIKey for caffeine are shown below. The first block of 14 letters (RYYVLZVUVIJVGH) encodes the molecular skeleton (connectivity). The first eight letters of the second block (UHFFFAOY) encode stereochemistry and isotopes. After that, “S” indicates that the key was produced from standard InChI and “A” indicates that version 1 of InChI was used. The final character, “N”, means “neutral”.

  
InChI = 1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
InChIKey = RYYVLZVUVIJVGH-UHFFFAOYSA-N

Use of InChIKey allows searches based solely on atom connectivity (the first 14 characters). For example, the stereoisomers D-fructose and L-fructose both have the same first block of 14 characters, BJHIKXHVCXFQLS.

Generating InChI

The PubChem Server Side Structure Editor v1.8 includes a facility for generating InChIs as the user draws the structure.18 ACD/Labs’ freely available structure-drawing program ChemSketch19 includes the facility to generate InChIs from drawn structures. Other structure drawing packages (MDL Draw, BKChem, ChemDraw, and Marvin) also allow an input chemical structure to be cut and pasted into the InChI Generator. ChemSpider provides methods to manipulate InChI strings and InChIKeys, including conversion to and from the molfile format, checking validity of the InChI identifiers, and searching ChemSpider using an input InChI.20

Some Other Identifiers

Readers will no doubt be familiar with CAS Registry Numbers.21 InChI is not a registry system; it does not depend on the existence of a database of unique substance records to establish the next available sequence number for any new chemical substance being assigned an InChI. Registry systems which index the literature are complementary to any InChI databases that anyone creates. The Simplified Molecular Input Line Entry System (SMILES) language22 is another well known way of representing a chemical structure by a string of characters. Like InChI, SMILES allows canonicalization of a structure, but SMILES is proprietary and not an open project. This has led to the use of different generation algorithms, and thus, different SMILES versions of the same compound have been found.10




Page 1 | 2 | 3 | 4
Have any Questions?
Name:
Email:
Enter your query/comment here
 

    Facilitated by
    Strand Life Sciences Pvt. LtdStrandls Logo