Structure Checker : structurecheck_cline

Structurechecker - The Command-line Tool of Structure Checker

Contents

Structurecheck Command-line
Structure Checker is a chemical validation tool detecting and fixing common structural errors or special features that can be potential sources of problems. Structurecheck is the command-line tool of Structure Checker.

Options

 

StructureCheck 5.6, (C) 1999-2011 ChemAxon Ltd.
Licenses of additionally used third party programs can be found in license.html
Online version: http://www.chemaxon.com/marvin/license.html
Molecule checker.
Usage:
  structurecheck [input file(s)/string(s)] -c <config file/string> [options]

General options:
  -m, --mode <operationmode>          mode of the operation: fix or check
                                      default mode is check 
      <operationmode> = [fix|check]   
      check - only check is executed, does not modify molecules
      fix   - fixes molecules containing structure errors whenever possible
  -x                                  fix mode (deprecated, use --mode fix)
Input options:
  -c, --config <filepath|string>      action string configuration
                                      actions separated by "..",
                                      valid checker actions are:
                                      - 3d
                                        (detect atoms with 3D coordinates)
                                      - abbrevgroup
                                        (detect abbreviated groups)
                                      - abbrevgroup:expanded=true
                                        (detect expanded abbreviated groups)
                                      - abbrevgroup:contracted=true
                                        (detect contracted abbreviated groups)
                                      - alias
                                        (detect atoms with alias)
                                      - aromaticity
                                        (detect aromaticity errors)
                                      - aromaticity:type=[basic,loose,general]
                                        (detect aromaticity errors
                                        with the given aromatization type)
                                      - atommap
                                        (detect atoms with map number)
                                      - atomqueryproperty
                                        (detect atom query properties)
                                      - atomqueryproperty:H=[true,false]:
                                        X=[true,false]:
                                        D=[true,false]:
                                        R=[true,false]:
                                        h=[true,false]:
                                        r=[true,false]:
                                        a=[true,false]:
                                        s=[true,false]:
                                        u=[true,false]:
                                        rb=[true,false]:
                                        (detect hydrogen count/
                                        connection count/explicit connection
                                        count/ring count/implicit hydrogen
                                        count/smallest ring count/aromaticity/
                                        substitution count/unsaturation/ring
                                        bond count atom query properties)
                                      - atomvalue
                                        (detect atoms with atom value)
                                      - attacheddata
                                        (detect atoms with attached data)
                                      - bondangle
                                        (detect unpreferred bond angles
                                        in 2d)
                                      - bondlength
                                        (detect bonds that are too long
                                        or too short)
                                      - chiralflag
                                        (detect non-chiral molecules with
                                        chiral flag)
                                      - coordsystem
                                        (detect invalid coordination systems)
                                      - covalentcounterion
                                        (detect covalent counterions)
                                      - crosseddoublenond
                                        (detect crossed double bonds)
                                      - empty
                                        (detect items without atoms)
                                      - explicith
                                        (detect explicit hydrogens)
                                      - explicith:lonely=[true,false]:
                                        mapped[true,false]:
                                        charged=[true,false]:
                                        isotopic=[true,false]:
                                        radical=[true,false]:
                                        wedged=[true,false]
                                        (detect lonely/charged/mapped/isotopic
                                        radical/wedged explicit hydrogens)
                                      - explicitlp
                                        (detect explicit lone pairs)
                                      - isotope
                                        (detect isotopes)
                                      - metallocene
                                        (detect incorrect metallocene
                                        representations)
                                      - missingatommap
                                        (detect atoms without map numbers)
                                      - multicenter
                                        (detect multicenters)
                                      - multicomponent
                                        (detect molecules containing
                                        disconnected parts)
                                      - moleculecharge
                                        (detect non-neutral molecules)
                                      - ocr
                                        (detect structures that are probably
                                        not chemical structures but originated
                                        from other drawings, usually results of
                                        incorrect optical structure
                                        recognition)
                                      - overlappingAtoms
                                        (detect atoms that are too close to
                                        each other)
                                        (detect bonds that are too close to
                                        each other)
                                      - pseudoatom
                                        (detect pseudo atoms)
                                      - queryatom
                                        (detect query atoms)
                                      - querybond
                                        (detect query bonds)
                                      - racemate
                                        (detect asymmetric tetrahedral atoms
                                        without specific stereo configuration)
                                      - radical
                                        (detect radical atoms)
                                      - ratom:all=[true,false]:
                                        disconnected=[true,false]:
                                        generic=[true,false]:
                                        linker=[true,false]:
                                        nested=[true,false]:
                                        (detect all/disconnected/generic
                                        linker/nested R-atoms)
                                      - rare
                                        (detect rare elements)
                                      - reactionmap
                                        (detect reactions with invalid
                                        atom mapping)
                                      - rgroupattachmenterror
                                        (detect R-group attachment errors)
                                      - rgroupreferenceerror:
                                        missingratom=[true,false]:
                                        missingrgroup=[true,false]:
                                        selfreference=[true,false]:
                                        (detect missing R-atom/
                                        missing R-group/invalid attachment/
                                        self reference errors in R-group
                                        definitions)
                                      - ringstrainerror
                                        (detect small rings with 
                                        trans or cumulative double bonds,
                                        or triple bond)
                                      - solvent
                                        (detect common solvents appearing
                                        by a main component)
                                      - staratom
                                        (detect star atoms)
                                      - substructure:reactionSmarts=[smarts]
                                        (detects if the give smarts
                                        structure could be found as
                                        a substructure of the
                                        original molecule)
                                      - unbalancedreaction
                                        (detect reactions with orphan atoms)
                                      - valence
                                        (detect valence errors)
                                      - valenceproperty
                                        (detect atoms with valence properties)
                                      - valenceproperty:defaultvalence=true
                                        (detect atoms with default valence
                                        properties)
                                      - valenceproperty:nondefaultvalence=true
                                        (detect atoms with default valence
                                        properties)
                                      - wedge
                                        (detect incorrect wedge bonds)
                                      - wigglydoublebond
                                        (detects non_stereo double bonds
                                        with wiggly representation)
                                        connected to a double bond)

                                        valid fixer actions are:
                                      - aliastoatom
                                        (remove aliases from atoms)
                                      - aliastogroup
                                        (convert atoms with aliases to
                                        abbreviated groups if the alias)
                                        is recognized)
                                      - aliastocarbon
                                        (remove alias values from atoms and 
                                        convert the atom to a carbon)
                                      - clearabsstereo
                                        (remove the chiral flag)
                                      - clean
                                        (calculate 2D coordinates)
                                      - contractgroup
                                        (contract all abbreviated groups)
                                      - converttoelementalform
                                        (convert isotopes into elemental atoms)
                                      - converttoionicform
                                        (convert covalent counterions
                                        to ionic form)
                                      - converttometalloceneform
                                        (convert non-standard metallocene
                                        representations into coordinated
                                        multicenter representation
                                      - crosseddoublebond
                                        (convert non-stereo double bond
                                        represented by wiggly ligand)
                                        to crossed double bond representation)
                                      - crossedtowiggly
                                        (convert non-stereo double bond
                                        represented by crossed double bond)
                                        to wiggly ligand representation)
                                      - dearomatize
                                        (convert aromatic rings into Kekule
                                        form)
                                      - expandgroup
                                        (expand all abbreviated groups)
                                        representation if it is possible)
                                      - fixrgroupattachment
                                        (add missing attachments to members
                                        with single location)
                                      - fixvalence
                                        (correct valence problem by removing
                                        hydrogens or setting charges)
                                      - mapmolecule
                                        (add atom maps to each atom 
                                        of the molecule)
                                      - mapreaction
                                        (add atom maps to the reaction)
                                      - neutralize
                                        (remove charges from the molecule)
                                      - pseudotogroup
                                        (convert pseudo atoms to 
                                        abbreviated groups if pseudo label
                                        is a known abbreviated group)
                                      - removeexplicith
                                        (remove explicit hydrogens)
                                      - rearomatize
                                        (dearomatize the molecule and
                                        aromatize it again)
                                      - removealias
                                        (remove alias values from atoms)
                                      - removeatom
                                        (remove the problematic atoms
                                        from the molecule)
                                      - removeatommap
                                        (remove atom map numbers)
                                      - removeatomqueryproperty
                                        (remove atom query properties)
                                      - removeatomvalue
                                        (remove atom values)
                                      - removeattacheddata
                                        (remove data attached to atoms)
                                      - removebond
                                        (remove problematic bonds
                                        from the molecule)
                                      - removeradical
                                        (convert radicals to non_radical atoms)
                                      - removevalenceproperty
                                        (remove valence properties from atoms)
                                      - removezcoordinate
                                        (set the z-coordinates of atoms to 
                                        zero)
                                      - ungroup
                                        (ungroup all abbreviated groups)
                                      - wedgeclean
                                        (recalculate the orientation of the
                                        wedge bonds in the molecule)

Output options:
  -t --output-type <output type>      output type (default: single) 
      <output type> = [single|separated|accepted|discarded]
      single    - both accepted and discarded structures are written to the
                  <output path>
      separated - accepted structures are written to the <output path>, 
                  discarded structures are written to the  <discarded path>
      accepted  - only accepted structures are written to the <output path>
      discarded - only discarded structures are written to the <discarded path>
  -o, --output <output path>          output file (default: standard output)
  -d, --discarded <discarded path>    writes molecules with structure
                                      error to a separate file (default:
                                      standard output)
  -f, --format <format>               output file format (default: smiles)
  -rf, --report-file <filepath>       writes report to a file
  -rp, --report-property <propname>   writes report to the property of the
                                      output, with the specified propery name
  -l, --log <filepath>                writes software-error log messages
                                      to file
  -ocr, --discard-scan-errors         discard incorrectly scanned molecules
  
  Examples:
  structurecheck -c config.xml -t separated -o out.smiles -d discarded.smiles in
.smiles
  structurecheck -c config.xml -m fix -t separated -d discarded.smiles in.smiles
  structurecheck -c config.xml -m fix -t discarded in.sdf
  structurecheck -c "bondLength" in.sdf
  structurecheck -c "isotope->converttoelementalform" in.sdf
  structurecheck -c "aromaticity..valence" -m fix -f sdf -o out.sdf in.sdf

Go to top

Usage

structurecheck  -c <config file> -m [mode] [<options>] [input list]
  • The command line parameter -c or --config is mandatory. This parameter specifies the configuration file path or a simple action string .
  • The optional parameter -m or --mode specifies the operation mode. The following operation modes are available:
  • check (default): searches for errors;
  • fix: fixes automatically fixable errors.

Note: When a molecule import/export error occurs, the program continues to run. The error is written to the console, and the molecule is discarded from the results (i.e., the resulting output file contains less molecules than the input file).

Input

Structurecheck accepts most molecular file formats as input (Marvin Documents (MRV), MDL molfile, Sdfile, RXNfile, Rdfile, SMILES, etc. ). The input can be specified as:

  • input file(s),
  • input string(s), or
  • SMILES (default).

Note: If neither the input file nor the input string is specified, the standard input (console) will be read.

Go to top

Output

Structurecheck's output contains the file(s) of the checked/fixed molecules and optionally a report of the results. The molecules are written to the output file(s). The format of the output file(s) can be specified by the -f or --format option (default format is: "smiles"). The type of output is defined by the -t or --output-type parameter. The possible values of the output type are the following:

  • single (default): all molecules are written to the file defined by the --output parameter. If --output parameter is omitted, the result is written in the standard output (console). (--discarded parameter is ignored in this case.)
  • separated: valid and invalid molecules are written to two different files. The --output parameter defines the output file of molecules with valid structures, and the --discarded parameter defines the output file of molecules with invalid structures (or in fix mode, those which can not be fixed automatically).
  • If --discarded parameter is omitted, molecules with invalid structures are written to standard output;
  • If --output parameter is omitted, molecules with valid structures are written to standard output;
    Note: The indication of --output or --discarded parameter is mandatory. If none of these parameters are defined, the program stops.

accepted: only molecules with valid structures are written to file defined by the --output parameter. If --output parameter is omitted, molecules with valid structures are written to the standard output. (--discarded parameter is ignored in this case) discarded: only molecules with invalid structures are written to the file defined by the --discarded parameter. If ?-discarded parameter is omitted, molecules with valid structures are written to the standard output. (--output parameter is ignored in this case.)

The report of the structure check can be written either to a separate file, defined by the --report-file parameter, or to the output file(s) as additional molecule property. The name of the property can be defined by the --report-property parameter.

Note: Not all molecules with structure errors are discarded. When fix mode is selected, molecules with automatically unfixable errors will be discarded only.

Go to top

Usage examples

Below you can find the short descriptions of some examples.

  1. structurecheck -c "metallocene"

    Executes a check with configuration metallocene on the molecule(s) defined in the standard input, and writes the result to the standard output (console);

  2. structurecheck -c "bondLength" in.sdf

    Executes a check with configuration bondLength on the molecule(s) defined in the in.sdf file, and writes the result to the standard output (console);

  3. structurecheck -c "isotope->converttoelementalform" in.sdf

    Executes a check with configuration isotope->converttoelementalform on the molecule(s) defined in the in.sdf file, and writes the result to the standard output (console);

  4. structurecheck -c "aromaticity..valence" -m fix -f sdf -o out.sdf in.sdf

    Executes a fix with configuration aromaticity and valence on the molecule(s) defined in the in.sdf file, and writes the molecules with valid structures (including automatically fixed molecules) in sdf format to the out.sdf output file;

  5. structurecheck -c config.xml -t separated -o out.sdf -d discarded.sdf

    Executes a check with configuration contained by the config.xml, and writes the molecules with valid structures to out.sdf, and writes the molecules with invalid structures to discarded.sdf.
    Note: The format of both outputs is SMILES(warning) as --format (-f) is not defined;

  6. structurecheck -c config.xml -m fix -t separated -d discarded.sdf

    Executes a fix with configuration contained by the config.xml, and writes the molecules with invalid structures to discarded.sdf, and writes molecules with valid structures to the standard output (console);

  7. structurecheck -c config.xml -m fix -t discarded in.sdf

    Executes a fix with configuration contained by the config.xml, and writes the molecules with invalid structures to discarded.sdf, and omits molecules with valid structures.

Go to top   Copyright © 1999-2010 ChemAxon Ltd.    All rights reserved.