Stanford University Libraries

Chemical Literature (Chem 184/284)
University of California at Santa Barbara

Lecture 16: Specialized Techniques and Files of Interest to Chemists

Searching for Special Substances in the Registry File

  • Certain classes of compounds pose special challenges for Registry File searching — they don't have well-defined stoichiometry, or may be defined in unusual ways, or are too large for standard structure searching.
  • CAS has implemented special techniques for some of these classes.

Alloys in the Registry File

  • Since alloys are usually described by percentages rather than conventional molecular formulas, CAS has created special displays and search fields to accomodate them.
  • Note that CAS did not assign Registry Numbers to alloys before 1972.
  • Alloy Record Example
    RN      39391-70-3
    CN      Nickel alloy, base, Ni 75, Cr 20, Ti
                    2.5, Al 1.5, Y2O3 1.3 (IN 853) (9CI)
    CN      IN 853
    CN      Inconel MA753
    CN      MA753
    DR      58719-29-2, 54259-74-4
    MF      Al . Cr . Ni . O3 Y2 . Ti
    CI      AYS
    LC      CA, IFICDB, IFIPAT, IFIUDB
    STE     8:AY, IN*853
    Component               Component               Component
                            Percent                 Registry #
    ===============================================
    Ni                      75                      7440-02-0
    Cr                      20                      7440-47-3
    Ti                       2.5                    7440-32-6
    Al                       1.5                    7429-90-5
    Y2O3                     1.3                    1214-36-9
    
  • Alloys may have ranges instead of fixed percentages of composition.
  • Alloy Search Fields
    • Material Composition Field (/mac)
      Range searchable, numeric field for percent composition. Examples
      => s (iron 70-74 and cr>10)/mac
       
    • Relative Composition Field (/rc)
      Allows searching by component in decreasing order of percentage.
      => s fe.cr.ni.?/rc
      

Nucleic Acid and Peptide Sequences

  • All registered sequences of four or more nucleic acids or amino acids are sequence searchable.
  • Amino acids may be written as either one-letter (ACDEF) or three-letter ('Ala-Cys-Asp-Glu-Phe') codes.
  • Nucleic acids use the familiar one-letter codes (A,C,G,T,U).
  • You may search for either exact sequence matches (/sqep, /sqen) or subsequence matches (/sqsp, /sqsn).
  • You may specify gaps, wildcards, and repeated sequences, or whether a fragment must be l ocated at the beginning or end of a chain.
  • For more information on biosequence searching, see Biosequences: You Need Them! STN Has Them!, a workshop in Adobe PDF format at http://www.cas.org/training/present97/bioseq97.pdf

Polymer Class Searching

  • Chemists interested in polymers frequently want to search for information on broad classes of polymers, such as polyesters.
  • However, the common structural features of most polymer classes are so common that they cannot be structure searched in the REGISTRY file.
  • To deal with this problem, CAS has added polymer class terms (/pct) to the Registry File records for polymers.
  • There is a wide range of polymer class terms, including vinyl polymers, polyacetylenes, polyesters, phenol resins, etc.
  • You may add the term “formed” to a search to specify that a feature is formed in the reaction of the monomers.
  • A list of the polymer class terms may be scanned by doing an EXPAND in the PCT field in the Registry file, but it is best to consult the printed search aid “REGISTRY Polymer Class Terms” for definitions of each class if you intend to use them extensively.

Other Files in the CAS Family:

Variations on the CA Theme
  • The CA File is searchable with three different pricing options:
CA HCA ZCA
Connect hour $32 $161 free
Search terms $1.25 free $1.65
Display costs
bib $0.72 for all files
all $2.00 for all files
  • Which version is more economical depends on the type of search you are doing.
Something Old — CAOLD
  • The CAOLD file (and its counterpart HCAOLD) is an attempt to retroactively provide electronic indexing for pre-1967 documents.
  • For 1957–67 the CAOLD file is searchable by CAS Registry Numbers or CAS Abstract numbers, and only gives the CAS Abstract Number, Registry Numbers associated with the document, and whether or not it is a patent.
  • For 1907–67, you may use the CAS Abstract Number to retrieve a TIFF page image of the printed page. This can be useful if you have a reference to an old CA listing (you may find them in Merck Index, Beilstein, Kirk-Othmer among others) and don’t have the printed volumes readily at hand.
  • It is hoped that this is a first step toward the full electronic searchable version of the 1907–67 abstracts.
Something Newer Yet — the CAplus file
  • CAplus (and HCAplus, ZCAplus) are the newest additions to the CA family. It is a combination of the CA file and records which have not yet received full indexing, updated daily with bibliographic information.
  • Additionally, it has cover-to-cover indexing of about 1,300 key journals (including editorials, letters, book reviews…)
  • CAplus costs 5% more than the CA file and there is no academic discount available.
CASREACT — the CA reactions database
  • CASREACT has detailed reactions indexing — every reactant, reagent and product — for the Organic Chemistry sections of CA, back to 1985.
  • Reactions are structure searchable, and you can label the atoms and bonds involved or uninvolved in the reactions.
  • You may also search functional group terms in reactant and/or product.
  • Other reaction searchable files on STN include: CHEMREACT, CHEMINFORMRRX, DJSMonline
MARPAT, MARPATpreviews
  • MARPAT provides structure searchable records for patents with Markush structures described. MARPATpreviews contains records which have not yet received full subject indexing.
  • Records found are the CA records for the patents, with Markush structure diagrams displayable.

Chemical Data Files

  • STN carries a number of files with numeric and textual chemical data information. In these files you can search both by compound and by data.
    => s 65-75/mp and 240-250/bp
    
  • Messenger has a SET UNITS command which allows you to select SI, metric or English units for either search or display. Units may be searched with tolerance ranges.
  • Most of these files have CAS Registry Numbers, so they will be listed in the Registry Record of a substance, and may be easily searched by Registry crossover.
  • General Chemical Data
    • BEILSTEIN (structure searchable)
    • GMELIN (structure searchable)
    • HODOC
    • MERCK
  • Crystal Structure Data
    • ICSD — Inorganic Crystal Structure Database
  • Thermodynamic Data
    • CHEMSAFE (flammability data)
    • DETHERM
    • DIPPR
    • TRCTHERMO
  • Toxicity Data
    • HSDB
    • MSDS files
    • RTECS
    • MERCK
SPECINFO — Spectra Online
  • SPECINFO is a spectral data file, containing 13C, 17O, 19F, 31P and 15N NMR spectra, as well as some IR and mass spectra.
  • The file is searchable by name, RN, numeric data and chemical structure and has programs for calculating spectra.
  • Peak tables and actual spectra are displayable (the latter w/ STN Express).
Can I buy it instead of making it? — CSCHEM and CHEMCATS
  • The online version of ChemSources, CSCHEM, combines ChemSources USA and ChemSources International.
  • Searchable by name and Registry Number; structures displayable.
  • Look in the LC field in the Registry Record for a compound; if CSCHEM is listed, you can buy it! CSCORP has address, etc. information.
  • CHEMCATS contains the chemical catalogs of specific manufacturers, including Aldrich, Sigma, Fluka, etc.
  • Searchable like CSCHEM, it includes any chemical data from the catalogs, plus price information.
  • CHEMCATS is also listed in the LC field of the Registry File for easy crossover.
Patent Files
  • STN has a variety of patent index files, some devoted especially to partcular countries, some with worldwide coverage.
  • Of particular interest:
    • WPIndex (World Patents Index): Broadest of the files. Noted for descriptive titles, abstracts.
    • IFIPAT (aka CLAIMS): U.S. Patents, detailed indexing, including CAS Registry Numbers.
    • USPATFULL: Full-test US Patents, crossover links to equivalent CAS patent listings, CA subject indexing and RN’s added for chemical patents.
    • EUROPATFULL: Full-text European Patent Office Patents.
    • Other databases for specific patent offices: PATDD, PATDPA, PATOSDE (Germany), JAPIO (Japan), PATOSEP (European Patent Office), PATOSWO (World Intellectual Property Organization)
    • DPCI (Derwent Patent Citation Index): Allows tracing by citations in patents, both by inventors and examiners, both patents and other documents.
    • DGENE (Derwent Geneseq): Sequence searchable file of protein and nucleic acid sequences from patents.
Science Citation Index Online
  • The online version of Science Citation Index (SciSearch) on STN has a number of advantages over its print and CD counterparts, though the Web of Science version is catching up.
  • Abstracts and author's keywords are available for recent years.
  • Database back to 1974 is searchable.
  • You can do the equivalent of a Related Record search over the whole database by using the SELECT command.
  • SELECT pulls information from records and turns it into a search query.
  • So you can take a record, SELECT RE (cited reference), search those references, then SORT OCC (sort by occurence) to get a Related Record set.
  • SELECT CIT lets you create a cited reference search term from an article record.
  • You can search a topic, then find all the papers that cited that initial set.
  • This lets you search an author, then find who has cited her without worrying about the “first author” problem.
Other Files Useful to Chemists
  • Besides most of the files available through the MELVYL system, or UCSB’s Web subscriptions (MEDLINE, INSPEC, COMPENDEX, NTIS, Dissertation Abstracts, the Cambridge Scientific abstracts databases), there are a few additional bibliographic databases of particular use to chemistry
  • ANABSTR — Analytical Abstracts (see the Important Indexes and Abstracts in Science and Engineering list for details.)
  • NAPRALERT — Natural Products Database: A database of natural products, comprehensive from 1975 to date, selective from 1650 to 1975. Searchable by Registry Number, chemical name, organism.

This page created by Chuck Huber (huber@library.ucsb.edu).