Abstract
This chapter is devoted to prior information required to set up and test identification hypotheses. According to its type, the relevant information is divided into meaning and statistical data. Knowledge with regard to the origin, properties, and use of chemical compounds is very essential in order to be able to propose and reject candidate compounds for identification. Prior information about samples analyzed is important in order to gather full evidence of the trueness of an identification result. Plausibility of qualitative analytical results is also taken into account to confirm conclusions made by analysts. Much of such data are extracted from chemical databases outlined in this chapter. These data sources are also used to calculate statistical rates of occurrence and co-occurrence of chemical compounds in the literature. The occurrence rate is the direct measure of the abundance of chemical compounds, and the related possibility of presenting in samples to be analyzed. Rare compounds are filtered out by means of this rate, and further excluded from consideration for identification purposes. Most known compounds are rare ones, as proved by respective statistical data. Facts and rates of the co-occurrence of chemical compounds in the literature provide the possibility of a priori prediction of a group of compounds available in the same samples analyzed. Different methods of estimating these rates are described; examples of their use for identification are given.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Milman BL, Kovrizhnych MA (2000) Identification of chemical substances by testing and screening of hypotheses II. Determination of impurities in n-hexane and naphthalene Fresenius. J Anal Chem 367:629–634
Milman BL (2002) A Procedure for decreasing uncertainty in the identification of chemical compounds based on their literature citation and cocitation. Two case studies. Anal Chem 74:1484–1492
Milman BL (2005) Literature-based generation of hypotheses on chemical composition using database co-occurrence of chemical compounds. J Chem Inf Model 45:1153–1158
Milman BL (2005) Identification of chemical compounds. Trends Anal Chem 24:493–508
Milman BL, Konopelko LA (2000) Identification of chemical substances by testing and screening of hypotheses I. General. Fresenius J Anal Chem 367:621–628
Anari MR, Baillie TA (2005) Bridging cheminformatic metabolite prediction and tandem mass spectrometry. Drug Discov Today 10:711–717
Baranczewski P, Stańczak A, Kautiainen A, Sandin P, Edlund PO (2006) Introduction to early in vitro identification of metabolites of new chemical entities in drug discovery and development. Pharmacol Rep 58:341–352
Staack RF, Hopfgartner G (2007) New analytical strategies in studying drug metabolism. Anal Bioanal Chem 388:1365–1380
Roger S, Scheltema RA, Girolami M, Breitling R (2009) Probabilistic assignment of formulas to mass peaks in metabolomics experiments. Bioinformatics 25:512–518
Chemical Abstracts Service. http://www.cas.org/expertise/cascontent/index.html. Accessed 23 May 2010
CrossFire Beilstein. http://www.info.crossfirebeilstein.com. Accessed 30 Oct 2010
The Combined Chemical Dictionary on DVD. http://www.crcpress.com/product/isbn/9780412820205. Accessed 29 Oct 2010
CHEMnetBASE. http://www.chemnetbase.com. Accessed 23 May 2010
The Merck Index. http://www.merckbooks.com/mindex. Accessed 23 May 2010
KEGG: Kyoto Encyclopedia of Genes and Genomes. http://www.genome.jp/kegg. Accessed 23 May 2010
NIST Chemistry WebBook. http://webbook.nist.gov/chemistry. Accessed 23 May 2010
PubChem. http://pubchem.ncbi.nlm.nih.gov. Accessed 6 July 2009
ChemSpider. http://www.chemspider.com. Accessed 23 May 2010
CHEMCATS. http://www.cas.org/expertise/cascontent/chemcats.html. Accessed 23 May 2010
ZINC http://zinc.docking.org. Accessed 23 May 2010
ChemIDplus. http://chem.sis.nlm.nih.gov/chemidplus. Accessed 23 May 2010
Google http://www.google.com. Accessed 22 March through 03 April 2008
Google Scholar. http://scholar.google.com. Accessed 1 Jan 2010
GMELIN97 http://www.cas.org/ASSETS/DB25829EA4F94816AB0A152D24863B92/gmelin97.pdf. Accessed 23 May 2010
SureChem http://www.surechem.org. Accessed 23 May 2010
Chemical databases http://www.google.com/Top/Science/Chemistry/Chemical_Databases. Accessed 23 May 2010
Drug databases http://www.drugbank.ca. Accessed 30 Oct 2010
Schaeffer DJ, Janardan KG (1980) Abundance of organic compounds in water. Bull Environ Contam Toxicol 24:211–216
Milman BL (2008) Introduction to chemical identification (In Russian). VVM, Saint Petersburg
CHEMLIST http://www.cas.org/expertise/cascontent/regulated/index.html. Accessed 23 May 2010
CAS Registry http://www.cas.org/expertise/cascontent/registry/regsys.html. Accessed 23 May 2010
Protein sequences in the CAS Registry file on STN – exact and pattern searching (2004) CAS2052-1104 http://www.stn-international.com/uploads/tx_ptgsarelatedfiles/protseq.pdf. Accessed 23 May 2010
CAS Registry: Exact and pattern searching of nucleic acid sequences (2008) CAS2536-1108. http://www.cas.org/ASSETS/4CE1649F453A44E78DC4763702375D92/nucleic.pdf. Accessed 23 May 2010
UniProtKB/Swiss-Prot protein knowledgebase release 2010_06 statistics. http://expasy.org/sprot/relnotes/relstat.html. Accessed 24 May 2010
Protein existence (2008) http://www.uniprot.org/manual/protein_existence. Accessed 24 May 2010
CA Abstracts. http://www.cas.org/products/print/ca/abstracts.html. Accessed 24 May 2010
Milman BL, Zhurkovich IK (2009) Tandem mass spectral library of pesticides and its use in identification. Proceedings of the 18th International Mass Spectrometry Conference, Bremen
Compendium of Pesticide Common Names. http://www.alanwood.net/pesticides/index.html. Accessed 24 May 2010
NIST Mass Spectral Search Program, version 2.0d, and NIST/EPA/NIH Mass Spectral Library (2005)
Mastral AM, Callén MS (2000) A review on polycyclic aromatic hydrocarbon (PAH) emissions from energy generation. Environ Sci Technol 34:3051–3057
Small H (1973) Co-citation in the scientific literature: a new measure of the relationship between two documents. J Am Soc Inf Sci 24:265–269
Small H, Sweeney E (1985) Clustering the Science Citation Index using co-citation I. A comparison of methods. Scientometrics 7:391–409
Small H, Sweeney E, Greenlee E (1985) Clustering the Science Citation Index using co-citation II. Mapping science. Scientometrics 8:311–340
Milman BL, Gavrilova YA (1993) Analysis of citation and co-citation in chemical engineering. Scientometrics 27:53–74
Law J, Bauin S, Courtial JP, Whittaker J (1988) Policy and the mapping of scientific change: a co-word analysis of research into environmental acidification. Scientometrics 14:251–264
Peters HPF, Hartmann D, Van Raan AFJ (1988) Monitoring advances in chemical engineering. Informetrics 87(88):175–195
Milman BL, Gavrilova YA (1994) Science news in business journals as the source of information on applied and strategic research and science policy (In Russian). Sci Technol Inf 1(7):17–26
Wolfram D (2003) Applied informetrics for information retrieval research. Library Unlimited, Westport
Smalheiser NR, Swanson DR (1998) Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Programs Biomed 57:149–153
Weeber M, Klein H, Jong-van D, den Berg LTW, Vos R (2001) Using concepts in the literature-based discovery: simulating Swanson’s Raynaud-fish oil and migraine-magnesium discoveries. J Am Soc Inform Sci Technol 52:548–557
Wren JD, Bekeredjian R, Stewart JA, Shohet RV, Garner HR (2004) Knowledge discovery by automated identification and ranking of implicit relationships. Bioinformatics 20:389–398
Jenssen TK, Öberg LMJ, Andersson ML, Komorowski J. Methods for large-scale mining of networks of human genes. http://www.egeeninc.com/u/vilo/edu/2004-05/Bioinformaatika/Bioinf/Komorowski_siam2001.pdf. Accessed 30 Oct 2010
Milman BL (2008) Unpublished data
Shada DM, Wong CF, Elrod L, Morley JA, Gay CM (1996) Determination of 1-benzo[b]thien-2-ylethanone and related impurities by high performance liquid chromatography. J Pharm Biomed Anal 14:501–510
Sunesson AL, Nilsson CA, Andersson B, Blomquist G (1996) Volatile metabolites produced by two fungal species cultivated on building materials. Ann Occup Hyg 40:397–410
Zhou S, Ma J, Wang S, Chen Z (1991) Qualitative analysis of organic compounds in enclosed air by gas chromatography/mass spectrometry (In Chinese). Fenxi Huaxue 19:1115–1121. CA (1992) 116:135267
ISO Standard 22892 (2006) Soil quality - Guidelines for the identification of target compounds by gas chromatography and mass spectrometry
FAO/WHO Codex Alimentarius. Guidelines on the use of mass spectrometry (MS) for identification, confirmation and quantative determination of residues (2005) CAC/GL 56-2005. http://www.codexalimentarius.net/web/standard_list.jsp. Accessed 16 May 2010
SOFT/AAFS Forensic Laboratory Guidelines (2006). http://www.soft-tox.org/docs/Guidelines%202006%20Final.pdf. Accessed 17 May 2010
Schürmann A, Dvorak V, Crüzer C, Butcher P, Kaufmann A (2009) False-positive liquid chromatography/tandem mass spectrometric confirmation of sebuthylazine residues using the identification points system according to EU directive 2002/657/EC due to a biogenic insecticide in tarragon. Rapid Commun Mass Spectrom 23:1196–1200
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Milman, B.L. (2011). Prior Data for Non-target Identification. In: Chemical Identification and its Quality Assurance. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15361-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-15361-7_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15360-0
Online ISBN: 978-3-642-15361-7
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)