Prior Data for Non-target Identification

Milman, Boris L.

doi:10.1007/978-3-642-15361-7_6

Boris L. Milman²

1048 Accesses

Abstract

This chapter is devoted to prior information required to set up and test identification hypotheses. According to its type, the relevant information is divided into meaning and statistical data. Knowledge with regard to the origin, properties, and use of chemical compounds is very essential in order to be able to propose and reject candidate compounds for identification. Prior information about samples analyzed is important in order to gather full evidence of the trueness of an identification result. Plausibility of qualitative analytical results is also taken into account to confirm conclusions made by analysts. Much of such data are extracted from chemical databases outlined in this chapter. These data sources are also used to calculate statistical rates of occurrence and co-occurrence of chemical compounds in the literature. The occurrence rate is the direct measure of the abundance of chemical compounds, and the related possibility of presenting in samples to be analyzed. Rare compounds are filtered out by means of this rate, and further excluded from consideration for identification purposes. Most known compounds are rare ones, as proved by respective statistical data. Facts and rates of the co-occurrence of chemical compounds in the literature provide the possibility of a priori prediction of a group of compounds available in the same samples analyzed. Different methods of estimating these rates are described; examples of their use for identification are given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Milman BL, Kovrizhnych MA (2000) Identification of chemical substances by testing and screening of hypotheses II. Determination of impurities in n-hexane and naphthalene Fresenius. J Anal Chem 367:629–634
Article CAS Google Scholar
Milman BL (2002) A Procedure for decreasing uncertainty in the identification of chemical compounds based on their literature citation and cocitation. Two case studies. Anal Chem 74:1484–1492
Article CAS Google Scholar
Milman BL (2005) Literature-based generation of hypotheses on chemical composition using database co-occurrence of chemical compounds. J Chem Inf Model 45:1153–1158
Article CAS Google Scholar
Milman BL (2005) Identification of chemical compounds. Trends Anal Chem 24:493–508
Article CAS Google Scholar
Milman BL, Konopelko LA (2000) Identification of chemical substances by testing and screening of hypotheses I. General. Fresenius J Anal Chem 367:621–628
Article CAS Google Scholar
Anari MR, Baillie TA (2005) Bridging cheminformatic metabolite prediction and tandem mass spectrometry. Drug Discov Today 10:711–717
Article CAS Google Scholar
Baranczewski P, Stańczak A, Kautiainen A, Sandin P, Edlund PO (2006) Introduction to early in vitro identification of metabolites of new chemical entities in drug discovery and development. Pharmacol Rep 58:341–352
CAS Google Scholar
Staack RF, Hopfgartner G (2007) New analytical strategies in studying drug metabolism. Anal Bioanal Chem 388:1365–1380
Article CAS Google Scholar
Roger S, Scheltema RA, Girolami M, Breitling R (2009) Probabilistic assignment of formulas to mass peaks in metabolomics experiments. Bioinformatics 25:512–518
Article Google Scholar
Chemical Abstracts Service. http://www.cas.org/expertise/cascontent/index.html. Accessed 23 May 2010
CrossFire Beilstein. http://www.info.crossfirebeilstein.com. Accessed 30 Oct 2010
The Combined Chemical Dictionary on DVD. http://www.crcpress.com/product/isbn/9780412820205. Accessed 29 Oct 2010
CHEMnetBASE. http://www.chemnetbase.com. Accessed 23 May 2010
The Merck Index. http://www.merckbooks.com/mindex. Accessed 23 May 2010
KEGG: Kyoto Encyclopedia of Genes and Genomes. http://www.genome.jp/kegg. Accessed 23 May 2010
NIST Chemistry WebBook. http://webbook.nist.gov/chemistry. Accessed 23 May 2010
PubChem. http://pubchem.ncbi.nlm.nih.gov. Accessed 6 July 2009
ChemSpider. http://www.chemspider.com. Accessed 23 May 2010
CHEMCATS. http://www.cas.org/expertise/cascontent/chemcats.html. Accessed 23 May 2010
ZINC http://zinc.docking.org. Accessed 23 May 2010
ChemIDplus. http://chem.sis.nlm.nih.gov/chemidplus. Accessed 23 May 2010
Google http://www.google.com. Accessed 22 March through 03 April 2008
Google Scholar. http://scholar.google.com. Accessed 1 Jan 2010
GMELIN97 http://www.cas.org/ASSETS/DB25829EA4F94816AB0A152D24863B92/gmelin97.pdf. Accessed 23 May 2010
SureChem http://www.surechem.org. Accessed 23 May 2010
Chemical databases http://www.google.com/Top/Science/Chemistry/Chemical_Databases. Accessed 23 May 2010
Drug databases http://www.drugbank.ca. Accessed 30 Oct 2010
Schaeffer DJ, Janardan KG (1980) Abundance of organic compounds in water. Bull Environ Contam Toxicol 24:211–216
Article CAS Google Scholar
Milman BL (2008) Introduction to chemical identification (In Russian). VVM, Saint Petersburg
Google Scholar
CHEMLIST http://www.cas.org/expertise/cascontent/regulated/index.html. Accessed 23 May 2010
CAS Registry http://www.cas.org/expertise/cascontent/registry/regsys.html. Accessed 23 May 2010
Protein sequences in the CAS Registry file on STN – exact and pattern searching (2004) CAS2052-1104 http://www.stn-international.com/uploads/tx_ptgsarelatedfiles/protseq.pdf. Accessed 23 May 2010
CAS Registry: Exact and pattern searching of nucleic acid sequences (2008) CAS2536-1108. http://www.cas.org/ASSETS/4CE1649F453A44E78DC4763702375D92/nucleic.pdf. Accessed 23 May 2010
UniProtKB/Swiss-Prot protein knowledgebase release 2010_06 statistics. http://expasy.org/sprot/relnotes/relstat.html. Accessed 24 May 2010
Protein existence (2008) http://www.uniprot.org/manual/protein_existence. Accessed 24 May 2010
CA Abstracts. http://www.cas.org/products/print/ca/abstracts.html. Accessed 24 May 2010
Milman BL, Zhurkovich IK (2009) Tandem mass spectral library of pesticides and its use in identification. Proceedings of the 18th International Mass Spectrometry Conference, Bremen
Google Scholar
Compendium of Pesticide Common Names. http://www.alanwood.net/pesticides/index.html. Accessed 24 May 2010
NIST Mass Spectral Search Program, version 2.0d, and NIST/EPA/NIH Mass Spectral Library (2005)
Google Scholar
Mastral AM, Callén MS (2000) A review on polycyclic aromatic hydrocarbon (PAH) emissions from energy generation. Environ Sci Technol 34:3051–3057
Article CAS Google Scholar
Small H (1973) Co-citation in the scientific literature: a new measure of the relationship between two documents. J Am Soc Inf Sci 24:265–269
Article Google Scholar
Small H, Sweeney E (1985) Clustering the Science Citation Index using co-citation I. A comparison of methods. Scientometrics 7:391–409
Article Google Scholar
Small H, Sweeney E, Greenlee E (1985) Clustering the Science Citation Index using co-citation II. Mapping science. Scientometrics 8:311–340
Article Google Scholar
Milman BL, Gavrilova YA (1993) Analysis of citation and co-citation in chemical engineering. Scientometrics 27:53–74
Article Google Scholar
Law J, Bauin S, Courtial JP, Whittaker J (1988) Policy and the mapping of scientific change: a co-word analysis of research into environmental acidification. Scientometrics 14:251–264
Article Google Scholar
Peters HPF, Hartmann D, Van Raan AFJ (1988) Monitoring advances in chemical engineering. Informetrics 87(88):175–195
Google Scholar
Milman BL, Gavrilova YA (1994) Science news in business journals as the source of information on applied and strategic research and science policy (In Russian). Sci Technol Inf 1(7):17–26
Google Scholar
Wolfram D (2003) Applied informetrics for information retrieval research. Library Unlimited, Westport
Google Scholar
Smalheiser NR, Swanson DR (1998) Using ARROWSMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Programs Biomed 57:149–153
Article CAS Google Scholar
Weeber M, Klein H, Jong-van D, den Berg LTW, Vos R (2001) Using concepts in the literature-based discovery: simulating Swanson’s Raynaud-fish oil and migraine-magnesium discoveries. J Am Soc Inform Sci Technol 52:548–557
Article CAS Google Scholar
Wren JD, Bekeredjian R, Stewart JA, Shohet RV, Garner HR (2004) Knowledge discovery by automated identification and ranking of implicit relationships. Bioinformatics 20:389–398
Article CAS Google Scholar
Jenssen TK, Öberg LMJ, Andersson ML, Komorowski J. Methods for large-scale mining of networks of human genes. http://www.egeeninc.com/u/vilo/edu/2004-05/Bioinformaatika/Bioinf/Komorowski_siam2001.pdf. Accessed 30 Oct 2010
Milman BL (2008) Unpublished data
Google Scholar
Shada DM, Wong CF, Elrod L, Morley JA, Gay CM (1996) Determination of 1-benzo[b]thien-2-ylethanone and related impurities by high performance liquid chromatography. J Pharm Biomed Anal 14:501–510
Article CAS Google Scholar
Sunesson AL, Nilsson CA, Andersson B, Blomquist G (1996) Volatile metabolites produced by two fungal species cultivated on building materials. Ann Occup Hyg 40:397–410
CAS Google Scholar
Zhou S, Ma J, Wang S, Chen Z (1991) Qualitative analysis of organic compounds in enclosed air by gas chromatography/mass spectrometry (In Chinese). Fenxi Huaxue 19:1115–1121. CA (1992) 116:135267
Google Scholar
ISO Standard 22892 (2006) Soil quality - Guidelines for the identification of target compounds by gas chromatography and mass spectrometry
Google Scholar
FAO/WHO Codex Alimentarius. Guidelines on the use of mass spectrometry (MS) for identification, confirmation and quantative determination of residues (2005) CAC/GL 56-2005. http://www.codexalimentarius.net/web/standard_list.jsp. Accessed 16 May 2010
SOFT/AAFS Forensic Laboratory Guidelines (2006). http://www.soft-tox.org/docs/Guidelines%202006%20Final.pdf. Accessed 17 May 2010
Schürmann A, Dvorak V, Crüzer C, Butcher P, Kaufmann A (2009) False-positive liquid chromatography/tandem mass spectrometric confirmation of sebuthylazine residues using the identification points system according to EU directive 2002/657/EC due to a biogenic insecticide in tarragon. Rapid Commun Mass Spectrom 23:1196–1200
Article Google Scholar

Download references

Author information

Authors and Affiliations

D.I. Mendeleyev Inst. for Metrology (VNIIM) and Cent. for Ecol. Saf. of Russ. Acad. of Sciences, 65, 9 Morskaya nab, 199226, St. Petersburg, Russia
Boris L. Milman

Authors

Boris L. Milman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Boris L. Milman .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Milman, B.L. (2011). Prior Data for Non-target Identification. In: Chemical Identification and its Quality Assurance. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15361-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-15361-7_6
Published: 15 October 2010
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15360-0
Online ISBN: 978-3-642-15361-7
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics