Advertisement

Bioinformatics in MPM: Using Decision Trees To Predict a Second Tumor Site

  • Alberto Cavallo
  • Concetta Dodaro
Part of the Updates in Surgery book series (UPDATESSURG)

Abstract

The availability of large databases of medical data has made it possible to apply statistical methodologies designed to deal with large data sets to medical applications. One of the largest databases, comprising data on multiple primary malignancies (MPM), is that of the NCI’s Surveillance, Epidemiology and End Results (SEER) program [1]. SEER cases have been collected since 1973, with constant updates and upgrades of the program during the subsequent years. SEER thus provides an appealing source for statistical investigations of MPM.

Keywords

Oral Cavity Female Genital System Male Genital System Cervix Uterus Standard Incidence Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Overview of the SEER Program, available at http://seer.cancer.gov/about/Google Scholar
  2. 2.
    Larose DT (2005) Discovering knowledge in data: an introduction to data mining. Wiley-Interscience, Hoboken, NJGoogle Scholar
  3. 3.
    Daüntsch I, Gediga G (2000) Rough set data analysis: road to non-invasive knowledge discovery. Methoδos Publishers, Bangor, UKGoogle Scholar
  4. 4.
    Zhang H, Liu D (2006) Fuzzy modeling and fuzzy control. Birkhäuser, Boston Basel BerlinGoogle Scholar
  5. 5.
    Polkowski L (2001) Rough sets. Physica-Verlag, Heidelberg, New YorkGoogle Scholar
  6. 6.
    Breiman L, Friedman JH, Olshen R, Stone CJ (1993) Classification and regression trees. Chapman & Hall, Boca RatonGoogle Scholar
  7. 7.
    Motwani R, Raghavan P (1995) Randomized algorithms. Cambridge University Press, New YorkGoogle Scholar
  8. 8.
    Tsumoto S (2004) Mining diagnostic rules from clinical databases using rough sets and medical diagnostic model. Inf Sci (Ny) 162:65–80CrossRefGoogle Scholar
  9. 9.
    Mugambi EM, Hunter A, Oatley G, Kennedy L (2004) Polynomial-fuzzy decision tree structures for classifying medical data. Knowledge-based Systems 17:81–87CrossRefGoogle Scholar
  10. 10.
    Incidence — SEER 9 Regs Limited-Use, Nov 2006 Sub (1973-2004) — Linked To County Attributes — Total U.S., 1969-2004 Counties, National Cancer Institute, DCCPS, Surveillance Research Program, Cancer Statistics Branch, released April 2007, based on the November 2006 submissionGoogle Scholar
  11. 11.
    Multiple Primary and Histology Coding Rules, January 01, 2007, National Cancer Institute Surveillance Epidemiology and End Results Program, Bethesda, MD. Available at http://seer.cancer.gov/tools/mphrules/2007_mphrules_manual_04302008.pdfGoogle Scholar
  12. 12.
    Localized/Regional/Distant Stage Adjustments, documentation available on the web at http://seer.cancer.gov/seerstat/variables/seer/yr1973_2004/lrd_stage/Google Scholar
  13. 13.
    Young JL Jr, Roffers SD, Ries LAG et al (eds) (2001) SEER Summary Staging Manual — 2000: Codes and Coding Instructions. National Cancer Institute, Bethesda, NIH Pub 01-4969Google Scholar
  14. 14.
    Adjadj E, Rubino C, Shamsaldim A et al (2003) The risk of multiple primary breast and thyroid carcinomas: role of radiation dose. Cancer 98:1309–1317PubMedCrossRefGoogle Scholar
  15. 15.
    Curtis RE, Freedman DM, Ron E et al (eds) (2006) New malignancies among cancer survivors: SEER cancer registries, 1973-2000. National Cancer Institute, Bethesda, NIH Pub 05-5302Google Scholar
  16. 16.
    Curtis RE, Ries LAG (2006) Methods. In: Curtis RE, Freedman DM, Ron E et al (eds) New malignancies among cancer survivors: SEER cancer registries, 1973-2000. National Cancer Institute, Bethesda, NIH Pub 05-5302, pp 9–14Google Scholar
  17. 17.
    Neugut AI, Meadows AT, Robinson E (eds) (1999) Multiple primary cancers. Lippincott Williams & Wilkins, PhiladelphiaGoogle Scholar
  18. 18.
    Begg CB (1999) Methodological and statistical considerations in the study of multiple primary cancers. In: Neugut AI, Meadows AT, Robinson E (eds) Multiple primary cancers. Lippincott Williams & Wilkins, Philadelphia, pp 13–26Google Scholar
  19. 19.
    Nelles O (2001) Nonlinear system identification. Springer-Verlag, Berlin HeidelbergGoogle Scholar
  20. 20.
    MATLAB with Statistic Toolbox, ver. 2007a, The Mathworks. Available at http://www.mathworks.com/products/statistics/Google Scholar
  21. 21.
    Surveillance Research Program, National Cancer Institute SEER*Stat, version 6.3.6, available at: www.seer.cancer.gov/seerstatGoogle Scholar
  22. 22.
    Kleinerman RA, Kosary C, Hildesheim A (2006) New malignancies following cancer of the cervix uteri, vagina, and vulva. In: Curtis RE, Freedman DM, Ron et al (Eds) New malignancies among cancer survivors: SEER cancer registries, 1973-2000. National Cancer Institute, Bethesda, NIH Pub 05-5302, pp 207–229Google Scholar

Copyright information

© Springer-Verlag Italia 2009

Authors and Affiliations

  • Alberto Cavallo
    • 1
  • Concetta Dodaro
    • 2
  1. 1.Department of Engineering of InformationsSecond University of NaplesNaplesItaly
  2. 2.Surgical, Anesthesiology-rianimative and Emergency Sciences DepartmentFederico II UniversityNaplesItaly

Personalised recommendations