Skip to main content

Domain Driven Tree Mining of Semi-structured Mental Health Information

  • Chapter
Data Mining for Business Applications

The World Health Organization predicted that depression would be the world's leading cause of disability by 2020. This is calling for urgent interventions. As most mental illnesses are caused by a number of genetic and environmental factors and many different types of mental illness exist, the identification of a precise combination of genetic and environmental causes for each mental illness type is crucial in the prevention and effective treatment of mental illness. Sophisticated data analysis tools, such as data mining, can greatly contribute in the identification of precise patterns of genetic and environmental factors and greatly help the prevention and intervention strategies. One of the factors that complicates data mining in this area is that much of the information is not in strictly structured form. In this paper, we demonstrate the application of tree mining algorithms on semi-structured mental health information. The extracted data patterns can provide useful information to help in the prevention of mental illness, and assist in the delivery of effective and efficient mental health services.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal R., Srikant R.: Fast algorithms for mining association rules. VLDB, Chile (1994).

    Google Scholar 

  2. Asai T., Arimura H., Uno T., Nakano S.: Discovering Frequent Substructures in Large Unordered Trees. Proc. of the Int'l Conf. on Discovery Science, Japan (2003).

    Google Scholar 

  3. Craddock N., Jones I.: Molecular genetics of bipolar disorder. The British Journal of Psychiatry, vol. 178, no. 41, pp. 128–133 (2001).

    Article  Google Scholar 

  4. Ghoting A., Buehrer G., Parthasarathy S., Kim D., Nguyen A., Chen Y.-K., Dubey P. : Cache-conscious Frequent Pattern Mining on a Modern Processor, VLDB Conf., (2005).

    Google Scholar 

  5. Hadzic M., Chang E.: Web Semantics for Intelligent and Dynamic Information Retrieval Illustrated Within the Mental Health Domain, to appear in Advances in Web Semantics: A State-of-the Art, Springer, (2008).

    Google Scholar 

  6. Hadzic M., Chang E.: An Integrated Approach for Effective and Efficient Retrieval of the Information about Mental Illnesses', Biomedical Data and Applications, Springer, (2008).

    Google Scholar 

  7. Hadzic F., Tan H., Dillon T.S.: UNI3-Efficient Algorithm for Mining Unordered Induced Subtrees Using TMG Candidate Generation. IEEE CIDM Symposium, Hawaii (2007).

    Google Scholar 

  8. Hadzic F., Tan H., Dillon T.S., Chang E.: Implications of frequent subtree mining using hybrid support definition, Data Mining and Information Engineering, UK, (2007).

    Google Scholar 

  9. Hadzic F., Dillon T.S., Chang E.: Knowledge Analysis with Tree Patterns, HICSS-41, USA,(2008).

    Google Scholar 

  10. Hadzic F., Dillon T.S., Sidhu A., Chang E., Tan H.: Mining Substructures in Protein Data,IEEE ICDM DMB Workshop, China (2006).

    Google Scholar 

  11. Hadzic M., Hadzic F., Dillon T.: Mining of Health Information from Ontologies, Int'l Conf.on Health Informatics, Portugal, (2008).

    Google Scholar 

  12. Hadzic M., Hadzic F., Dillon T.: Tree Mining in Mental Health Domain, HICSS-41, USA,(2008).

    Google Scholar 

  13. Han J., Kamber M.: Data Mining: Concepts and Techniques (2nd edition). San Francisco:Morgan Kaufmann (2006).

    Google Scholar 

  14. Horvitz-Lennon M., Kilbourne A.M., Pincus H.A.: From Silos To Bridges: Meeting The General Health Care Needs Of Adults With Severe Mental Illnesses. Health Affairs vol. 25, no.3, pp. 659–669 (2006).

    Article  Google Scholar 

  15. Liu J., Juo S.H., Dewan A., Grunn A., Tong X., Brito M., Park N., Loth J.E., Kanyas K., Lerer B., Endicott J., Penchaszadeh G., Knowles J.A., Ott J., Gilliam T.C., Baron M.: Evidence for a putative bipolar disorder locus on 2p13–16 and other potential loci on 4q31, 7q34, 8q13, 9q31,10q21–24,13q32, 14q21 and 17q11–12. Mol Psychiatry, vol. 8, no. 3, pp. 333–342 (2003).

    Article  Google Scholar 

  16. Lopez A.D., Murray C.C.J.L.: The Global Burden of Disease, 1990–2020. Nature Medicine vol. 4, pp. 1241–1243 (1998).

    Article  Google Scholar 

  17. Novichkova S., Egorov S., Daraselia N.: Medscan, a natural language processing engine for Medline abstracts. Bioinformatics, vol. 19, no. 13, pp. 1699–1706, (2003).

    Article  Google Scholar 

  18. Onkamo P., Toivonen H.: A survey of data mining methods for linkage disequilibrium mapping. Human genomics, vol. 2, no. 5, pp. 336–340 (2006).

    Google Scholar 

  19. Piatetsky-Shapiro G., Tamayo P.: Microarray Data Mining: Facing the Challenges. SIGKDD Explorations, vol. 5, no. 2, pp. 1–6 (2003).

    Article  Google Scholar 

  20. Shasha D., Wang J.T.L., Zhang S.: Unordered Tree Mining with Applications to Phylogeny.Int'l Conf. on Data Engineering, USA (2004).

    Google Scholar 

  21. Sidhu A.S., Dillon T.S., Sidhu B.S., Setiawan H.: A Unified Representation of Protein Structure Databases. Biotech. Approaches for Sustainable Development, pp. 396–408 (2004).

    Google Scholar 

  22. Smith D.G., Ebrahim S., Lewis S., Hansell A.L., Palmer L.J., Burton P.R.: Genetic epidemiology and public health: hope, hype, and future prospects. The Lancet, vol. 366, no. 9495, pp.1484–1498 (2005).

    Article  Google Scholar 

  23. Tan H., Dillon T.S., Hadzic F., Chang E.: Razor: mining distance constrained embedded subtrees. IEEE ICDM 2006 Workshop on Ontology Mining and Knowledge Discovery from Semistructured documents, China (2006).

    Google Scholar 

  24. Tan H., Dillon T.S., Hadzic F., Chang E.: SEQUEST: mining frequent subsequences using DMA Strips. Data Mining and Information Engineering, Czech Republic, (2006).

    Google Scholar 

  25. Tan H., Dillon T.S., Hadzic F., Chang E., Feng L.: MB3-Miner: mining eMBedded subTREEs using Tree Model Guided candidate generation. MCD workshop, held in conjunction with ICDM05, USA (2005).

    Google Scholar 

  26. Tan H., Dillon T.S., Hadzic F., Feng L., Chang E.: IMB3-Miner: Mining Induced/Embedded subtrees by constraining the level of embedding. Proc. of PAKDD, (2006).

    Google Scholar 

  27. Tan H., Hadzic F., Dillon T.S., Feng L., Chang E.: Tree Model Guided Candidate Generation for Mining Frequent Subtrees from XML, to appear in ACM Transactions on Knowledge Discovery from Data, (2008).

    Google Scholar 

  28. Tan H., Hadzic F., Dillon T.S., Chang E.: State of the art of data mining of tree structured information, CSSE Journal, vol. 23, no 2, (2008).

    Google Scholar 

  29. Wang J.T.L., Shan H., Shasha D., Piel W.H.: Treerank: A similarity measure for nearest neighbor searching in phylogenetic databases. Int'l Conf. on Scientific and Statistical Database Management, USA (2003).

    Google Scholar 

  30. Wilczynski N.L., Haynes R.B., Hedges T.: Optimal search strategies for identifying mentalhealth content in MEDLINE: an analytic survey. Annals of General Psychiatry, vol. 5, (2006).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maja Hadzic .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Hadzic, M., Hadzic, F., Dillon, T.S. (2009). Domain Driven Tree Mining of Semi-structured Mental Health Information. In: Cao, L., Yu, P.S., Zhang, C., Zhang, H. (eds) Data Mining for Business Applications. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-79420-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-79420-4_9

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-79419-8

  • Online ISBN: 978-0-387-79420-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics