Skip to main content

Geospatial Data Mining and Knowledge Discovery

  • Chapter
  • First Online:

Part of the book series: Geotechnologies and the Environment ((GEOTECH,volume 12))

Abstract

This chapter surveys three emerging issues concerning geospatial data mining: the need to extend patient privacy protections beyond HIPAA, the use of geospatial visualization and data mining algorithms in medical geographic research, and the growth of geospatial data mining applications in public health. Geospatial data mining is the process of discovering interesting patterns in large and disparate geographic datasets so that the information is meaningful and useful to decision-makers. It involves geo-statistical algorithms, which are used for prediction, classification, and for finding interesting patterns in the data, such as associations, clusters and subgroups. A major challenge in the discipline of public health is harvesting knowledge discovery from the growing volume of data, because the discipline is a knowledge-intensive domain. Most health care applications are data-intensive and involve sophisticated data mining techniques. Since health is a geographical phenomenon, geospatial technologies play an important role in strengthening the process of epidemiological surveillance, information management and analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Scholten HJ, De Lepper MJC (1991) The benefits of the application of Geographic information systems in public and environmental health. WHO Stat Q 44(3):160–171

    Google Scholar 

  2. WHO (1999) Geographical information systems (GIS): mapping for epidemiological surveillance. Wkly Epidemiol Rec 74(34):281–285

    Google Scholar 

  3. Kuo RJ, Li SY, Shi CW (2007) Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan. Expert Syst Appl 33:794–808

    Article  Google Scholar 

  4. Lavrac N, Bohanec M, Pur A et al (2007) Data mining and visualization for decision support and modeling of public health-care resources. J Biomed Inform 40(4):438–447

    Article  Google Scholar 

  5. Mullinsa IM, Siadatya MS, Lymana J (2005) Data mining and clinical data repositories: insights from a 667,000 patient data set. Comput Biol Med 36:1351–1377

    Article  Google Scholar 

  6. Shekhar S, Vatsavai R (2003) Techniques for mining geospatial databases, as chapter 22. In: Ye N (ed) Handbook of data mining. LEA Publishers, Hillsdale

    Google Scholar 

  7. DemSar U (2007) Investigating visual exploration of geospatial data: an exploratory usability experiment for visual data mining. Comput Environ Urban Syst 31(5):551–571

    Article  Google Scholar 

  8. Beilken C, Spenke M (1999) Visual interactive data mining with InfoZoom – the Medical Data Set. In: Proceedings of the 3rd European conference on principles and practice of knowledge discovery in databases, PKDD 1999, Prague, Czech Republic

    Google Scholar 

  9. Health Insurance Portability and Accountability Act of 1996. Pub. L. 104–191 (Aug. 21, 1996)

    Google Scholar 

  10. Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570

    Article  Google Scholar 

  11. Samarati, P, Sweeney L (1998) Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report. SRI International

    Google Scholar 

  12. LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data, Maryland, pp 49–60. doi:10.1145/1066157.1066164

  13. Menon S, Sarkar S (2006) Exploiting problem structure to efficiently sanitize very large transactional databases. In: The 16th workshop on information technology and systems, Milwaukee, Wisconsin

    Google Scholar 

  14. Samarati P (2001) Protecting respondents’ identities in microdata release. IEEE Trans Knowl Data Eng 13(6):1010–1027

    Article  Google Scholar 

  15. Xiao X, Tao Y (2006) Anatomy: simple and effective privacy preservation. In: Proceedings of the 32nd international conference on very large data bases, Seoul, Korea, pp 139–150

    Google Scholar 

  16. Xu J, Wang W, Pei J et al (2006) Utility-based anonymization for privacy preservation with less information loss. SIGKDD Explor Newsl 8(2):21–30. doi:10.1145/1233321.1233324

    Article  Google Scholar 

  17. Xu J, Wang W, Pei J et al (2006) Utility-based anonymization using local recoding. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM Press, New York, pp 785–790. doi:10.1145/1150402.1150504

  18. Zhang Q, Koudas N, Srivastava D et al (2007) Aggregate query answering on anonymized tables. In: 2007 IEEE 23rd international conference on data engineering, Istanbul, pp 116–125. doi:10.1109/ICDE.2007.367857

  19. Sanches P, Svee E-O, Bylund M et al (2013) Knowing your population: privacy-sensitive mining of massive data. Netw Commun Technol 2(1):34–51. doi:10.5539/nct.v2n1p34

    Google Scholar 

  20. MacEachren AM, Brewer CA, Pickle LW (1998) Visualizing georeferenced data: representing reliability of health statistics. Environ Plan A 30(9):1547–1561

    Article  Google Scholar 

  21. Koua EL, Kraak MJ (2004) Geovisualization to support the exploration of large health and demographic survey data. Int J Health Geogr 3(1):12

    Article  Google Scholar 

  22. Tominski C, Schulze-Wollgast P, Schumann H (2008) Visual methods for analyzing human health data. In: Encyclopedia of healthcare information systems. Medical Information Science Reference, Hershey, pp 1357–1364

    Chapter  Google Scholar 

  23. Keahey TA (1998) Visualization of high-dimensional clusters using nonlinear magnification. Vis Data Expl Anal VI 3643:228–235

    Google Scholar 

  24. Madigan EA, Curet OL (2006) A data mining approach in home healthcare: outcomes and service use. BMC Health Serv Res 6(1):18. doi:10.1186/1472-6963-6-18

    Article  Google Scholar 

  25. Verdegem P, Verleye G (2009) User-centered e-government in practice: a comprehensive model for measuring user satisfaction. Gov Inf Q 26(3):487–497

    Article  Google Scholar 

  26. Gil-Garcia JR, Pardo TA (2005) E-Government success factors: mapping practical tools to theoretical foundations. Gov Inf Q 22(2):187–216

    Article  Google Scholar 

  27. Fedorowicz J, Dias MA (2010) A decade of design in digital government research. Gov Inf Q 27(1):1–8

    Article  Google Scholar 

  28. Donker-Kuijer MW, de Jong M, Lentz L (2010) Usable guidelines for usable websites? An analysis of five e-government heuristics. Gov Inf Q 27(3):254–263

    Article  Google Scholar 

  29. Goodman DC, Wennberg JE (1999) Maps and health: the challenges of interpretation. J Public Health Manag Pract 5(4):xiii–xvii

    Article  Google Scholar 

  30. Villalon M (1999) GIS and the internet: tools that add value to your health plan. Health Manag Technol 20(9):16–18

    Google Scholar 

  31. Castronovo D, Chui KKH, Naumova EN (2009) Dynamic maps: a visual-analytic methodology for exploring spatio-temporal disease patterns. Environ Heal 8:61. doi:10.1186/1476-069X-8-61

    Article  Google Scholar 

  32. Lu X (2005) A framework of web GIS based unified public health information visualization platform. Comput Sci Appl ICCSA 3482:265–268

    Google Scholar 

  33. Sopan A, Noh AS-I, Karol S et al (2012) Community Health Map: a geospatial and multivariate data visualization tool for public health datasets. Gov Inf Q 29:223–234

    Article  Google Scholar 

  34. Jeffery C, Ozonoff A, White LF et al (2009) Power to detect spatial disturbances under different levels of geographic aggregation. J Am Med Inform Assoc 16:847–854

    Article  Google Scholar 

  35. Jones SG, Kulldorff M (2012) Influence of spatial resolution on space-time disease cluster detection. PLoS ONE 7(10):e48036. doi:10.1371/journal.pone.0048036

    Article  Google Scholar 

  36. Gotway CA, Young LJ (2002) Combining incompatible spatial data. J Am Stat Assoc 97:632–649

    Article  Google Scholar 

  37. Wakefield J, Kelsall J, Morris S (2000) Clustering, cluster detection, and spatial variation in risk. In: Elliott P, Wakefied JC, Best NG, Briggs DJ (eds) Spatial epidemiology: methods and applications. Oxford University Press, Oxford, pp 128–152

    Google Scholar 

  38. Jeffery C, Ozonoff A, Pagano M (2014) The effect of spatial aggregation on performance when mapping a risk of disease. Int J Health Geogr 13:9. doi:10.1186/1476-072X-13-9

    Article  Google Scholar 

  39. Ozonoff A, Jeffery C, Pagano M (2009) Multivariate disease mapping. In: Proceedings of the American Statistical Association, Biometrics Section [CD-ROM] ASA

    Google Scholar 

  40. Raghupathi W (2010) Data mining in health care. In: Kudyba S (ed) Healthcare informatics: improving efficiency and productivity. Taylor & Francis, Boca Raton, pp 211–223

    Chapter  Google Scholar 

  41. Fernandes L, O’Connor M, Weaver V (2012) Big data, bigger outcomes. J AHIMA 83(10):38–42

    Google Scholar 

  42. IHTT (2013) Transforming health care through big data: strategies for leveraging big data in the health care industry. http://ihealthtran.com/iHT2_BigData_2013.pdf. Accessed 5 Apr 2014

  43. Bian J, Topaloglu U, Yu F et al (2012) Towards large-scale Twitter mining for drug-Related adverse events. In: Proceedings of the 2012 international workshop on smart health and wellbeing, pp 25–32. doi:10.1145/2389707.2389713

  44. Savage N (2012) Digging for drug facts. Commun ACM 55(10):11–13

    Article  Google Scholar 

  45. LaValle S, Lesser E, Shockley R et al (2011) Big data, analytics and the path from insights to value. MIT Sloan Manag Rev 52:20–32

    Google Scholar 

  46. Courtney M (2013) Puzzling out big data. Eng Technol 7(12):56–60

    Article  Google Scholar 

  47. Raghupathi W, Raghupathi V (2014) Big data analytics in healthcare: promise and potential. Health Inf Sci Syst 2:3. doi:10.1186/2047-2501-2-3

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Blatt, A.J. (2015). Geospatial Data Mining and Knowledge Discovery. In: Health, Science, and Place. Geotechnologies and the Environment, vol 12. Springer, Cham. https://doi.org/10.1007/978-3-319-12003-4_7

Download citation

Publish with us

Policies and ethics