Skip to main content
Log in

Future trends in data mining

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Over recent years data mining has been establishing itself as one of the major disciplines in computer science with growing industrial impact. Undoubtedly, research in data mining will continue and even increase over coming decades. In this article, we sketch our vision of the future of data mining. Starting from the classic definition of “data mining”, we elaborate on topics that — in our opinion — will set trends in data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Achtert E, Böhm C, Kriegel H-P, Kröger P (2005) Online hierarchical clustering in a data warehouse environment. In: Proceedings of the 5th international conference on data mining (ICDM), Houston, TX, pp 10–17

  • Bille P (2005) A survey on tree edit distance and related problems. Theor Comput Sci 337(1–3):217–239

    Article  MATH  Google Scholar 

  • Blum A, Mitchell T (1998) Combining labeled and unlabeled data with Co-training. In: Proceedings of the 11th annual conference on computational learning theory (COLT), Madison, WI, pp 92–100

  • Bø TH, Dysvik B, Jonassen I (2004) LSimpute: accurate estimation of missing values in microarray data with least squares methods. Nucleic Acids Res 32(3)

  • Böhm C, Kailing K, Kröger P, Zimek A (2004) Computing clusters of correlation connected objects. In: Proceedings of the SIGMOD conference, Paris, France, pp 455–466

  • Cronea SF, Lessmann S, Stahlbock R (2005) The impact of preprocessing on data mining: an evaluation of classifier sensitivity in direct marketing. Eur J Oper Res

  • Dietterich TG, Lathrop RH, Lozano-Perez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89:31–71

    Article  MATH  Google Scholar 

  • Domeniconi C, Gunopulos D (2001) Incremental support vector machine construction. In: Proceedings of the 1st international conference on data mining (ICDM), San Jose, CA, pp 589–592

  • Eiter T, Mannila H (1997) Distance measures for point sets and their computation. Acta Informatica 34(2):103–133

    Article  Google Scholar 

  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996) Knowledge discovery and data mining: Towards a unifying framework. In: Proceedings of the 2nd ACM international conference on knowledge discovery and data mining (KDD), Portland, OR, pp 82–88

  • Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. SIGMOD Records 34(2)

  • Gärtner T, Flach PA, Kowalczyk A, Smola A (2002) Multi-instance kernels. In: Proceedings of the 19th international conference on machine learning (ICML), Sydney, Australia, pp 179–186

  • Halevy AY (2003) Data integration: a status report. In: BTW, pp 24–29

  • Han J, Kamber M (2001) Data mining: concepts and techniques. Academic Press, San Diego

    Google Scholar 

  • Jörnsten R, Wang H-Y, Welsh WJ, Ouyang M (2005) DNA microarray data imputation and significance analysis of differential expression. Bioinformatics 21(22):4155–4161

    Article  Google Scholar 

  • Kailing K, Kriegel H-P, Pryakhin A, Schubert M (2004) Clustering multi-represented objects with noise. In: Proceedings of the 8th pacific-asia conference on knowledge discovery and data mining (PAKDD), Sydney, Australia, pp 394–403

  • Kanellopoulos Y, Dimopulos T, Tjortjis C, Makris C (2006) Mining source code elements for comprehending object-oriented systems and evaluating their maintainability. SIGKDD Explorations 8(1):33–40

    Article  Google Scholar 

  • Keogh E, Kasetty S (2002) On the need for time series data mining benchmarks: A survey and empirical demonstration. In: Proceedings of the 8th ACM international conference on knowledge discovery and data mining (SIGKDD), Edmonton, Alberta, pp 102–111

  • Kittler J, Hatef M, Duin R, Matas J (1998) On combining classifiers. IEEE Trans Pattern Analysis and Machine Intelligence 20(3):226–239

    Article  Google Scholar 

  • Kriegel H-P, Kröger P, Pryakhin A, Schubert M (2004) Using support vector machines for classifying large sets of multi-represented objects. In: Proceedings of the 4th SIAM international conference on data mining (SDM), Orlando, FL, pp 102–113

  • Kriegel H-P, Pryakhin A, Schubert M (2005) Multi-represented kNN-classification for large class sets. In: Proceedings of the 10th international conference on database systems for advanced applications (DASFAA), Beijing, China, pp 511–522

  • Kriegel H-P, Pryakhin A, Schubert M (2006) An EM-approach for clustering multi-instance objects. In: Proceedings of the 10th pacific-asia conference on knowledge discovery and data mining (PAKDD), Singapore, pp 139–148

  • Liu C, Yan X, Yu H, Han J, Yu PS (2005) Mining behaviour graphs for “backtrace” of noncrashing bugs. In: Proceedings of the 5th SIAM international conference on data mining (SDM), Newport Beach, CA, pp 286–297

  • Liu K, Kargupta H, Bhaduri K, Ryan J (2006a) Distributed data mining bibliography, January 2006. http://www.csee.umbc.edu/ hillol/DDMBIB/

  • Liu C, Yan X, Han J (2006) Mining control flow abnormality for logic error isolation. In: Proceedings of the 6th SIAM international conference on data mining (SDM), Bethesda, MD, pp 106–117

  • Pyle D (1999) Data preparation for data mining. Morgan Kaufmann Publishers Inc.

  • Ramon J, Bruynooghe M (2001) A polynomial time computable metric between points sets. Acta Informatica 37:765–780

    Article  MATH  Google Scholar 

  • Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of the 3rd ACM international conference on knowledge discovery and data mining (KDD), Newport Beach, CA, pp 67–73

  • Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17(6):520–525

    Article  Google Scholar 

  • Weidmann N, Frank E, Pfahringer B (2003) A two-level learning method for generalized multi-instance problems. In: Proceedings of the 14th european conference on machine learning (ECML), Cavtat-Dubrovnik, Croatia, pp 468–479

  • Washio T, Motoda H (2003) State of the art of graph-based data mining. SIGKDD Explorations Newslett 5(1):59–68

    Article  Google Scholar 

  • Yarowsky D (1995) Unsupervised word sense disambiguation rivaling supervised methods. In: Meeting of the association for computational linguistics

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hans-Peter Kriegel.

Additional information

Responsible editor: Geoffrey Webb

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kriegel, HP., Borgwardt, K.M., Kröger, P. et al. Future trends in data mining. Data Min Knowl Disc 15, 87–97 (2007). https://doi.org/10.1007/s10618-007-0067-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-007-0067-9

Keywords

Navigation