Introduction: What Is Core

  • Boris Mirkin
Part of the Undergraduate Topics in Computer Science book series (UTICS)


This is an introductory chapter in which(i)Goals of data analysis as a tool helping to enhance and augment knowledge of the domain are outlined. Since knowledge is represented by the concepts and statements of relation between them, two main pathways for data analysis are summarization, for developing and augmenting concepts, and correlation, for enhancing and establishing relations. (ii)A set of seven cases involving small datasets and related data analysis problems is presented. The datasets are taken from various fields such as monitoring market towns, computer security protocols, bioinformatics, cognitive psychology. (iii)An overview of data visualization, its goals and some techniques is given.


Mathematical Structure Protein Amino Acid Sequence Market Town Produce Classification Rule Computational Data Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Abdi, H., Valentin, D., Edelman, B.: Neural Networks, Series: Quantitative Applications in the Social Sciences, 124. Sage Publications, London, ISBN 0-7619-1440-4 (1999)Google Scholar
  2. Berthold, M., Hand D.: Intelligent Data Analysis. Springer, Berlin-Heidelberg (2003)Google Scholar
  3. Betts, M.J., Russell, R.B.: Amino acid properties and consequences of subsitutions. In: Barnes, M.R., Gray, I.C. (eds.) Bioinformatics for Geneticists. Wiley, New York, NY (2003)Google Scholar
  4. Card, S.K., Mackinlay, J.D., Shneiderman B.: Readings in Information Visualization: Using Vision to Think. Morgan Kaufmann Publishers, San Francisco, CA, ISBN 1-55860-533-9 (1999)Google Scholar
  5. Duda, R.O., Hart, P.E., Stork D.G.: Pattern Classification. Wiley-Interscience, New York, NY, ISBN 0-471-05669-3 (2001)MATHGoogle Scholar
  6. Engelbrecht, A.P.: Computational Intelligence. Wiley, Chichester, ISBN 0-470-84870-7 (2002)Google Scholar
  7. Fisher, R.: The use of multiple measurements in taxonomic problems. Annals Eugen. 7, 179–188 (1936)CrossRefGoogle Scholar
  8. Gama, J.: Knowledge Discovery from Data Streams. Boca Raton, Chapman & Hall/CRC (2010)Google Scholar
  9. Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E.: Multivariate Data Analysis, 7th edn. Prentice Hall, Upper Saddle River, NJ, ISBN-10: 0-13-813263-1 (2010)Google Scholar
  10. Han, J., Kamber, M., J. Pei: Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2006)Google Scholar
  11. Hartigan, J.A.: Clustering Algorithms. Wiley, New York, NY (1975)MATHGoogle Scholar
  12. Haykin, S. S.: Neural Networks, 2nd edn. Prentice Hall, Upper Saddle River NJ, ISBN 0132733501 (1999)Google Scholar
  13. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci.USA 89(22), 10915–10919 (1992)CrossRefGoogle Scholar
  14. Kendall, M.G., Stewart, A.: Advanced Statistics: Inference and Relationship, 3rd edn. Griffin, London, ISBN: 0852642156 (1973)Google Scholar
  15. Lebart, L., Morineau, A., Piron, M.: Statistique Exploratoire Multidimensionelle. Dunod, Paris, ISBN 2-10-002886-3 (1995)Google Scholar
  16. Lohninger, H.: Teach Me Data Analysis. Springer, Berlin-New York-Tokyo, ISBN 3-540-14743-8 (1999)MATHGoogle Scholar
  17. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge, England (2008)MATHCrossRefGoogle Scholar
  18. Mazza, R.: Introduction to Information Visualization. Springer, London, ISBN: 978-1-84800-218-0 (2009)Google Scholar
  19. Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic Press, Boston-Dordrecht (1996)MATHCrossRefGoogle Scholar
  20. Mirkin, B.: Clustering for Data Mining: A Data Recovery Approach. Chapman & Hall/CRC, London, ISBN 1-58488-534-3 (2005)MATHCrossRefGoogle Scholar
  21. Mitchell, T.M.: Machine Learning. McGraw Hill, New York, NY (2005)Google Scholar
  22. Mitsa, T.: Temporal Data Mining. Chapman & Hall/CRC, Boca Raton (2010)Google Scholar
  23. Murtagh, F.: Multidimensional Clustering Algorithms. Physica-Verlag, Vienna (1985)MATHGoogle Scholar
  24. Polyak, B.: Introduction to Optimization. Optimization Software, Los Angeles, CA, ISBN: 0911575146 (1987)Google Scholar
  25. Schölkopf, B., Smola, A.J.: Learning with Kernels. The MIT Press, Cambridge, MA (2005)Google Scholar
  26. Spence, R.: Information Visualization. ACM Press, New York, NY, ISBN 0-201-59626-1 (2001)Google Scholar
  27. Tukey, J.W.: Exploratory Data Analysis. Addison-Wesley, Reading, MA (1977)MATHGoogle Scholar
  28. Vapnik, V.: Estimation of Dependences Based on Empirical Data, 2d edn. Springer Science + Business Media Inc., New York, NY (2006)Google Scholar
  29. Webb, A.: Statistical Pattern Recognition. Wiley, Chichester (2002)Google Scholar
  30. Weiss, S.M., Indurkhya, N., Zhang, T., Damerau, F.J.: Text Mining: Predictive Methods for Analyzing Unstructured Information. Springer Science+Business Media, New York, NY, ISBN 0-387-95433-3 (2005)MATHGoogle Scholar
  31. Zhang, Z., Zhang, R.: Multimedia Data Mining. Chapman & Hall/CRC, Boka Raton (2009)Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

  • Boris Mirkin
    • 1
    • 2
  1. 1.Research University – Higher School of Economics, School of Applied Mathematics and InformaticsMoscowRussia
  2. 2.Department of Computer ScienceBirkbeck University of LondonLondonUK

Personalised recommendations