Skip to main content

Part of the book series: Data-Centric Systems and Applications ((DCSA))

  • 6770 Accesses

Abstract

Summarizing the contents of a book is a matter of personal preferences. One way to obtain more objectivity is to use formal criteria for the identification of the most important findings. In Chap. 8, we introduced text mining, in particular, analysis methods based on the term document matrix for the detection of structure in text data. Hence, we thought that it is self-evident to use the text mining approach as a starting point for a summary. Specifically, we used the following procedure in order to acquire an overview on the contents in the different chapters:

  • Definition of a corpus containing the eight chapters of the book.

  • Cleaning the documents in the standard way by removal of stop words, punctuation, and numbers.

  • Definition of two document term matrices: one with words and the other one with words and bigrams. For these two matrices, some stemming was done, mainly to clean plurals. Furthermore, some additional stop words were removed, mainly words in context of the examples in Chap. 8.

  • For both matrices, term frequency-inverse document frequencies (TF-IDF) were calculated.

  • Definition of comparison clouds each based on 60 terms.

  • Calculation of topic maps of order 2–8 for the term document matrices with stem terms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 84.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Grossmann, W., Rinderle-Ma, S. (2015). Summary. In: Fundamentals of Business Intelligence. Data-Centric Systems and Applications. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46531-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-46531-8_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-46530-1

  • Online ISBN: 978-3-662-46531-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics