In this chapter, we will expand our discussion to topics beyond information retrieval (IR) but that still require the processing of text. We will see that these techniques have different purposes than the retrieval of documents but still require good basic IR techniques to succeed. The context of these topics can be gleaned from Fig. 1.6, where we move down the funnel from trying to find probably relevant information to finding definitely relevant information and turning it into actionable knowledge. The four topics we will explore include information extraction and text mining, question-answering, text categorization, and document summarization.
9.1 Information Extraction and Text Mining
As we have seen in the first eight chapters of this book, the general goal of IR systems is the retrieval of documents from textual databases, which will then be read and applied to the task for which they were retrieved, such as a search for more information on a disease by a clinician or a patient...