Abstract
So far, the attribute support has been assumed to fit into main memory. This might no longer be the case for large dense databases (e.g., census data or the corpus of text documents). Such databases may contain attributes (e.g., the sex attribute in census data) that have very large support. As suggested in [SON95], solving such a difficulty can be achieved by partitioning the database in addition to partitioning the search space as it was done in Chapter 2. It should be noted that the algorithms described in this chapter are not proposed as an alternative to those presented in Chapters 2, 4, and 5. Instead, they should be considered as new extensions. Data partitioning offers a new source of parallelism that comes in addition to search space partition-based parallelism. Data partitioning can be used within Procedure 2.6.2.2 that computes the starting sets or within Procedure 2.6.3.1 that performs full enumeration as well. For the purpose of simplicity, we confine ourselves to developing the new algorithms upon the basic enumeration procedures found in Chapter 2 (Procedures 2.5.1, 2.6.2.2, and 2.6.3.1). Accommodating these algorithms to make them work with all extensions developed in Chapters 4 and 5 is straightforward. This chapter contains two sections. Section 6.1 presents a probabilistic method for data partitioning while Section 6.2 describes data partition-based algorithms for rule mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media New York
About this chapter
Cite this chapter
Adamo, JM. (2001). Data Partition-Based Rule Mining. In: Data Mining for Association Rules and Sequential Patterns. Springer, New York, NY. https://doi.org/10.1007/978-1-4613-0085-4_6
Download citation
DOI: https://doi.org/10.1007/978-1-4613-0085-4_6
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-6511-5
Online ISBN: 978-1-4613-0085-4
eBook Packages: Springer Book Archive