Parallel Data Placement
Multiprocessor data placement
Parallel data placement refers to the physical placement of the data in a multiprocessor computer in order to favor parallel data access and yield high-performance. Most of the work on data placement has been done in the context of the shared-nothing architecture. Data placement in a parallel database system exhibits similarities with data fragmentation in distributed databases since fragmentation yields parallelism. However, there is no need to maximize local processing (at each node) since users are not associated with particular nodes and load balancing is much more difficult to achieve in the presence of a large number of nodes.
The main solution for parallel data placement is a variation of horizontal fragmentation, called partitioning, which divides database relations into partitions, each stored at a different disk node. There are three basic partitioningstrategies: round-robin, hashing, and interval. Furthermore, to improve...