Inducing Load Balancing and Efficient Data Distribution Prior to Association Rule Discovery in a Parallel Environment

  • Anna M. Manning
  • John A. Keane
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1685)


Many association rule algorithms operate in a parallel environment where the database is divided up among a number of processors, a procedure which is usually carried out indiscriminately. The nature of the database partitioning can affect both the number of candidate sets produced and the workload at each processor. This paper demonstrates that Principal Component Analysis can be used successfully to help arrange the records of a database among processors so that efficient load balancing is enabled and candidate set duplication minimised.


  1. [1]
    R. Agrawal, T. Imielinski and A. Swami, ‘Mining association rules between sets of items in large databases’, SIGMOD 93, pp 207–216, 1993.Google Scholar
  2. [2]
    R. Agrawal and J.C. Shafer, Parallel mining of association rules, IEEE Transactions on Knowledge and Data Engineering, 8(6), pp962–969, 1996.Google Scholar
  3. [3]
    R. Agrawal and R. Srikant, ‘Fast algorithms for mining association rules in large databases’, VLDB-94, pp 487–499, 1994.Google Scholar
  4. [4]
    D.W.C heung and Y. Xiao, Effect of Data Skewness in Parallel Mining of Association Rules, PAKDD-98, pp 48–60, 1998.Google Scholar
  5. [5]
    D.W. Cheung, V.T. Ng, A.W. Fu and U. Fu, ‘Efficient Mining of Association Rules in Distributed Databases’, IEEE Transactions on Knowledge and Data Engineering Vol.8 No.6 pp 911–922, 1996.Google Scholar
  6. [6]
    I.T. Jolliffe, ‘Principal Component Analysis’, New York: Springer (Springer series in statistics), 1986.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Anna M. Manning
    • 1
  • John A. Keane
    • 1
  1. 1.Department of ComputationUMISTManchesterUK

Personalised recommendations