BiAS: A Theme Metric to Model Mutual Association

  • Ramkishore Bhattacharyya
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8251)

Abstract

Identifying likeness between events is one of the fundamental necessities in machine learning and data mining techniques. Though grouping of events usually happens on their proximity in Euclidean space or the degree of similarity or the extent of linear dependence, certain applications like keyword and document clustering, phylogenetic profiling and feature selection tend to yield better results if events are grouped based on their mutual association. This paper presents a metric, the Bidirectional Association Similarity (BiAS) to quantify the degree of mutual association between a pair of events. We put forward generalized formulation to compute BiAS and establish unidirectional correspondence with the Jaccard and the cosine similarities. The measure can be suitably incorporated with clustering algorithms in grouping mutually associative events with adding precision to the discovered knowledge.

Keywords

Bi-directional association similarity BiAS clustering cosine similarity Jaccard index mutual association 

References

  1. 1.
    Agrawal, R., Imielinski, T., Swamy, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. ACM Int’l Conf. on Management of Data (SIGMOD), Washington, DC, pp. 207–216 (May 1993)Google Scholar
  2. 2.
    Blake, C., Pratt, W.: Better rules, fewer features: a semantic approach to selecting features from text. In: Proc. IEEE Intl. Conf. on Data Mining (ICDM 2001), pp. 59–66 (2001)Google Scholar
  3. 3.
    Dai, X., Jia, J., Ghaoui, L.E., Yu, B.: SBA-term: Sparse Bilingual Association for Terms. In: Fifth Intl. Conf. on Semantic Computing (ICSC 2011), pp. 189–192 (2011)Google Scholar
  4. 4.
    Everitt, B.: Cluster analysis, 3rd edn. Edward Arnold, London (1993)Google Scholar
  5. 5.
    Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Englewood Cliffs, N.J (1998)Google Scholar
  6. 6.
    MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1965)Google Scholar
  7. 7.
    Tanimoto, T.T.: IBM Internal Report November 17 (1957)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ramkishore Bhattacharyya
    • 1
  1. 1.Microsoft CorporationRedmondUSA

Personalised recommendations