Abstract
Partial rankings are totally ordered subsets of a set of items. For example, the sequence in which a user browses through different parts of a website is a partial ranking. We consider the following problem. Given a set D of partial rankings, find items that have strongly different status in different parts of D. To do this, we first compute a clustering of D and then look at items whose average rank in the cluster substantially deviates from its average rank in D. Such items can be seen as those that contribute the most to the differences between the clusters. To test the statistical significance of the found items, we propose a method that is based on a MCMC algorithm for sampling random sets of partial rankings with exactly the same statistics as D. We also demonstrate the method on movie rankings and gene expression data.
Chapter PDF
References
Cobb, G., Chen, Y.: An application of markov chain monte carlo to community ecology. American Mathematical Monthly 110, 264–288 (2003)
Gionis, A., Mannila, H., Mielikäinen, T., Tsaparas, P.: Assessing data mining results via swap randomization. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 167–176. ACM Press, New York (2006)
Kamishima, T., Akaho, S.: Efficient clustering for orders. In: ICDMW 2006. Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops, pp. 274–278. IEEE Computer Society Press, Los Alamitos (2006)
Ryser, H.J.: Combinatorial properties of matrices of zeros and ones. Canad. J. Math. 9, 371–377 (1957)
Scherf, U., Ross, D.T., Waltham, M., Smith, L.H., Lee, J.K., Tanabe, L., Kohn, K.W., Reinhold, W.C., Myers, T.G., TAndrews, D., Scudiero, D.A., Eisen, M.B., Sausville, E.A., Pommier, Y., Botstein, D., Brown, P.O., Weinstein, J.N.: A gene expression database for the molecular pharmacology of cancer. Nature Genetics 24(3), 236–244 (2000)
Ukkonen, A., Mannila, H.: Finding representative sets of bucket orders from partial rankings (submitted for review)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ukkonen, A., Mannila, H. (2007). Finding Outlying Items in Sets of Partial Rankings. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds) Knowledge Discovery in Databases: PKDD 2007. PKDD 2007. Lecture Notes in Computer Science(), vol 4702. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74976-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-74976-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74975-2
Online ISBN: 978-3-540-74976-9
eBook Packages: Computer ScienceComputer Science (R0)