Abstract
In this paper, we initiate the study of the problem of ordering objects from their pairwise comparison results when allowed to discard up to a certain number of objects as outliers. More specifically, we seek to find an ordering under the popular Kendall tau distance measure, i.e., minimizing the number of pairwise comparison results that are inconsistent with the ordering, with some outliers removed. The presence of outliers challenges the assumption that a global consistent ordering exists and obscures the measure. This problem does not admit a polynomial time algorithm unless NP \( \subseteq \) BPP, and therefore, we develop approximation algorithms with provable guarantees for all inputs. Our algorithms have running time and memory usage that are almost linear in the input size. Further, they are readily adaptable to run on massively parallel platforms such as MapReduce or Spark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A directed graph \(G = (V, E)\) is called a tournament if it is complete and directed. In other words, for any pair \(u \ne v \in V\), either \((u, v) \in E\) or \((v, u) \in E\).
- 2.
More precisely, he considered a slightly more general version where each vertex may have a different cost when removed as an outlier.
- 3.
We show that our algorithm is (180, 180)-approximate, which can be improved arbitrarily close to (60, 60) if one is willing to accept a lower success probability. In contrast, Aboud’s algorithm can be adapted to be (18, 18)-approximate for FASTO; however as mentioned above, it uses considerably more memory and run time than ours.
References
Aboud, A.: Correlation clustering with penalties and approximating the reordering buffer management problem. Master’s thesis. The Technion Israel Institute of Technology (2008)
Ailon, N.: Aggregation of partial rankings, p-ratings and top-m lists. Algorithmica 57(2), 284–300 (2010)
Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. J. ACM 55(5), 23 (2008)
Altman, A., Tennenholtz, M.: Ranking systems: the pagerank axioms. In: ACM EC (2005)
Ammar, A., Shah, D.: Ranking: Compare, don’t score. In: IEEE Allerton (2011)
Andoni, A., Nikolov, A., Onak, K., Yaroslavtsev, G.: Parallel algorithms for geometric graph problems. In: ACM STOC, pp. 574–583 (2014)
Arora, S., Frieze, A., Kaplan, H.: A new rounding procedure for the assignment problem with applications to dense graph arrangement problems. Math Program. 92(1), 1–36 (2002)
Bradley, R.A., Terry, M.E.: Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika 39(3/4), 324–345 (1952)
Charikar, M., Khuller, S., Mount, D.M., Narasimhan, G.: Algorithms for facility location problems with outliers. In: ACM-SIAM SODA (2001)
Chen, K.: A constant factor approximation algorithm for k-median clustering with outliers. In: ACM-SIAM SODA (2008)
Coppersmith, D., Fleischer, L.K., Rurda, A.: Ordering by weighted number of wins gives a good ranking for weighted tournaments. ACM Trans. Algorithms 6(3), 55 (2010)
Duchi, J.C., Mackey, L.W., Jordan, M.I.: On the consistency of ranking algorithms. In: ICML, pp. 327–334 (2010)
Even, G., (Seffi) Naor, J., Schieber, B., Sudan, M.: Approximating minimum feedback sets and multicuts in directed graphs. Algorithmica 20(2), 151–174 (1998)
Frieze, A., Kannan, R.: Quick approximation to matrices and applications. Combinatorica 19(2), 175–220 (1999)
Guha, S., Li, Y., Zhang, Q.: Distributed partial clustering. In: ACM SPAA (2017)
Gupta, S., Kumar, R., Lu, K., Moseley, B., Vassilvitskii, S.: Local search methods for k-means with outliers. PVLDB 10(7), 757–768 (2017)
Kemeny, J.G.: Mathematics without numbers. Daedalus 88(4), 577–591 (1959)
Kenyon-Mathieu, C., Schudy, W.: How to rank with few errors. In: ACM STOC (2007)
Lu, T., Boutilier, C.: Learning mallows models with pairwise preferences. In: ICML (2011)
Luce, R.D.: Individual Choice Behavior a Theoretical Analysis. Wiley, Hoboken (1959)
Malkomes, G., Kusner, M.J., Chen, W., Weinberger, K.Q., Moseley, B.: Fast distributed k-center clustering with outliers on massive data. In: NIPS (2015)
Muller, E., Sánchez, P.I., Mulle, Y., Bohm, K.: Ranking outlier nodes in subspaces of attributed graphs. In: IEEE ICDEW (2013)
Negahban, S., Oh, S., Shah, D.: Rank centrality: ranking from pairwise comparisons. Oper. Res. 65(1), 266–287 (2016)
Rajkumar, A., Agarwal, S.: A statistical convergence perspective of algorithms for rank aggregation from pairwise data. In: ICML (2014)
van Zuylen, A., Williamson, D.P.: Deterministic algorithms for rank aggregation and other ranking and clustering problems. In: Kaklamanis, C., Skutella, M. (eds.) WAOA 2007. LNCS, vol. 4927, pp. 260–273. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-77918-6_21
Wauthier, F., Jordan, M., Jojic, N.: Efficient ranking from pairwise comparisons. In: ICML (2013)
Williamson, D.P., Shmoys, D.B.: The Design of Approximation Algorithms. Cambridge University Press, Cambridge (2011)
Acknowledgements
This work was supported in part by NSF grants CCF-1409130 and CCF-1617653.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Im, S., Montazer Qaem, M. (2020). Fast and Parallelizable Ranking with Outliers from Pairwise Comparisons. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science(), vol 11906. Springer, Cham. https://doi.org/10.1007/978-3-030-46150-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-46150-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46149-2
Online ISBN: 978-3-030-46150-8
eBook Packages: Computer ScienceComputer Science (R0)