In as Few Comparisons as Possible
We review a variety of data ordering problems with the goal of solving them in as few comparisons as possible. En route we highlight a number of open problems, some new, some a couple of decades old, and others open for up to a half century. The first is that of sorting and the Ford-Johnson Merge-Insertion algorithm  of 1959, which remains the “best”, at least for the “best and worst” values of n. Is it optimal, or are its extra .028.. n or so comparisons beyond the information theoretic lower bound necessary?
Moving to selection problems we first examine a special case. The problem of finding the second largest member of a set is fairly straightforward in the worst case. The best expected case method remains the \(n+ \Theta(\lg \lg n)\) method of Matula from 1973 . It begs the question as to whether the \(\lg \lg n\) term is necessary. The status of median finding has remained unchanged for a couple of decades, since the work of Dor and Zwick [4,5]. (3 − δ)n comparisons are sufficient, while (2 + ε)n are necessary. So the constant isn’t an integer, but is it log4/3 2 as conjectured by Paterson ? This worst case behavior is in sharp contrast with the expected case of median finding where the answer has been known since the mid-’80’s [3,6].
Finally we look at the problem of partial sorting (arranging elements according to a given partial order) and completing a sort given partially ordered data. The latter problem was posed and solved within n or so comparisons of optimal by Fredman in 1975 . The method, though, could use exponential time to determine which comparisons to perform. The more recent approaches of Cardinal et al [2,1] to these problems are based on graph entropy arguments and require only polynomial time to determine the comparisons to be made. Indeed the solution to the partial ordering problem involves a reduction to multiple selection . In both cases the number of comparisons used differs from the information theoretic lower bound by only a lower order term plus a linear term.
KeywordsPartial Order Selection Problem Half Century Exponential Time Lower Order Term
- 2.Cardinal, J., Fiorini, S., Joret, G., Jungers, R.M., Ian Munro, J.: Sorting under partial information (without the ellipsoid algorithm). In: STOC, pp. 359–368 (2010)Google Scholar
- 7.Fredman, M.L.: Two applications of a probabilistic search technique: Sorting x + y and building balanced search trees. In: STOC, pp. 240–244 (1975)Google Scholar
- 10.Matula, D.W.: Selecting the the best in average n + θ(loglogn) comparisons. Washington University Report, AMCS-73-9 (1973)Google Scholar