# Building an Optimal Point-Location Structure in $$O( sort (n))$$ I/Os

## Abstract

We revisit the problem of constructing an external memory data structure on a planar subdivision formed by n segments to answer point location queries optimally in $$O(\log _B n)$$ I/Os. The objective is to achieve the I/O cost of $$sort (n) = O(\frac{n}{B} \log _{M/B} \frac{n}{B})$$, where B is the number of words in a disk block, and M being the number of words in memory. The previous algorithms are able to achieve this either in expectation or under the tall cache assumption of $$M \ge B^2$$. We present the first algorithm that solves the problem deterministically for all values of M and B satisfying $$M \ge 2B$$.

This is a preview of subscription content, log in to check access.

1. 1.

In the original model formulation in , M can be as small as 2B. However, any algorithm that works on $$M = \mu B$$ with constant $$\mu > 2$$ can be adapted to work on $$M = 2B$$ with only a constant blowup in space and I/O cost. For this purpose, it suffices to treat each block as $$\mu$$ “micro-blocks”, each with $$B/\mu$$ words. Each “logical I/O” now reads or writes a micro-block. A memory of 2B words can accommodate $$\mu B$$ “micro-blocks”, plus B more words that can be used to perform the “physical I/Os” (which are still done in B words each). Whenever a logical I/O is needed on a micro-block, a physical I/O occurs on the block containing the micro-block. Hence, any algorithm with I/O complexity $$O( sort (n))$$ under $$M = \mu B$$ now incurs $$O(\frac{\mu n}{B} \log _\frac{\mu M}{B} \frac{\mu n}{B}) = O( sort (n))$$ I/Os on $$M = 2B$$.

2. 2.

An $$\Omega (\log _B n)$$ query lower bound can be established via a reduction from predecessor search .

3. 3.

In particular, as pointed out in , the algorithm of  incurs $$O(n \log _B n)$$ I/Os on a general $${\mathcal {S}}$$.

4. 4.

$$IL^*(B)$$ is the number of times that we need to repeatedly apply $$\log ^*$$ operation on B before the value becomes O(1).

5. 5.

$$O_\epsilon$$ hides a factor polynomial to $$1/\epsilon$$.

6. 6.

The structure solves a more general problem called approximate half-space counting in $$\mathbb {R}^3$$.

7. 7.

The algorithm of  is described with a default leaf capacity of B, but one can replace that with any $$\beta \in [B, \sqrt{MB}]$$ without affecting the algorithm’s correctness. In fact, since what we need here is only the leaf level, the algorithm of  can be simplified considerably by ignoring all of its details on producing the non-leaf levels of a persistent B-tree.

8. 8.

Suppose that we have a problem $$\Pi$$ on an input set S. $$\Pi$$ is decomposable if we can partition S into $$S_1, S_2, \ldots , S_\gamma$$ such that, once we have the answer on each $$S_i$$ ($$1 \le i \le \gamma$$), we can obtain the answer of S using $$O(\gamma )$$ additional I/Os.

9. 9.

A weaker insertion cost of $$O(\log _B^{\alpha +1} n)$$ was claimed in . However, it should be folklore that the cost can be easily improved to as we stated here (for readers familiar with the technique: by creating a structure on $$B, B^{1+\delta /2}, B^{1+\delta }, B^{1+3\delta /2}, \ldots$$ elements, respectively).

## References

1. 1.

Achakeev, D., Seeger, B.: Efficient bulk updates on multiversion B-trees. PVLDB 6(14), 1834–1845 (2013)

2. 2.

Afshani, P., Chan, T.M.: On approximate range counting and depth. Discrete Comput. Geom. 42(1), 3–21 (2009)

3. 3.

Agarwal, P.K., Arge, L.., Brodal, G.S.., Vitter, J.S.: I/O-efficient dynamic point location in monotone planar subdivisions. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 11–20 (1999)

4. 4.

Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. CACM 31(9), 1116–1127 (1988)

5. 5.

Arge, L., Brodal, G.S., Georgiadis, L.: Improved dynamic planar point location. In: Proceedings of Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 305–314 (2006)

6. 6.

Arge, L., Brodal, G.S., Rao, S.S.: External memory planar point location with logarithmic updates. Algorithmica 63(1–2), 457–475 (2012)

7. 7.

Arge, L., Danner, A., Teh, S.-M.: I/O-efficient point location using persistent B-trees. ACM J. Exp. Algorithmics 8, 1–2 (2003)

8. 8.

Arge, L., Vahrenhold, J.: I/O-efficient dynamic planar point location. Comput. Geom. 29(2), 147–162 (2004)

9. 9.

Arge, L., Vengroff, D.E., Vitter, J.S.: External-memory algorithms for processing line segments in geographic information systems. Algorithmica 47(1), 1–25 (2007)

10. 10.

Aronov, B., Har-Peled, S.: On approximating the depth and related problems. SIAM J. Comput. 38(3), 899–921 (2008)

11. 11.

Baumgarten, H., Jung, H., Mehlhorn, K.: Dynamic point location in general subdivisions. J. Algorithms 17(3), 342–380 (1994)

12. 12.

Bender, M.A., Cole, R., Raman, R.: Exponential structures for efficient cache-oblivious algorithms. In: International Colloquium on Automata, Languages and Programming (ICALP), pp. 195–207 (2002)

13. 13.

Bentley, J.L., Saxe, J.B.: Decomposable searching problems I: static-to-dynamic transformation. J. Algorithms 1(4), 301–358 (1980)

14. 14.

Bertino, E., Catania, B., Shidlovsky, B.: Towards optimal indexing for segment databases. In: Proceedings of Extending Database Technology (EDBT), pp. 39–53 (1998)

15. 15.

Cheng, S.-W., Janardan, R.: New results on dynamic planar point location. SIAM J. Comput. 21(5), 972–999 (1992)

16. 16.

de Berg, M., Haverkort, H.J., Thite, S., Toma, L.: I/O-efficient map overlay and point location in low-density subdivisions. In: International Symposium on Algorithms and Computation (ISAAC), pp. 500–511 (2007)

17. 17.

den Bercken, J.V., Seeger, B., Widmayer, P.: A generic approach to bulk loading multidimensional index structures. In: Proceedings of Very Large Data Bases (VLDB), pp. 406–415 (1997)

18. 18.

Goodrich, M.T., Tsay, J.-J., Vengroff, D.E., Vitter, J.S.: External-memory computational geometry. In: Proceedings of Annual IEEE Symposium on Foundations of Computer Science (FOCS), pp. 714–723 (1993)

19. 19.

Haussler, D., Welzl, E.: Epsilon-nets and simplex range queries. Discrete Comput. Geom. 2, 127–151 (1987)

20. 20.

Maheshwari, A., Zeh, N.: I/O-efficient planar separators. SIAM J. Comput. 38(3), 767–801 (2008)

21. 21.

Overmars, M.H.: Range searching in a set of line segments. In: Proceedings of Symposium on Computational Geometry (SoCG), pp. 177–185 (1985)

22. 22.

Patrascu, M., Thorup, M.: Time–space trade-offs for predecessor search. In: Proceedings of ACM Symposium on Theory of Computing (STOC), pp. 232–240 (2006)

23. 23.

Sarnak, N., Tarjan, R.E.: Planar point location using persistent search trees. CACM 29(7), 669–679 (1986)

24. 24.

van Walderveen, F., Zeh, N., Arge, L.: Multiway simple cycle separators and i/o-efficient algorithms for planar graphs. In: Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 901–918 (2013)

## Author information

Authors

### Corresponding author

Correspondence to Yufei Tao.