## Geometric Data Structures

#### Instructor:

Timothy Chan (Siebel 3230, tmc "at" illinois.edu, office hours: Wed 3:30-4:30 or by appointment)

#### Meeting Time and Place:

Wed & Fri 2:00-3:15, Siebel 1109

#### Course Overview:

This course is about data structures in computational geometry. We will discuss fundamental techniques and recent theoretical developments for a variety of basic geometric data structuring problems in a variety of models. Prior knowledge in computational geometry is not required, but students are assumed to have a solid background in algorithm design and analysis.

The following is a list of possible topics.

• Orthogonal range searching: from basic methods (such as range trees) to current best results (such as my paper with Larsen and Patrascu)
• Point location: from standard solutions (segment trees, fractional cascading, persistence, planar separators, ...), to more recent sublogarithmic methods on the word RAM
• Nonorthogonal range searching: partition trees, cutting trees, random sampling, iterative reweighting, shallow cuttings
• Dynamic data structures: dynamic convex hull, dynamic point location, general dynamization techniques
• Approximate nearest neighbor searching: quadtrees, shifting, z-ordering, locality sensitive hashing
• Big data: external-memory geometric data structures, geometric streaming algorithms, ...

#### Course Work:

• 3 or 4 assignments (problem sets), worth 40%:
• presentation, worth 20%;
• a project that involves reading some papers and writing a report, or alternatively doing some original research, worth 40%.

#### Lectures:

[I'll provide links to relevant papers below. There is no required textbook, although de Berg, Cheong, van Kreveld, and Overmars's book is a useful starting point for the less recent topics. See also the newest 2017 edition of the CRC Handbook...]

Jan 18, 20: Introduction. Orthogonal range searching.

• k-d trees (O(n) space, O(n^{1-1/d} + k) time; see Sec 5.2 of BCKO)
• Priority search trees and Cartesian trees for 2D 3-sided emptiness/reporting (O(n) space, O(log n + k) time; see Sec 10.2 of BCKO and wikipedia page)
• Range trees (O(n log^{d-1} n) space, O(log^{d-1} n) time; see Sec 5.3-5.6 of BCKO)

Jan 25:

• Improving space by bit packing: Chazelle'88 for 2D counting (O(n) space, O(log n) time; see Sec 3.1 of paper)
• Improving time: digression on 1D predecessor search
• van Emde Boas trees (O(n) space, O(loglog U) time; see wikipedia, or the Cormen et al. textbook, 3rd ed.)
• fusion trees (O(n) space, O(log_w n) time, but rough sketch only; see Fredman and Willard's paper)

Jan 27, Feb 1:

• Improving time in 2D (O(n log n) space, O(loglog U) time (for emptiness))
• Improving time and space in 2D: Alstrup, Brodal, and Rauhe's grid-based method (O(n loglog U) space, O((loglog U)^2) time; see paper [Sec 3])
• Improving time and space in 2D: C.-Larsen-Patrascu'11 (O(n loglog n) space, O(loglog U) time; see Sec 2 of paper)

Feb 3: Planar point location.

• The naive slab method (O(n^2) space, O(log n) time)
• The segment tree method (O(n log n) space, O(log^2 n) time; see Sec 10.3 of BCKO)
• ... with fractional cascading (O(n log n) space, O(log n) time; see Chazelle and Guibas' two papers)

Feb 8:

• Preparata's trapezoid tree method (also O(n log n) space, O(log n) time; see paper or Preparata and Shamos' old textbook)
• Persistent search tree method (O(n log n) space, O(log n) time by path copying; see Sarnak and Tarjan's paper, which also reduced space to O(n))

Feb 10:

• Planar separator method (O(n) space, O(log n) time; see Lipton and Tarjan's paper)
• Random sampling method (O(n) space, O(log n) time; see Clarkson and Shor's original paper or Mulmuley's book)
• Kirkpatrick's hierarchical triangulation method (see paper, or Preparata and Shamos' book, or Iacono's video)

Feb 15:

• Point location in sublogarithmic time: orthogonal case (O(n) space, O(loglog U) time; see my paper from '11)

Feb 17:

• Point location in sublogarithmic time: general case (O(n) space, O(log n/loglog n) or O(sqrt{log U/loglog U}) time; see my '09 paper with Patrascu)

Feb 22, 24: Nonorthogonal range searching.

• Willard's method (O(n) space, O(n^{0.793}) time in 2D, via the ham-sandwich cut theorem; see paper)
• A dual method (O(n^{4.82}) space, O(log n) time in 2D; for duality, see Sec 8.2 of BCKO)
• Clarkson's cutting tree (O(n^{2+eps}) space, O(log n) time in 2D) via the Cutting Lemma (proof by random sampling; see Clarkson's paper or Mulmuley's book)

Mar 1:

• Matousek's partition tree (O(n) space, O(n^{1/2+eps}) time in 2D) via the Partition Theorem (proof by iterative reweighting; see Matousek's paper or Mulmuley's book)

Mar 3:

• "Applications" to combinatorial geometry (Szemeredi-Trotter theorem and Erdos' unit distance problem; see Ch4 of Matousek's book)
• Nonlinear range searching (via the linearization technique)
• Multi-level versions of partition trees (see Sec 16.2 of BCKO)

Mar 8:

• Halfspace range reporting in 2D and 3D (O(n log n) space, O(log n + k) time; see my paper) via the Shallow Cutting Lemma (see Matousek's paper)

Mar 10: Dynamic geometric data structures. Dynamic 2D convex hull.

• Insert-only (O(log n) update and O(log n) query time, by Preparata'79)
• Fully dynamic: the hull tree method (O(log^2 n) update and O(log n) query time, by Overmars and van Leeuwen'80; see also Preparata and Shamos' book)

Mar 15: General dynamization techniques.

• Static to insert-only: The "logarithmic method" (Bentley and Saxe '80)
• Static to fully dynamic: The "square root method" (same paper)
• Static to offline dynamic, via segment tree

Mar 17: Dynamic 2D point location.

• Dynamic segment tree (O(log^2 n) query and update time)
• ... with dynamic fractional cascading (O(log n loglog n) query and O(log^2 n) update time)
• Sketch of a version of the latest method by C. and Nekrich'15 (O(log n (loglog n)^2) query and O(log n loglog n) update time)

Mar 29, 31: Approximate nearest neighbor search. In constant dimensions.

• Grid method (O(n) space and O(1) time, for fixed radius)
• (Compressed) quadtree method (O(n) space and O(log U) time for decision problem; see Sariel's book)
• ... with shifting (O(n) space and O(log U) time; see Bern'93 and Lemma 3.3 in C.'98)
• Balanced quadtree (O(n) space and O(log n) time; see again the above papers)
• Z-ordering (same result but simplified; see C.'02 and wikipedia)

Apr 5, 7, 12: In high dimensions.

• Dimension reduction via the Johnson-Lindenstrauss Lemma (n^{O((1/eps^2)log(1/eps))} space, O(poly(d)) time for Euclidean distances; see Indyk and Motwani's original paper and also Matousek's survey on metric embedding)
• Locality-sensitive hashing (near O(n^{1+1/c}) space, O(poly(d) n^{1/c}) time for approximation factor c, for Hamming, L_1, and Euclidean distances; see again Indyk and Motwani's original paper and Sec 2.9 of Matousek's notes)
• The L_infinity case: Indyk's search tree method (near O(n^t) space, O(d log n) time for approximation factor O(log_t log d); see paper)
• Offline case: the polynomial method (near n^{2-sqrt{eps}} time via matrix multiplication and Chebyshev polynomials; this paper also has improvement near n^{2-eps^{1/3}})
• Apr 14, 19: Big data. External-memory and cache-oblivious data structures.
• Apr 21: Streaming (insert-only).
• approximate diameter (O(1) space for any eps>0)
• approximate width: eps-kernels/coresets via the logarithmic method/merge-and-reduce (O(log^d n) space, by Agarwal, Har-Peled, and Varadarajan); or via a doubling trick (O(1) space, see paper)
• Apr 26: Streaming (dynamic).
• approximate diameter, by bucketing (O(poly(1/eps,log U)) space)
• witness finding and random sampling
• approximate width by the polynomial method of Andoni and Nguyen'12 (see also this paper)

• Apr 28: Student presentations.
• May 3: Student presentations.