You are on page 1of 62

Search Trees

We have covered:
• interval trees for point queries
• mappings to higher dim. space for
interval range queries

next, an alternate internal memory data


structure for the interval queries
Range queries
• Consider a fixed set of intervals 1-d. Several
different types:
1. For a given interval I,
1. report all intervals that are contained in I.
2. report all intervals that intersect I.
3. report all intervals that contain I.
Mapping intervals to points in 2-d
• Every interval J can naturally be mapped into
the 2-d point, pj, so that the angle shown at pj,
is a right-angle.
transformation step 2
now rotate by 45o
queries
1. report all intervals that are contained in I

I
queries
2 report all intervals that intersect I.

I
queries
3. report all intervals that contain I

I
interpretation
Project on x-y axes;
J contains I iff
J has larger x-coordinate and larger y-coordinate than I

J’

I’ in that case we say that


J’ dominates I’
I

J
interpretation
J contains I iff
J has larger x-coordinate and larger y-coordinate than I

J’

I’ in that case we say that


J’ dominates I’
interpretation
J contains I iff
J has larger x-coordinate and larger y-coordinate than I

J’ dominates I’
J’ I’ dominates k’
k’ I’
the relation dominance
is transitive, so
J’ dominates k’
direct dominance
We say that J’ directly dominates I’
if J’ dominates I’ and should J’ dominate some k’ then
I’ dominates k’ or I’ and k’ are not related via the dominance relation

J’

k’ I’
the relation direct
dominance
k’ is not transitive!
Dominance/Direct Dominance
• As a result:
– the number of dominance pairs in direct
dominance is linear
– the number of dominance pairs in relation
dominance can be quadratic
– Dominance pairs are the transitive closure of
direct dominance i.e., each pair in the direct
dominance relation is taken to be in
dominance and then we build transitive
closure.
determining dominance pairs

4
1

2
6

1 2 3 4 5 6
determining dominance pairs
1. sort the points by one of the coordinates
say x-coordinate

5 2. write the corresponding


points as they appear on
the y-axis
4
3. read this as a permutation
1 of the numbers
П with number k maps into
2 Пk
6

(3,6,2,1,4,5)
3

1 2 3 4 5 6
Dominance via permuations
We can now easily see that:
i dominates k iff
1. i > k and
2. П = (.., Пk , …, Пi )
i.e., Пi appears Пk after in П
determining dominance pairs
(3,6,2,1,4,5)
5
is a compact representation
of dominance!
4 That is linear space
1 for a possibly quadratic
relation.
2
You can also get direct
6 dominance by looking for
the first number appearing
after a given entry that
3
is larger than the entry.

1 2 3 4 5 6
back to 1-d
1-d range search for points can be
answered in O(log n + outputsize) with
linear space

query
range queries

I. Build a balanced binary


search tree on the
x-coordinates as keys

1 2 3 4 5 6
half-open range queries
I. the balanced binary tree search component

1 2 3 4 5 6

query
half-open range queries
I. balanced binary tree search
II. each internal node has a 1-d range tree of all
elements in its subtree

1 2 3 4 5 6

query
range queries
find the nodes closest to the
root that cover all nodes
of the x-range for the query
in each of these perform
a range query on the y-coordinates

query time:
O(log n) * O(log n) + O(output) =
O(log2 n + output)

storage: O( n log n)

1 2 3 4 5 6

query
half-open range queries

report points
2,4 and 5

1 2 3 4 5 6 7 8
half-open range queries
I. Build a balanced binary
search tree on the
100 x-coordiantes as keys

II. build a max heap on the


80 y-coordinate
70
60
50
40
30
20

1 2 3 4 5 6 7 8
Priority Search Tree
(McCreight)
Heap 100

on y-values 80 40

70 50 20 30

60 - - - - - - -

[ y-values 70 50 20 80 100 40 30 60 ]

x1 x2 x3 x4 x5 x6 x7 x8
half-open range queries can be solved
in linear space and O(log n + outputsize)

report points
2,4 and 5

1 2 3 4 5 6 7 8
range queries can be solved through
two half-open range queries

report points
2 and 4

1 2 3 4 5 6 7 8
Range queries through half-open range
queries Important note: this requires
Intersecting the outputs of both
half-open range queries,
thus NOT output sensitive.
+∞ query output: 2,4,5
-∞ query output: 2,3,4
output: 2 and 4

1 2 3 4 5 6 7 8
Point location problems
cont’d
• We discussed the O(n) point location
algorithm based on Jordan’s Theorem.
• We will be discussing three methods which
allow preprocessing and then faster point
location.
• Note: point location in planar subdivision or
polygons is a key task and a step in many
algorithms.
Point location methods
1. Slab method
2. Chain decomposition
3. Kirkpatrick
Slab Method
• Consider a planar subdivision
• In a preprocessing step we
construct horizontal slab through each of
the vertices of the subdivision.
Slab Method
Slab Method

within a particular slab the endpoints of the segments are sorted by


x-coordinate, The endpoints on the upper and lower slab
boundaries list the corresponding segments in the same order.
Slab Method

This allows a binary search to be carried out within a slab.


The binary search must be done carefully.
Binary search

use left-turn/right-turns for comparison function

left turn or not


Binary search with a slab

Notes: I. the number of segments per slab is O(n)


II. thus in O(log n) a binary search can be performed inside a slab

left turn or not


Binary search on slabs

Note: To identify the correct slab requires an initial binary search


using the y-coordinates for slabs and query point.

query
Total cost for query

Theorem: Using the slab method point location queries in an planar


subdivision on n edges can be carried out in O(log n) time.

query
Total cost for preprocessing

Theorem: The slab method requires O(n2) storage and preprocessing time.

query
Total cost for preprocessing
Theorem: The slab method requires O(n2) storage and preprocessing time.

si = # of edges per slab

∑si = O(n2)
O(n2) preprocessing time
Note: Brute force preprocessing O(n2)
Alternate method use plane sweep method in O(n log n + output)

si = # of edges per slab

∑si = O(n2)
Pros and Cons of Slab Method

+ log n query time that is optimal


+ simple to implement

- storage quadratic
- preprocessing time
Chain decomposition
• This method improves on the previous in
terms of storage and preprocessing time.
• However, we pay for this with increased
query time.
Chain decomposition
• The idea is to partition the planar
subdivsion edges into a set of chains.
• These chains will all be monotonic in the
y-direction.
• This allows for binary search between
chains.
2 Chains point location
• Consider two chains that are monotonic
and that have a common highest and
lowest vertex.
(We actually assume
that the chains are going
to infinity on both ends.)
2 Chains point location
• Consider two chains that are monotonic
and that have a common highest and
lowest vertex.
In O(log n) time we can
determine if a query point
falls inside or outside the
polygon formed by these two
chains.
2 Chains point location
• We do this by performing two separate
location tests: one against the left and one
against the right
chain.
2 Chains point location
• Each test is using the left-turn/right-turn
primitive.
2 Chains point location
• Assume, for a moment, that the planar
subdivision is partitioned into
a monotonic (in y)
polygons.
Chain decomposition
We aim for the following properties for that
partitioning:

1. each edge is covered by at least one


chain

2. each chain is monotonic in y-direction

3. each chain joins the top-most to the


bottom-most vertex (if not unique use
the x-coordinate as secondary key to make
the decision).

4. the total number of chains is O(n)

5. the total storage is linear.


Chain decomposition:
regular vertices
First assume that each vertex in the
subdivision is regular, i.e., it is
connected
to at least one vertex above it and to one
vertex below it.
Later we will see how to guarantee
this condition.

regular not regular


Chain decomposition
The idea is to perform two plane sweeps
one top-to-bottom and one bottom-to-top.

Let us sketch only the top-to-bottom one.


(Preparata and Shamos Computational
Geometry book is a good reference.)

1. orient the edges in y-direction


4
2 3
assume that the number of chains (to be constructed)
sharing the edges into the vertex are at least 2, 4, 3
6 3 then there must be at least 9= 2+4+3 outgoing chains
and we can “arbitrarily” assign those to the
outgoing edges, say here: 6 + 3
Chain decomposition
in the second phase (bottom-up)
we make sure that the number of
chains on each vertex matches
in both directions.

4
2 3

6 3
Chain decomposition
in the second phase (bottom-up) 1
we make sure that the number of 1 1
chains on each vertex matches 1
1
in both directions.
1 1
1 1 1
2
1 2

3
1 2

2 3
1 3

6
Chain decomposition
6
in the second phase (bottom-up) 31
we make sure that the number of 1 1
2
chains on each vertex matches 1
2 1
in both directions.
1 1
1 1 1
2
1 2

3
1 2

2 3
1 3

6 6
Query using chains
6
Once we have the chain decomposition, 31
we can carry out two nested binary 1 1
2
searches: 1
2 1
one over the chains the other
for each chains (left/right of chain) 1 1
1 1 1
2
The query then takes:
1 2
O( log2 n) time. 3
1 2

2 3
1 3

6 6
Regularization
Finally, we need to guarantee that all
vertices are regular.
This is done by a procedure called
regularization. Regularization is useful
for other problems too.

regular not regular


Regularization
Finally, we need to guarantee that all
vertices are regular.
This is done by a procedure called
regularization. Regularization is useful
for other problems too.

The vertex v is missing


an outgoing edge (connected v
to a lower vertex).
Regularization
Recall out plane sweep technique similar to
line segment intersection.
We know for every event point which
edges are to the left and right
(immediate).
That means that these two edges
are visible horizontally.
v
Consider now the the higher of these
two vertices.
Either it is visible or an
event point (a vertex) is visible.
Regularization
Consider now the the higher of these
two vertices.
Either it is visible or an
event point (a vertex) is visible.
v

it is visible

an event point (a vertex ) is visible


Regularization
Based on these observations, a simple
plane sweep method can be designed.

it is visible

an event point (a vertex ) is visible


Chain method complexities
We obtain:

Theorem: In O(log2 n) time, after O(n log n) preprocessing and linear storage,
point location queries in planar subdivisions on n vertices can be answered.

To obtain linear storage one must store the chains in such a manner as to
avoid storing multiple chains sharing an edge.
Kirkpatrick complexity
The best method is due to Kirkaptrick. Optimal is preprocessing, storage and
query.

Theorem: In O(log n) time, after O(n) preprocessing and linear storage,


point location queries in planar subdivisions can be answered.

see class