Вы находитесь на странице: 1из 8

2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines

Fast and Flexible Conversion of Geohash Codes to and from Latitude/Longitude


Coordinates

Roger Moussalli, Mudhakar Srivatsa, Sameh Asaad


IBM T.J. Watson Research Center
Yorktown Heights, USA
{rmoussal, msrivats, asaad}@us.ibm.com

AbstractInsights extracted from spatial queries in geo- hierarchical regions, arbitrary precision, and simple proxim-
database systems introduce signicant opportunities for busi- ity estimation. However, geohashes do not replace lat./long.
ness intelligence. However, geodatabases are unable to keep coordinates, primarily due to the format of disseminated
up with the required performance due to the massive (and
sky-rocketing) amounts of data generated from embedded data, as well as the dependency of several spatial algorithms
location-enabled devices. In this paper, we focus on geographic on lat./long coordinates.
information systems that make use of geohash; specically, we This work tackles the (bit-serial) conversion of geohash
tackle the kernel of converting geohash codes to and from codes to/from lat./long. coordinate pairs, a frequent problem
longitude/latitude pairs. We present the rst hardware imple- in geohash-based spatial querying frameworks. Handling
mentation of a geohash conversion engine operating at wire
speed. The presented geohash converter is further enhanced geohash codes can be computationally demanding, and en-
with runtime exibility with respect to characteristics of the tails bit-granular operations, a class of computations where
data it can process; furthermore, the architecture allows the general purpose processors are not known to shine.
user to compromise on performance when limited by hardware We present a novel and parallel hardware bi-directional
resources (design time exibility). Experimental results of the geohash conversion engine, converting geohash codes at
geohash conversion engine on a Xilinx XC7K325T FPGA show
>13X (end-to-end) speedup compared to optimized industry-
wire speed (no stalls), at the constant throughput rate of
grade software running on 16 CPU hardware threads. one geohash code per hardware cycle. The novel geohash
converter is enhanced through the incorporation of runtime
Keywords-geospatial analysis; spatial databases; data con- exibility, where the length of the geohash codes can
version; geohash; FPGA; recongurable architectures; logic
design; accelerator architectures; parallel architectures be varying, whilst minimizing misspent input/output band-
width. Furthermore, if lower performance can be tolerated,
the geohash conversion engine can be made to consume
I. I NTRODUCTION
fewer hardware resources. The proposed hardware converter
The ubiquitous availability of location sensing devices is shown to outperform optimized industry-grade software,
embedded in smartphones, cars, taxis, and other devices, demonstrating end-to-end throughput of 100 million conver-
combined with the ability to collect data at scale, enables the sions per second, while occupying 30% of a Xilinx Kintex
ne grained monitoring and modeling of human movement, 7 XC7K325T, a mid- to small-sized FPGA.
both at the individual level and at the group level. Using The contributions of this work can be summarized as
trajectory data harvested by GPS, RFID and mobile devices, follows:
complex pattern queries can be posed against objects moving The rst hardware geohash converter engine (to the
in both time and space. Answering these queries introduces best of our knowledge), exploiting hardware pipelining
opportunities for business intelligence, where prediction techniques to maximize conversion throughput.
regarding future patterns can be used to solve challenging A user-controlled exible compute-based bi-directional
problems such as trafc congestion prediction, crime pattern converter architecture that operates on any size geo-
analysis and prediction, epidemic spread characterization hashes (with architected max), whilst minimizing
and alerting, insurance pricing, and targeted advertising. This wasted bandwidth.
work focuses on the spatial aspect of spatiotemporal queries. A memory-bound lookup-based converter architecture
One of the most signicant challenges posed by pro- for converting geohash codes to lat./long. pairs.
cessing spatial queries is the sheer amount of available An extensive design space exploration studying the
spatial data; the volume of such data is increasing at an resulting resource utilization and end-to-end throughput
unprecedented rate, especially thanks to the widespread of the hardware converter on a Xilinx XC7K325T
use of GPS-enabled smart-phones. In this context, high- FPGA.
performance techniques are needed to process spatial queries A performance study comparing state-of-the-art CPU-
in a reasonable amount of time. based conversion to the proposed novel hardware con-
Geohash is a hierarchical geocoding system that is often version engine.
used for spatial indexing. Geohash presents several ad- The rest of the paper is organized as follows: the geohash
vantages over the traditional latitude/longitude geographic hierarchical geocoding system is reviewed in Section II.
coordinate system, such as efcient indexing, support for The proposed hardware geohash conversion engine is then

978-1-4799-9969-9/15 $31.00 2015 IEEE 179


DOI 10.1109/FCCM.2015.18
geohash bit of 1 is produced, and the new interval becomes
{mid, mid+max
2 , max}. On the other hand, if the longitude
of the point of interest is smaller or equal to the mid of
the interval, then a geohash bit of 0 is produced, and the
new interval becomes {min, min+mid 2 , mid}. This process
is repeated up to the desired precision (number of geohash
bits), in a bit-serial fashion. The same method is applied
(a) (b)
to the latitude, where the initial interval is {-90, 0, +90}.
Finally, the longitude and latitude bits are interleaved in the
Figure 1. (a) Intuition behind the geohash coordinate system, where geohash code. Figure 1(b) lists the actual binary geohash
space is hierarchically divided into grid-shaped buckets. A string denotes a code of the red dot depicted in Figure 1(a).
rectangular region (bounding box) in the space. The point of interest (red
dot) can be represented by either of 241, 24 or 2 (using the depicted 2) Converting From Geohash Code to Lat./Long.: Given
illustrative region naming scheme), depending on the required precision. (b) a set of lat./long. bits, the respective intervals are updated
A sample binary geohash code (representing the red dot in (a)), where the using the same serial process described above, with no
bits respective to the longitude and latitude space divisions are interleaved
(read from left to right). geohash bits generated at each step; instead, one geohash bit
is consumed at each step. Each input geohash bit is examined
to update the interval at hand, starting from the initial {-90,
detailed in Section III. The performance of the FPGA-based 0, +90} and {-180, 0, +180} for lat./long. respectively. The
geohash conversion engine is evaluated in Section IV. Prior resulting lat./long. values are the respective mid values of
art is outlined in Section V, while conclusions are presented the nal lat./long. intervals.
in Section VI. III. H IGH P ERFORMANCE H ARDWARE G EOHASH C ODEC
II. G EOHASH C ODING OVERVIEW
In this section, an in-depth description of the proposed
In this section, an overview of the geohash system is pro- hardware geohash codec is provided, starting from the basic
vided, followed by a description of the conversion process serial unit for conversion, scaling up to a parallel codec. A
to and from latitude/longitude. mechanism for lookup-based (in contrast to compute-based)
A. Geohash Intuition conversion is further detailed.
Geohash is a geographic coordinate system that hierar- A. Basic Conversion Block
chically divides space into grid-shaped buckets (see Figure We dene a single step in the conversion process as
1(a)). A geohash code, represented as a string, denotes either of:
a rectangle (bounding box) on the earth. It provides a
The act of producing one geohash bit, when converting
spatial hierarchy with arbitrary precision: the precision can
be reduced (i.e. represent a bigger rectangle) by removing from latitude (or longitude) to geohash code.
The act of updating the latitude (or longitude) value
characters from the end of the string. In other words,
the longer the geohash code is, the smaller the bounding (i.e. interval mid) by processing one geohash bit, when
box represented by the code is. For example, the point converting from geohash code to latitude (or longi-
of interest in Figure 1(a) (red dot) can be represented by tude) .
either of 241, 24 or 2 (using the depicted illustrative As described in Section II-B, conversion consists of
region naming scheme), depending on the required precision iterating over the single step as many times as dictated by
(tolerated error). the required precision (number of input/output geohash code
Another property of geohash is that two locations with a bits).
long common geohash prex are close each other. Similarly, Figure 2 depicts the logic circuit implementing a single
nearby locations usually share a similar prex. However, it step, being designed with simplicity in mind, where the input
is not always guaranteed that two close locations share a interface matches the output interface. Data ows from top
long common prex (when they are located near the border to bottom, as the input values are (potentially) updated and
of two bounding boxes). passed to the output.
The logic circuit is also exible, as it provides conversion
B. Algorithm for Converting Geohash Codes to and from in either direction, using the mode bit: mode 0 indicates
Latitude/Longitude conversion from geohash code to lat. (or long.), whereas
Concretely, a geohash code can be represented as a binary mode 1 enables conversion from a lat. (or long.) value to
string, where the bits respective to the longitude and latitude a geohash code bit. In the remainder of this paper, modes
space divisions are interleaved (see Figure 1(b)). 0 and 1 will also be referred to as geo-to-ll and ll-to-geo,
1) Converting From Lat./Long. to Geohash Code: The respectively.
longitude space is bounded by the initial interval {-180, 0, The input interface of the logic block consists of the
+180} being the min, mid, and max. If the longitude of following signals:
the point of interest is greater than the mid of the interval Num bits remaining: the number of bits remaining
(i.e. if the point resides in the upper subinterval), then a to be processed (produced or consumed based on the

180
Figure 2. Basic conversion block for carrying out a single step: producing one geohash bit when converting from latitude (or longitude), or updating the
latitude (or longitude) value (i.e. interval mid) by processing one geohash bit when converting from geohash code to latitude (or longitude).

direction/mode). The use of this eld will be made


clearer when deploying the logic block in a wider
context (see Section III-B). If a value of 0 is specied
at the input, then no change is applied to any of the
input signals (see red dashed lines in Figure 2).
Mask: when converting from lat. (or long.) to geohash
code (i.e. mode = 1), the mask is used to update the
appropriate geohash code bit (the latter being a multi-
bit signal). As geohash codes are read from left to right
(Figure 1(a)), the mask is shifted right by 1 when a Figure 3. A geohash conversion engine that parallelizes the processing
conversion step takes place. The initial value of the of the lat. and long. portions of a geohash code, while also parallelizing
mask is 1000. . . 0. conversion across independent geohash codes. Loopback can be used when
the number of deployed stages does not match the maximum geohash code
Geohash code: when operating in mode 0 (geo-to-ll), bit size; here, processing a geohash code may require several passes through
only the most signicant geohash code bit is used to the pipeline.
update the input interval. The geohash code is also
shifted left by 1 to omit the processed bit. When
operating in mode 1 (ll-to-geo), only the one bit of the Some implementation notes: a full-edged divider is not
geohash code indicated by the input mask is updated. deployed in hardware, as the division by the constant 2 can
The value of the updated bit is 1 when the lat. (or long.) be achieved with a simple subtract by 1 to the exponent or
value falls in the upper subinterval, and is otherwise 0 a mantissa shift in the case of oating point. Also, pipelin-
(see Section II-B for more detail). ing is applied within the conversion step, for performance
Lat. (or long.) value: this is the value that is being purposes. Details are omitted for simplicity.
converted to (the respective lat./long. portion of) a
geohash code. This signal is only used in mode 1 (ll-to- B. Parallel Conversion Through Pipelining
geo) when locating the value within the two portions of Converting an N-bit geohash code (from or to lat./long.)
the input interval. The value of this signal is unchanged requires N passes through the single step described in Sec-
at the output. tion III-A. Parallelism can be achieved at the level of con-
Mode: as described earlier, this signal sets the conver- verting a single geohash code by simultaneously processing
sion direction, where 0 indicates geo-to-ll, and 1 indi- the latitude and longitude portions; this entails deploying two
cates ll-to-geo. The value of this signal is unchanged single step function blocks. Further parallelism in conversion
at the output. can be attained when converting several geohash codes in
Interval {min, mid, max}: when operating in mode parallel, through additional replication of the single step
0 (geo-to-ll), the interval is updated based on the function block.
most signicant geohash bit, whereas in mode 1 (ll-to- Figure 3 illustrates a geohash conversion engine that
geo), the interval is updated based on the comparison processes the lat./long. portions of a geohash code in paral-
between the lat. (or long.) value and the input intervals lel, while providing parallelism across independent geohash
mid. codes. Powered by the exibility of the single step conver-
sion block, the input and output streams can consist of both

181
lat./long. values and geohash codes. Metadata would indicate
the conversion mode, as well as the number of bits to process
(i.e. geohash precision). In the case of geo-to-ll conversion,
bits of the input geohash code are de-interleaved and passed
to the latitude and longitude pipelines respectively. Similarly,
in the case of ll-to-geo conversion, the geohash codes at
the respective output interfaces of the latitude and longitude
pipelines are interleaved before being pushed to the output
stream.
In order to increase efciency in transferring data to the
geohash conversion engine, the input and output controllers
support batch transfers, where a batch header species char-
acteristics (metadata) of the data following it. Information
enclosed in the header includes (1) the batch size (number
of conversions to be performed), (2) the conversion mode
(geo-to-ll, ll-to-geo), (3) the number of bits to process per
conversion, and (4) the size of the geohash on the wire, Figure 4. Subset of the pre-meditated lookup table respective to the
longitude conversion.
i.e. during transfer to/from the converter (this eld will be
described below).
A geohash converter engine has the following design-time
architectural attributes: (1). For example, a geohash of size 56 bits can be
Max geohash size: this eld affects the size of the
transferred using 64 bits, essentially wasting 8 bits,
geohash code passed across single step blocks (Section while a geohash of size 30 bits can be transferred
III-A), as well as the size of the mask and num bits re- using 32 bits, essentially wasting 2 bits. Furthermore,
maining signals. The max geohash size is an architected only log2 (N ) shifters are deployed. In our implemented
maximum geohash code that can be processed (in either converter pipelines, geohashes transferred are of size
mode). Hence, an N-bit geohash converter pipeline can any power of two less than or equal to the max geohash
support geohashes of size N or any smaller size. size.
Number of deployed stages: this eld refers to the total Once the above three attributes are set, the geohash con-
number of single step blocks deployed. If the number verter pipeline can be developed. We implemented a (C++)
of stages is smaller than the max geohash size, then a utility to generate the HDL of the converter pipeline, using
loopback connection should be made from the last stage certain parameter inputs. These include the max geohash
back to the rst (see Figure 3). The number of stages size, the number of deployed stages, the supported transfer-
determines the number of passes through the pipeline side geohash sizes, the bit-width of the I/O interfaces, as
for a given conversion to complete, hence affecting well as other lower-level options such as extra buffering to
performance. When limited by hardware resources, the meet timing.
converter designer may decide to deploy fewer stages
rather than matching the max geohash size, if the C. Lookup-Based Conversion
performance hit is deemed tolerable. In this section a method is presented to perform the
The supported transfer-side geohash sizes: given an geohash code to lat./long. conversion using a lookup into
architected max geohash size N, any geohash of smaller a table of conversions pre-computed ofine. This approach
size can be processed through the converter. However, is in contrast to the online compute method described earlier.
when transferring geohashes of size G to and from Note that converting from lat./long. to a geohash code using
the converter pipeline (where G is specied in the lookup is not feasible, except for the caching of common
metadata using the num bits remaining eld, Section values. The lookup method is practical when faced with
III-A), there are two available options: (1) transfer each limited hardware resources for compute and compromising
geohash using N bits; this option wastes N G bits on performance is not an option.
for each conversion performed, hence is bandwidth Figure 4 encapsulates a subset of the pre-computed lookup
wasteful, especially for small geohashes. (2) transfer table with respect to the longitude conversion. Given a longi-
geohashes contiguously using G bits each. While (2) is tude geohash code of length L, the address for the conversion
more bandwidth-efcient, it requires the I/O interface is computed as: longitude geohash code + of f set, where
controllers to implement order of N shifters, which is L1
the offset is 0 when L is 1, otherwise of f set = i=1 2i .
generally not feasible. A middle-ground between the The offset need not be computed at runtime, rather it can
aforementioned 2 options is supporting a subset of N, be hard-wired in the address computation logic for different
such as all powers of two up to N. Here, geohashes values of L (L is generally kept small to limit memory size).
of size G would be transferred on the wire using The lookup conversion method can be combined with the
geohashes of the next closest power of two, which may compute method in order to eliminate some hardware stages,
still waste bandwidth, though much less than option as depicted by Figure 5. For generality, geohash codes of

182
A. Experimental Framework
1) Hardware Framework: Several versions of the pro-
posed hardware geohash conversion engine were imple-
mented on a Pico M-505 board connected to an Intel Xeon
processor via 8 lanes of PCI-e Gen. 2 [1]. The M-505 board
includes a Xilinx Kintex 7 XC7K325T FPGA [2], a mid-
to small-size FPGA by todays standards. Xilinx Vivado
2014.2 is used for synthesis and implementation, with de-
fault settings. The PCIe hardware interface and software
drivers are provided as part of the Pico framework. The
hardware engines communicate with the I/O PCIe interfaces
through one stream each way, with dual-clock BRAM FIFOs
in between the converter logic and the PCI-e interfaces. The
RAM on the FPGA board does not reside in the same virtual
address space as the CPU RAM, and data is streamed from
the CPU RAM to the FPGA. Since the proposed solution
does not require memory ofoading, RAM on the FPGA
board is not used. All performance numbers are reported
end-to-end, including streaming the data from the host CPU
RAM to the FPGA and back to the host CPU RAM.
2) Software Framework: We compare our hardware con-
verter engine to the software converter engine developed in
[3]. The highly optimized software converter is distributed
as part of the IBM Streams [4] and SPSS [5] products, as
Figure 5. Combining the lookup method with the compute method. a component of the spatiotemporal toolkit. While we did
Geohash codes of length 2N bits as drawn, alongside two pre-computed not gain access to the source code, the authors provided
geo-to-ll lookup tables that convert at most X bits each (one for each of the
latitude and longitude conversions). If N is less than X, then the conversion us with performance measurements. Software experiments
can be achieved fully by lookup. Otherwise, X bits are rst converted by were run on a single socket 8-core x2 Hyper-Threads Intel
lookup, and the remaining N-X bits are converted by compute, using the Xeon Processor running at 2.5GHz, with 20MB L3 cache
output of the lookup table as a starting interval.
and 32GB RAM.
3) Datasets Description: Synthetic datasets were gen-
erated for varying geohash sizes, namely geo 8, geo 16,
length 2N bits as drawn, alongside two pre-computed geo- geo 32, geo 64 and geo 128. Each geohash code le is
to-ll lookup tables that convert at most X bits each (one associated with a lat./long. le represented with double
for each of the latitude and longitude conversions). The precision oating point. Note that for a given geohash code
remainder of this discussion applies to either of the latitude size, modifying the geohash codes typically has no effect
or longitude portions of a geohash code. If N is less than on the performance of either of the software and hardware
X, then the conversion can be achieved fully by lookup (no converters.
need to compute). Otherwise, X bits are rst converted by
lookup, providing an output interval. Then, the remaining B. FPGA Resource Utilization
N-X bits are converted by compute, using the output of the We rst perform a design space exploration regarding the
lookup table as a starting interval. Note that loopback can resource utilization of the proposed converter. To that end,
be applied from the last to rst compute stages (omitted for several 128-bit converters were developed, while varying the
simplicity in Figure 5). number of deployed stages (single step blocks) from 8 to
Implementation results regarding the lookup method are 128. Each of these converters can process geohash codes
not offered in Section IV, as the end-to-end performance of any size up to 128 bits (with varying performance), and
of the lookup method is the same as that of the compute- run at 250MHz. Figure 6 summarizes the various resources
based conversion method, and the memory requirements of consumed by all converters. Generally, resource utilization
the lookup method can be trivially derived. Furthermore, increases linearly with the number of hardware stages.
a thorough design space exploration regarding resource Furthermore, the converters are LUT-dominated, and the
utilization is provided for the compute method, which helps largest (128-stage) converter consumes slightly less than
determine whether the lookup method should be used given 50% of available hard-wired DSPs (3 per stage for the
specic platform constraints. double precision adder), as well as 71% of available LUTs.
The 64-stage converter is relatively light-weight and oc-
IV. E XPERIMENTS AND A NALYSIS cupies around 33% of the FPGA; it is able to process 64-bit
geohashes through a single pass. Note that 64-bit geohashes
This section presents an extensive experimental evaluation are most common as they achieve cm precision on the
of the proposed geohash conversion engine. surface of the earth, and higher precision is rarely required.

183
Figure 8. Performance impact of deploying fewer stages in hardware.
Results are shown for 128-bit converters (running at 250MHz) and a
Figure 6. Various resources consumed by several 128-bit converters. The batch of 100M conversions of 64-bit geohashes. The red line represents
number of deployed stages is varied. Note that all converters can process the deterministic throughput of the isolated converter core on the FPGA,
any geohash size up to 128 bits (with varying performance). without taking into consideration the PCI-e transfers. In the case of less
than 64 stages, several passes are required through the deployed stages for
each conversion.

C. Hardware Converter Performance Evaluation


without taking into consideration the PCI-e transfers. Recall
End-to-end (CPU RAM to FPGA and back to CPU RAM) that in the case of less than 64 stages, several passes are
performance is measured by processing batches of data (of required through the deployed stages for each conversion.
varying sizes). A batch either consists of geohash codes, Initially, the end-to-end performance is limited by the
or lat./long. pairs. The reported metric is throughput as converter throughput (4 and 8 stages). However, the per-
measured by the number of sustained (millions of) con- formance of the 32-stage and 64-stage converters is com-
versions per second. All experiments in Figure 7 were run parable, and limited by the available PCI-e bandwidth.
on a single hardware converter (single FPGA image), with Hence, a 32-stage converter can be deployed in place of
no compromises on (end-to-end and core converter) perfor- the full 64 stages, with minimal impact on performance, but
mance for any of the test cases. This approach showcases considerable savings in resources (16% vs 33%, see Figure
the exibility of the proposed converter, a critical asset as 6). Furthermore, note that the end-to-end performance of
the performance hit incurred by FPGA reconguration time the converter can be increased (more than doubled) simply
cannot be tolerated. by providing higher PCI-e bandwidth (Gen.3 and/or more
Figures 7(a) and 7(b) show the throughput achieved by the lanes), with no changes to the developed converter core.
hardware converter (running at 250MHz) in modes geo-to-
ll and ll-to-geo, respectively. As expected, performance for D. Hardware Converter vs. Single-Threaded Software
small batches is very low, due to PCI-e transfer overhead. In this section, the performance of the proposed hardware
As the batch size increases, the performance exhibits an converter is compared to that of a single-threaded version of
exponential increase, followed by a slow linear increase and the optimized software converter. Note that the remainder of
eventual saturation. this study makes use of a single hardware converter (single
The performance of the converter core on the FPGA is FPGA image), being a 128-bit 128-stage converter running
one geohash code per hardware cycle, regardless of the at 250MHz.
geohash code size (and the conversion mode). End-to-end Figures 9(a) and 9(c) show the speedup of the hardware
conversion will, however, vary due to PCIe side-effects. As converter versus the single-threaded software converter, for
shown in Figure 7(b), the sustained end-to-end conversion conversions modes geo-to-ll and ll-to-geo respectively.
rate is higher for smaller geohash sizes which incur less Some notes on the performance of the software converter
trafc on the PCI-e bus. Furthermore, while the performance (data omitted for brevity): (1) performance does not vary
with respect to 128-bit geohashes does not vary depending greatly across batch sizes, as there is no data transfer
on the mode (geo-to-ll and ll-to-geo), performance is higher overhead to pay; that is with the exception of batch size
in the ll-to-geo mode for all other geohash sizes. This is 10, where the overall performance is lower than for other
likely due to the Pico rmware, and occurs as a side effect batches, since the conversion of the small batch is affected
of the converter consuming input data more rapidly in the by software environment overhead. (2) unlike the hardware
ll-to-geo mode (recall that a lat./long. pair is represented in converter, the performance of the software converter is lin-
128 bits). Generally, performance is in the tens of millions early affected by the geohash code bit size; processing a 32-
of conversions per second, and up to 160 MConversions/s bit geohash code requires twice the time needed to process a
for 8-bit geohashes. 16-bit geohash code. (3) similar to the hardware converter,
We show in Figure 8 the performance impact of deploying the performance of the software converter is higher when
fewer stages in hardware, using 128-bit converters (running converting from lat./long. to geohash codes.
at 250MHz) and a batch of 100M conversions for 64- The hardware converter generally performs better than
bit geohashes. The red line represents the deterministic its software counterpart. This improved performance arises
throughput of the isolated converter core on the FPGA, mainly because the hardware converter performs bit-level

184
(a) (b)

Figure 7. Throughput achieved by a 128-bit 128-stage hardware converter (running at 250MHz) with respective modes (a) geo-to-ll and (b) ll-to-geo.

operations very efciently (de-interleaving and reconstruct-


ing the geohash code), and also because it includes double
precision oating point units operating in parallel in a very
deep pipeline. While the hardware converter provides higher
throughput when converting 8-bit geohashes than with 128-
bit geohashes, speedup is higher in the case of the latter
(see (2) above). When reaching steady-state, the hardware
converter achieves up to 250X and 170X speedup for 128-
bit geohashes (geo-to-ll and ll-to-geo respectively), as well
as 160X and 120X speedup for 64-bit geohashes. (a) (b)
Figures 9(b) and 9(d) provide a zoomed in view of Figures
9(a) and 9(c) respectively, for batches of size 10, 100 and
1K. Slowdown is represented below the red (horizontal) line,
where speedup < 1. The red line denotes the threshold at
which hardware acceleration should be used. We can see
that generally, for any geohash size, batches larger than 600
should be sent to the hardware converter; when considering
large geohash codes (64 and 128), batches of size 100 (down
to 30) can be accelerated. Also, note that speedup increases
sharply above the red line, and the benets of hardware
acceleration will be directly visible. (c) (d)

E. Hardware Converter vs. Multi-Threaded Software: a Figure 9. Speedup of the hardware converter versus the single-threaded
software converter, for conversions modes (a) geo-to-ll and (c) ll-to-geo.
Socket-to-Socket Comparison Figures (b) and (d) provide a zoomed in view of Figure (a) and (c)
By assigning separate subsets of a batch to different respectively, for batches of size 10, 100 and 1K. Slowdown is represented
below the red (horizontal) line, where speedup < 1.
threads, software conversion can be accelerated (data-level
parallelism). The next set of experiments we describe show-
cases the speedup of the hardware decoder against a multi-
threaded version of the software decoder. We limit the CPU
to a single socket for a fair (CPU)socket-to-(FPGA)socket
comparison. The available CPU socket comprises of 8 cores
with 2-way Hyper-Threading, a total of 16 hardware threads.
Figure 10 depicts the speedup of the hardware converter
against the software converter, while doubling the number
of software threads. Geohashes of size 64-bits are used with
batches containing 100 Million conversions. As expected,
the comparative speedup is halved as the number of threads
is doubled, up to the number of available cores (8). When Figure 10. Speedup of the hardware converter against the software
Hyper-Threading is used (at 16 threads), we see a 30% converter, while doubling the number of software threads. Geohashes of
reduction in speedup instead of the previous 50%. Increasing size 64-bits are used with batches containing 100 Million conversions.
the number of software threads beyond the number of
hardware threads results in a slight deterioration of software
performance. In summary, the hardware converter achieves 16.1X (geo-to-ll) and 13X (ll-to-geo) speedup versus the

185
best run of the software converter on the CPU socket. [2] KINTEX-7 FPGAS, http://www.xilinx.com.
As noted in Section IV-C, the speedup shown here can
be potentially more than doubled simply by attaching the [3] K. Lee, R. K. Ganti, M. Srivatsa, and L. Liu, Efcient Spatial
Query Processing for Big Data, SIGSPATIAL, vol. 7, no. 11,
hardware converter to a higher bandwidth PCI-e platform pp. 412, 2014.
(no modications to the converter core required).
[4] IBM InfoSphere Streams, http://www-
V. P RIOR A RT 03.ibm.com/software/products/en/infosphere-streams.
The geohash geocode system [6] has been introduced [5] IBM SPSS: Predictive Analysis Software and Solutions,
fairly recently (2008) and is witnessing a rapidly increas- http://www-01.ibm.com/software/analytics/spss/.
ing adoption. Several database management systems and
geographic information systems make use of geohash for [6] Geohash, http://www.geohash.org.
indexing and efcient querying. These include MongoDB
[7] MongoDB Manual 2.6: Geospatial Indexes and Queries,
[7], MySQL [8], IBM Streams and SPSS [3][4][5], as well http://docs.mongodb.org/manual/core/geospatial-indexes/.
as Apache Accumulo (through third-party research) [9].
MongoDB uses geohash for indexing where conversion is [8] MySQL 5.7 Reference Manual: Spatial Geohash Functions,
a frequent problem, as input queries process spatial data in http://dev.mysql.com/doc/refman/5.7/en/spatial-geohash-
a lat./long. format. The work in [3] proposes a lightweight functions.html.
scalable spatial index based on geohash, and extends their [9] A. Fox, C. Eichelberger, J. Hughes, and S. Lyon, Spatio-
approach for graph data structures. Temporal Indexing in Non-Relational Distributed Databases,
There exists an extensive body of work pertaining to in Big Data, 2013 IEEE International Conference on. IEEE,
the FPGA acceleration of certain spatial queries (a small 2013, pp. 291299.
subset is referenced here). Acceleration of point-in-polygon
[10] J. Fender and J. Rose, A High-Speed Ray Tracing En-
algorithms is demonstrated in [10][11][12], while speeding gine Built on a Field-Programmable System, in Field-
up k-Nearest Neighbors queries is tackled in [13][14]. Programmable Technology (FPT), 2003. Proceedings. 2003
The works in [15] [16][17] describe the hardware accel- IEEE International Conference on. IEEE, 2003, pp. 188
eration of spatio-temporal analytics, where queries in the 195.
form of regular-expressions are posed on moving objects.
[11] J. Schmittler, S. Woop, D. Wagner, W. J. Paul, and
The spatial history of the moving objects is represented P. Slusallek, Realtime Ray Tracing of Dynamic Scenes
as regions. Region information is derived from lat./long. on an FPGA Chip, in Proceedings of the ACM SIG-
coordinates through methods such as point-in-polygon. GRAPH/EUROGRAPHICS conference on Graphics hard-
To the best of our knowledge, this paper is the rst ware. ACM, 2004, pp. 95106.
to describe a hardware architecture for the conversion of
[12] M. Woulfe, M. MANZKE, and J. L. DINGLIANA, Hard-
geohash codes to/from lat./long. coordinates. ware Accelerated Broad Phase Collision Detection for Real-
time Simulations, 2007.
VI. C ONCLUSIONS
We present, to the best of our knowledge, the rst [13] H. Hussain, K. Benkrid, C. Hong, and H. Seker, An adaptive
hardware implementation of a geohash conversion engine FPGA Implementation of Multi-Core K-Nearest Neighbour
Ensemble Classier Using Dynamic Partial Reconguration,
operating at wire speed. The proposed geohash converter in Field Programmable Logic and Applications (FPL), 2012
is enhanced with runtime exibility with respect to char- 22nd International Conference on. IEEE, 2012, pp. 627630.
acteristics of the data it can process (no restrictions on
geohash sizes, bi-directional conversion, etc). Moreover, the [14] E. S. Manolakos and I. Stamoulias, Flexible IP Cores for the
architecture allows the user to compromise on performance k-NN Classication Problem and Their FPGA Implementa-
tion, in Parallel & Distributed Processing, Workshops and
when limited by hardware resources (design time exibility). Phd Forum (IPDPSW), 2010 IEEE International Symposium
A thorough experimental evaluation of the geohash con- on. IEEE, 2010, pp. 14.
version engine on a Xilinx XC7K325 FPGA shows >13X
(end-to-end) speedup compared to optimized industry-grade [15] L. Woods, J. Teubner, and G. Alonso, Complex Event
software running on 16 CPU hardware threads. We show Detection at Wire Speed With FPGAs, Proceedings of the
VLDB Endowment, vol. 3, no. 1-2, pp. 660669, 2010.
that higher (more than double) speedup can be attained
with no changes to the converter engine, using higher PCI-e [16] R. Moussalli, M. R. Vieira, W. Najjar, and V. J. Tsotras,
bandwidth (Gen. 3 and/or more lanes). Stream-Mode FPGA Acceleration of Complex Pattern Tra-
Future work will focus on the acceleration of more jectory Querying, in Advances in Spatial and Temporal
complex spatial queries in the geohash domain, using the Databases. Springer, 2013, pp. 201222.
proposed converter as a constituent. [17] R. Moussalli, I. Absalyamov, M. R. Vieira, W. Najjar, and
V. J. Tsotras, High Performance FPGA and GPU Complex
R EFERENCES Pattern Matching Over Spatio-Temporal Streams, GeoInfor-
[1] M-505-K325T Embedded, http://picocompu- matica, pp. 130, 2014.
ting.com/products/embedded-modules/m-505-k325t-
embedded-2/.

186

Вам также может понравиться