Вы находитесь на странице: 1из 8

IADIS International Conference Informatics 2008

OPTIMISING LARGE HASH TABLES FOR LOOKUP


PERFORMANCE

Sándor Juhász
Budapest University of Technology and Economics, Department of Automation and Applied Informatics
1111 Budapest, Goldmann György tér 3., Hungary

Ákos Dudás
Budapest University of Technology and Economics, Department of Automation and Applied Informatics
1111 Budapest, Goldmann György tér 3., Hungary

ABSTRACT
Hash tables can provide fast mapping between keys and values even for voluminous data sets. Our main goal is to find a
suitable implementation having compact structure and efficient collision avoidance method. Our attention is focused on
maximizing the lookup performance when handling several millions of data items. This paper suggest a new memory
consumption oriented way for comparing the significantly different approaches and analyses various types of hash table
implementations in order to answer the question what structure needs to be used and how the parameters must be chosen
in order to achieve a maximal lookup performance with the lowest possible memory consumption.

KEYWORDS
bucket hashing, open hashing, lookup performance, data transformation, cache memory

1. INTRODUCTION
Hashing is a well known approach for realizing code tables and lookup charts. Hash tables are suitable for
data transformations where it is more efficient to store the output values belonging to the repeating input
values instead of using a complex or computation intensive conversion algorithm. Hash tables provide fast
searching capabilities combined with a compact storage structure. The reason of their efficiency lies in the
classification of the items according to a hash function applied on their keys, considerably reducing the
number of items that need to be tested. The result of the hash function becomes the identifier of the group
this item will belong to. Forming these groups helps to minimize the number of items that have to be
compared when searching for a specific key.
The distribution of the items over the available groups is mainly influenced by the factors like the choice
of the hash function, the number of groups created and the uniformity of the distribution of the hashed values.
The more items are mapped to the same group, the slower the search is, because of the increased number of
items the search has to examine.
The appropriate choice of the hash function and the number of groups significantly affect the length of the
search path: unbalanced hashing or low number of groups results in many items being mapped to the same
slot. Next to the reduction of the average search path length the storage structure itself also has a serious
performance impact. The main factors to be considered here are memory footprint and cache friendliness.
The number of indirections in the storage structure usually decrease memory consumption, but has a negative
effect on the lookup speed. The increased locality of the structure allows faster collision handling because of
the better cache usage, but maintaining this compactness usually calls for heavier administrative load.
Our paper proposes and analyses different storage structures trying to highlight the balance point of the
above mentioned basic options. A number of recent papers have considered the effect of modern computer
memory hierarchy as an influencing factor, and examined how caching affects the performance of algorithms
(Heileman and Luo 2005). We continue along this path and propose different implementation approaches and
tune the free parameters to further increase cache friendliness. The focus is set to cases where the number of

107
ISBN: 978-972-8924-62-1 © 2008 IADIS

items to be stored is in the domain of tens of millions or higher, and the number of search queries to complete
can exceed hundreds of millions.
Our main contribution is introducing a new comparison method for evaluation of hash table performance,
and to use this method to test and examine different variations of hash tables. Most papers and studies about
hash tables use either the load factor or the number/size of buckets as variable (Heileman and Luo 2005; Bell
1970) neglecting memory consumption. Our new method compares the execution times in function of the
measured or calculated amount of reserved memory allowing a more fair comparison of the algorithms under
the same memory conditions.
The rest of the paper is organized as follows: Section 2 presents the related literature by giving a short
introduction to hash tables and to the aspects of their performance tuning. Section 3 outlines our hash table
implementations and the calculation of their memory consumption. The experimental tests and results are
shown in Section 4, where the outcome of synthetic tests is used to draw the consequences. We conclude in
Section 5 by summarizing of the presented methods and results.

2. RELATED WORKS
Our work was inspired by a real life data mining project dealing with processing of web log data (Juhász and
Iváncsy 2007). The web logs in question record the activity of a few million people surfing the web during a
time interval of several months providing several terabytes of raw data. Because of the huge size a
compression is applied to the raw logs as a preprocessing step, which includes recoding the original verbose
text field formats to 4 byte integers to save space. Hash tables are used to complete this task as they are
promised to provide a fast, lookup table based transformation.
The history of hash tables dates back to the 1960’s, where their ancestor, key-to-address transformation
was studied to find a record in a background file using a key (Lum 1971; Brent 1973). Meantime the capacity
improvement of system memories made it more and more possible to store the data themselves as well in the
main memory, but the basic principle remains the same: the possible location of an item must be guessed as
precisely as possible by using its key.
Hash tables have two basic types differing in the collision handling method: the first one is called open
addressing, where the number of slots is fixed, and one slot contains exactly zero or one item. If more than
one item is mapped to the same slot by the hash function, the algorithm finds another free slot inside the table
by the help of a secondary function. This approach allocates a fixed amount of memory, independently of the
final number of items it will store in the end. Open hashing is well studied both analytically and in practice.
After studying some of the many suggested collision resolving methods Heileman and Luo (2005) claimed
that even though the data locality and cache usage was in favor of linear probing, it showed no significant
performance gain due to the long probe lengths. As we will show later, this drawback can be overcome by
increasing the number of slots which efficiently reduces the long probing paths. To examine this aspect, both
linear double hashing and the so called quadratic quotient method by Bell (1970) will be tested and compared
to linear probing in Section 4.
Bucket hashing, opposed to the previous case of open addressing, reserves additional external space
(outside the slots of the main table) for the colliding items, and links them to the corresponding original slots.
As several items can be linked to one single slot, they are now called buckets. This solution still leaves other
options to consider that will be detailed in Section 3.
As our original task consists of speeding up the transformation process, performance becomes our highest
priority. Open hashing is presumed to be the faster solution; although when studying bucket hashing, Lum et
al. (1971) found that with careful design (setting the number of buckets in the order of magnitude of the
items) bucket hashing can provide competing performance. As it is shown in Section 4, according to our
measurements, this claim is tenable under some cases, but not as a general statement.
To understand the basic dimensions of parameter tuning of hash tables, the relevant aspects influencing
the performance must be selected. The execution time is mostly influenced by the number of steps the search
algorithm takes to find an item, which is directly related to the number of items sharing the same hash value.
To reduce the number of colliding items, the hash function is the first place to optimize. Hash functions have
to be fast, and should produce values with uniform distribution. Lum et al. (1971) analyzed various

108
IADIS International Conference Informatics 2008

commonly used hash functions, such as modulo division, mid-square method, algebraic coding, and found
that modulo division is one of the best solutions. That is why this hash function was chosen for our studies.
The second parameter having considerable performance effect is the number of slots. Increasing the
number of slots causes fewer items to be mapped to the same group; however, the memory requirements
grow as well. Probability theory can prove to be helpful when seeking for the optimal size of the buckets.
When the distribution of hashed keys is uniform, the size of the buckets will form binomial distribution thus
the final bucket sizes can be calculated with a particular certainty. Mitzenmacher (2002) gives formula for
calculating the probability of a bucket ending up with k items, however that applies to cases only when the
number of buckets is less than the number of items. In this paper, instead of emphasizing the probability of
the individual buckets sizes, the expected bucket size is considered.
The third aspect strongly influencing the performance, marginally referred before, is cache friendliness. It
is known, that the way algorithms utilize memory can seriously impact the performance (Pfister 1998). As
consequence not only the number of steps (the number of executed CPU instructions) an algorithm takes is
important, but the memory access pattern, and the nature of the memory accesses (Wulf and McKee 1995;
Heileman and Luo 2005; Anderson 2006). Utilization of the cache highly determines the performance
requiring the algorithms and storage structures to be designed accordingly. This aspect acts as the source of
distinction between the several implementations of hash tables suggested in Section 3.

3. HASH TABLE STRUCTURES AND VARIATIONS


This section presents six hash table structures/variations that will be analyzed in this paper. All types are
given names describing the choices made during their construction. The first part is item-table or pointer-
table depending on the nature of the data the hash table contains when addressed with the hash of the key. In
case of item-tables the content is a single item (key-value pair), while pointer-tables introduce a further
indirection, containing just the address of the (first) corresponding item. Item tables are fast and provide
direct access to the values, while pointer tables allocate considerable less memory for empty buckets.
The last part of the name hints on the storage mode or collision resolution method of the items mapped to
the same location. In case of a collision linear-probing puts the new item into the next free location after the
one originally calculated by the hash function. In case of linear-double-hashing the probe sequence of key k
is calculated with the help of the original h hash function, and a g secondary hash function as:
h(k) + ig(k), i=1,2,…. As suggested by Heileman and Luo (2005), if h(k) = k mod V (V is the number or
slots), and g(k) = k mod (V-2) , then the probe sequence visits every slot in the table, if V is chosen as a prime
number. The third secondary hash function is quadratic-quotient introduced by Bell (1970) which uses
h(k) + ai + b(k)i2, i=1,2,… to calculate the probe sequence, where a is a chosen constant and b(k) = k div V.
This method will visit exactly half of the table before returning to the original position.
Suffix with-list tells that the colliding items are chained after each other in a linked list, where memory
for new items is allocated from the system memory as needed. Suffix with-array means that the colliding
items will share a larger common block of memory for each group of items having the same hash value, thus
they are accessible after each other without having to follow any pointer chains.

3.1 Memory Requirement Analysis


Measuring the execution time shows performance from one point, but we should not forget about memory.
Although we limit our studies to voluminous sets of items fitting in the system memory, the required amount
is still important, since during processing the data most of it may be reserved by concurrent tasks.
Additionally, referring to cache friendliness again, if the data spans over bigger amount of memory, the
chance of cache hits will decrease, as it will be shown later by our results as well.
To provide a starting point for further analysis let us consider the distribution of the items according to
the hash values of their key. Let N be the number of items, V be the number of buckets, and let vi denote the
size of the bucket i. The most important information for further analysis are the number u of buckets left
empty, and the average size s of the non-empty buckets.
The probability of leaving bucket i empty after the insertion of all N items is:

109
ISBN: 978-972-8924-62-1 © 2008 IADIS

N
⎛ 1⎞
P( v = 0) = ⎜1 − ⎟ . (3.1)
i ⎝ V⎠
The expected value of the number of empty buckets is:
N/V
⎛V⎛ 1⎞ ⎞
N
⎛ 1⎞
N ⎛⎛ 1⎞ ⎞
V
1 N/V
u = E ⎜ ∑ ⎜ 1 − ⎟ ⎟ = V ⎜ 1 − ⎟ = V ⎜ ⎜1 − ⎟ ⎟ ≈V = Ve − N / V . (3.2)
⎜ i =1 ⎝ V ⎠ ⎟ ⎝ V⎠ ⎜⎝ V ⎠ ⎟ e
⎝ ⎠ ⎝ ⎠
As s is the expected value of the number of items per the non-empty buckets, we know that
E ( v i ) = E ( v i | v i ≥ 1) ⋅ P( v i ≥ 1) + 0 ⋅ P( v i = 0) = sP( v i ≥ 1) , (3.3)
To calculate E(vi), we use
V ⎛V ⎞
∑ E ( v i ) = E⎜ ∑ v i ⎟ = E( N) = N = VE( v i ) , (3.4)
i =1 ⎝ i =1 ⎠
Combining the results of (3.3) and (3.4) provides
E ( vi ) N /V N /V
s = E (vi | vi ≥ 1) = = = . (3.5)
P (vi ≥ 1) 1 − P (vi < 1) ⎛ 1⎞
N
1 − ⎜1 − ⎟
⎝ V⎠
The above formulas for number u of empty buckets and the average size s of the non-empty buckets will
be used for calculating the expected memory cost in next subsection.

3.2 Six hash table variations


The first structure (see Figure 1) is item-table-linear-probing (or ILinProbe for short), which is the original
open hash table, where each slot contains a key and a value. The collision resolution is performed by linear
probing. This method is very fast and good in memory access characteristics (cache friendly avoidance), but
needlessly allocates memory for the empty slots. Also, choosing the number of slot fixes the maximum
number of items than can be stored. The memory need of this version is
MILinProbe = V * (ks + ds), (3.7)
where ks is the size of the key and ds is the size of the value (data) it belongs to. This type of structure is
claimed to be good when the saturation level is between 30-70% (Pagh et al. 2007) with significant amount
of memory reserved in excess.
The second and third variations, item-table-linear-double-hashing (IDoubleHash) and item-table-
quadratic-quotient (IQuadQuot) use the same structure, but they apply the collision resolving method
appearing in their names instead of linear probing. Both have the same memory requirement as ILinProbe.
The fourth version (see Figure 1), called item-table-with-list (IList), has a structure similar to the former
ones, but instead of probing additional slots inside the table external space is allocated for the new items if
the corresponding slot is already occupied (according to the definition given in Section 2, this is a
representative of bucket hashing). In other works this approach is also referred as chaining the items (Munro
and Celis 1986). This modification opens the possibility of decreasing the number of buckets, but iteration
through long chains is expensive. The memory need of this implementation is increased with one pointer for
each item, but decreased with the omitted empty buckets. The approximate total amount this version needs is
MIList = (N + u) * (ks + ds + 4), (3.8)
where u is the estimated number of empty buckets. This formula is obtained by adding the memory
requirement of the fixed in-table slots V and the amount of extra slots outside the table (N – (V – u)).
The fifth version (see Figure 2) is pointer-table-with-list (PList), which seeks to decrease the cost of the
empty buckets by storing pointers in the slots instead of the items, and allocates external space for each item.
Although it introduces an extra pointer indirection, it consumes less memory than IList having the same
number of slots. The items are still chained to each other, using a linked list. This implementation is a
traditional bucket hash table. The memory need in this case is
MPList = V * 4 + N * (ks + ds + 4). (3.9)

110
IADIS International Conference Informatics 2008

empty  key | ptr | value  key | ptr | value


key | value  empty 
key | value  key | ptr | value
key | value  key | ptr | value key | ptr | value
empty  empty 
empty  empty 
key | value  key | ptr | value

ks + ds byte  ks + ds + 4 byte 
ILinProbe, IDoubleHash, IList 
IQuadQuot 

Figure 1. The structure of ILinProbe, IDoubleHash, IQuadQuot and IList variations.

ptr  ptr
ptr  key | ptr | value  ptr length | key | value 
ptr  ptr
ptr  key | ptr | value  key | ptr | value key | ptr | value ptr length | key | key | key | value | value | value
ptr  ptr
ptr  key | ptr | value  key | ptr | value ptr length | key | key | value | value 

4 byte  ks + ds + 4 byte  4 byte 4 + length*(ks + ds) byte 


PList  PArray 

Figure 2. The structure of PList and PArray variations.


The last version (see Figure 2), called pointer-table-with-array (PArray), follows the same path as the
previous one, but tries to get rid of the pointer associated with each item, by using arrays instead. These
arrays are fixed in size, thus this version has an overhead of re-allocating the arrays when they become full.
Since this happens in the building phase only, it is not reflected in the look up performance.
The size of the block is stored in each array, this value is followed by the keys next to each other, ending
with the values in the same order. Putting the keys side by side increases cache friendliness of the search
operation. The memory need this last case is
MPArray = V * 4 + N * (ks + ds) + (V – u) * 4, (3.10)
where the last term is the size information of the non-empty buckets.

4. OPTIMIZING PERFORMANCE
The nature of the project described in the Section 2 determines the most decisive use case of the hash tables,
which is efficient searching for known elements. Based on this fact, our attention is focused on searching for
known items, ignoring the behavior in case of unsuccessful lookups and the time taken by inserting the items.
The hash tables were implemented in C++ compiled with Microsoft Visual Studio 2008 compiler. All
implementations were optimized using a custom memory manager and the best techniques known by us to
reduce the number of instruction. To measure the performance, 10 million random numbers of 20 bytes were
generated, which were used as keys, and a value of 4 bytes was assigned to each of them. After inserting the
above items into the table, 200 millions of random lookup operations were performed for the previously
inserted items. The items themselves and the order by which the items are searched for are pseudo-random
values generated with Mersenne Twister (Matsumoto and Nishimura 1998). The tests were carried out on an
Intel Pentium 4 processor @ 3.2 GHz, 2MB L2 cache, 4 GB memory, running Windows Server 2003 R2.

4.1 Comparison by the Number of Buckets or By Memory Need


First of all, the behavior of the presented implementations was analyzed in function of the number of buckets
in Figure 3. We would like to emphasize that this classical representation is misleading as IList seems to have
the best overall performance and Figure 3 suggests that the more bucket we use, the better the performance
gets. But it has to be noted, that as it will be shown below that this deduction is deceptive.

111
ISBN: 978-972-8924-62-1 © 2008 IADIS

ILinProbe IDoubleHash IQuadQuot ILinProbe IDoubleHash IQuadQuot


execution time [sec] IList PList PArray execution time [sec] IList PList PArray
340 250

290 200

240 150

190 100

140 50

90 0
1 2.5 5 7.5 10(+) 11 12.5 15 20 30 50 200 400 600 800 1000 1200 1400
number of buckets [million] reserved memory [MB]

Figure 3. Comparison of execution time Figure 4. Comparison of execution time


with varying bucket number. considering the memory need.
On the other hand one conclusion can be safely drawn: IDoubleHash and IQuadQuot are not good
choices, since they, in all cases, turned out to be worse than ILinProbe. (Since their memory consumption is
equal the representation of Figure 3 is accurate.) This proves that in case of open addressing linear probing is
really fast. This result annuls the doubt of Heileman and Luo (2005) about the efficiency of linear probing,
and underline the importance of its cache friendliness. They stated that even though linear probing gained
significant advantage contrary to double hashing in some scenarios, this is generally not true. Hereby we see
that linear probing is a viable choice, in case of voluminous hash tables (not fitting entirely into the caches)
with uniform distribution of hash values (obtainable with the prober choice of hash function), and item sizes
not considerably bigger than the cache line size. With these circumstances, linear probing outperforms the
best known secondary hash functions as possibly longer search paths are compensated by the lower number
of cache misses. However it is out our scope to examine its behavior when the conditions are not met.
Please note that 10 million slots in Figure 3 are increased to 10,2 million (2 percents higher) for
ILinProbe, IDoubleHash and IQuadQuot, since using exactly 10 millions would take significantly more time
because of the 100% load factor applied to the open hash.
As the previous representation approach did not take into consideration the memory consumption of the
algorithms let us introduce a new dimension used for the performance comparison of hash tables. Our
method tries to even the odds by measuring (or calculating) the memory need of the different variations, and
displays the execution as the function of this memory. This approach makes it possible to fairly rank the
algorithms by giving them equal amount of resources (memory). The same measurements following this new
approach are shown in Figure 4. Our way of representation brings an important improvement to the widely
used slot-based representation, as the picture goes through a considerable change compared to Figure 3. The
most compact in memory and the fastest is ILinProbe, while IList which seemed to be the best, is a little
slower, as its extra pointer of 4 bytes per item increases the memory need.
The two classical bucket versions, PList and PArray are worse than the previous two. If we compare IList
and PList, where the main difference is how the empty buckets are handled (if they consume a whole item
space or just a 4 byte pointer), we see that getting rid of a level of indirection makes it worthwhile to waste
some memory for the empty buckets. It is also clearly noticeable in the memory based representation that
after a certain point giving more and more memory to any hash tables results in a slight increase of execution
time, thus increasing the number of buckets beyond all limits is not just worthless but deteriorates
performance. This phenomenon can be explained with the reduction of data locality, as spanning the same
amount of data over bigger memory space results in more and more cache misses.
To provide better observability, the region marked in Figure 4 is enlarged on Figure 5, where the curves
are labeled with the bucket numbers (IDoubleHash and IQuadQuot were omitted from this diagram). It is
clearly visible now that the bucket number is best to be chosen around 20 millions or between 20 and 30
million in case of ILinProbe and IList. This is twice or three times the number of the items. We see no reason
to go beyond this point (bellow saturation level 33-50% for open hashing). The only region, where PArray is
competitive, is between 1 to 5 millions of buckets, where it is still not the best, but it is the one that tolerates
changes the best, and does not react to increase in number of items with exponential increase in execution
time. To examine this aspect as well, next subsection analyses how robust the different variants are.

112
IADIS International Conference Informatics 2008

ILinProbe IList PList PArray ILinProbe IList PList PArray


execution time [sec] execution time [sec]
230 1m 400
10.1m
210 350
2.5m
2.5m
300
190
5m 7.5m 250
170 10m
12.5m
5m 15m 200
150 7.5m 20m
11m 10m 30m 80m 100m 150
5m 15m 50m
130 12.5m 80m 100m 100
7.5m20m 30m 50m
110 10m
12.5m 12.5m 15m 50
15m 20m 30m
90 20m
0
240 340 440 540 640 740 0 5 10 15 20
reserved memory [MB] number of items [million]

Figure 5. Focusing to the interesting region of bucket Figure 6. Testing robustness of the hash tables.
numbers.

4.2 Robustness
So far the figures only showed how the different variations perform when the number of items is fixed, thus
the bucket number can be chosen accordingly. The number of items, however, may not always be known a
priori. A good hash table is robust, which means, it should handle the increase in the number of items with a
graceful degradation of look-up time, otherwise frequent reallocations would become too expensive.
The robustness test is shown in Figure 6. The number of items, the hash tables are designed for, is 2.5
million, thus the number of buckets was chosen to be 5 million. The number of items are varied from 1
million (40% of the expected amount) to 20 million (8 times as much as expected).
The original ILinProbe reaches its maximum at 5 million items, at this point the execution time becomes
very high compared to the others. IList on the other hand behaves very well, its execution time increases
linearly with the number of items, which is an acceptable behavior. One interesting case is PArray, which
gains advantage compared to the others with the increasing number of items per buckets.

4.3 Advantages and Disadvantages


Table 1 below summarizes the main design aspects and their effects. The first aspect deals with the number
of items that is how does the increasing number of items influence the performance if the number of slots is
kept constant. Sensitivity to the size of the key illustrates the memory requirement growth with the key size
getting bigger. Performance is an overall point of view, which compares the algorithms if the allocated
memory size is fixed. The aspect of cache alignment forces relevant memory blocks to start at cache line
boundaries lowering the number of cache misses, but increasing memory footprint. A typical cache line size
is 64 bytes (Intel Corp. 2007) in today’s CPUs. The last aspect is the overall memory consumption and
behavior. The algorithms IDoubleHash and IQuadQuot are not enumerated in Table 1 as they behave exactly
the same way as ILinProbe, the only difference is their lower performance detailed in subsection 4.1.
Table 1. Advantages and disadvantages of the outlined hash tables
ILinProbe IList PList PArray
sensitivity to increasing
number of items very sensitive sensitive sensitive robust
memory sensitivity to the
size of the key very sensitive sensitive sensitive robust

performance with low


appropriate parameters good good medium (gains advantage
for low memory)
cache alignment minor affect minor affect no affect minor effect
memory requirement for the high in high
lowest medium medium
same performance performance reg.

113
ISBN: 978-972-8924-62-1 © 2008 IADIS

5. CONCLUSION
Various hash tables and implementation considerations were presented and analyzed in this paper. Our
intention was to provide an overview of the different structure and to present ways towards the optimization
of such lookup tables. When analyzing storage structures we came to the conclusion, that the traditional open
addressing is a good choice, regardless of the fact that empty slots seemingly waste memory. When using
pointers to decrease the cost of empty buckets, the further indirection causes performance slowdown that
should be compensated with a higher number of buckets, which seems to consume even more memory than
the original open hash approach. We also presented a hybrid open-bucket hash, IList, which uses chaining as
the collision resolving method, allowing the hash table to be tolerant towards changes in the number of items
without the frequent need to resize the whole table, but comes with a slight increase in memory.
Version ILinProbe (open addressing with linear probing) turned out to be the fastest, although linear
probing is presumed to be the best choice with near-uniform distribution of hash values only. In other cases
clustering of the items may increase the probe path length weakening the benefits of cache friendliness.
To overcome the above mentioned and the robustness problem we suggest the use of an IList type hash
table, which is able to cope with a wider range of circumstances. The best performance is obtained when
number of slots (buckets) for is twice the number of the items, but choosing the number of slots to be equal to
the number of items provides nearly the same performance (10% lower) for 55% of the memory cost.
We may also note, that the 30-70% saturation level told to be optimal for open addressing was justified,
although according to our result reducing the saturation below 50% does not increase the performance, and
going below 30% can indeed cause performance loss for large data sets.

ACKNOWLEDGEMENT
This work was supported by the Mobile Innovation Center, Hungary. Their help is kindly acknowledged.

REFERENCES
Anderson, B., 2006. Processor Cache 101: How Cache Works. AMD Developer Central, [Online]
http://developer.amd.com/article_print.jsp?id=84.
Bell, James R., 1970. The quadratic quotient method: A hash code eliminating secondary clustering. Communications of
the ACM, Vol. 13, No. 2, pp 107-109.
Brent, R. P., 1973. Reducing the retrieval time of scatter storage techniques. Communications of the ACM, Vol. 16, No.
2, pp 105-109.
Heileman, G. L. and Luo, W., 2005. How caching affects hashing. In Proceedings of the 7th Workshop on Algorithm
Engineering and Experiments, Vancouver, Canada, pp 141-154.
Intel Corporation, 2007. Intel® 64 and IA-32 Architectures Optimization Reference Manual. Vol. 1, Rev. 2.0, [Online]
http://developer.intel.com/design/processor/manuals/248966.pdf.
Juhász, S. and Iváncsy, R., 2007. Tracking Activity of Real Individuals in Web Logs. International Journal of Computer
Science, Vol. 2, No. 3, pp 172-177.
Lum, V. Y. et al, 1971. Key-to-address transform techniques: A fundamental performance study on large existing
formatted files. Communications of the ACM, Vol. 14, No. 4, pp 228-239.
Mitzenmacher, M., 2002. Good Hash Tables & Multiple Hash Functions. Dr. Dobbs Journal, No. 336, pp 28-32.
Munro, J. I. and Celis, P., 1986. Techniques for collision resolution in hash tables with open addressing. In Proceedings
of 1986 ACM Fall Joint Computer Conference, Dallas, United States, pp. 601-610.
Matsumoto, M. and Nishimura, T., 1998. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random
number generator. ACM Transactions on Modeling and Computer Simulation, Vol. 8, Issue 1, pp 3-30.
Pagh, R. et al., 2007. Linear probing with constant independence. In Proceedings of the Thirty-Ninth Annual ACM
Symposium on theory of Computing, San Diego, United States, pp 318-327.
Pfister G.F., 1998. In Search of Clusters (Second Edition), Prentice Hall, Upper Saddle River, New Jersey, USA
Wulf, W. A. and McKee, S. A., 1995. Hitting the Memory Wall: Implications of the Obvious. Computer Architecture
News, Vol. 23, pp 20-24.

114

Вам также может понравиться