The following function, taken from [Aho, Sethi, Ullman], computes the hash code of a string, seen as a sequence of bytes in memory.
/* Original hash function */
static unsigned long
compute_hashval (const void *key, size_t keylen)
{
size_t cnt;
unsigned long int hval, g;
cnt = 0;
hval = keylen;
while (cnt < keylen)
{
hval <<= 4;
hval += (unsigned long int) *(((char
*) key) + cnt++);
g = hval & ((unsigned long)
0xf << (LONGBITS - 4));
if (g != 0)
{
hval ^=
g >> (LONGBITS - 8);
hval ^=
g;
}
}
return hval != 0 ? hval : ~((unsigned long) 0);
}
This function is used in the 'localedef' program that belongs to the GNU C Library version 2.2. The command that was benchmarked was the creation of a Unicode locale: "localedef -i de_DE -f UTF-8 de_DE.UTF-8". compute_hashval is used by both the primary and secondary hash functions of two hash tables. The meaning of primary and secondary hash function becomes clear from the following hash table lookup function:
static size_t
lookup (hash_table *htab, const void *key, size_t keylen, unsigned long hval)
{
unsigned long hash;
size_t idx;
/* Lookup call counter L */
hash_entry *table = (hash_entry *) htab->table;
/* First hash function: simply take the modul but prevent
zero. */
hash = 1 + hval % htab->size;
idx = hash;
if (table[idx].used)
{
if (table[idx].used == hval &&
table[idx].keylen == keylen
&&
memcmp (key, table[idx].key, keylen) == 0)
return idx;
/* Collision
counter C */
/* Second hash function as suggested
in [Knuth]. */
hash = 1 + hval % (htab->size -
2);
do
{
if (idx
<= hash)
idx = htab->size + idx - hash;
else
idx -= hash;
/* If entry
is found use it. */
if (table[idx].used
== hval && table[idx].keylen == keylen
&& memcmp (key, table[idx].key, keylen) == 0)
return idx;
/*
Round-looping counter R */
}
while (table[idx].used);
}
return idx;
}
The two hash tables have keys of the following form:
At the marked position three counters were added:
Resize at | Original hashfunc | Modified hashfunc | Best hashfunc | ||||
% | L | C | R | C | R | C | R |
99 | 3967749 | 3675179 | 81368723 | 3411436 | 38852940 | 2878792 | 21111366 |
98 | 3967749 | 3673939 | 80417219 | 3409981 | 38222653 | 2875586 | 20758471 |
95 | 3967749 | 3672883 | 78027125 | 3405394 | 37278047 | 2864306 | 20284108 |
90 | 3967749 | 3660798 | 75756023 | 3388510 | 35800720 | 2838975 | 19679561 |
85 | 3967749 | 3326789 | 57423689 | 2851028 | 16483724 | 1469015 | 4341997 |
80 | 3967749 | 3319496 | 54165729 | 2834775 | 16047321 | 1439351 | 4094990 |
75 | 3967749 | 3311030 | 50568145 | 2819586 | 15658912 | 1411443 | 3909274 |
70 | 3967749 | 3305998 | 48357944 | 2804174 | 15330091 | 1385422 | 3764736 |
65 | 3967749 | 3307809 | 47748304 | 2790929 | 15169761 | 1365246 | 3652689 |
60 | 3967749 | 3299227 | 47180698 | 2777389 | 14913818 | 1344251 | 3554541 |
55 | 3967749 | 3303693 | 46376191 | 2767100 | 14202239 | 1334525 | 3531360 |
50 | 3967749 | 3306849 | 45578422 | 2756588 | 14105465 | 1326398 | 3490387 |
45 | 3967749 | 3304367 | 44779595 | 2745398 | 13964985 | 1313752 | 3375041 |
40 | 3967749 | 2888722 | 37253590 | 2493803 | 12063451 | 608490 | 1248833 |
35 | 3967749 | 2893043 | 35589986 | 2485646 | 11974145 | 599018 | 1221599 |
30 | 3967749 | 2892003 | 35223235 | 2478289 | 11865595 | 586111 | 1187839 |
25 | 3967749 | 2898534 | 34987248 | 2470076 | 11608237 | 571787 | 1139588 |
20 | 3967749 | 2635739 | 33145946 | 2420612 | 11010101 | 412730 | 711494 |
15 | 3967749 | 2631859 | 32882465 | 2405928 | 10942076 | 402464 | 562713 |
A very small "resize at" percentage means a very large table with many holes. One would expect that C and R get smaller at the bottom. But this is hardly the case: the R/L ratio (= average number of round-looping per lookup function call) decreases from 20 (top, resize at 99%) only to 8 (bottom, resize at 15%). The reason for this behaviour is
/* Modified hash function */
static unsigned long
compute_hashval (const void *key, size_t keylen)
{
size_t cnt;
unsigned long int hval, g;
cnt = 0;
hval = keylen;
while (cnt < keylen)
{
hval <<= 5;
hval += (unsigned long int) *(((char
*) key) + cnt++);
g = hval & ((unsigned long)
0x1f << (LONGBITS - 5));
if (g != 0)
{
hval ^=
g >> (LONGBITS - 8);
hval ^=
g;
}
}
return hval != 0 ? hval : ~((unsigned long) 0);
}
The fifth and sixth columns show the collision and round-loop counts for this hash function. One can see that in this case, the R/L ratio decreases from 9.8 to about 2.75. But the gross behaviour remains the same:
/* Best hash function */
static unsigned long
compute_hashval (const void *key, size_t keylen)
{
size_t cnt;
unsigned long int hval;
cnt = 0;
hval = keylen;
while (cnt < keylen)
{
hval = (hval << 9) | (hval
>> (LONGBITS - 9));
hval += (unsigned long int) *(((char
*) key) + cnt++);
}
return hval != 0 ? hval : ~((unsigned long) 0);
}
The results of these function are displayed in the seventh and eighth column of the table.
This wasn't even the worst possible result: For other bit patterns, like 1110xxxx xxxxxxxx xxxx0101, the amount of killed bits would have been 8, giving rise to collision lists of length 256.
Let's now look at the computation time.
Resize at % | Original | Modified | Best | Best, Memory |
90 | 19.9 sec | 11.1 sec | 9.4 sec | 76.8 MB |
85 | 19.8 sec | 9.7 sec | 8.0 sec | 98.4 MB |
80 | 7.9 sec | 98.4 MB | ||
40 | 10.0 sec | 8.5 sec | 141.7 MB | |
20 | 12.1 sec | 10.8 sec | 239.3 MB |
It is also dependent on the resize fullness. If this factor is larger than 85%, the number of collisions is quite high (see first table), eating CPU time in the round-looping. If this factor is less than 40%, the program eats a whole lot more memory, slowing down the program through cache misses and paging. So a good resize fullness is between 40% and 85%. Fur further timings, I chose 75%.
The final speed achievement, through the changed hash function and the
changed resize threshold, is as follows.
Original | Best | |
Total time | 19.9 sec | 7.9 sec |
Time spent in lookup() | 72.5% | 26.9% |