V. Srinivasan and George Varghese (WUSTL), 1998
Summary. The authors suggest three techniques
that, when applied to existing router lookup algorithms, yield lookup
times of 226nsec (about twice as fast as known algorithms) and worst
case insert/delete times of 2.5ms.
The Internet Lookup Problem:
Map incoming IP addresses to outgoing interface. Requires more
than just a one-to-one mapping because of the millions of IP addresses seen.
Instead, the database consists of pairs.
This reduces database size AT THE EXPENSE OF requiring a more complex
lookup: longest matching prefix. Ex:
-
USA.* -> Boston, USA.CA.* -> LA
- All packets with address USA forwarded to Boston EXCEPT USA.CA packets
which are forwarded to LA.
- Must match the longest prefix to be correct.
Performance Model. Routing Environment: use four publicly available
databases for comparisons. Implementation Model: Software: 300MHz Pentium II,
NT, 512Kb L2, cache line 32 bytes. Hardware: cheap forwarding engine at
a clock rate of 10 nsec (?). They map their analyses onto these models to
get average and worst case lookup/insert/delete times for their
lookup algorithms.
Techniques:
- Controlled Prefix Expansion. Reduce the number of prefix lengths
by expanding shorter prefixes to longer. By comparing more bits at a time,
reduce the height of the trie/tree. This reduces the number of memory optimizations.
Clearly one wants to make this small; from the paper it appears that the number
of prefix lengths is chosen by targeting a certain forwarding rate/lookup time.
- Picking Optimal Expansion Levels. Clearly want to make the number
of prefix lengths small, but what lengths should be chosen? The goal here is
to minimize the amount of memory required for the data structure.
- Local Restructuring.
- Leaf pushing. Make internal nodes pointers only, data at leaves. This
saves storage since internal nodes only need pointers. Cost: potentially
slow rebuild on insert/delete.
- Cache alignment. Use (1) packed arrays and (2) semi-perfect hashing:
settle for hash functions that have no more than 6 collisions per entry
because 6 will fit in a cache line in model environment.
Miscellaneous points:
- Increasing trie stride (#bits compared per READ) decreases the number of READS.
Ex, 1-bit strides can require 32 READS.
- Measure of lookup algorithm often in DRAM lookups since this is the bottleneck.
Attempt to get the data structure into cache and perform cache alignment. Lots
of interesting, known techniques touched upon.
- insert/delete important because IPMA reports that the number of routing
prefixes announced and withdrawn by BGP in a single day was ~1.8M -> 25/sec.
- of the 1.8M above, 47k were route changes (not insertions/deletions), so
route changing somewhat important. If you do the numbers, I don't see what
the big point is, though.
- Binary search on prefix levels (previous work): Perform a binary search
through hash tables. Ex: hash key, if hash value found at root hash tree, done.
Otherwise, will find a marker pointing to next level to hash on. This was
not explained well at all.
Criticisms.
Presentation needs work. Many key concepts are
vague (binary search on prefix lengths). Needed a picture of fixed-stride
trie and variable-stride trie (not clear if variance in horizontal or
vertical axis). Background work not explained (Patricia tries).