DPF: Fast, Flexible Message Demultiplexing using
Dynamic Code Generation
Dawson R. Engler and M. Frans Kaashoek (MIT), 1996
Summary. Existing packet filters (CSPF, BPF, MPF, Pathfinder)
in the literature define a filter-language (predicates written in
a small, safe language) with which filters can be defined at
the application level, downloaded, and interpreted safely. Because of
the high cost of interpretation, however, the packet-filter model has
been precluded from use with high-performance networks in lieu of
hand-coded demultiplexors. The authors have designed a packet-filter
system that uses a declarative packet-filter language that is
aggressively optimized using dynamic code DCG allows
improves performance by 1) removing the interpretation overhead
and 2) leveraging information available at runtime. DPF
achieves the goal of flexibility without becoming a bottleneck in
network communication.
More detail.
Language. They present a language carefully designed to allow
aggressive dynamic compilation. Their language is declarative; it
allows apps to describe the packets they're looking for. A DPF
filter is composed of a sequence of boolean comparitors called
atoms linked by conjunctions ('ands'). An atom looks like
this:
# check ethernet header for IP datagram
(12:16 == 0x8) &&
# skip ethernet header
(SHIFT(6+6+2)) &&
# check IP header for TCP (type 6)
(9:8 == 6) &&
...
Using a declarative lang rather than an imperative one is key
to merging filters efficiently.
Filter merging.
A new filter is merged in the following way:
-
- Client app writes filter
- Filter preprocessed into low-level representation
-
- Filter parsed into atoms
- Syntax checked
- Semantics checked
-
- New filter merged into existing trie along path
with largest prefix match
- Disjunctive atoms (atoms which share expression
tree on one side of equality but different constant
on other side)
-
- Atoms not merged are connected to trie with an
'or' branch
- Trie branches sorted in descending order by ref count
- If new branch, unlinked code is generated
-
- If code was generated, system retraverses master
filter (trie, I'm assuming) copying code into
linear space
- Cleanup (jump backpatching and instruction cache flushed)
Optimizations.
There are two classes of optimizations: inter-filter (between filters)
and intra-filter (within a filter). In addition to dynamic code
generation, an intra-filter op, they implement all the inter-filter
optimizations described in MPF. Unlike statically compiled code,
dynamic code generation can utilize runtime information.
Their specific dynamic code gen ops:
- Runtime constants. Many constants are known--such as
ports and addresses--so these can be compiled in as immediates
rather than using indirection. Since many constants "are filter-
specific as opposed to protocol-specific (e.g. they are
known only after a connection is established), a statically coded"
demuxer cannot access them directly. The authors state that
the ability to leverage runtime knowledge of constants allows
DPF to be faster than hand-crafted demuxers.
- Fast disjuctions. If the number of disjunctives is small
(<= 3), the hash table is eliminated and the comparison is
done directly with branches. For >= 3 values, since constants
and number of constants known at codegen time, can select an optimal
hash function. If collisions not existent, can eliminate collisio
n
detection.
- Atom coalescing. If atom are adjacent, can coalesce into
a single comparison.
- Alignment estimation. Packet-filter engines treat every
memory op as potentially unaligned. DPF finds the lower
bound of the message pointer. (?) As a side-effect, they
eliminate constant SHIFT operations by adding the operand
to address offset in subsequent atoms.
- Bounds-check aggregation. At most, check once per
indirection. Offending packets "fail" filter and move on.
Fixed cost of starting the packet-filter engine is small: uses
small number of registers thus minimizing the number of saves
and restores whereas an interpreter has a startup high cost.
DPF is so fast that caching of filters not necessary/implemented.
Insertion overhead breaks down roughly to 40% codegen, 30%
merging of filter and 30% translation of filter rep into
DPF representation.
"Packet filters can be viewed as application code that is
downloaded into the kernel. Protocol knowledge is limited
to the application, while the protection checks to determine
packet owndership are couched in
a language understood by the kernel. Fault isolation of the
downloaded code is ensured by careful language design (to
bound run time) and runtime checks (to protect against wild
memory references and unsafe operations). "
On the use of user-level protocols:
Separating protection (demux) from authorization & mgt,
fast multip. possible while still maintaining app-level
flexibility.
One problem with filter: one app can lie and try to
get another app's packets. Security precautions such as only
allowing trusted server to install filters addresses this
problem.
Miscellaneous
- I liked the author's choice of words:
vetted, unfettered, elited, "venerable
tradition".