DPF: Fast, Flexible Message Demultiplexing using Dynamic Code Generation

Dawson R. Engler and M. Frans Kaashoek (MIT), 1996

Summary. Existing packet filters (CSPF, BPF, MPF, Pathfinder) in the literature define a filter-language (predicates written in a small, safe language) with which filters can be defined at the application level, downloaded, and interpreted safely. Because of the high cost of interpretation, however, the packet-filter model has been precluded from use with high-performance networks in lieu of hand-coded demultiplexors. The authors have designed a packet-filter system that uses a declarative packet-filter language that is aggressively optimized using dynamic code DCG allows improves performance by 1) removing the interpretation overhead and 2) leveraging information available at runtime. DPF achieves the goal of flexibility without becoming a bottleneck in network communication.

More detail.

Language. They present a language carefully designed to allow aggressive dynamic compilation. Their language is declarative; it allows apps to describe the packets they're looking for. A DPF filter is composed of a sequence of boolean comparitors called atoms linked by conjunctions ('ands'). An atom looks like this:
	# check ethernet header for IP datagram
	(12:16 == 0x8) &&
	# skip ethernet header
	(SHIFT(6+6+2)) &&
	# check IP header for TCP (type 6)
	(9:8 == 6) &&
	...
	
Using a declarative lang rather than an imperative one is key to merging filters efficiently.

Filter merging. A new filter is merged in the following way:

Optimizations. There are two classes of optimizations: inter-filter (between filters) and intra-filter (within a filter). In addition to dynamic code generation, an intra-filter op, they implement all the inter-filter optimizations described in MPF. Unlike statically compiled code, dynamic code generation can utilize runtime information. Their specific dynamic code gen ops:

Fixed cost of starting the packet-filter engine is small: uses small number of registers thus minimizing the number of saves and restores whereas an interpreter has a startup high cost.

DPF is so fast that caching of filters not necessary/implemented.

Insertion overhead breaks down roughly to 40% codegen, 30% merging of filter and 30% translation of filter rep into DPF representation.

"Packet filters can be viewed as application code that is downloaded into the kernel. Protocol knowledge is limited to the application, while the protection checks to determine packet owndership are couched in a language understood by the kernel. Fault isolation of the downloaded code is ensured by careful language design (to bound run time) and runtime checks (to protect against wild memory references and unsafe operations). "

On the use of user-level protocols: Separating protection (demux) from authorization & mgt, fast multip. possible while still maintaining app-level flexibility.

One problem with filter: one app can lie and try to get another app's packets. Security precautions such as only allowing trusted server to install filters addresses this problem.

Miscellaneous