End-to-End Packet Delay and Loss Behavior in the Internet

Jean-Chrysostome Bolot (INRIA), 1993

Summary. The author uses a single server, two input stream (one for probe traffic and one for other internet traffic) queueing model to analyze packet delay and loss behavior in the internet. Measurements are taken using the UDP echo tool NetDyn. Results show probe packet compression and correlated losses when the send interval (delta) is small (<=~20ms), random loss behavior when the probe traffic comprises <10% of the available bandwidth.

More Detail
Packet loss and delay behavior are important characteristics of packet-switched nets because:

Packet delay. comprised of fixed component (transmission delay and propogation delay) and variable component (queue delay and any processing); have no control over latter.

Loss behavior.
Packets can be dropped at any node along the path.
Implications of PD and L.
- Propoer design of net algs such as routing and flow control.
- Dimensioning of buffers and link capacity.
- Choosing parameters for simulation and analytic studies.

Data generation. Used NetDyn. Studied a couple different paths but mainly INRIA to University of Maryland. Sent out UDP "probe" packets at certain intervals. Each experiment lasted 10 minutes. Intervals ranged from 8ms to 500ms.

Suggested that time series analysis is often used in like contexts to analyze two problems (general model fitting and prediction) but that time series analysis didn't leverage known factors present in this study (specifically, the connection over which the probe packets are sent (#hops, traffic mix expected, etc.). Therefore, their approach in this paper is to use the available information to interpret observations and suggest a specific model.

Analysis of Delay. Have a single server queue with two inputs: (1) probe traffic and (2) internet traffic.

They use phase plots to plot (rtt[n], rtt[n+1]). Phase plots for small delta show the presence of probe packet compression, a phenomenon previously seen in 2-stream protocols such as TCP where small packets (such as these probe packets or TCP ACK packets) queue up behind larger packets. Larger delta experiments do not show compression as the queue has an opportunity to empty between packets. Compressed packets lie along the line rtt[n+1] = rtt[n] + P/u - d where P is the probe packet size, u is the service rate of the queue (so P/u is the service time for a probe packet) and d is delta.

Used Lindley's Recurrence Equation:

         w = wait
         y = service time
         x = interarrival time between n and n+1

                - 
               |  = w[n] + y[n] - x[n], if <- > 0
         w[n+1]|  
               |  = 0, otherwise
                -

to derive the probability distribution for the internet traffic workload (see paper for derivation; it's not difficult to follow) which turns out to also be the interarrival time of the probe packets at the *destination*. The PDs showed peaks for compressed packets, packets where rtt[n+1] = rtt[n], and the first packets to queue up behind one larger packet (they calculated the packets size and decided they were FTP packets), two larger packets, etc..

Packet loss. They calculated three loss metrics:

Unconditional Loss Probability.
Correlated Loss Probability.
Packet Loss Gap.

They found that that ULP increased with a decrease in delta since "if a probe packet is lost at time t because of buffer overflow, then the next probe packet which arrives at time t+d will also be lost if d is less than the service time of the packet in service."

CLP is greater than ULP in all cases since the loss probability of probe n+1 increases with the buffer occupancy on the arrival of probe n.

For large values of d, CLP and ULP are almost identical since the states of the buffer seen by two successive probes become less and less correlated as d increases. Their results show that the losses of probes are essentially random as long as they compries less than 10% of the available link capacity.

They noted that ULP stabilizes around 10% as d increases. Conjectured that some of this might be due to faulty internet cards (prev. study showed 3% losses due to such cards on one of the nets traversed in the study).

Loss gap even for small values of d stayed close to 1. Implication: audio and perhaps video (video doesn't necessarily send at regular intervals) can use open-loop EC schemes such as FEC successfully.

Kristin Wright

Last modified: Fri Apr 21 17:15:11 MDT 2000