[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Testbed-admins] Minibosses



    From testbed-admins-bounces@flux.utah.edu Tue May  5 14:38:14 2009

    > I would suggest talking to some of the guys from DETER about this - I
    > bet Keith would respond to a post on testbed-admins. They have done
    > something like this, since half their nodes are at ISI and half are at
    > Berkeley. But, I know embarrassingly little about what they've actually
    > done. I think that, at the very least, they are caching Frisbee images
    > on the Berkeley side (their boss is on the ISI side).


Keith has been distracted by some other deadlines this week;
I'll try to do the readers digest version.

We have a 2 headed hydra; ucb.deterlab.net is used for development and
is so close to being a firewalled emulab-in-emulab experment within
isi.deterlab.net, it isn't worth further mention here.  It is not
a miniboss in the sense that Pat Gunn was proposing.

When berkeley nodes are under ISI control, frisbee images are not cached;
but the mfs's are.

The boot sequence goes:

DHCP response from ISI (in the wide area! but by layer 2 tunnel)
tftp of pxeboot.emu from ISI (400 packets at 10 milliseconds rt is 4 seconds)
bootinfo tells the client to load 192.168.3.252:/tftpboot/frisbee
    (which is on the berkeley side).
    [loads os in a few seconds instead of 6 minutes]
everything else proceeds as in the usual emulab case.

Multicast in the wide area is actually faster than transfering the images
by tcp due to tcp clients not putting bandwidth x delay in the sockbufs.
(which would be on the order of a megabyte or so).
(scp takes 8 to 15 minutes to transfer an OS image, frisbee does it
in slightly over a minute).

Reliability of swap-ins and functioning of experiments as a whole took
a quantum jump when we were able to use just slightly larger than
standard ethernet size packets (via jumbograms containing unfragmented
layer-2 tunneled packets).

The link between ISI and Berkeley is only a gigabit, but there's
very little loss on it.  For all practical purposes it looks like
a wire that's 400 miles long with a 5.5 millisecond delay in each direction.

It would be worth investigating whether would could build a layer2
tunnel with a reliable transport that was fast enough to run
at line speed.  That would allow passing of etherframes without
IP fragmentation and reassembly, which really proved problematic for us.

Given such an implemntation we might be able to eat our own dog food
and run a homenet implementation within an emulab-in-emulab seeing
whether frisbee works as well over a bandwidth-limited but otherwise
reliable layer 2 tunnel as it does in the DETER case.

Keith Sklower