EmuStore: Large Scale Disk Image Storage and Deployment in the Emulab Network Testbed
Masters Thesis, University of Utah. August 2014.
The Emulab network testbed deploys and installs disk images on client nodes upon request. A disk image is a custom representation of filesystem which typically corresponds to an operating system configuration. Making a large variety of disk images available to Emulab users greatly encourages heterogeneous experimentation. This requires a significant amount of disk storage space. Data deduplication has the potential to dramatically reduce the amount of disk storage space required to store disk images. Since most disk images in Emulab are derived by customizing a few golden disk images, there is a substantial amount of data redundancy within and among these disk images.
This work proposes a method of storing disk images in Emulab with maximizing storage utilization, minimal impact on performance, and nonintrusiveness as primary goals. We propose to design, implement, and evaluate EmuStore — a system built on top of a data deduplication infrastructure for efficiently storing disk images in the Emulab network testbed. The goal of this system is to take advantage of duplicate data in storing the disk images while remaining unobtrusive to other Emulab components.