Operational Experiences with Disk Imaging in a Multi-Tenant Datacenter

Kevin Atkinson, Gary Wong, and Robert Ricci

Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI) 2014.

Testbeds, Storage


Disk images play a critical role in multi-tenant datacenters. In this paper, the first study of its kind, we analyze operational data from the disk imaging system that forms part of the infrastructure of the Emulab facility. This dataset spans four years and more than a quarter-million disk image loads requested by Emulab's users. From our analysis, we draw observations about the nature of the images themselves (for example: how similar are they to each other?) and about usage patterns (what is the statistical distribution of image popularity?). Many of these observations have implications for the design and operation of disk imaging systems, including how images are stored, how caching is employed, the effectiveness of pre-loading, and strategies for network distribution.