Flux Research Group / School of Computing

ATOM: Efficient Tracking, Monitoring, and Orchestration of Cloud Resources

Min Du and Feifei Li

IEEE Transactions on Parallel and Distributed Systems 28(8), August 2017.

DOI: 10.1109/TPDS.2017.2652467



The emergence of Infrastructure as a Service framework brings new opportunities, which also accompanies with new challenges in auto scaling, resource allocation, and security. A fundamental challenge underpinning these problems is the continuous tracking and monitoring of resource usage in the system. In this paper, we present ATOM, an efficient and effective framework to automatically track, monitor, and orchestrate resource usage in an Infrastructure as a Service (IaaS) system that is widely used in cloud infrastructure. We use novel tracking method to continuously track important system usage metrics with low overhead, and develop a Principal Component Analysis (PCA) based approach to continuously monitor and automatically find anomalies based on the approximated tracking results. We show how to dynamically set the tracking threshold based on the detection results, and further, how to adjust tracking algorithm to ensure its optimality under dynamic workloads. Lastly, when potential anomalies are identified, we use introspection tools to perform memory forensics on VMs guided by analyzed results from tracking and monitoring to identify malicious behavior inside a VM. We demonstrate the extensibility of ATOM through virtual machine (VM) clustering. The performance of our framework is evaluated in an open source IaaS system.