Providing Dependability and Resilience in the Cloud: Challenges and Opportunities

Samuel Kounev, Philipp Reinecke, Fabian Brosig, Jeremy T. Bradley, Kaustubh Joshi, Vlastimil Babka, Stephen T. Gilmore, Anton Stefanek

Book Chapter
Resilience Assessment and Evaluation of Computing Systems
November, 2012
Springer Verlag
ISBN 978-3-642-29031-2
DOI 10.1007/978-3-642-29032-9_4

Cloud Computing is a novel paradigm for providing data center resources as on-demand services in a pay-as-you-go manner. It promises significant cost savings by making it possible to consolidate workloads and share infrastructure resources among multiple applications resulting in higher cost and energy efficiency. However, these benefits come at the cost of increased system complexity and dynamicity posing new challenges in providing service dependability and resilience for applications running in a Cloud environment. At the same time, the virtualization of physical resources, inherent in Cloud Computing, provides new opportunities for novel dependability and quality-of-service management techniques that can potentially improve system resilience. In this chapter, we first discuss in detail the challenges and opportunities introduced by the Cloud Computing paradigm. We then provide a review of the state-of-the-art on dependability and resilience management in Cloud environments, and conclude with an overview of emerging research directions.

