Curator: Self-Managing Storage for Enterprise Clusters

Ignacio Cano; Srinivas Aiyar; Varun Arora; Manosiz Bhattacharyya; Akhilesh Chaganti; Chern Cheah; Brent Chun; Karan Gupta; Vinayak Khot; Arvind Krishnamurthy

2017 NSDI NSDI 2017

Curator: Self-Managing Storage for Enterprise Clusters

Abstract

Modern cluster storage systems perform a variety of background tasks to improve the performance, availability, durability, and cost-efficiency of stored data. For example, cleaners compact fragmented data to generate long sequential runs, tiering services automatically migrate data between solid-state and hard disk drives based on usage, recovery mechanisms replicate data to improve availability and durability in the face of failures, cost saving techniques perform data transformations to reduce the storage costs, and so on. In this work, we present Curator, a background MapReduce-style execution framework for cluster management tasks, in the context of a distributed storage system used in enterprise clusters. We describe Curator’s design and implementation, and evaluate its performance using a handful of relevant metrics. We further report experiences and lessons learned from its five-year construction period, as well as thousands of customer deployments. Finally, we propose a machine learning-based model to identify an efficient execution policy for Curator’s management tasks that can adapt to varying workload characteristics.

🧭 Keyword Pioneer — machine learning model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Ignacio Cano , Srinivas Aiyar , Varun Arora , Manosiz Bhattacharyya , Akhilesh Chaganti , Chern Cheah , Brent Chun , Karan Gupta , Vinayak Khot , Arvind Krishnamurthy

Topics

Machine Learning > Optimization & Theory > Optimization Machine Learning > Application Areas > Efficient Computing

Keywords

machine learning model distributed storage system background task management execution policy optimization data tiering

Download PDF

Related papers

ViewMap: Sharing Private In-Vehicle Dashcam Videos 2017

RAIL: A Case for Redundant Arrays of Inexpensive Links in Data Center Networks 2017

Bringing IoT to Sports Analytics 2017

mOS: A Reusable Networking Stack for Flow Monitoring Middleboxes 2017

Flowtune: Flowlet Control for Datacenter Networks 2017