by Shahnawaz Saifi on Tuesday, 19 January 2016

Vote on this proposal
Status: Submitted
Full talk

Technical level



Modeling a distributed system as a state machine with constraints on states and transitions has the following benefits:

  • Separates cluster management from the core functionality of the system.
  • Allows a quick transformation from a single node system to an operable, distributed system.
  • Increases simplicity: system components do not have to manage a global cluster. This division of labor makes it easier to build, debug, and maintain your system.

In this talk Shahnawaz will cover Helix Introduction, concepts and putting concepts together to work.


Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration.

  • Automatic assignment of resources and partitions to nodes
  • Node failure detection and recovery
  • Dynamic addition of resources
  • Dynamic addition of nodes to the cluster
  • Pluggable distributed state machine to manage the state of a resource via state transitions
  • Automatic load balancing and throttling of transitions
  • Optional pluggable rebalancing for user-defined assignment of resources and partitions


Basic knowledge of distributed systems.

Speaker bio

Shahnawaz is part of Site Reliability Engineering - Distributed Data Systems at LinkedIn. He has 7+ years of experience playing around large scale environments. Prior to LinkedIn, he was associated with Clickable and Guavus.