Cosmos - Centralized Monitoring Service @ Flipkart
Submitted by Anand Karthik (@anandkarthikt) on Saturday, 28 March 2015
Gain insights into building large scale monitoring infrastructure
Measurement precedes awareness, control and improvement.
This talk is about how we measure system and application metrics across 10s of thousands of servers at Flipkart with focus on handling scale, ensuring Ease of use for developers, failure detection and healing, the power of conventions, alerting and the monitoring tech stack at use in Flipkart.
I am Anand Karthik, part of the Infrastructure Engineering team at Flipkart. I have previously worked with the Supply Chain Payments and Logistics team.
Our team built the monitoring infrastructure - Cosmos which is used extensively @ Flipkart including events like the Big Billion Day.