by Laxmi Nagarajan (@laxmi777) on Tuesday, 14 February 2017

+2
Vote on this proposal
Status: Confirmed & Scheduled
View session in schedule
Section
Crisp talk of 15 mins duration

Technical level
Beginner

Media

Abstract

Understanding the Capacity limits of an application is critical to ensuring that SLAs are consistently met.
This how-to talk aims to break down the process of Capacity planning into three steps that leverage standard, simple tools. It also touches upon how the learnings from the capacity planning can be channelled into the setup of AWS autoscaling policies.

Outline

Capacity planning involves the three main steps below
a) Coming up with the load pattern for one single host: While it is useful to benchmark key APIs individually and regress degradations in these KPIs release over release, from a capacity prediction perspective, it is more accurate to base predictions off of production traffic patterns. Dashboards in New Relic provide a clear, real time, window into the top used APIs and this data, coupled with Splunk filters, provides peak incoming request count for each API. Based on the total AWS instances count, production load per AWS instance can be arrived at and simulated in the performance load scripts.
b) Preparing the load testing scripts and run the tests in the Perf environment: JMeter is the tool of choice for load testing script creation and execution. For the predictions to be reliable, the tests must run in a (scaled down) performance environment which has server size matching that of the production boxes and tests must run from the same subnet. Care must be exercised to ensure dependent downstream environments are also performance environments. Any caching optimisations must be identified and called out. Load tests starting at current load should be scaled up incrementally to upto 5X/10X of the current load.
c) Analysing/extrapolating the results to determine the capacity and autoscaling policies: KPIs for analysis are the client and server side response times, TP90, CPU and memory consumption and Apdex scores. This KPI data can be used to identify the load at which application SLAs are met and extrapolated to determine loads that can be optimally processed in Production. Also, based on peak traffic analysis, if there is recurring, predictable spike in usage for a time window, auto scaling policies can be configured in AWS for provisioning AWS instances on demand, so as to optimise operation costs.

Speaker bio

Laxmi Nagarajan is a Staff Software Engineer in Quality, Intuit, Inc. She has helped drive Quality upstream in the development cycle for SAAS applications built in Adobe, Paypal and startups in the Bay area and more recently in Intuit, IDC.

Comments

  • 1
    [-] Zainab Bawa (@zainabbawa) Reviewer 6 months ago

    Thanks for this proposal, Laxmi. To complete the review, we need draft slides and link to a self-recorded video explaining what this talk is about and why the audience should attend it. Please share this information, latest by Wednesday, 22 Feb.

  • 1
    [-] Laxmi Nagarajan (@laxmi777) Proposer 6 months ago

    Thank you for the follow-up Zainab. I will upload draft slides and link to a video by 22nd Feb at the latest

  • 1
    [-] saurabh hirani (@saurabh-hirani) 5 months ago

    can you please open up the slides for general public access? none of the rootconf proposal slides should be closed as it helps the audience to understand your thought process

    • 1
      [-] Laxmi Nagarajan (@laxmi777) Proposer 5 months ago

      The slides setting is for anyone with the link to be able to view it. Could you retry please? Thank you

      • 1
        [-] saurabh hirani (@saurabh-hirani) 5 months ago

        Thanks. Am able to access it.

Login with Twitter or Google to leave a comment