Rootconf 2018

On scaling infrastructure and operations

Production Report - Using Apache Flink as a microservice for stateful asynchronous processing

Submitted by Jagadish Bihani on Saturday, 3 March 2018

videocam
Preview video

Technical level

Advanced

Section

Crisp Talk

Status

Submitted

Vote on this proposal

Login to vote

Total votes:  +18

Abstract

This talk highlights why we chose flink as a microservice for stateful asynchronous event processing and challenges we faced in production, how we solved those and recommendations for productionization of the applications using Apache flink.

Key takeaways:
- Architecture pattern of using Flink/similar platform as a microservice for statuful async event processing - Flink fault tolerance concepts in-depth understanding - Production issues/challenges faced and insights on how to solve (& also prevent) them

Basic understanding of stream processing will be an advantage.

Outline

  • Brief summary of what is flink and important terminologies
  • Flink as a microservice for asynchronous stateful event stream processing
    • Challenges in doing it in a conventional way
  • Prerequisite concepts
    • Fault tolerance and checkpointing
    • Scalable partitioned state
    • State Backend - Rocksdb
    • Asynchronous checkpointing details
  • Production Experiences
    • Flink taskmanager failover time tuning
    • Failure detection mechanism
    • Tuning Akka Deathwatch
    • How state leaks happen and how to prevent and monitor them
    • How to clear old state (result of state leak) of running system, without taking downtime
    • How state size and checkpointing can cause processing delays and how to tune it
  • Recommendations & Summary

Speaker bio

Software architect at Helpshift. Have worked on streaming processing,various backend architectures and end-end data pipelines before. Have a good understanding of systems side of software as well. More details can be found on https://www.linkedin.com/in/jagadish-bihani-1335a04a/

Slides

http://slides.com/jagadishbihani/apache-flink-production-report/fullscreen

Preview video

https://youtu.be/lmUq9fBeJVs

Comments

Login with Twitter or Google to leave a comment