by BASAVAIAH THAMBARA (@basavaiaht) on Saturday, January 30, 2016

+15
Vote on this proposal
Status: Submitted
Section
Full talk

Technical level
Intermediate

Objective

Talk focuses on how we are doing data ingestion from MySQL to Hadoop in an incremental fashion to make the data on hdfs more upto date

Description

Utilizing big data processing platform like hadoop is very crucial for any business to build good analytical dashboards to provide business insights. Dumping full database and converting it to avro and loading it to hdfs consumes significant amount of database resources and cannot be done as frequently as we need which lead to stale data in hdfs.In this talk we give details on the incremental design framework to ingest data from MySQL to Hadoop wich involves capturing change data from MySQL database and processing the delta capture to finally merge it with full data set in HDFS.

Requirements

Basics of MySQL replication and SQL is a requirement to understand the technical details of design

Speaker bio

Staff Database engineer at Linkedin,responsible for maintaining MySQL and Oracle databases.