by BASAVAIAH THAMBARA (@basavaiaht) on Saturday, 30 January 2016
- Full talk
- Technical level
Talk focuses on how we are doing data ingestion from MySQL to Hadoop in an incremental fashion to make the data on hdfs more upto date
Utilizing big data processing platform like hadoop is very crucial for any business to build good analytical dashboards to provide business insights. Dumping full database and converting it to avro and loading it to hdfs consumes significant amount of database resources and cannot be done as frequently as we need which lead to stale data in hdfs.In this talk we give details on the incremental design framework to ingest data from MySQL to Hadoop wich involves capturing change data from MySQL database and processing the delta capture to finally merge it with full data set in HDFS.
Basics of MySQL replication and SQL is a requirement to understand the technical details of design
Staff Database engineer at Linkedin,responsible for maintaining MySQL and Oracle databases.