ETL and Data Pipelines with Shell, Airflow and Kafka
After taking this course, you will be able to describe two different approaches to converting raw data into analytics-ready data. One approach is the Extract, Transform, Load (ETL) process. The other contrasting approach is the Extract, Load, and Transform (ELT) process. ETL processes apply to data warehouses and data marts. ELT processes apply to data lakes, where the data is transformed on demand by the requesting/calling application. Both ETL and ELT extract data from source systems, move the data through the data pipeline, and store the data in destination systems. During this course, you will experience how ELT and ETL processing differ and identify use cases for both.
You will identify methods and tools used for extracting the data, merging extracted data either logically or physically, and for importing data into data repositories. You will also define transformations to apply to source data to make the data credible, contextual, and accessible to data users. You will be able to outline some of the multiple methods for loading data into the destination system, verifying data quality, monitoring load failures, and the use of recovery mechanisms in case of failure.
Finally, you will complete a shareable final project that enables you to demonstrate the skills you acquired in each module.
Describe and contrast Extract, Transform, Load (ETL) processes and Extract, Load, Transform (ELT) processes.
Explain batch vs concurrent modes of execution.
Implement an ETL pipelinethrough shell scripting.
Describe data pipeline components, processes, tools, and technologies.
Syllabus
Syllabus - What you will learn from this course
Week 1
Data Processing Techniques
Week 2
ETL & Data Pipelines: Tools and Techniques
Week 3
Building Data Pipelines using Airflow
Week 4
Building Streaming Pipelines using Kafka
Week 5
Final Assignment
FAQ
When will I have access to the lectures and assignments?
Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:
The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.
The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Certificate?
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.
Reviews
Perfect environment to make experiments! Very easy and powerful in use.
Excellent introduction to this topics. Labs contain all you need to know how to start using this type of technologies. Highly recommended.
Nice intro to ETL and Data Pipelines. Beginner level easy to follow hands on Airflow and Kafka.
Thanks to all the instructor's efforts, one of the best DATA engineering courses, contains hands-on Experience with essential data tools.