Build ETL & streaming pipelines using Shell, Airflow & Kafka. Master data transformation for enterprise solutions. Automate data workflows efficiently!
Build ETL & streaming pipelines using Shell, Airflow & Kafka. Master data transformation for enterprise solutions. Automate data workflows efficiently!
This course cannot be purchased separately - to access the complete learning experience, graded assignments, and earn certificates, you'll need to enroll in the full IBM Data Engineering Professional Certificate program. You can audit this specific course for free to explore the content, which includes access to course materials and lectures. This allows you to learn at your own pace without any financial commitment.
4.5
(335 ratings)
47,963 already enrolled
Instructors:
English
پښتو, বাংলা, اردو, 2 more
What you'll learn
Implement ETL and ELT processes for data warehousing
Build automated data pipelines using Shell scripting
Create and manage DAGs with Apache Airflow
Develop streaming data pipelines using Apache Kafka
Optimize pipeline performance and monitoring
Skills you'll gain
This course includes:
1.8 Hours PreRecorded video
11 assignments
Access on Mobile, Desktop, Tablet
FullTime access
Shareable certificate
Closed caption
Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.





There are 5 modules in this course
This comprehensive course covers essential data engineering concepts, focusing on Extract, Transform, Load (ETL) processes and data pipeline development. Students learn to build both batch and streaming data pipelines using industry-standard tools like Shell scripting, Apache Airflow, and Apache Kafka. The curriculum includes hands-on labs and practical exercises in implementing ETL workflows, creating DAGs in Airflow, and developing streaming solutions with Kafka. Special emphasis is placed on understanding the differences between ETL and ELT processes, pipeline monitoring, and optimization techniques.
Data Processing Techniques
Module 1 · 1 Hours to complete
ETL & Data Pipelines: Tools and Techniques
Module 2 · 2 Hours to complete
Building Data Pipelines using Airflow
Module 3 · 3 Hours to complete
Building Streaming Pipelines using Kafka
Module 4 · 3 Hours to complete
Final Assignment
Module 5 · 6 Hours to complete
Fee Structure
Instructors
AI and Machine Learning Expert at IBM Canada
Yan Luo serves as a Data Scientist and Developer at IBM Canada, where he applies his expertise in machine learning and artificial intelligence to develop innovative cognitive applications across diverse domains including software repository mining, personalized health management, wireless networks, and digital banking. After earning his Ph.D. in Machine Learning from the University of Western Ontario, he has contributed significantly to technical education through developing and teaching multiple data science courses, including Applied Data Science Capstone, Machine Learning Capstone, and Introduction to R Programming for Data Science. His work focuses on practical applications of AI and cognitive computing, bridging the gap between theoretical machine learning concepts and real-world business solutions.
Data Engineering and Technology Education Expert
Ramesh Sannareddy serves as a freelance technology educator and content developer, bringing over two and a half decades of experience in Information Technology Infrastructure Management, Database Administration, and Information Integration. After earning his Bachelor's Degree in Information Systems from Birla Institute of Technology, Pilani, he built an impressive career working with leading technology companies including Intergraph, Genpact, HCL, and Microsoft. Currently focused on his passion for teaching, he specializes in developing and delivering courses in Data Science, Machine Learning, Programming, and Databases. His educational impact is evidenced through his extensive course portfolio, which includes specialized programs in Data Engineering, Data Warehousing, Linux Commands, Machine Learning with Apache Spark, and Python Programming. His teaching reaches over 11,800 learners globally, maintaining a strong 4.5 rating for his educational content
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
4.5 course rating
335 ratings
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.