Master big data platforms and tools like Spark, Hadoop, and Snowflake for building scalable data engineering solutions.
Master big data platforms and tools like Spark, Hadoop, and Snowflake for building scalable data engineering solutions.
This course cannot be purchased separately - to access the complete learning experience, graded assignments, and earn certificates, you'll need to enroll in the full Applied Python Data Engineering Specialization program. You can audit this specific course for free to explore the content, which includes access to course materials and lectures. This allows you to learn at your own pace without any financial commitment.
3.8
(31 ratings)
7,839 already enrolled
Instructors:
English
What you'll learn
Create scalable data pipelines with modern platforms
Optimize data processing with clustering
Implement ML solutions using PySpark
Apply DataOps and DevOps practices
Manage end-to-end data engineering workflows
Skills you'll gain
This course includes:
10.4 Hours PreRecorded video
21 quizzes
Access on Mobile, Tablet, Desktop
FullTime access
Shareable certificate
Get a Completion Certificate
Share your certificate with prospective employers and your professional network on LinkedIn.
Created by
Provided by

Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.





There are 4 modules in this course
This comprehensive course focuses on enterprise-scale data engineering platforms and methodologies. Students learn to build and optimize data pipelines using Hadoop, Spark, and Snowflake, while mastering PySpark for data processing. The curriculum covers advanced topics including Databricks integration, MLFlow for machine learning lifecycle management, and DataOps practices. Through hands-on projects and real-world applications, learners develop skills in implementing scalable data solutions and managing end-to-end data engineering workflows.
Overview and Introduction to PySpark
Module 1 · 7 Hours to complete
Snowflake
Module 2 · 4 Hours to complete
Azure Databricks and MLFLow
Module 3 · 5 Hours to complete
DataOps and Operations Methodologies
Module 4 · 12 Hours to complete
Fee Structure
Instructors
Executive in Residence and Founder of Pragmatic AI Labs at Duke University
Noah Gift is the founder of Pragmatic AI Labs and serves as an Executive in Residence at Duke University, where he lectures in the Master of Interdisciplinary Data Science (MIDS) program. He specializes in designing and teaching graduate-level courses on machine learning, MLOps, artificial intelligence, and data science, while also consulting on machine learning and cloud architecture for students and faculty. A recognized expert in the field, Gift is a Python Software Foundation Fellow and an AWS Machine Learning Hero, holding multiple AWS certifications, including AWS Certified Solutions Architect and AWS Certified Machine Learning Specialist. He has authored several influential books, such as Practical MLOps, Python for DevOps, and Pragmatic AI, and has published over 100 technical articles across various platforms, including Forbes and O'Reilly. His extensive industry experience includes roles as CTO and Chief Data Scientist for notable companies like Disney Feature Animation, Sony Imageworks, and AT&T, contributing to major films like Avatar and Spider-Man 3. Gift's work has generated millions in revenue through product development on a global scale. He actively consults startups on machine learning and cloud architecture while leading initiatives to enhance data science education.
Senior Data Engineer and Educator at Duke University
Kennedy Behrman is a Senior Data Engineer at Duke University, where he also serves as an instructor for several online courses focused on data engineering and visualization. With decades of experience in Python and data management across various fields, including film, computing, and machine learning, he has established himself as a leading figure in the industry. Behrman has developed and taught courses such as "Data Visualization with Python" and "Linux and Bash for Data Engineering," equipping students with essential skills for the evolving data landscape. His expertise extends to big data processing technologies, where he covers platforms like Apache Spark and Snowflake. In addition to his teaching roles, Behrman has authored educational materials that contribute to the understanding of data science principles. His commitment to fostering learning and innovation in data engineering makes him a valuable asset to both Duke University and the broader academic community.
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
3.8 course rating
31 ratings
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.