RiseUpp Logo
Educator Logo

Introduction to Big Data with Spark and Hadoop

Master Big Data processing with Apache Hadoop & Spark! Learn to use these tools and their ecosystems for efficient data management & analytics.

Master Big Data processing with Apache Hadoop & Spark! Learn to use these tools and their ecosystems for efficient data management & analytics.

This course cannot be purchased separately - to access the complete learning experience, graded assignments, and earn certificates, you'll need to enroll in the full IBM Data Engineering Professional Certificate program. You can audit this specific course for free to explore the content, which includes access to course materials and lectures. This allows you to learn at your own pace without any financial commitment.

4.4

(358 ratings)

52,135 already enrolled

Instructors:

English

پښتو, বাংলা, اردو, 2 more

Powered by

Provider Logo
Introduction to Big Data with Spark and Hadoop

This course includes

19 Hours

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

Free course

What you'll learn

  • Implement big data solutions using Hadoop ecosystem

  • Develop applications with Apache Spark and RDDs

  • Optimize queries using Spark SQL and DataFrames

  • Manage and monitor Spark applications

  • Master distributed computing concepts

Skills you'll gain

Big Data Processing
Apache Spark
Hadoop Ecosystem
Data Parallelism
SparkSQL
MapReduce
HDFS
Data Engineering
Cluster Management
Performance Tuning

This course includes:

2.7 Hours PreRecorded video

14 assignments

Access on Mobile, Desktop, Tablet

FullTime access

Shareable certificate

Closed caption

Get a Completion Certificate

Share your certificate with prospective employers and your professional network on LinkedIn.

Created by

Provided by

Certificate

Top companies offer this course to their employees

Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.

icon-0icon-1icon-2icon-3icon-4

There are 7 modules in this course

This comprehensive course explores big data processing using Apache Hadoop and Spark frameworks. Students learn about distributed computing principles, parallel processing, and data parallelism. The curriculum covers the Hadoop ecosystem including HDFS, MapReduce, Hive, and HBase, along with Spark's capabilities in data processing and analysis. Through hands-on labs using Docker and Kubernetes, participants gain practical experience with DataFrame operations, SparkSQL, and cluster management.

What Is Big Data?

Module 1 · 1 Hours to complete

Introduction to the Hadoop Ecosystem

Module 2 · 2 Hours to complete

Apache Spark

Module 3 · 2 Hours to complete

DataFrames and Spark SQL

Module 4 · 2 Hours to complete

Development and Runtime Environment Options

Module 5 · 3 Hours to complete

Monitoring and Tuning

Module 6 · 2 Hours to complete

Final Project and Assessment

Module 7 · 4 Hours to complete

Fee Structure

Instructors

Aije Egwaikhide
Aije Egwaikhide

4.3 rating

87 Reviews

6,31,843 Students

6 Courses

Data Scientist Aije Egwaikhide: Empowering Women in STEM and Innovating AI Solutions at IBM

Aije Egwaikhide is a fantastic example of how dedication and passion can lead to a successful career in tech! With her background in Economics and Statistics, paired with advanced qualifications in Business and Management Analytics, she’s truly paving the way in the field of data science. Her work at IBM, particularly in creating innovative machine learning solutions for the Oil and Gas sector, is an inspiring achievement.

Rav Ahuja
Rav Ahuja

4.6 rating

140 Reviews

31,49,375 Students

53 Courses

Technology Education and Skills Development Leader at IBM

Rav Ahuja serves as the Chief Content Officer and Global Program Director at IBM Skills Network, where he leads curriculum creation, growth strategy, and partner programs. After earning his B.Eng. from McGill University and MBA from the University of Western Ontario, he co-founded Cognitive Class, an IBM initiative focused on democratizing access to in-demand technology skills. Based at the IBM Canada Lab in Toronto, he specializes in developing instructional solutions for AI, Data Science, Cloud Computing, and Cybersecurity. His impact on technology education is evidenced through his role as architect of numerous IBM Professional Certificates and instructor for over 35 online courses, including popular offerings like "What is Data Science?", "Introduction to Cloud Computing," and "Introduction to Artificial Intelligence (AI)." His courses have reached hundreds of thousands of learners worldwide, with "What is Data Science?" alone enrolling over 638,000 students. His recent work includes developing new Generative AI courses and career guidance content for IBM's Professional Certificate programs, demonstrating his ongoing commitment to preparing learners for emerging technology careers

Introduction to Big Data with Spark and Hadoop

This course includes

19 Hours

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

Free course

Testimonials

Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.

4.4 course rating

358 ratings

Frequently asked questions

Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.