High Performance Programming Fundamentals

Master efficient computing and parallelism techniques through matrix multiplication to optimize code performance on modern CPU architectures.

This comprehensive course focuses on achieving high performance in compute-intensive applications through the practical example of matrix-matrix multiplication. Students progress from basic implementation to advanced optimization techniques, learning to leverage instruction-level parallelism and multithreading. The course emphasizes the crucial role of data movement in efficient computing, covering topics from algorithm selection to parallel computing. Through hands-on exercises in C programming, participants develop skills in performance optimization and software architecture, with free access to MATLAB Online for practical application.

5,403 already enrolled

Instructors:

Devangi Parikh

Robert van de Geijn

English

This course includes

5 Weeks

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

8,448

Audit For Free

Add to compare

What you'll learn

Master algorithm-to-architecture mapping for optimal performance

Implement multiple levels of parallelism in computational tasks

Optimize data movement patterns in high-performance applications

Analyze and interpret performance metrics effectively

Develop layered software architectures for complex systems

Skills you'll gain

High Performance Computing

Parallel Programming

Matrix Multiplication

CPU Architecture

Algorithm Optimization

Data Movement

Scientific Computing

Machine Learning

Multithreading

Performance Analysis

This course includes:

PreRecorded video

Graded assignments, exams

Access on Mobile, Tablet, Desktop

Limited Access access

Shareable certificate

Closed caption

Get a Completion Certificate

Share your certificate with prospective employers and your professional network on LinkedIn.

Created by

University of Texas at Austin

Provided by

Edx

Top companies offer this course to their employees

Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.

There are 4 modules in this course

This course teaches advanced techniques for optimizing code performance on modern CPUs through the practical lens of matrix multiplication. The curriculum progresses from basic implementations to sophisticated optimizations, covering instruction-level parallelism, multithreading, and efficient data movement. Students learn to analyze performance bottlenecks, implement parallel computing strategies, and manage software complexity through layered architecture. The course combines theoretical understanding with hands-on practice using C programming and MATLAB.

Loops and More Loops

Module 1

Start Your Engines

Module 2

Pushing the Limits

Module 3

Multithreaded Parallelism

Module 4

Fee Structure

Instructors

Devangi Parikh

2 Courses

A Distinguished Expert in High-Performance Computing and Computer Science Education

Dr. Devangi Parikh has established herself as an accomplished computer scientist at the University of Texas at Austin, where she currently serves as Assistant Professor of Instruction in the Department of Computer Science. Her academic journey includes a PhD in Electrical and Computer Engineering from Georgia Institute of Technology (2006-2012) and early career experience as a System Engineer at Texas Instruments, where she developed high-performance dense linear algebra libraries. Her research interests span high-performance computing, numerical software, and digital signal processing for speech and audio enhancement

Robert van de Geijn

4 Courses

A Pioneering Authority in High-Performance Computing and Linear Algebra

Prof. Dr. Robert van de Geijn has established himself as a distinguished leader in computer science at the University of Texas at Austin, where he served until becoming Professor Emeritus in 2021. Born in the Netherlands in 1962, he earned his BS in Mathematics and Computer Science from the University of Wisconsin-Madison and PhD in Applied Mathematics from the University of Maryland, College Park. As leader of the Science of High-Performance Computing group and member of the Oden Institute for Computational Engineering and Science, he has made groundbreaking contributions to linear algebra, high-performance computing, and parallel processing. His pioneering work includes developing the Formal Linear Algebra Method (FLAME) and creating widely-adopted open-source software libraries. His research excellence has been recognized with numerous awards, including the 2020 SIAM Activity Group on Supercomputing Best Paper Prize and the 2007-2008 President's Associates Teaching Excellence Award. His influential publications, including "Anatomy of High-Performance Matrix Multiplication" and "The Science of Deriving Dense Linear Algebra Algorithms," have significantly advanced the field of computational science. Though officially retired, he maintains 25% research activity, continuing his influential work in algorithm development and high-performance computing education.

This course includes

5 Weeks

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

8,448

Audit For Free

Add to compare

Testimonials

Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.

Frequently asked questions

Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.

What prerequisites are needed for this course?

Who is this course designed for?

What hardware requirements are there?

Is programming experience required?

What software will be provided?