Master efficient computing and parallelism techniques through matrix multiplication to optimize code performance on modern CPU architectures.
Master efficient computing and parallelism techniques through matrix multiplication to optimize code performance on modern CPU architectures.
This comprehensive course focuses on achieving high performance in compute-intensive applications through the practical example of matrix-matrix multiplication. Students progress from basic implementation to advanced optimization techniques, learning to leverage instruction-level parallelism and multithreading. The course emphasizes the crucial role of data movement in efficient computing, covering topics from algorithm selection to parallel computing. Through hands-on exercises in C programming, participants develop skills in performance optimization and software architecture, with free access to MATLAB Online for practical application.
5,403 already enrolled
Instructors:
English
English
What you'll learn
Master algorithm-to-architecture mapping for optimal performance
Implement multiple levels of parallelism in computational tasks
Optimize data movement patterns in high-performance applications
Analyze and interpret performance metrics effectively
Develop layered software architectures for complex systems
Skills you'll gain
This course includes:
PreRecorded video
Graded assignments, exams
Access on Mobile, Tablet, Desktop
Limited Access access
Shareable certificate
Closed caption
Get a Completion Certificate
Share your certificate with prospective employers and your professional network on LinkedIn.
Created by
Provided by

Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.





There are 4 modules in this course
This course teaches advanced techniques for optimizing code performance on modern CPUs through the practical lens of matrix multiplication. The curriculum progresses from basic implementations to sophisticated optimizations, covering instruction-level parallelism, multithreading, and efficient data movement. Students learn to analyze performance bottlenecks, implement parallel computing strategies, and manage software complexity through layered architecture. The course combines theoretical understanding with hands-on practice using C programming and MATLAB.
Loops and More Loops
Module 1
Start Your Engines
Module 2
Pushing the Limits
Module 3
Multithreaded Parallelism
Module 4
Fee Structure
Instructors

2 Courses
A Distinguished Expert in High-Performance Computing and Computer Science Education
Dr. Devangi Parikh has established herself as an accomplished computer scientist at the University of Texas at Austin, where she currently serves as Assistant Professor of Instruction in the Department of Computer Science. Her academic journey includes a PhD in Electrical and Computer Engineering from Georgia Institute of Technology (2006-2012) and early career experience as a System Engineer at Texas Instruments, where she developed high-performance dense linear algebra libraries. Her research interests span high-performance computing, numerical software, and digital signal processing for speech and audio enhancement

4 Courses
A Pioneering Authority in High-Performance Computing and Linear Algebra
Prof. Dr. Robert van de Geijn has established himself as a distinguished leader in computer science at the University of Texas at Austin, where he served until becoming Professor Emeritus in 2021. Born in the Netherlands in 1962, he earned his BS in Mathematics and Computer Science from the University of Wisconsin-Madison and PhD in Applied Mathematics from the University of Maryland, College Park. As leader of the Science of High-Performance Computing group and member of the Oden Institute for Computational Engineering and Science, he has made groundbreaking contributions to linear algebra, high-performance computing, and parallel processing. His pioneering work includes developing the Formal Linear Algebra Method (FLAME) and creating widely-adopted open-source software libraries. His research excellence has been recognized with numerous awards, including the 2020 SIAM Activity Group on Supercomputing Best Paper Prize and the 2007-2008 President's Associates Teaching Excellence Award. His influential publications, including "Anatomy of High-Performance Matrix Multiplication" and "The Science of Deriving Dense Linear Algebra Algorithms," have significantly advanced the field of computational science. Though officially retired, he maintains 25% research activity, continuing his influential work in algorithm development and high-performance computing education.
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.