Decision Making and Reinforcement Learning

Master sequential decision making and reinforcement learning. Learn dynamic programming, Monte Carlo methods, and temporal difference learning.

This comprehensive course introduces sequential decision making and reinforcement learning fundamentals. Students explore utility theory, multi-armed bandit problems, and Markov decision processes (MDPs). The curriculum covers dynamic programming algorithms, partial observability concepts, and reinforcement learning paradigms including Monte Carlo methods and temporal difference learning. Through hands-on programming assignments and theoretical study, learners develop practical skills in implementing key algorithms and understanding their applications.

4.2

(17 ratings)

2,885 already enrolled

Instructors:

Tony Dear

English

پښتو, বাংলা, اردو, 2 more

This course includes

47 Hours

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

2,435

Audit For Free

Add to compare

What you'll learn

Map between qualitative preferences and appropriate quantitative utilities

Model sequential decision problems using multi-armed bandits and Markov processes

Implement dynamic programming algorithms for optimal policies

Master Monte Carlo and temporal difference learning methods

Understand partial observability in real-world problems

Develop practical skills in reinforcement learning implementation

Skills you'll gain

reinforcement learning

Monte Carlo methods

Markov decision process

dynamic programming

machine learning

temporal difference learning

utility theory

Python programming

algorithms

decision making

This course includes:

264 Minutes PreRecorded video

8 assignments

Access on Mobile, Tablet, Desktop

FullTime access

Shareable certificate

Closed caption

Get a Completion Certificate

Share your certificate with prospective employers and your professional network on LinkedIn.

Created by

Columbia University

Provided by

Coursera

Top companies offer this course to their employees

Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.

There are 8 modules in this course

This course provides a comprehensive introduction to sequential decision making and reinforcement learning. Students begin with utility theory foundations and progress through multi-armed bandit problems, Markov decision processes, and dynamic programming solutions. The curriculum covers both theoretical concepts and practical implementations, including partial observability, Monte Carlo methods, and temporal difference learning. Through eight modules, learners develop skills in implementing various reinforcement learning algorithms while understanding their mathematical foundations and real-world applications.

Decision Making and Utility Theory

Module 1 · 5 Hours to complete

Bandit Problems

Module 2 · 4 Hours to complete

Markov Decision Processes

Module 3 · 4 Hours to complete

Dynamic Programming

Module 4 · 7 Hours to complete

Partially Observable Markov Decision Processes

Module 5 · 5 Hours to complete

Monte Carlo Methods

Module 6 · 4 Hours to complete

Temporal-Difference Learning

Module 7 · 8 Hours to complete

Reinforcement Learning - Generalization

Module 8 · 5 Hours to complete

Fee Structure

Payment options

Financial Aid

Instructor

Tony Dear

4.3 rating

6 Reviews

3,063 Students

1 Course

Innovator in Robotics and Artificial Intelligence at Columbia University

Tony Dear is a Lecturer in the Discipline of Computer Science at Columbia University, where he has been shaping the future of robotics and artificial intelligence since 2018. He holds a Bachelor’s degree in Electrical Engineering and Computer Science from the University of California, Berkeley, as well as a Master’s and PhD in Robotics from Carnegie Mellon University. His academic focus includes teaching courses such as Artificial Intelligence, Discrete Mathematics, and Data-Driven Decision Modeling, where he emphasizes the integration of theoretical concepts with practical applications. In addition to his teaching responsibilities, Tony serves as the faculty director for Columbia's Online Artificial Intelligence Executive Education program and has developed a course on Decision Making and Reinforcement Learning available on Coursera. His research interests lie in geometric mechanics and deep reinforcement learning, particularly in applying these concepts to improve robot locomotion. With a commitment to fostering innovation and collaboration among students, Tony Dear continues to make significant contributions to the fields of computer science and robotics through his teaching and research initiatives.

This course includes

47 Hours

Of Self-paced video lessons

Intermediate Level

Completion Certificate

awarded on course completion

2,435

Audit For Free

Add to compare

Testimonials

Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.

Frequently asked questions

Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.

When will I have access to the lectures and assignments?

What will I get if I purchase the Certificate?

What is the refund policy?

Is financial aid available?