Master sequential decision making and reinforcement learning. Learn dynamic programming, Monte Carlo methods, and temporal difference learning.
Master sequential decision making and reinforcement learning. Learn dynamic programming, Monte Carlo methods, and temporal difference learning.
This comprehensive course introduces sequential decision making and reinforcement learning fundamentals. Students explore utility theory, multi-armed bandit problems, and Markov decision processes (MDPs). The curriculum covers dynamic programming algorithms, partial observability concepts, and reinforcement learning paradigms including Monte Carlo methods and temporal difference learning. Through hands-on programming assignments and theoretical study, learners develop practical skills in implementing key algorithms and understanding their applications.
4.2
(17 ratings)
2,885 already enrolled
Instructors:
English
پښتو, বাংলা, اردو, 2 more
What you'll learn
Map between qualitative preferences and appropriate quantitative utilities
Model sequential decision problems using multi-armed bandits and Markov processes
Implement dynamic programming algorithms for optimal policies
Master Monte Carlo and temporal difference learning methods
Understand partial observability in real-world problems
Develop practical skills in reinforcement learning implementation
Skills you'll gain
This course includes:
264 Minutes PreRecorded video
8 assignments
Access on Mobile, Tablet, Desktop
FullTime access
Shareable certificate
Closed caption
Get a Completion Certificate
Share your certificate with prospective employers and your professional network on LinkedIn.
Created by
Provided by
Top companies offer this course to their employees
Top companies provide this course to enhance their employees' skills, ensuring they excel in handling complex projects and drive organizational success.
There are 8 modules in this course
This course provides a comprehensive introduction to sequential decision making and reinforcement learning. Students begin with utility theory foundations and progress through multi-armed bandit problems, Markov decision processes, and dynamic programming solutions. The curriculum covers both theoretical concepts and practical implementations, including partial observability, Monte Carlo methods, and temporal difference learning. Through eight modules, learners develop skills in implementing various reinforcement learning algorithms while understanding their mathematical foundations and real-world applications.
Decision Making and Utility Theory
Module 1 · 5 Hours to complete
Bandit Problems
Module 2 · 4 Hours to complete
Markov Decision Processes
Module 3 · 4 Hours to complete
Dynamic Programming
Module 4 · 7 Hours to complete
Partially Observable Markov Decision Processes
Module 5 · 5 Hours to complete
Monte Carlo Methods
Module 6 · 4 Hours to complete
Temporal-Difference Learning
Module 7 · 8 Hours to complete
Reinforcement Learning - Generalization
Module 8 · 5 Hours to complete
Fee Structure
Payment options
Financial Aid
Instructor
Innovator in Robotics and Artificial Intelligence at Columbia University
Tony Dear is a Lecturer in the Discipline of Computer Science at Columbia University, where he has been shaping the future of robotics and artificial intelligence since 2018. He holds a Bachelor’s degree in Electrical Engineering and Computer Science from the University of California, Berkeley, as well as a Master’s and PhD in Robotics from Carnegie Mellon University. His academic focus includes teaching courses such as Artificial Intelligence, Discrete Mathematics, and Data-Driven Decision Modeling, where he emphasizes the integration of theoretical concepts with practical applications. In addition to his teaching responsibilities, Tony serves as the faculty director for Columbia's Online Artificial Intelligence Executive Education program and has developed a course on Decision Making and Reinforcement Learning available on Coursera. His research interests lie in geometric mechanics and deep reinforcement learning, particularly in applying these concepts to improve robot locomotion. With a commitment to fostering innovation and collaboration among students, Tony Dear continues to make significant contributions to the fields of computer science and robotics through his teaching and research initiatives.
Testimonials
Testimonials and success stories are a testament to the quality of this program and its impact on your career and learning journey. Be the first to help others make an informed decision by sharing your review of the course.
4.2 course rating
17 ratings
Frequently asked questions
Below are some of the most commonly asked questions about this course. We aim to provide clear and concise answers to help you better understand the course content, structure, and any other relevant information. If you have any additional questions or if your question is not listed here, please don't hesitate to reach out to our support team for further assistance.