FI:PA230 Reinforcement Learning - Course Information
PA230 Reinforcement Learning
Faculty of InformaticsAutumn 2024
- Extent and Intensity
- 2/0/1. 3 credit(s) (plus extra credits for completion). Type of Completion: zk (examination).
In-person direct teaching - Teacher(s)
- doc. RNDr. Petr Novotný, Ph.D. (lecturer)
Mgr. Martin Kurečka (assistant)
Bc. Václav Nevyhoštěný (assistant)
Bc. Vít Unčovský (assistant) - Guaranteed by
- doc. RNDr. Petr Novotný, Ph.D.
Department of Computer Science – Faculty of Informatics
Supplier department: Department of Computer Science – Faculty of Informatics - Timetable
- Thu 26. 9. to Thu 19. 12. Thu 14:00–15:50 B410
- Prerequisites
- PV021 Neural Networks
Knowledge of basic types of neural networks and of their training. Elementary knowledge of probability and statistics. - Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
- fields of study / plans the course is directly associated with
- Machine learning and artificial intelligence (programme FI, N-UIZD)
- Course objectives
- The main aim of the course is to introduce the participants to the field of reinforcement learning and to acquaint them with the major approaches to training of agent policies. This will include both classical, model-based approaches, as well as modern deep-learning and tree search algorithms. The knowledge will be reinforced by a hands-on project in which the participants will train their own agents on selected benchmarks.
- Learning outcomes
- After completing the course the student:
+ will have a formal understanding of the problems solved in the field of reinforcement learning (RL).
+ will be able to formulate core principles of RL algorithms.
+ will be able to describe the most prominent RL algorithms and reason about their performance characteristics and tradeoffs.
+ will have a practical experience with training of a RL agent utilizing state-of-the-art deep learning frameworks.
+ will be able to read scientificic literature from the RL domain. - Syllabus
- Aims of reinforcement learning (RL), neuropsychological connection, brief history.
- Problem formalization: Markov decision processes, policies, payoffs).
- Exact policy synthesis methods: value iteration, policy iteration, their relevance for RL.
- Brief overview of tabular methods: Monte Carlo, SARSA, Q-learning. General principles: temporal difference learning, value bootstrapping.
- Deep reinforcement learning: function approximators and issues pertaining to their use, gradient-based optimization.
- Value-based Deep RL methods: DQN, DDQN, Rainbow heuristics, deep TD(lambda).
- Policy gradient methods: policy gradient theorem, REINFORCE, Actor-Critic methods, SAC, trust region policy optimization (TRPO), proximal policy optimization (PPO).
- Monte Carlo tree search (MCTS) methods: conceptual foundations (exploration vs. exploitation, multi-armed bandits, upper confidence bound), UCT-based MCTS, MCTS and deep RL (AlphaZero, MuZero, stochastic MuZero).
- Case study: RL with human feedback in fine-tuning of large language models.
- Some current topics in RL research: multi-agent learning (AlphaStar), safe and risk-constrained learning, hierarchical learning, transfer learning.
- Literature
- LAPAN, Maxim. Deep reinforcement learning hands-on : apply modern RL methods to practical problems of chatbots, robotics, discrete optimization, web automation, and more. Second edition. Birmingham: Packt, 2020, xix, 798. ISBN 9781838826994. info
- SUTTON, Richard S. and Andrew G. BARTO. Reinforcement learning : an introduction. Second edition. Cambridge, Massachusetts: The MIT Press, 2018, xxii, 526. ISBN 9780262039246. info
- WIERING, Marco. Reinforcement learning : state of the art. Edited by Martijn van Otterlo. Berlin: Springer-Verlag, 2012, xxxiv, 638. ISBN 9783642446856. info
- Teaching methods
- lecture, semestral project
- Assessment methods
- semestral project, oral exam
- Language of instruction
- English
- Further comments (probably available only in Czech)
- Study Materials
The course is taught annually. - Teacher's information
- An exception from the requirement of passing the PV021 course can be granted in some circumstances (e.g., if you enroll in PV021 in the same semester and there is enough space in PA230).
- Enrolment Statistics (recent)
- Permalink: https://is.muni.cz/course/fi/autumn2024/PA230