INFO8003-1 Reinforcement Learning

Course motivation

In a world where intelligent systems are increasingly autonomous, reinforcement learning (RL) is revolutionising decision-making across a range of complex problems (e.g., control of anti-UAV robots on a battlefield). From optimising robotic controls to developing strategies for financial markets, RL enables agents to learn from interactions with their environements and make decisions that maximise long-term rewards.

This course provides a comprehensive introduction to RL, focusing on both theoretical foundations and practical applications. As an example of theoretical thematics, we can mention learning in low-data environments (which is particularly useful for designing efficient medical treatments for chronic diseases such as for example obesity, alcoholism and cancer), operating in partially observable settings (problems met for example  in robotics, in games or when interacting with energy markets) and coordinating multiple agents, a thematic that becomes increasingly important with the defense industry currently developing drone-swarm technologies. Practical applications of RL to real-world problems will include robotics, large language models (LLMs) and infrastructure management planning.

Course information

This class will be given during the second semester on Tuesday afternoon in Building, B28, Room 1.21. It starts at 1:45pm and till 5:45pm. The first class takes place on the 4th of February. Course description.

The teaching assistants for the class are Arthur Louette and Raphaël Fonteneau. You should contact them using the following email address: arthur.louette@uliege.be and raphael.fonteneau@uliege.be.

Lectures schedule

DateActivityTopicSpeaker
04/02/25Course organisation,

Lec 1
Introduction to Reinforcement Learning (RL)Arthur Louette

Damien Ernst
11/02/25Lec 2Introduction to RL: Q-learningDamien Ernst
18/02/25Lec 3Introduction to RL: Fitted-Q iteration and convergence of Q-learningDamien Ernst
25/02/25Lec 4Low data reinforcement learningRaphael Fonteneau
04/03/25No class
11/03/25Lec 5Advanced algorithms for learning Q-functionsGaspard Lambrechts
18/03/25Lec 6Introduction to gradient-based direct policy searchAdrien Bolland
25/03/25Lec 7Advanced policy gradient algorithmsAdrien Bolland
01/04/25Lec 8Reinforcement learning for partially observable Markov decision processesGaspard Lambrechts
08/03/25Lec 9Multi-agent reinforcement learningPascal Leroy
15/04/25Lec 10Robotic reinforcement learningArthur Louette
22/04/25No class
29/04/25No class
06/05/25Lec 11Reinforcement learning and large language modelsLize Pirenne
13/05/25Q&A

Practical sessions and deadlines

In order to avoid misleading information, the submission platform is the point of reference for deadlines.

Installation guide for the notebooks can be found here.

DateActivityTopicMaterials
04/02/25TP1Value funcion and Gym environmentStatement, Notebook
11/02/25TP2Q-learning and system identification Statement, Notebook
18/02/25Q&AQ&A notebooks TP 1 & 2
24/02/25 Homework 1Complete notebooks TP 1 & 2
25/02/25TP3
+
Homework correction
FQI and parametric Q-learningNotebook
11/03/25TP4Advanced Q-learningNotebook
18/03/25
24/03/25Homework 2Complete notebooks TP 3 & 4
25/03/25
01/04/25TP5

Project
Policy-gradient: PPO

Project Presentation
Statement, code

Project
08/04/25Q&AQ&A for the project and the theoretical lectures
15/04/25Q&AQ&A for the project and the theoretical lectures
22/04/25
29/04/25
06/05/25Q&AQ&A for the project and the theoretical lectures
09/05/25ProjectDeadline for the project
13/05/25Q&A

Exam

The modalities for the exam and a list of potential questions can be downloaded here.

The schedule is available here.

If you do not plan to pass the oral exam, please send us an email at arthur.louette@uliege.be. Also tell us as soon as possible if you have a problem with the exam schedule.

Highly recommended books

Prince, S. J. D. (2023). Understanding deep learning. The MIT Press. http://udlbook.com

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press. http://incompleteideas.net/book/the-book-2nd.html

Previous projects + supplementary material

Projet 1 – Section 1 to 4 need to be submitted see submission platform. Section 5 see submission platform.

Projet 2 – Section 1 to 4 need to be submitted see submission platform. Deadline for the final submission: see submission platform.

Project 3 – Deadline for the final submission: see submission platform.

Deep RL with Vision: Statement.

Network management : ANM6-Easy project.

Robot equilibrium : Double Inverted Pendulum project.

Exploration/exploitation in Reinforcement Learning: The multi-armed bandit problems.  research paper (first 25 pages).

Evaluations that took place during the previous years: Evaluation 1Evaluation 2Evaluation 3Evaluation 4Evaluation 5.