INFO8003-1 Reinforcement Learning

Course motivation

In a world where intelligent systems are increasingly autonomous, reinforcement learning (RL) is revolutionising decision-making across a range of complex problems (e.g., control of anti-UAV robots on a battlefield). From optimising robotic controls to developing strategies for financial markets, RL enables agents to learn from interactions with their environments and make decisions that maximise long-term rewards.

This course provides a comprehensive introduction to RL, focusing on both theoretical foundations and practical applications. As an example of theoretical thematics, we can mention learning in low-data environments (which is particularly useful for designing efficient medical treatments for chronic diseases such as for example obesity, alcoholism and cancer), operating in partially observable settings (problems met for example in robotics, in games or when interacting with energy markets) and coordinating multiple agents, a thematic that becomes increasingly important with the defense industry currently developing drone-swarm technologies. Practical applications of RL to real-world problems will include robotics, large language models (LLMs), and infrastructure management planning.

Course information

This class will be given during the second semester on Tuesday afternoon in Building B28, Room 1.21. It starts at 1:45pm and till 5:45pm. The first class takes place on the 4th of February. Course description.

The teaching assistants for the class are Arthur Louette and Raphaël Fonteneau. You should contact them at arthur.louette@uliege.be and raphael.fonteneau@uliege.be.

Lectures schedule

Practical sessions and deadlines

DateActivityTopicSpeaker
03/02/26Course organisation,

Lec 1
Introduction to Reinforcement Learning (RL)Arthur Louette

Raphaël Fonteneau
10/02/26Lec 2Introduction to RL: Q-learningRaphaël Fonteneau
24/02/26Lec 3Introduction to RL: Fitted-Q iteration and convergence of Q-learningRaphaël Fonteneau
03/03/26Lec 4Advanced algorithms for learning Q-functionsArthur Louette
10/03/26Lec 5Low data reinforcement learningRaphaël Fonteneau
17/03/26Lec 6Policy gradient methodsAdrien Bolland
24/03/26Lec 7Partially observable Markov decision processesArthur Louette
31/03/26Lec 8Model-based reinforcement learningSamy Mokeddem
07/04/26Lec 9Multi-agent reinforcement learningJulien Hansen
14/04/26Lec 10Robotic reinforcement learningArthur Louette
05/05/26Lec 11Reinforcement learning and large language modelsLize Pirenne

To avoid misleading information, the submission platform is the point of reference for deadlines.

The installation guide for the notebooks can be found here.

DateActivityTopicMaterials
03/02/26TP1Value function and Gym environmentStatement, Notebook
10/02/26TP2Q-learning and system identification Statement, Notebook
24/02/26Q&AQ&A notebooks TP 1 & 2
03/03/26Homework 1Complete notebooks TP 1 & 2
03/03/26TP3
+
Homework correction
FQI and parametric Q-learningNotebook
10/03/26TP4Advanced Q-learningNotebook
17/03/26
24/03/26Homework 2Complete notebooks TP 3 & 4

Exam

The exam modalities and a list of potential questions can be downloaded here.

The schedule is available here.

If you do not plan to take the oral exam, please email us at arthur.louette@uliege.be. Also, tell us as soon as possible if you have a problem with the exam schedule.

Highly recommended books

Prince, S. J. D. (2023). Understanding deep learning. The MIT Press. http://udlbook.com

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press. http://incompleteideas.net/book/the-book-2nd.html

Previous projects + supplementary material

Project 1 – Sections 1 to 4 need to be submitted, see the submission platform. Section 5: see submission platform.

Project 2 – Sections 1 to 4 need to be submitted, see the submission platform. Deadline for the final submission: see submission platform.

Project 3 – Deadline for the final submission: see submission platform.

Deep RL with Vision: Statement.

Network management: ANM6-Easy project.

Robot equilibrium: Double Inverted Pendulum project.

Exploration/exploitation in Reinforcement Learning: The multi-armed bandit problems.  research paper (first 25 pages).

Evaluations that took place during the previous years: Evaluation 1Evaluation 2Evaluation 3Evaluation 4Evaluation 5.