INFO8003-1 Reinforcement Learning

Course motivation

In a world where intelligent systems are increasingly autonomous, reinforcement learning (RL) is revolutionising decision-making across a range of complex problems (e.g., control of anti-UAV robots on a battlefield). From optimising robotic controls to developing strategies for financial markets, RL enables agents to learn from interactions with their environments and make decisions that maximise long-term rewards.

This course provides a comprehensive introduction to RL, focusing on both theoretical foundations and practical applications. As an example of theoretical thematics, we can mention learning in low-data environments (which is particularly useful for designing efficient medical treatments for chronic diseases such as for example obesity, alcoholism and cancer), operating in partially observable settings (problems met for example in robotics, in games or when interacting with energy markets) and coordinating multiple agents, a thematic that becomes increasingly important with the defense industry currently developing drone-swarm technologies. Practical applications of RL to real-world problems will include robotics, large language models (LLMs), and infrastructure management planning.

Course information

This class will be given during the second semester on Tuesday afternoon in Building B28, Room 1.21. It starts at 1:45pm and till 5:45pm. The first class takes place on the 4th of February. Course description.

The teaching assistants for the class are Arthur Louette and Raphaël Fonteneau. You should contact them at arthur.louette@uliege.be and raphael.fonteneau@uliege.be.

Lectures schedule

Practical sessions and deadlines

Date	Activity	Topic	Speaker
03/02/26	Course organisation, Lec 1	Introduction to Reinforcement Learning (RL)	Arthur Louette Raphaël Fonteneau
10/02/26	Lec 2	Introduction to RL: Q-learning	Raphaël Fonteneau
24/02/26	Lec 3	Introduction to RL: Fitted-Q iteration and convergence of Q-learning	Raphaël Fonteneau
03/03/26	Lec 4	Advanced algorithms for learning Q-functions	Arthur Louette
10/03/26	Lec 5	Low data reinforcement learning	Raphaël Fonteneau
17/03/26	Lec 6a Lec 6b	Policy gradient methods	Adrien Bolland
24/03/26	Lec 7	Partially observable Markov decision processes	Arthur Louette
31/03/26	Lec 8	Model-based reinforcement learning	Samy Mokeddem
07/04/26	Lec 9	Multi-agent reinforcement learning	Julien Hansen
14/04/26	Lec 10	Robotic reinforcement learning	Arthur Louette
05/05/26	Lec 11	Reinforcement learning and large language models	Lize Pirenne

To avoid misleading information, the submission platform is the point of reference for deadlines.

The installation guide for the notebooks can be found here.

Date	Activity	Topic	Materials
03/02/26	TP1	Value function and Gym environment	Statement, Notebook
10/02/26	TP2	Q-learning and system identification	Statement, Notebook
24/02/26	Q&A	Q&A notebooks TP 1 & 2	–
03/03/26	Homework 1	Complete notebooks TP 1 & 2	–
03/03/26	TP3 + Homework correction	FQI and parametric Q-learning	Notebook
10/03/26	TP4	Advanced Q-learning	Notebook
17/03/26	–	–	–
24/03/26	Homework 2	Complete notebooks TP 3 & 4	–

Exam

The exam modalities and a list of potential questions can be downloaded here.

The schedule is available here.

If you do not plan to take the oral exam, please email us at arthur.louette@uliege.be. Also, tell us as soon as possible if you have a problem with the exam schedule.

Highly recommended books

Prince, S. J. D. (2023). Understanding deep learning. The MIT Press. http://udlbook.com

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press. http://incompleteideas.net/book/the-book-2nd.html

Previous projects + supplementary material

Project 1 – Sections 1 to 4 need to be submitted, see the submission platform. Section 5: see submission platform.

Project 2 – Sections 1 to 4 need to be submitted, see the submission platform. Deadline for the final submission: see submission platform.

Project 3 – Deadline for the final submission: see submission platform.

Deep RL with Vision: Statement.

Network management: ANM6-Easy project.

Robot equilibrium: Double Inverted Pendulum project.

Exploration/exploitation in Reinforcement Learning: The multi-armed bandit problems. research paper (first 25 pages).

Evaluations that took place during the previous years: Evaluation 1; Evaluation 2; Evaluation 3; Evaluation 4; Evaluation 5.