PBT for Adversarial Reward Shaping - A Work in Progress
Software Institute
Date: 12 March 2026 / 17:00 - 18:00
USI East Campus, Room D1.13
Speaker: Andréa Cristina de Souza Doreste, USI
Abstract: Autonomous Driving Systems (ADS) are inherently safety-critical applications and must be thoroughly tested before deployment. One way to achieve this is by using Reinforcement Learning (RL) to dynamically test the ADS behaviors by controlling Non-Playable Characters (NPCs) as adversarial agents. In this approach, the RL adversarial agent is guided by a reward function to learn how to challenge the original behavior of ADS under test. However, creating and fine-tuning a reward function requires domain expertise and manual experimentation. To address this, we propose using Population-Based Training (PBT), an optimization method originally designed to simultaneously train a population of models, to fine-tune the adversarial reward. During training with PBT, the rewards with the best performance are retained, while the underperforming ones are mutated. We applied PBT to our case study, Highway-Env, and are currently conducting experiments to evaluate its effectiveness in this setting.
Bonus: Rio de Janeiro – a beginner’s guide
Since ICSE 2026 is happening in Rio de Janeiro, my hometown, I thought it would be a good opportunity to share some information about the city with people who are planning to travel there.
Biography: I’m a Ph.D. student in the TAU (Testing AUtomated) research group at the Software Institute, USI, Lugano, supervised by Prof. Dr. Paolo Tonella. I received both my BS degree in Computer and Information Engineering and my Master’s degree in Systems Engineering and Computer Science from the Federal University of Rio de Janeiro, Brazil. My current research focuses on testing Autonomous Driving Systems (ADS) using Reinforcement Learning to train an Adversarial Agent.
Chair: Mattia Giannaccari
*************************
In February 2019, the Software Institute started its SI Seminar Series. Every Thursday afternoon, a researcher of the Institute will publicly give a short talk on a software engineering argument of their choice. Examples include, but are not limited to novel interesting papers, seminal papers, personal research overview, discussion of preliminary research ideas, tutorials, and small experiments.
On our YouTube playlist you can watch some of the past seminars. On the SI website you can find more details on the next seminar, the upcoming seminars, and an archive of the past speakers.