PBT for Adversarial Reward Shaping - A Work in Progress

Software Institute

Date: 12 March 2026 / 17:00 - 18:00

USI East Campus, Room D1.13

Speaker: Andréa Cristina de Souza Doreste, USI

Abstract: Autonomous Driving Systems (ADS) are inherently safety-critical applications and must be thoroughly tested before deployment. One way to achieve this is by using Reinforcement Learning (RL) to dynamically test the ADS behaviors by controlling Non-Playable Characters (NPCs) as adversarial agents. In this approach, the RL adversarial agent is guided by a reward function to learn how to challenge the original behavior of ADS under test. However, creating and fine-tuning a reward function requires domain expertise and manual experimentation. To address this, we propose using Population-Based Training (PBT), an optimization method originally designed to simultaneously train a population of models, to fine-tune the adversarial reward. During training with PBT, the rewards with the best performance are retained, while the underperforming ones are mutated. We applied PBT to our case study, Highway-Env, and are currently conducting experiments to evaluate its effectiveness in this setting.

Bonus: Rio de Janeiro – a beginner’s guide

Since ICSE 2026 is happening in Rio de Janeiro, my hometown, I thought it would be a good opportunity to share some information about the city with people who are planning to travel there.

Biography: I’m a Ph.D. student in the TAU (Testing AUtomated) research group at the Software Institute, USI, Lugano, supervised by Prof. Dr. Paolo Tonella. I received both my BS degree in Computer and Information Engineering and my Master’s degree in Systems Engineering and Computer Science from the Federal University of Rio de Janeiro, Brazil. My current research focuses on testing Autonomous Driving Systems (ADS) using Reinforcement Learning to train an Adversarial Agent.

Chair: Mattia Giannaccari

*************************

In February 2019, the Software Institute started its SI Seminar Series. Every Thursday afternoon, a researcher of the Institute will publicly give a short talk on a software engineering argument of their choice. Examples include, but are not limited to novel interesting papers, seminal papers, personal research overview, discussion of preliminary research ideas, tutorials, and small experiments.

On our YouTube playlist you can watch some of the past seminars. On the SI website you can find more details on the next seminar, the upcoming seminars, and an archive of the past speakers.

Contact

Software Institute

+41 58 666 46 90

decanato.inf@usi.ch

Attachments

Add to your calendar

Share

Facebook

X

LinkedIn

Whatsapp

Email

About

Study

Research

Practicalities

News and events

PBT for Adversarial Reward Shaping - A Work in Progress

Contact

Attachments

Share

Print

Directions

Stay in touch