A USI project selected among the Leading House MENA's Research Partnership Grants 2025

daabc68ad5c6eda1e1bcce5ba12095ef.jpg

Institutional Communication Service

11 December 2025

A project proposed by the Faculty of Informatics at USI has been selected as one of 13 international projects by the Leading House for the Middle East and North Africa (LHMENA) as part of the Research Partnership Grants 2025. This selection was made from over 150 applications submitted by 15 countries and 36 Swiss institutions, resulting in an acceptance rate of just 8.6%. The six-month grant will facilitate a scientific collaboration with the Cyber Security Group at NYU Abu Dhabi on a critical emerging topic: how the linguistic styles of artificial intelligence models can subtly convey deception, misinformation, or manipulation.

The selected project "Tracing Deception and Misinformation Heuristics in LLMs via Causal Neural Circuit Discovery", was presented by PhD assistant Francesco Sovrano and involves the participation of Michele Guerra and full professor Marc Langheinrich from USI, together with Prof. Christina Pöpper from NYU Abu Dhabi.

The starting point for the research is a crucial question: in large language models (LLMs), what are the heuristics – i.e. learned cognitive shortcuts – that govern response styles, and where in the model do they reside? The project stems from evidence that LLMs influence users not only through content but also through the way they present it: tone and style can emphasise, simplify, overload, or guide the reader, regardless of factual accuracy. Precisely because style is not a "lie" but a form, it often escapes traditional fact-checking tools.

The initiative aims to adapt recent causal interpretability techniques, initially developed for arithmetic models, to the study of natural language in two languages—English and Arabic—to identify the neural circuits that determine specific deceptive styles. The project also involves defining a Deceptive Style Index (DSI), useful for cross-language comparisons and for making risks and vulnerabilities measurable.

In addition to the methodological aspect, the research addresses relevant security implications. Malicious actors can manipulate styles through prompt design, obtaining more persuasive outputs in contexts such as phishing, unilateral propaganda or "information dumping". The risk is particularly acute in languages that are less represented in datasets, such as Arabic, where data asymmetry can lead to less stable and more easily manipulated models.

The work programme includes:

– the construction of a multilingual dataset with and without stylistic manipulations;

– the discovery and causal validation of the circuits that control these styles;

– crowd-sourcing studies to identify visible clues and potential heuristics;

– selective circuit attenuation tests to assess whether deceptive style can be reduced while preserving the model's capabilities.

All materials – datasets, metrics, code and documentation – will be released in open access, supporting lasting collaboration between Switzerland and the United Arab Emirates (UAE) and a transparent and reproducible research ecosystem.

According to the Leading House MENA, the high number and quality of proposals received represent "a strong signal of the creativity and strength of the global scientific community". The inclusion of the USI project among those funded confirms the international relevance of the university's expertise in AI safety, cybersecurity, and model interpretability.