[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning
Summary
This article presents the Advantage-based Adversarial Transformer (AAT), a novel method for generating time-correlated adversarial examples in reinforcement learning, enhancing attack performance against DRL models.
Why It Matters
As reinforcement learning systems become more prevalent, understanding their vulnerabilities to adversarial attacks is crucial. This research addresses a significant gap in existing methods by improving the temporal correlation of adversarial perturbations, potentially leading to safer AI applications.
Key Takeaways
- AAT improves adversarial example generation by capturing temporal dependencies.
- The method uses a multi-scale causal self-attention mechanism for better correlation.
- AAT's performance surpasses existing adversarial attack baselines in various environments.
Computer Science > Machine Learning arXiv:2602.19582 (cs) [Submitted on 23 Feb 2026] Title:Advantage-based Temporal Attack in Reinforcement Learning Authors:Shenghong He View a PDF of the paper titled Advantage-based Temporal Attack in Reinforcement Learning, by Shenghong He View PDF HTML (experimental) Abstract:Extensive research demonstrates that Deep Reinforcement Learning (DRL) models are susceptible to adversarially constructed inputs (i.e., adversarial examples), which can mislead the agent to take suboptimal or unsafe actions. Recent methods improve attack effectiveness by leveraging future rewards to guide adversarial perturbation generation over sequential time steps (i.e., reward-based attacks). However, these methods are unable to capture dependencies between different time steps in the perturbation generation process, resulting in a weak temporal correlation between the current perturbation and previous this http URL this paper, we propose a novel method called Advantage-based Adversarial Transformer (AAT), which can generate adversarial examples with stronger temporal correlations (i.e., time-correlated adversarial examples) to improve the attack performance. AAT employs a multi-scale causal self-attention (MSCSA) mechanism to dynamically capture dependencies between historical information from different time periods and the current state, thus enhancing the correlation between the current perturbation and the previous perturbation. Moreover, AAT introduces a ...