Machine Learning Nlp Ai Safety Ai Agents

[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning

arXiv - Machine Learning February 24, 2026 4 min read Article

Summary

This article presents the Advantage-based Adversarial Transformer (AAT), a novel method for generating time-correlated adversarial examples in reinforcement learning, enhancing attack performance against DRL models.

Why It Matters

As reinforcement learning systems become more prevalent, understanding their vulnerabilities to adversarial attacks is crucial. This research addresses a significant gap in existing methods by improving the temporal correlation of adversarial perturbations, potentially leading to safer AI applications.

Key Takeaways

AAT improves adversarial example generation by capturing temporal dependencies.
The method uses a multi-scale causal self-attention mechanism for better correlation.
AAT's performance surpasses existing adversarial attack baselines in various environments.

Computer Science > Machine Learning arXiv:2602.19582 (cs) [Submitted on 23 Feb 2026] Title:Advantage-based Temporal Attack in Reinforcement Learning Authors:Shenghong He View a PDF of the paper titled Advantage-based Temporal Attack in Reinforcement Learning, by Shenghong He View PDF HTML (experimental) Abstract:Extensive research demonstrates that Deep Reinforcement Learning (DRL) models are susceptible to adversarially constructed inputs (i.e., adversarial examples), which can mislead the agent to take suboptimal or unsafe actions. Recent methods improve attack effectiveness by leveraging future rewards to guide adversarial perturbation generation over sequential time steps (i.e., reward-based attacks). However, these methods are unable to capture dependencies between different time steps in the perturbation generation process, resulting in a weak temporal correlation between the current perturbation and previous this http URL this paper, we propose a novel method called Advantage-based Adversarial Transformer (AAT), which can generate adversarial examples with stronger temporal correlations (i.e., time-correlated adversarial examples) to improve the attack performance. AAT employs a multi-scale causal self-attention (MSCSA) mechanism to dynamically capture dependencies between historical information from different time periods and the current state, thus enhancing the correlation between the current perturbation and the previous perturbation. Moreover, AAT introduces a ...

Read Original Article

Machine Learning

[D] Best websites for pytorch/numpy interviews

Hello, I’m at the last year of my PHD and I’m starting to prepare interviews. I’m mainly aiming at applied scientist/research engineer or...

Reddit - Machine Learning · 1 min · 1 minute ago

Llms

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://git...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

Can AI truly be creative?

AI has no imagination. “Creativity is the ability to generate novel and valuable ideas or works through the exercise of imagination” http...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Machine Learning

AI video generation seems fundamentally more expensive than text, not just less optimized

There’s been a lot of discussion recently about how expensive AI video generation is compared to text, and it feels like this is more tha...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning

Summary

Why It Matters

Key Takeaways

Related Articles

[D] Best websites for pytorch/numpy interviews

[P] Remote sensing foundation models made easy to use.

Can AI truly be creative?

AI video generation seems fundamentally more expensive than text, not just less optimized

No comments

Stay updated with AI News