[2603.00186] RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration

[2603.00186] RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.00186: RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration

Computer Science > Cryptography and Security arXiv:2603.00186 (cs) [Submitted on 26 Feb 2026] Title:RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration Authors:Srikumar Nayak View a PDF of the paper titled RLShield: Practical Multi-Agent RL for Financial Cyber Defense with Attack-Surface MDPs and Real-Time Response Orchestration, by Srikumar Nayak View PDF HTML (experimental) Abstract:Financial systems run nonstop and must stay reliable even during cyber incidents. Modern attacks move across many services (apps, APIs, identity, payment rails), so defenders must make a sequence of actions under time pressure. Most security tools still use fixed rules or static playbooks, which can be slow to adapt when the attacker changes behavior. Reinforcement learning (RL) is a good fit for sequential decisions, but much of the RL-in-finance literature targets trading and does not model real cyber response limits such as action cost, service disruption, and defender coordination across many assets. This paper proposes RLShield, a practical multi-agent RL pipeline for financial cyber defense. We model the enterprise attack surface as a Markov decision process (MDP) where states summarize alerts, asset exposure, and service health, and actions represent real response steps (e.g., isolate a host, rotate credentials, ratelimit an API, block an account, or trigger recovery). RLShield learns coordinated policies across m...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

[2506.20964] Evidence-based diagnostic reasoning with multi-agent copilot for human pathology
Llms

[2506.20964] Evidence-based diagnostic reasoning with multi-agent copilot for human pathology

Abstract page for arXiv paper 2506.20964: Evidence-based diagnostic reasoning with multi-agent copilot for human pathology

arXiv - AI · 4 min ·
[2601.08323] AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation
Ai Agents

[2601.08323] AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation

Abstract page for arXiv paper 2601.08323: AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation

arXiv - AI · 3 min ·
[2603.18349] Large-Scale Analysis of Persuasive Content on Moltbook
Llms

[2603.18349] Large-Scale Analysis of Persuasive Content on Moltbook

Abstract page for arXiv paper 2603.18349: Large-Scale Analysis of Persuasive Content on Moltbook

arXiv - AI · 3 min ·
[2511.19669] HeaRT: A Hierarchical Circuit Reasoning Tree-Based Agentic Framework for AMS Design Optimization
Ai Agents

[2511.19669] HeaRT: A Hierarchical Circuit Reasoning Tree-Based Agentic Framework for AMS Design Optimization

Abstract page for arXiv paper 2511.19669: HeaRT: A Hierarchical Circuit Reasoning Tree-Based Agentic Framework for AMS Design Optimization

arXiv - AI · 3 min ·
More in Ai Agents: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime