[2603.00285] TraderBench: How Robust Are AI Agents in Adversarial

[2603.00285] TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?

arXiv - AI March 03, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.00285: TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?

Computer Science > Artificial Intelligence arXiv:2603.00285 (cs) [Submitted on 27 Feb 2026] Title:TraderBench: How Robust Are AI Agents in Adversarial Capital Markets? Authors:Xiaochuang Yuan, Hui Xu, Silvia Xu, Cui Zou, Jing Xiong View a PDF of the paper titled TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?, by Xiaochuang Yuan and 4 other authors View PDF HTML (experimental) Abstract:Evaluating AI agents in finance faces two key challenges: static benchmarks require costly expert annotation yet miss the dynamic decision-making central to real-world trading, while LLM-based judges introduce uncontrolled variance on domain-specific tasks. We introduce TraderBench, a benchmark that addresses both issues. It combines expert-verified static tasks (knowledge retrieval, analytical reasoning) with adversarial trading simulations scored purely on realized performance-Sharpe ratio, returns, and drawdown-eliminating judge variance entirely. The framework features two novel tracks: crypto trading with four progressive market-manipulation transforms, and options derivatives scoring across P&L accuracy, Greeks, and risk management. Trading scenarios can be refreshed with new market data to prevent benchmark contamination. Evaluating 13 models (8B open-source to frontier) on ~50 tasks, we find: (1) 8 of 13 models score ~33 on crypto with <1-point variation across adversarial conditions, exposing fixed non-adaptive strategies; (2) extended thinking helps retrieval ...

Originally published on March 03, 2026. Curated by AI News.

Llms

I Accidentally Discovered a Security Vulnerability in AI Education — Then Submitted It To a $200K Competition

Last night I was testing Maestro University, the first fully AI-taught university. I walked into their enrollment chatbot and asked it to...

Reddit - Artificial Intelligence · 1 min · 4 minutes ago

Llms

Is anyone else concerned with this blatant potential of security / privacy breach?

Recently, when sending a very sensitive email to my brother including my mother’s health information, I wondered what happens if a recipi...

Reddit - Artificial Intelligence · 1 min · 4 minutes ago

Llms

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I published a paper today on something I've been calling postural manipulation. The short version: ordi...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I've been documenting what I'm calling postural manipulation: a specific class of language that install...

Reddit - Machine Learning · 1 min · about 3 hours ago

[2603.00285] TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?

About this article

Related Articles

I Accidentally Discovered a Security Vulnerability in AI Education — Then Submitted It To a $200K Competition

Is anyone else concerned with this blatant potential of security / privacy breach?

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

No comments

Stay updated with AI News