[2603.15017] Consequentialist Objectives and Catastrophe

[2603.15017] Consequentialist Objectives and Catastrophe

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.15017: Consequentialist Objectives and Catastrophe

Computer Science > Artificial Intelligence arXiv:2603.15017 (cs) [Submitted on 16 Mar 2026 (v1), last revised 26 Mar 2026 (this version, v2)] Title:Consequentialist Objectives and Catastrophe Authors:Henrik Marklund, Alex Infanger, Benjamin Van Roy View a PDF of the paper titled Consequentialist Objectives and Catastrophe, by Henrik Marklund and 2 other authors View PDF HTML (experimental) Abstract:Because human preferences are too complex to codify, AIs operate with misspecified objectives. Optimizing such objectives often produces undesirable outcomes; this phenomenon is known as reward hacking. Such outcomes are not necessarily catastrophic. Indeed, most examples of reward hacking in previous literature are benign. And typically, objectives can be modified to resolve the issue. We study the prospect of catastrophic outcomes induced by AIs operating in complex environments. We argue that, when capabilities are sufficiently advanced, pursuing a fixed consequentialist objective tends to result in catastrophic outcomes. We formalize this by establishing conditions that provably lead to such outcomes. Under these conditions, simple or random behavior is safe. Catastrophic risk arises due to extraordinary competence rather than incompetence. With a fixed consequentialist objective, avoiding catastrophe requires constraining AI capabilities. In fact, constraining capabilities the right amount not only averts catastrophe but yields valuable outcomes. Our results apply to any ob...

Originally published on March 27, 2026. Curated by AI News.

Related Articles

Machine Learning

[HIRING]Remote AI Training Jobs -Up to $1K/Week| Collaborators Wanted.USA

submitted by /u/nortonakenga [link] [comments]

Reddit - ML Jobs · 1 min ·
Llms

We hit 150 stars on our AI setup tool!

yo folks, we just hit 150 stars on our open source tool that auto makes AI context files. got 90 PRs merged and 20 issues that ppl are pi...

Reddit - Artificial Intelligence · 1 min ·
Llms

Is ai getting dummer?

Over the past month, it feels like GPT and Gemini have been giving wrong answers a lot. Do you feel the same, or am I exaggerating? submi...

Reddit - Artificial Intelligence · 1 min ·

The CEO Who Builds AI Warfare Systems Just Confirmed What I Released For Free

submitted by /u/MarsR0ver_ [link] [comments]

Reddit - Artificial Intelligence · 1 min ·

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime