Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

LLM Guard scored 0/8 detecting a Crescendo multi-turn attack. Arc Sentry flagged it at Turn 3.

Crescendo (Russinovich et al., USENIX Security 2025) is a multi-turn jailbreak that starts with innocent questions and gradually steers a...

Reddit - Artificial Intelligence · 1 min · 17 minutes ago

Llms

Free LLM security audit

I built Arc Sentry, a pre-generation guardrail for open source LLMs that blocks prompt injection before the model generates a response. I...

Reddit - Artificial Intelligence · 1 min · 17 minutes ago

Llms

You can decompose models into a graph database [N]

https://github.com/chrishayuk/larql https://youtu.be/8Ppw8254nLI?si=lo-6PM5pwnpyvwMXh Now you can decompose a static llm model and do a k...

Reddit - Machine Learning · 1 min · about 1 hour ago

All Content

Llms

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Abstract page for arXiv paper 2603.03292: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Abstract page for arXiv paper 2603.03291: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Abstract page for arXiv paper 2603.03290: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

Abstract page for arXiv paper 2603.04390: A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

Abstract page for arXiv paper 2603.04191: Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Abstract page for arXiv paper 2603.04124: BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structure...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models

Abstract page for arXiv paper 2603.03824: In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

Abstract page for arXiv paper 2603.03761: AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

Abstract page for arXiv paper 2603.03686: AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Ali...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

Abstract page for arXiv paper 2603.03680: MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploita...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents

Abstract page for arXiv paper 2603.03655: Mozi: Governed Autonomy for Drug Discovery LLM Agents

arXiv - AI · 4 min · about 1 month ago

Llms

[P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)

It is hard to communicate how frustrating the current Apple ML stack is for low-level research. CoreML imposes opaque abstractions that p...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

[D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)

Hello, r/MachineLearning . I am just a regular user from a Korean AI community ("The Singularity Gallery"). I recently came across an ano...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

LLMs can unmask pseudonymous users at scale with surprising accuracy

So ai can uncover your anonymous identity on social media now so creating burner accounts may be pointless. submitted by /u/_Dark_Wing [l...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

Don’t Paste the Attorney’s Letter Into ChatGPT- What HOA Boards and Managers Need to Know About AI and Legal Privilege

Federal court rules that conversations with public AI chatbots are not protected by attorney-client privilege. By using ai, you may be st...

AI Tools & Products · 10 min · about 1 month ago

Llms

What I learned using Claude Sonnet to migrate Python to Rust

Using an AI coding assistant to migrate an application from one programming language to another wasn’t as easy as it looked. Here are thr...

AI Tools & Products · about 1 month ago

Llms

Lighthouse introduces ChatGPT app for direct hotel booking

AI Tools & Products · about 1 month ago

Llms

A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster

MIT researchers developed a computational approach that can be used to solve problems with hundreds of variables. In tests on realistic e...

AI News - General · 9 min · about 1 month ago

Llms

Anthropic essentially bans OpenClaw from Claude by making subscribers pay extra

Using OpenClaw with Claude AI is about to get a lot more expensive, thanks to Anthropic's new policy changes. Beginning April 4th at 3PM ...

The Verge - AI · 4 min · about 1 month ago

Llms

🚀 OllamaFX v0.5.0 ya disponible!

Ollama FX es una interfaz de escritorio Open Source para Ollama con grandes mejoras en gestión de chats, RAG, multimodalidad y organizaci...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Previous Page 173 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

LLM Guard scored 0/8 detecting a Crescendo multi-turn attack. Arc Sentry flagged it at Turn 3.

Free LLM security audit

You can decompose models into a graph database [N]

All Content

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models

[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents

[P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)

[D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)

LLMs can unmask pseudonymous users at scale with surprising accuracy

Don’t Paste the Attorney’s Letter Into ChatGPT- What HOA Boards and Managers Need to Know About AI and Legal Privilege

What I learned using Claude Sonnet to migrate Python to Rust

Lighthouse introduces ChatGPT app for direct hotel booking

A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster

Anthropic essentially bans OpenClaw from Claude by making subscribers pay extra

🚀 OllamaFX v0.5.0 ya disponible!

Related Topics

Stay updated with AI News