Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

LLM Guard scored 0/8 detecting a Crescendo multi-turn attack. Arc Sentry flagged it at Turn 3.

Crescendo (Russinovich et al., USENIX Security 2025) is a multi-turn jailbreak that starts with innocent questions and gradually steers a...

Reddit - Artificial Intelligence · 1 min ·
Llms

Free LLM security audit

I built Arc Sentry, a pre-generation guardrail for open source LLMs that blocks prompt injection before the model generates a response. I...

Reddit - Artificial Intelligence · 1 min ·
Llms

You can decompose models into a graph database [N]

https://github.com/chrishayuk/larql https://youtu.be/8Ppw8254nLI?si=lo-6PM5pwnpyvwMXh Now you can decompose a static llm model and do a k...

Reddit - Machine Learning · 1 min ·

All Content

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG
Llms

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Abstract page for arXiv paper 2603.03292: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv - AI · 4 min ·
[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models
Llms

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Abstract page for arXiv paper 2603.03291: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

arXiv - AI · 3 min ·
[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents
Llms

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Abstract page for arXiv paper 2603.03290: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

arXiv - Machine Learning · 4 min ·
[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development
Llms

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

Abstract page for arXiv paper 2603.04390: A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

arXiv - AI · 3 min ·
[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions
Llms

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

Abstract page for arXiv paper 2603.04191: Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized...

arXiv - AI · 3 min ·
[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning
Llms

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Abstract page for arXiv paper 2603.04124: BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structure...

arXiv - Machine Learning · 4 min ·
[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models
Llms

[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models

Abstract page for arXiv paper 2603.03824: In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv - Machine Learning · 4 min ·
[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation
Llms

[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

Abstract page for arXiv paper 2603.03761: AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

arXiv - AI · 4 min ·
[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment
Llms

[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

Abstract page for arXiv paper 2603.03686: AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Ali...

arXiv - AI · 4 min ·
[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation
Llms

[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

Abstract page for arXiv paper 2603.03680: MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploita...

arXiv - AI · 4 min ·
[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents
Llms

[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents

Abstract page for arXiv paper 2603.03655: Mozi: Governed Autonomy for Drug Discovery LLM Agents

arXiv - AI · 4 min ·
Llms

[P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)

It is hard to communicate how frustrating the current Apple ML stack is for low-level research. CoreML imposes opaque abstractions that p...

Reddit - Machine Learning · 1 min ·
Llms

[D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)

Hello, r/MachineLearning . I am just a regular user from a Korean AI community ("The Singularity Gallery"). I recently came across an ano...

Reddit - Machine Learning · 1 min ·
Llms

LLMs can unmask pseudonymous users at scale with surprising accuracy

So ai can uncover your anonymous identity on social media now so creating burner accounts may be pointless. submitted by /u/_Dark_Wing [l...

Reddit - Artificial Intelligence · 1 min ·
Don’t Paste the Attorney’s Letter Into ChatGPT- What HOA Boards and Managers Need to Know About AI and Legal Privilege
Llms

Don’t Paste the Attorney’s Letter Into ChatGPT- What HOA Boards and Managers Need to Know About AI and Legal Privilege

Federal court rules that conversations with public AI chatbots are not protected by attorney-client privilege. By using ai, you may be st...

AI Tools & Products · 10 min ·
What I learned using Claude Sonnet to migrate Python to Rust
Llms

What I learned using Claude Sonnet to migrate Python to Rust

Using an AI coding assistant to migrate an application from one programming language to another wasn’t as easy as it looked. Here are thr...

AI Tools & Products ·
Llms

Lighthouse introduces ChatGPT app for direct hotel booking

AI Tools & Products ·
A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster
Llms

A “ChatGPT for spreadsheets” helps solve difficult engineering challenges faster

MIT researchers developed a computational approach that can be used to solve problems with hundreds of variables. In tests on realistic e...

AI News - General · 9 min ·
Anthropic essentially bans OpenClaw from Claude by making subscribers pay extra
Llms

Anthropic essentially bans OpenClaw from Claude by making subscribers pay extra

Using OpenClaw with Claude AI is about to get a lot more expensive, thanks to Anthropic's new policy changes. Beginning April 4th at 3PM ...

The Verge - AI · 4 min ·
Llms

🚀 OllamaFX v0.5.0 ya disponible!

Ollama FX es una interfaz de escritorio Open Source para Ollama con grandes mejoras en gestión de chats, RAG, multimodalidad y organizaci...

Reddit - Artificial Intelligence · 1 min ·
Previous Page 173 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime