Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Gemini caught a $280M crypto exploit before it hit the news, then retracted it as a hallucination because I couldn't verify it - because the news hadn't dropped yet

So this happened mere hours ago and I feel like I genuinely stumbled onto something worth documenting for people interested in AI behavio...

Reddit - Artificial Intelligence · 1 min · 6 minutes ago

Llms

GPT-4 vs Claude vs Gemini for coding — honest breakdown after 3 months of daily use

I am a solo developer who has been using all three seriously. Here is what I actually think: GPT-4o — Strengths: Large context window, st...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

You're giving feedback on a new version of ChatGPT

So I will be paying attention to these system messages more now- the last time I got one of these not so long back the 'tone' changed to ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

All Content

Llms

[2603.03378] AOI: Turning Failed Trajectories into Training Signals for Autonomous Cloud Diagnosis

Abstract page for arXiv paper 2603.03378: AOI: Turning Failed Trajectories into Training Signals for Autonomous Cloud Diagnosis

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03318] Quantum-Inspired Self-Attention in a Large Language Model

Abstract page for arXiv paper 2603.03318: Quantum-Inspired Self-Attention in a Large Language Model

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03314] Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

Abstract page for arXiv paper 2603.03314: Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03313] How does fine-tuning improve sensorimotor representations in large language models?

Abstract page for arXiv paper 2603.03313: How does fine-tuning improve sensorimotor representations in large language models?

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03308] Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

Abstract page for arXiv paper 2603.03308: Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03306] Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

Abstract page for arXiv paper 2603.03306: Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03305] Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

Abstract page for arXiv paper 2603.03305: Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation

Abstract page for arXiv paper 2603.03303: HumanLM: Simulating Users with State Alignment Beats Response Imitation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03301] From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

Abstract page for arXiv paper 2603.03301: From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03298] TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Abstract page for arXiv paper 2603.03298: TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03297] TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Abstract page for arXiv paper 2603.03297: TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03296] PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

Abstract page for arXiv paper 2603.03296: PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03295] Language Model Goal Selection Differs from Humans' in an Open-Ended Task

Abstract page for arXiv paper 2603.03295: Language Model Goal Selection Differs from Humans' in an Open-Ended Task

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03294] Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

Abstract page for arXiv paper 2603.03294: Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Abstract page for arXiv paper 2603.03292: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Abstract page for arXiv paper 2603.03291: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Abstract page for arXiv paper 2603.03290: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

Abstract page for arXiv paper 2603.04390: A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

Abstract page for arXiv paper 2603.04191: Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Abstract page for arXiv paper 2603.04124: BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structure...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 200 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Gemini caught a $280M crypto exploit before it hit the news, then retracted it as a hallucination because I couldn't verify it - because the news hadn't dropped yet

GPT-4 vs Claude vs Gemini for coding — honest breakdown after 3 months of daily use

You're giving feedback on a new version of ChatGPT

All Content

[2603.03378] AOI: Turning Failed Trajectories into Training Signals for Autonomous Cloud Diagnosis

[2603.03318] Quantum-Inspired Self-Attention in a Large Language Model

[2603.03314] Towards Self-Robust LLMs: Intrinsic Prompt Noise Resistance via CoIPO

[2603.03313] How does fine-tuning improve sensorimotor representations in large language models?

[2603.03308] Old Habits Die Hard: How Conversational History Geometrically Traps LLMs

[2603.03306] Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

[2603.03305] Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation

[2603.03301] From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

[2603.03298] TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

[2603.03297] TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

[2603.03296] PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

[2603.03295] Language Model Goal Selection Differs from Humans' in an Open-Ended Task

[2603.03294] Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Related Topics

Stay updated with AI News