Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

What's your "When Language Model AI can do X, I'll be impressed"?

I have two at the top of my mind: When it can read musical notes. I will be mildly impressed when I can paste in a picture of musical not...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Google’s Gemini AI can answer your questions with 3D models and simulations

Google's latest upgrade for Gemini will allow the chatbot to generate interactive 3D models and simulations in response to your questions...

The Verge - AI · 4 min · about 8 hours ago

Llms

Moody’s Integrates AI Agents With Anthropic’s Claude

AI Tools & Products · 4 min · about 8 hours ago

All Content

Llms

[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation

Abstract page for arXiv paper 2603.03303: HumanLM: Simulating Users with State Alignment Beats Response Imitation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03301] From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

Abstract page for arXiv paper 2603.03301: From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03298] TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Abstract page for arXiv paper 2603.03298: TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03297] TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Abstract page for arXiv paper 2603.03297: TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03296] PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

Abstract page for arXiv paper 2603.03296: PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03295] Language Model Goal Selection Differs from Humans' in an Open-Ended Task

Abstract page for arXiv paper 2603.03295: Language Model Goal Selection Differs from Humans' in an Open-Ended Task

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03294] Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

Abstract page for arXiv paper 2603.03294: Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Abstract page for arXiv paper 2603.03292: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Abstract page for arXiv paper 2603.03291: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Abstract page for arXiv paper 2603.03290: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

Abstract page for arXiv paper 2603.04390: A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

Abstract page for arXiv paper 2603.04191: Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Abstract page for arXiv paper 2603.04124: BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structure...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models

Abstract page for arXiv paper 2603.03824: In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

Abstract page for arXiv paper 2603.03761: AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

Abstract page for arXiv paper 2603.03686: AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Ali...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

Abstract page for arXiv paper 2603.03680: MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploita...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents

Abstract page for arXiv paper 2603.03655: Mozi: Governed Autonomy for Drug Discovery LLM Agents

arXiv - AI · 4 min · about 1 month ago

Llms

[P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)

It is hard to communicate how frustrating the current Apple ML stack is for low-level research. CoreML imposes opaque abstractions that p...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

[D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)

Hello, r/MachineLearning . I am just a regular user from a Korean AI community ("The Singularity Gallery"). I recently came across an ano...

Reddit - Machine Learning · 1 min · about 1 month ago

Previous Page 143 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

What's your "When Language Model AI can do X, I'll be impressed"?

Google’s Gemini AI can answer your questions with 3D models and simulations

Moody’s Integrates AI Agents With Anthropic’s Claude

All Content

[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation

[2603.03301] From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

[2603.03298] TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

[2603.03297] TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

[2603.03296] PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

[2603.03295] Language Model Goal Selection Differs from Humans' in an Open-Ended Task

[2603.03294] Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models

[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents

[P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)

[D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)

Related Topics

Stay updated with AI News