Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

What's your "When Language Model AI can do X, I'll be impressed"?

I have two at the top of my mind: When it can read musical notes. I will be mildly impressed when I can paste in a picture of musical not...

Reddit - Artificial Intelligence · 1 min ·
Google’s Gemini AI can answer your questions with 3D models and simulations
Llms

Google’s Gemini AI can answer your questions with 3D models and simulations

Google's latest upgrade for Gemini will allow the chatbot to generate interactive 3D models and simulations in response to your questions...

The Verge - AI · 4 min ·
Moody’s Integrates AI Agents With Anthropic’s Claude
Llms

Moody’s Integrates AI Agents With Anthropic’s Claude

AI Tools & Products · 4 min ·

All Content

[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation
Llms

[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation

Abstract page for arXiv paper 2603.03303: HumanLM: Simulating Users with State Alignment Beats Response Imitation

arXiv - AI · 4 min ·
[2603.03301] From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings
Llms

[2603.03301] From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

Abstract page for arXiv paper 2603.03301: From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings

arXiv - Machine Learning · 3 min ·
[2603.03298] TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation
Llms

[2603.03298] TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

Abstract page for arXiv paper 2603.03298: TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation

arXiv - AI · 4 min ·
[2603.03297] TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement
Llms

[2603.03297] TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

Abstract page for arXiv paper 2603.03297: TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement

arXiv - Machine Learning · 4 min ·
[2603.03296] PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
Llms

[2603.03296] PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

Abstract page for arXiv paper 2603.03296: PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents

arXiv - AI · 4 min ·
[2603.03295] Language Model Goal Selection Differs from Humans' in an Open-Ended Task
Llms

[2603.03295] Language Model Goal Selection Differs from Humans' in an Open-Ended Task

Abstract page for arXiv paper 2603.03295: Language Model Goal Selection Differs from Humans' in an Open-Ended Task

arXiv - AI · 3 min ·
[2603.03294] Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory
Llms

[2603.03294] Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

Abstract page for arXiv paper 2603.03294: Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory

arXiv - Machine Learning · 4 min ·
[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG
Llms

[2603.03292] From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

Abstract page for arXiv paper 2603.03292: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG

arXiv - AI · 4 min ·
[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models
Llms

[2603.03291] One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

Abstract page for arXiv paper 2603.03291: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models

arXiv - AI · 3 min ·
[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents
Llms

[2603.03290] AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

Abstract page for arXiv paper 2603.03290: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents

arXiv - Machine Learning · 4 min ·
[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development
Llms

[2603.04390] A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

Abstract page for arXiv paper 2603.04390: A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

arXiv - AI · 3 min ·
[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions
Llms

[2603.04191] Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

Abstract page for arXiv paper 2603.04191: Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized...

arXiv - AI · 3 min ·
[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning
Llms

[2603.04124] BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structured Beam Mechanics Reasoning

Abstract page for arXiv paper 2603.04124: BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structure...

arXiv - Machine Learning · 4 min ·
[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models
Llms

[2603.03824] In-Context Environments Induce Evaluation-Awareness in Language Models

Abstract page for arXiv paper 2603.03824: In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv - Machine Learning · 4 min ·
[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation
Llms

[2603.03761] AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

Abstract page for arXiv paper 2603.03761: AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation

arXiv - AI · 4 min ·
[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment
Llms

[2603.03686] AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

Abstract page for arXiv paper 2603.03686: AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Ali...

arXiv - AI · 4 min ·
[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation
Llms

[2603.03680] MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

Abstract page for arXiv paper 2603.03680: MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploita...

arXiv - AI · 4 min ·
[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents
Llms

[2603.03655] Mozi: Governed Autonomy for Drug Discovery LLM Agents

Abstract page for arXiv paper 2603.03655: Mozi: Governed Autonomy for Drug Discovery LLM Agents

arXiv - AI · 4 min ·
Llms

[P] Bypassing CoreML to natively train a 110M Transformer on the Apple Neural Engine (Orion)

It is hard to communicate how frustrating the current Apple ML stack is for low-level research. CoreML imposes opaque abstractions that p...

Reddit - Machine Learning · 1 min ·
Llms

[D] A mathematical proof from an anonymous Korean forum: The essence of Attention is fundamentally a d^2 problem, not n^2. (PDF included)

Hello, r/MachineLearning . I am just a regular user from a Korean AI community ("The Singularity Gallery"). I recently came across an ano...

Reddit - Machine Learning · 1 min ·
Previous Page 143 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime