What's your "When Language Model AI can do X, I'll be impressed"?
I have two at the top of my mind: When it can read musical notes. I will be mildly impressed when I can paste in a picture of musical not...
GPT, Claude, Gemini, and other LLMs
I have two at the top of my mind: When it can read musical notes. I will be mildly impressed when I can paste in a picture of musical not...
Google's latest upgrade for Gemini will allow the chatbot to generate interactive 3D models and simulations in response to your questions...
Abstract page for arXiv paper 2603.03303: HumanLM: Simulating Users with State Alignment Beats Response Imitation
Abstract page for arXiv paper 2603.03301: From Exact Hits to Close Enough: Semantic Caching for LLM Embeddings
Abstract page for arXiv paper 2603.03298: TATRA: Training-Free Instance-Adaptive Prompting Through Rephrasing and Aggregation
Abstract page for arXiv paper 2603.03297: TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement
Abstract page for arXiv paper 2603.03296: PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
Abstract page for arXiv paper 2603.03295: Language Model Goal Selection Differs from Humans' in an Open-Ended Task
Abstract page for arXiv paper 2603.03294: Fine-Tuning and Evaluating Conversational AI for Agricultural Advisory
Abstract page for arXiv paper 2603.03292: From Conflict to Consensus: Boosting Medical Reasoning via Multi-Round Agentic RAG
Abstract page for arXiv paper 2603.03291: One Bias After Another: Mechanistic Reward Shaping and Persistent Biases in Language Reward Models
Abstract page for arXiv paper 2603.03290: AriadneMem: Threading the Maze of Lifelong Memory for LLM Agents
Abstract page for arXiv paper 2603.04390: A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development
Abstract page for arXiv paper 2603.04191: Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized...
Abstract page for arXiv paper 2603.04124: BeamPERL: Parameter-Efficient RL with Verifiable Rewards Specializes Compact LLMs for Structure...
Abstract page for arXiv paper 2603.03824: In-Context Environments Induce Evaluation-Awareness in Language Models
Abstract page for arXiv paper 2603.03761: AgentSelect: Benchmark for Narrative Query-to-Agent Recommendation
Abstract page for arXiv paper 2603.03686: AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Ali...
Abstract page for arXiv paper 2603.03680: MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploita...
Abstract page for arXiv paper 2603.03655: Mozi: Governed Autonomy for Drug Discovery LLM Agents
It is hard to communicate how frustrating the current Apple ML stack is for low-level research. CoreML imposes opaque abstractions that p...
Hello, r/MachineLearning . I am just a regular user from a Korean AI community ("The Singularity Gallery"). I recently came across an ano...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime