Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

When Robots Have Their ChatGPT Moment, Remember These Pincers | WIRED

From sorting chicken nuggets to screwing in light bulbs, Eka’s robots are eerily lifelike. But do they have real physical smarts?

Wired - AI · 13 min · about 2 hours ago

Llms

87% Cost Savings & Sub-3s Latency: I built a "Warm-Cache" harness for persistent Claude agents.

**The "Goldfish Problem" is expensive. I decided to fix the plumbing.** Most Claude implementations leave 90% of their money on the table...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

What are people using for low-latency autocomplete in production? [P]

I’ve been looking into autocomplete/typeahead systems recently, especially in contexts where latency really matters (e.g. search-as-you-t...

Reddit - Machine Learning · 1 min · about 3 hours ago

All Content

Llms

[2511.03441] CareMedEval dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field

Abstract page for arXiv paper 2511.03441: CareMedEval dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Abstract page for arXiv paper 2510.24702: Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.24178] MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

Abstract page for arXiv paper 2510.24178: MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.10889] Topological Alignment of Shared Vision-Language Embedding Space

Abstract page for arXiv paper 2510.10889: Topological Alignment of Shared Vision-Language Embedding Space

arXiv - AI · 3 min · about 2 months ago

Llms

[2510.07181] TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

Abstract page for arXiv paper 2510.07181: TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

arXiv - AI · 4 min · about 2 months ago

Llms

[2505.06046] Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

Abstract page for arXiv paper 2505.06046: Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25541] Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Abstract page for arXiv paper 2509.25541: Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

arXiv - AI · 4 min · about 2 months ago

Llms

[2504.08714] Generating Fine Details of Entity Interactions

Abstract page for arXiv paper 2504.08714: Generating Fine Details of Entity Interactions

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Abstract page for arXiv paper 2509.24222: Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

arXiv - AI · 4 min · about 2 months ago

Llms

[2412.19436] Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

Abstract page for arXiv paper 2412.19436: Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2509.13471] An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software

Abstract page for arXiv paper 2509.13471: An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.06415] Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models

Abstract page for arXiv paper 2509.06415: Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Langu...

arXiv - AI · 3 min · about 2 months ago

Llms

[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering

Abstract page for arXiv paper 2508.07321: ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answe...

arXiv - AI · 3 min · about 2 months ago

Llms

[2508.00450] When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation

Abstract page for arXiv paper 2508.00450: When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation

arXiv - AI · 4 min · about 2 months ago

Llms

[2507.09875] Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

Abstract page for arXiv paper 2507.09875: Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

arXiv - AI · 4 min · about 2 months ago

Llms

[2507.07847] From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Abstract page for arXiv paper 2507.07847: From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Au...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.05630] Rewards as Labels: Revisiting RLVR from a Classification Perspective

Abstract page for arXiv paper 2602.05630: Rewards as Labels: Revisiting RLVR from a Classification Perspective

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2601.17473] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Abstract page for arXiv paper 2601.17473: LeanTutor: Towards a Verified AI Mathematical Proof Tutor

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2505.23783] Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

Abstract page for arXiv paper 2505.23783: Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

arXiv - AI · 4 min · about 2 months ago

Llms

[2512.20760] Generalization of RLVR Using Causal Reasoning as a Testbed

Abstract page for arXiv paper 2512.20760: Generalization of RLVR Using Causal Reasoning as a Testbed

arXiv - AI · 4 min · about 2 months ago

Previous Page 264 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

When Robots Have Their ChatGPT Moment, Remember These Pincers | WIRED

87% Cost Savings & Sub-3s Latency: I built a "Warm-Cache" harness for persistent Claude agents.

What are people using for low-latency autocomplete in production? [P]

All Content

[2511.03441] CareMedEval dataset: Evaluating Critical Appraisal and Reasoning in the Biomedical Field

[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

[2510.24178] MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

[2510.10889] Topological Alignment of Shared Vision-Language Embedding Space

[2510.07181] TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

[2505.06046] Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

[2509.25541] Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

[2504.08714] Generating Fine Details of Entity Interactions

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

[2412.19436] Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

[2509.13471] An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software

[2509.06415] Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models

[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering

[2508.00450] When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation

[2507.09875] Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

[2507.07847] From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

[2602.05630] Rewards as Labels: Revisiting RLVR from a Classification Perspective

[2601.17473] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

[2505.23783] Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

[2512.20760] Generalization of RLVR Using Causal Reasoning as a Testbed

Related Topics

Stay updated with AI News