Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Started a video series on building an orchestration layer for LLM post-training [P]

Hi everyone! Context, motivation, a lot of yapping, feel free to skip to TL;DR. A while back I posted here asking [D] What framework do y...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

ChatGPT finally offers $100/month Pro plan

OpenAI announced on Thursday something that power users have been asking for: a $100/month plan. Previously, subscriptions jumped from $2...

TechCrunch - AI · 4 min · about 2 hours ago

Llms

Anthropic says new Claude Mythos AI is too risky for public use

Dubbed Claude Mythos, the software is part of the Claude AI family, an artificial intelligence model that can act like a chatbot and AI a...

AI Tools & Products · 10 min · about 2 hours ago

All Content

Llms

[2511.22935] EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

Abstract page for arXiv paper 2511.22935: EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

arXiv - AI · 4 min · about 1 month ago

Llms

[2412.13091] LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

Abstract page for arXiv paper 2412.13091: LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

arXiv - AI · 3 min · about 1 month ago

Llms

[2510.15982] AMiD: Knowledge Distillation for LLMs with $α$-mixture Assistant Distribution

Abstract page for arXiv paper 2510.15982: AMiD: Knowledge Distillation for LLMs with $α$-mixture Assistant Distribution

arXiv - AI · 4 min · about 1 month ago

Llms

[2406.06512] Merlin: A Computed Tomography Vision-Language Foundation Model and Dataset

Abstract page for arXiv paper 2406.06512: Merlin: A Computed Tomography Vision-Language Foundation Model and Dataset

arXiv - AI · 4 min · about 1 month ago

Llms

[2405.15374] Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

Abstract page for arXiv paper 2405.15374: Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.23405] Planner Aware Path Learning in Diffusion Language Models Training

Abstract page for arXiv paper 2509.23405: Planner Aware Path Learning in Diffusion Language Models Training

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2509.22263] Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning

Abstract page for arXiv paper 2509.22263: Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2509.21465] Talking Trees: Reasoning-Assisted Induction of Decision Trees for Tabular Data

Abstract page for arXiv paper 2509.21465: Talking Trees: Reasoning-Assisted Induction of Decision Trees for Tabular Data

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2509.17874] Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

Abstract page for arXiv paper 2509.17874: Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.09937] Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

Abstract page for arXiv paper 2602.09937: Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

arXiv - AI · 4 min · about 1 month ago

Llms

[2506.15963] On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy

Abstract page for arXiv paper 2506.15963: On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2601.16529] SycoEval-EM: Sycophancy Evaluation of Large Language Models in Simulated Clinical Encounters for Emergency Care

Abstract page for arXiv paper 2601.16529: SycoEval-EM: Sycophancy Evaluation of Large Language Models in Simulated Clinical Encounters fo...

arXiv - AI · 3 min · about 1 month ago

Llms

[2601.15160] Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning

Abstract page for arXiv paper 2601.15160: Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.22235] Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Abstract page for arXiv paper 2511.22235: Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

Abstract page for arXiv paper 2511.21471: SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.05854] Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

Abstract page for arXiv paper 2511.05854: Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.26905] Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

Abstract page for arXiv paper 2510.26905: Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.20065] SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

Abstract page for arXiv paper 2505.20065: SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space

Abstract page for arXiv paper 2510.09782: The Geometry of Reasoning: Flowing Logics in Representation Space

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.07972] SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

Abstract page for arXiv paper 2510.07972: SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

arXiv - AI · 4 min · about 1 month ago

Previous Page 137 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Started a video series on building an orchestration layer for LLM post-training [P]

ChatGPT finally offers $100/month Pro plan

Anthropic says new Claude Mythos AI is too risky for public use

All Content

[2511.22935] EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

[2412.13091] LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

[2510.15982] AMiD: Knowledge Distillation for LLMs with $α$-mixture Assistant Distribution

[2406.06512] Merlin: A Computed Tomography Vision-Language Foundation Model and Dataset

[2405.15374] Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

[2509.23405] Planner Aware Path Learning in Diffusion Language Models Training

[2509.22263] Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning

[2509.21465] Talking Trees: Reasoning-Assisted Induction of Decision Trees for Tabular Data

[2509.17874] Deep Hierarchical Learning with Nested Subspace Networks for Large Language Models

[2602.09937] Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

[2506.15963] On the Limits of Sparse Autoencoders: A Theoretical Framework and Reweighted Remedy

[2601.16529] SycoEval-EM: Sycophancy Evaluation of Large Language Models in Simulated Clinical Encounters for Emergency Care

[2601.15160] Knowledge Graphs are Implicit Reward Models: Path-Derived Signals Enable Compositional Reasoning

[2511.22235] Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

[2511.05854] Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

[2510.26905] Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

[2505.20065] SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space

[2510.07972] SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

Related Topics

Stay updated with AI News