Trending Large Language Models

The most popular large language models content from the past 3 days. Curated by AI News.

Llms

LLM rankings are not a ladder: experimental results from a transitive benchmark graph [D]

I built a small website called LLM Win: https://llm-win.com It turns LLM benchmark results into a directed graph: text If model A beats m...

Reddit - Machine Learning · 1 min ·
[2605.07631] Inference Time Causal Probing in LLMs
Llms

[2605.07631] Inference Time Causal Probing in LLMs

Abstract page for arXiv paper 2605.07631: Inference Time Causal Probing in LLMs

arXiv - AI · 3 min ·
[2605.07692] GASim: A Graph-Accelerated Hybrid Framework for Social Simulation
Llms

[2605.07692] GASim: A Graph-Accelerated Hybrid Framework for Social Simulation

Abstract page for arXiv paper 2605.07692: GASim: A Graph-Accelerated Hybrid Framework for Social Simulation

arXiv - AI · 3 min ·
[2605.07926] AgentEscapeBench: Evaluating Out-of-Domain Tool-Grounded Reasoning in LLM Agents
Llms

[2605.07926] AgentEscapeBench: Evaluating Out-of-Domain Tool-Grounded Reasoning in LLM Agents

Abstract page for arXiv paper 2605.07926: AgentEscapeBench: Evaluating Out-of-Domain Tool-Grounded Reasoning in LLM Agents

arXiv - AI · 4 min ·
[2605.08011] Abductive Reasoning with Probabilistic Commonsense
Llms

[2605.08011] Abductive Reasoning with Probabilistic Commonsense

Abstract page for arXiv paper 2605.08011: Abductive Reasoning with Probabilistic Commonsense

arXiv - AI · 3 min ·
[2605.06765] VITA-QinYu: Expressive Spoken Language Model for Role-Playing and Singing
Llms

[2605.06765] VITA-QinYu: Expressive Spoken Language Model for Role-Playing and Singing

Abstract page for arXiv paper 2605.06765: VITA-QinYu: Expressive Spoken Language Model for Role-Playing and Singing

arXiv - AI · 4 min ·
[2605.06903] MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text
Llms

[2605.06903] MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text

Abstract page for arXiv paper 2605.06903: MELD: Multi-Task Equilibrated Learning Detector for AI-Generated Text

arXiv - AI · 4 min ·
[2605.06936] Bridging the Last Mile of Circuit Design: PostEDA-Bench, a Hierarchical Benchmark for PPA Convergence and DRC Fixing
Llms

[2605.06936] Bridging the Last Mile of Circuit Design: PostEDA-Bench, a Hierarchical Benchmark for PPA Convergence and DRC Fixing

Abstract page for arXiv paper 2605.06936: Bridging the Last Mile of Circuit Design: PostEDA-Bench, a Hierarchical Benchmark for PPA Conve...

arXiv - AI · 3 min ·
[2605.07019] LensVLM: Selective Context Expansion for Compressed Visual Representation of Text
Llms

[2605.07019] LensVLM: Selective Context Expansion for Compressed Visual Representation of Text

Abstract page for arXiv paper 2605.07019: LensVLM: Selective Context Expansion for Compressed Visual Representation of Text

arXiv - AI · 4 min ·
[2605.07068] WiCER: Wiki-memory Compile, Evaluate, Refine Iterative Knowledge Compilation for LLM Wiki Systems
Llms

[2605.07068] WiCER: Wiki-memory Compile, Evaluate, Refine Iterative Knowledge Compilation for LLM Wiki Systems

Abstract page for arXiv paper 2605.07068: WiCER: Wiki-memory Compile, Evaluate, Refine Iterative Knowledge Compilation for LLM Wiki Systems

arXiv - AI · 4 min ·
[2605.07186] The Text Uncanny Valley: Non-Monotonic Performance Degradation in LLM Information Retrieval
Llms

[2605.07186] The Text Uncanny Valley: Non-Monotonic Performance Degradation in LLM Information Retrieval

Abstract page for arXiv paper 2605.07186: The Text Uncanny Valley: Non-Monotonic Performance Degradation in LLM Information Retrieval

arXiv - AI · 4 min ·
[2605.07234] Reformulating KV Cache Eviction Problem for Long-Context LLM Inference
Llms

[2605.07234] Reformulating KV Cache Eviction Problem for Long-Context LLM Inference

Abstract page for arXiv paper 2605.07234: Reformulating KV Cache Eviction Problem for Long-Context LLM Inference

arXiv - AI · 3 min ·
[2605.07299] EgoPro-Bench: Benchmarking Personalized Proactive Interaction in Egocentric Video Streams
Llms

[2605.07299] EgoPro-Bench: Benchmarking Personalized Proactive Interaction in Egocentric Video Streams

Abstract page for arXiv paper 2605.07299: EgoPro-Bench: Benchmarking Personalized Proactive Interaction in Egocentric Video Streams

arXiv - AI · 4 min ·
[2605.07314] DCGL: Dual-Channel Graph Learning with Large Language Models for Knowledge-Aware Recommendation
Llms

[2605.07314] DCGL: Dual-Channel Graph Learning with Large Language Models for Knowledge-Aware Recommendation

Abstract page for arXiv paper 2605.07314: DCGL: Dual-Channel Graph Learning with Large Language Models for Knowledge-Aware Recommendation

arXiv - AI · 4 min ·
[2605.07394] BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning
Llms

[2605.07394] BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning

Abstract page for arXiv paper 2605.07394: BalCapRL: A Balanced Framework for RL-Based MLLM Image Captioning

arXiv - AI · 4 min ·

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime