Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

Every agent I build forgets everything between sessions. I got tired of it and built brainctl. pip install brainctl, then: from agentmemo...

Reddit - Artificial Intelligence · 1 min · 5 minutes ago

Llms

How has Claude far surpassed the competitors? They were not first to market or ever had the most cash yet their feature are far and away the best on the market.

How has Claude far surpassed the competitors? They were not first to market or ever had the most cash yet their feature are far and away ...

Reddit - Artificial Intelligence · 1 min · 5 minutes ago

Llms

Anthropic temporarily banned OpenClaw's creator from accessing Claude | TechCrunch

This ban took place after Claude's pricing changed for OpenClaw users last week.

TechCrunch - AI · 5 min · about 2 hours ago

All Content

Llms

[2510.06410] Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?

Abstract page for arXiv paper 2510.06410: Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.05684] D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Abstract page for arXiv paper 2510.05684: D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.23725] MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models

Abstract page for arXiv paper 2509.23725: MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language M...

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.22613] Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Abstract page for arXiv paper 2509.22613: Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Pers...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2507.08207] Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking

Abstract page for arXiv paper 2507.08207: Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbr...

arXiv - AI · 3 min · about 1 month ago

Llms

[2505.19892] OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Abstract page for arXiv paper 2505.19892: OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.13909] Efficient Agent Training for Computer Use

Abstract page for arXiv paper 2505.13909: Efficient Agent Training for Computer Use

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2505.13180] ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

Abstract page for arXiv paper 2505.13180: ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03269] LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Abstract page for arXiv paper 2603.03269: LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03180] Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling

Abstract page for arXiv paper 2603.03180: Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industr...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03192] MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization

Abstract page for arXiv paper 2603.03192: MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Pr...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.02952] Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Abstract page for arXiv paper 2603.02952: Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in singl...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03095] Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

Abstract page for arXiv paper 2603.03095: Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02775] From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

Abstract page for arXiv paper 2603.02775: From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03047] TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health

Abstract page for arXiv paper 2603.03047: TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language M...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02983] Contextualized Privacy Defense for LLM Agents

Abstract page for arXiv paper 2603.02983: Contextualized Privacy Defense for LLM Agents

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02623] Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

Abstract page for arXiv paper 2603.02623: Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.02949] SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment

Abstract page for arXiv paper 2603.02949: SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark ...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02909] Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction

Abstract page for arXiv paper 2603.02909: Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02830] Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

Abstract page for arXiv paper 2603.02830: Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

arXiv - AI · 4 min · about 1 month ago

Previous Page 148 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

How has Claude far surpassed the competitors? They were not first to market or ever had the most cash yet their feature are far and away the best on the market.

Anthropic temporarily banned OpenClaw's creator from accessing Claude | TechCrunch

All Content

[2510.06410] Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?

[2510.05684] D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

[2509.23725] MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models

[2509.22613] Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

[2507.08207] Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking

[2505.19892] OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

[2505.13909] Efficient Agent Training for Computer Use

[2505.13180] ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

[2603.03269] LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

[2603.03180] Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling

[2603.03192] MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization

[2603.02952] Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

[2603.03095] Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

[2603.02775] From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

[2603.03047] TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health

[2603.02983] Contextualized Privacy Defense for LLM Agents

[2603.02623] Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

[2603.02949] SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment

[2603.02909] Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction

[2603.02830] Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

Related Topics

Stay updated with AI News