Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min · about 6 hours ago

Llms

I let Gemini in Google Maps plan my day and it went surprisingly well | The Verge

Gemini in Google Maps is a surprisingly useful way to explore new territory.

The Verge - AI · 11 min · about 8 hours ago

Llms

The person who replaces you probably won't be AI. It'll be someone from the next department over who learned to use it - opinion/discussion

I'm a strategy person by background. Two years ago I'd write a recommendation and hand it to a product team. Now.. I describe what I want...

Reddit - Artificial Intelligence · 1 min · about 15 hours ago

All Content

Llms

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the ...

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

Abstract page for arXiv paper 2603.19264: Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference

Abstract page for arXiv paper 2603.19262: The α-Law of Observable Belief Revision in Large Language Model Inference

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

Abstract page for arXiv paper 2603.19255: LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19258] MAPLE: Metadata Augmented Private Language Evolution

Abstract page for arXiv paper 2603.19258: MAPLE: Metadata Augmented Private Language Evolution

arXiv - Machine Learning · 4 min · 14 days ago

Llms

[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

Abstract page for arXiv paper 2603.19252: GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

Abstract page for arXiv paper 2603.19236: L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

Abstract page for arXiv paper 2603.19247: When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

arXiv - AI · 4 min · 14 days ago

Llms

[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

Abstract page for arXiv paper 2603.17765: Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Simi...

arXiv - AI · 4 min · 14 days ago

Llms

[2603.20170] Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

Abstract page for arXiv paper 2603.20170: Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

arXiv - AI · 3 min · 14 days ago

Llms

[2603.20101] Pitfalls in Evaluating Interpretability Agents

Abstract page for arXiv paper 2603.20101: Pitfalls in Evaluating Interpretability Agents

arXiv - AI · 4 min · 14 days ago

Llms

[2603.20046] Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

Abstract page for arXiv paper 2603.20046: Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for ...

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19896] Utility-Guided Agent Orchestration for Efficient LLM Tool Use

Abstract page for arXiv paper 2603.19896: Utility-Guided Agent Orchestration for Efficient LLM Tool Use

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19715] Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

Abstract page for arXiv paper 2603.19715: Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19685] A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Abstract page for arXiv paper 2603.19685: A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

arXiv - Machine Learning · 4 min · 14 days ago

Llms

[2603.19639] HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning

Abstract page for arXiv paper 2603.19639: HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19584] PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

Abstract page for arXiv paper 2603.19584: PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19515] ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models

Abstract page for arXiv paper 2603.19515: ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19514] Learning to Disprove: Formal Counterexample Generation with Large Language Models

Abstract page for arXiv paper 2603.19514: Learning to Disprove: Formal Counterexample Generation with Large Language Models

arXiv - AI · 3 min · 14 days ago

Previous Page 81 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

OpenClaw security checklist: practical safeguards for AI agents

I let Gemini in Google Maps plan my day and it went surprisingly well | The Verge

The person who replaces you probably won't be AI. It'll be someone from the next department over who learned to use it - opinion/discussion

All Content

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference

[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

[2603.19258] MAPLE: Metadata Augmented Private Language Evolution

[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

[2603.20170] Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

[2603.20101] Pitfalls in Evaluating Interpretability Agents

[2603.20046] Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

[2603.19896] Utility-Guided Agent Orchestration for Efficient LLM Tool Use

[2603.19715] Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

[2603.19685] A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

[2603.19639] HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning

[2603.19584] PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

[2603.19515] ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models

[2603.19514] Learning to Disprove: Formal Counterexample Generation with Large Language Models

Related Topics

Stay updated with AI News