Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

Anthropic Restricts Claude Agent Access Amid AI Automation Boom in Crypto

AI Tools & Products · 7 min · about 1 hour ago

Llms

Is cutting ‘please’ when talking to ChatGPT better for the planet? An expert explains

AI Tools & Products · 5 min · about 1 hour ago

Llms

AI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface

AI Tools & Products · 3 min · about 1 hour ago

All Content

Llms

[2603.19276] From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

Abstract page for arXiv paper 2603.19276: From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19275] Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

Abstract page for arXiv paper 2603.19275: Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language M...

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19274] CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

Abstract page for arXiv paper 2603.19274: CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19273] LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

Abstract page for arXiv paper 2603.19273: LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19271] A Human-Centered Workflow for Using Large Language Models in Content Analysis

Abstract page for arXiv paper 2603.19271: A Human-Centered Workflow for Using Large Language Models in Content Analysis

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19268] Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

Abstract page for arXiv paper 2603.19268: Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19266] Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

Abstract page for arXiv paper 2603.19266: Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

arXiv - Machine Learning · 4 min · 14 days ago

Llms

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the ...

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

Abstract page for arXiv paper 2603.19264: Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference

Abstract page for arXiv paper 2603.19262: The α-Law of Observable Belief Revision in Large Language Model Inference

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

Abstract page for arXiv paper 2603.19255: LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19258] MAPLE: Metadata Augmented Private Language Evolution

Abstract page for arXiv paper 2603.19258: MAPLE: Metadata Augmented Private Language Evolution

arXiv - Machine Learning · 4 min · 14 days ago

Llms

[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

Abstract page for arXiv paper 2603.19252: GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

arXiv - AI · 4 min · 14 days ago

Llms

[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

Abstract page for arXiv paper 2603.19236: L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

arXiv - AI · 3 min · 14 days ago

Llms

[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

Abstract page for arXiv paper 2603.19247: When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

arXiv - AI · 4 min · 14 days ago

Llms

[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

Abstract page for arXiv paper 2603.17765: Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Simi...

arXiv - AI · 4 min · 14 days ago

Llms

[2603.20170] Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

Abstract page for arXiv paper 2603.20170: Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

arXiv - AI · 3 min · 14 days ago

Llms

[2603.20101] Pitfalls in Evaluating Interpretability Agents

Abstract page for arXiv paper 2603.20101: Pitfalls in Evaluating Interpretability Agents

arXiv - AI · 4 min · 14 days ago

Llms

[2603.20046] Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

Abstract page for arXiv paper 2603.20046: Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for ...

arXiv - AI · 4 min · 14 days ago

Previous Page 88 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Anthropic Restricts Claude Agent Access Amid AI Automation Boom in Crypto

Is cutting ‘please’ when talking to ChatGPT better for the planet? An expert explains

AI Desktop 98 lets you chat with Claude, ChatGPT, and Gemini through a Windows 98-inspired interface

All Content

[2603.19276] From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

[2603.19275] Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

[2603.19274] CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

[2603.19273] LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

[2603.19271] A Human-Centered Workflow for Using Large Language Models in Content Analysis

[2603.19268] Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

[2603.19266] Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference

[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

[2603.19258] MAPLE: Metadata Augmented Private Language Evolution

[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

[2603.20170] Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

[2603.20101] Pitfalls in Evaluating Interpretability Agents

[2603.20046] Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

Related Topics

Stay updated with AI News