Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

Gemini gets major upgrade towards interactive AI learning

AI News - General · 3 min · 37 minutes ago

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General · 37 minutes ago

Llms

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

Anthropic launches Claude Managed Agents in public beta — composable APIs for shipping production AI agents 10x faster Handles sandboxing...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

All Content

Llms

[2603.02960] Architecting Trust in Artificial Epistemic Agents

Abstract page for arXiv paper 2603.02960: Architecting Trust in Artificial Epistemic Agents

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02939] ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

Abstract page for arXiv paper 2603.02939: ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

Abstract page for arXiv paper 2603.02908: SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs with...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02858] LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates

Abstract page for arXiv paper 2603.02858: LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for R...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02798] Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

Abstract page for arXiv paper 2603.02798: Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02787] Rethinking Code Similarity for Automated Algorithm Design with LLMs

Abstract page for arXiv paper 2603.02787: Rethinking Code Similarity for Automated Algorithm Design with LLMs

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02680] LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

Abstract page for arXiv paper 2603.02680: LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Opt...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02268] PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis

Abstract page for arXiv paper 2603.02268: PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differentia...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02626] See and Remember: A Multimodal Agent for Web Traversal

Abstract page for arXiv paper 2603.02626: See and Remember: A Multimodal Agent for Web Traversal

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02599] SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

Abstract page for arXiv paper 2603.02599: SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.02586] LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

Abstract page for arXiv paper 2603.02586: LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02237] Concept Heterogeneity-aware Representation Steering

Abstract page for arXiv paper 2603.02237: Concept Heterogeneity-aware Representation Steering

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02542] AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation

Abstract page for arXiv paper 2603.02542: AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02236] CUDABench: Benchmarking LLMs for Text-to-CUDA Generation

Abstract page for arXiv paper 2603.02236: CUDABench: Benchmarking LLMs for Text-to-CUDA Generation

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02540] A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

Abstract page for arXiv paper 2603.02540: A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02528] LLM-MLFFN: Multi-Level Autonomous Driving Behavior Feature Fusion via Large Language Model

Abstract page for arXiv paper 2603.02528: LLM-MLFFN: Multi-Level Autonomous Driving Behavior Feature Fusion via Large Language Model

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02504] NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect

Abstract page for arXiv paper 2603.02504: NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail E...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02232] Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback

Abstract page for arXiv paper 2603.02232: Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02473] Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

Abstract page for arXiv paper 2603.02473: Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02435] VL-KGE: Vision-Language Models Meet Knowledge Graph Embeddings

Abstract page for arXiv paper 2603.02435: VL-KGE: Vision-Language Models Meet Knowledge Graph Embeddings

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 152 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Gemini gets major upgrade towards interactive AI learning

8 free AI courses from Anthropic’s Claude platform with certificates

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

All Content

[2603.02960] Architecting Trust in Artificial Epistemic Agents

[2603.02939] ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

[2603.02858] LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates

[2603.02798] Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

[2603.02787] Rethinking Code Similarity for Automated Algorithm Design with LLMs

[2603.02680] LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

[2603.02268] PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis

[2603.02626] See and Remember: A Multimodal Agent for Web Traversal

[2603.02599] SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

[2603.02586] LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

[2603.02237] Concept Heterogeneity-aware Representation Steering

[2603.02542] AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation

[2603.02236] CUDABench: Benchmarking LLMs for Text-to-CUDA Generation

[2603.02540] A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

[2603.02528] LLM-MLFFN: Multi-Level Autonomous Driving Behavior Feature Fusion via Large Language Model

[2603.02504] NeuroProlog: Multi-Task Fine-Tuning for Neurosymbolic Mathematical Reasoning via the Cocktail Effect

[2603.02232] Beyond Binary Preferences: A Principled Framework for Reward Modeling with Ordinal Feedback

[2603.02473] Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

[2603.02435] VL-KGE: Vision-Language Models Meet Knowledge Graph Embeddings

Related Topics

Stay updated with AI News