Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

it is impossible to stop AI chatbots from using quotes (any instance of the character ")

no matter how i phrase it in the instructions, how many times i repeat the rule not to use quotes, and which LLM i use, i have failed to ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Converting XQuery to SQL with Local LLMs: Do I Need Fine-Tuning or a Better Approach? [P]

​ I am trying to convert XQuery statements into SQL queries within an enterprise context, with the constraint that the solution must rely...

Reddit - Machine Learning · 1 min ·
AI: Fragility of today's Claude Cowork type AI Agent Apps. RTZ 1061
Llms

AI: Fragility of today's Claude Cowork type AI Agent Apps. RTZ 1061

...realities like memory management, highlight a longer road to resilient AI Agents and AGI

AI Tools & Products · 11 min ·

All Content

[2603.02510] ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution
Llms

[2603.02510] ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

Abstract page for arXiv paper 2603.02510: ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evol...

arXiv - Machine Learning · 4 min ·
[2603.03002] SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models
Llms

[2603.03002] SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

Abstract page for arXiv paper 2603.03002: SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

arXiv - AI · 4 min ·
[2603.02482] MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models
Llms

[2603.02482] MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

Abstract page for arXiv paper 2603.02482: MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

arXiv - Machine Learning · 4 min ·
[2603.03078] RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization
Llms

[2603.03078] RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

Abstract page for arXiv paper 2603.03078: RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

arXiv - AI · 4 min ·
[2603.03072] TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning
Llms

[2603.03072] TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

Abstract page for arXiv paper 2603.03072: TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

arXiv - AI · 4 min ·
[2603.03018] REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry
Llms

[2603.03018] REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry

Abstract page for arXiv paper 2603.03018: REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise T...

arXiv - AI · 4 min ·
[2603.03005] OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Structured Agents
Llms

[2603.03005] OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Structured Agents

Abstract page for arXiv paper 2603.03005: OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Struct...

arXiv - AI · 4 min ·
[2603.02960] Architecting Trust in Artificial Epistemic Agents
Llms

[2603.02960] Architecting Trust in Artificial Epistemic Agents

Abstract page for arXiv paper 2603.02960: Architecting Trust in Artificial Epistemic Agents

arXiv - AI · 4 min ·
[2603.02939] ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization
Llms

[2603.02939] ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

Abstract page for arXiv paper 2603.02939: ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative...

arXiv - AI · 3 min ·
[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training
Llms

[2603.02908] SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

Abstract page for arXiv paper 2603.02908: SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs with...

arXiv - AI · 4 min ·
[2603.02858] LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates
Llms

[2603.02858] LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates

Abstract page for arXiv paper 2603.02858: LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for R...

arXiv - AI · 4 min ·
[2603.02798] Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification
Llms

[2603.02798] Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

Abstract page for arXiv paper 2603.02798: Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

arXiv - AI · 3 min ·
[2603.02787] Rethinking Code Similarity for Automated Algorithm Design with LLMs
Llms

[2603.02787] Rethinking Code Similarity for Automated Algorithm Design with LLMs

Abstract page for arXiv paper 2603.02787: Rethinking Code Similarity for Automated Algorithm Design with LLMs

arXiv - AI · 4 min ·
[2603.02680] LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization
Llms

[2603.02680] LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

Abstract page for arXiv paper 2603.02680: LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Opt...

arXiv - AI · 4 min ·
[2603.02268] PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis
Llms

[2603.02268] PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differential Diagnosis

Abstract page for arXiv paper 2603.02268: PRISM: Exploring Heterogeneous Pretrained EEG Foundation Model Transfer to Clinical Differentia...

arXiv - AI · 4 min ·
[2603.02626] See and Remember: A Multimodal Agent for Web Traversal
Llms

[2603.02626] See and Remember: A Multimodal Agent for Web Traversal

Abstract page for arXiv paper 2603.02626: See and Remember: A Multimodal Agent for Web Traversal

arXiv - AI · 3 min ·
[2603.02599] SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving
Llms

[2603.02599] SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

Abstract page for arXiv paper 2603.02599: SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

arXiv - Machine Learning · 4 min ·
[2603.02586] LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges
Llms

[2603.02586] LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

Abstract page for arXiv paper 2603.02586: LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

arXiv - AI · 3 min ·
[2603.02237] Concept Heterogeneity-aware Representation Steering
Llms

[2603.02237] Concept Heterogeneity-aware Representation Steering

Abstract page for arXiv paper 2603.02237: Concept Heterogeneity-aware Representation Steering

arXiv - AI · 4 min ·
[2603.02542] AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation
Llms

[2603.02542] AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation

Abstract page for arXiv paper 2603.02542: AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical...

arXiv - AI · 4 min ·
Previous Page 208 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime