Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

I compiled every major AI agent security incident from 2024-2026 in one place - 90 incidents, all sourced, updated weekly

After tracking AI agent security incidents for the past year, I put together a single reference covering every major breach, vulnerabilit...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

[R] Forced Depth Consideration Reduces Type II Errors in LLM Self-Classification: Evidence from an Exploration Prompting Ablation Study - (200 trap prompts, 4 models, 8 Step-0 variants) [R]

LLM-Based task classifier tend to misroute prompts that look simple at first glance, but require deeper understanding - I call it "Type I...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

I asked ChatGPT and Gemini to generate a world map

submitted by /u/Pitiful-Entrance5769 [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

All Content

Llms

[2603.04727] Are Multimodal LLMs Ready for Surveillance? A Reality Check on Zero-Shot Anomaly Detection in the Wild

Abstract page for arXiv paper 2603.04727: Are Multimodal LLMs Ready for Surveillance? A Reality Check on Zero-Shot Anomaly Detection in t...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04707] Detection of Illicit Content on Online Marketplaces using Large Language Models

Abstract page for arXiv paper 2603.04707: Detection of Illicit Content on Online Marketplaces using Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04698] Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

Abstract page for arXiv paper 2603.04698: Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04678] Optimizing Language Models for Crosslingual Knowledge Consistency

Abstract page for arXiv paper 2603.04678: Optimizing Language Models for Crosslingual Knowledge Consistency

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04676] Decoding the Pulse of Reasoning VLMs in Multi-Image Understanding Tasks

Abstract page for arXiv paper 2603.04676: Decoding the Pulse of Reasoning VLMs in Multi-Image Understanding Tasks

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04663] Neuro-Symbolic Financial Reasoning via Deterministic Fact Ledgers and Adversarial Low-Latency Hallucination Detector

Abstract page for arXiv paper 2603.04663: Neuro-Symbolic Financial Reasoning via Deterministic Fact Ledgers and Adversarial Low-Latency H...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04597] Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Abstract page for arXiv paper 2603.04597: Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04474] From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

Abstract page for arXiv paper 2603.04474: From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04464] Understanding the Dynamics of Demonstration Conflict in In-Context Learning

Abstract page for arXiv paper 2603.04464: Understanding the Dynamics of Demonstration Conflict in In-Context Learning

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04459] Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

Abstract page for arXiv paper 2603.04459: Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04460] VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

Abstract page for arXiv paper 2603.04460: VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.04455] Large Language Models as Bidding Agents in Repeated HetNet Auction

Abstract page for arXiv paper 2603.04455: Large Language Models as Bidding Agents in Repeated HetNet Auction

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04454] Query Disambiguation via Answer-Free Context: Doubling Performance on Humanity's Last Exam

Abstract page for arXiv paper 2603.04454: Query Disambiguation via Answer-Free Context: Doubling Performance on Humanity's Last Exam

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04453] Induced Numerical Instability: Hidden Costs in Multimodal Large Language Models

Abstract page for arXiv paper 2603.04453: Induced Numerical Instability: Hidden Costs in Multimodal Large Language Models

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.04452] A unified foundational framework for knowledge injection and evaluation of Large Language Models in Combustion Science

Abstract page for arXiv paper 2603.04452: A unified foundational framework for knowledge injection and evaluation of Large Language Model...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04444] vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

Abstract page for arXiv paper 2603.04444: vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04436] ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation

Abstract page for arXiv paper 2603.04436: ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04443] AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

Abstract page for arXiv paper 2603.04443: AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04429] What Is Missing: Interpretable Ratings for Large Language Model Outputs

Abstract page for arXiv paper 2603.04429: What Is Missing: Interpretable Ratings for Large Language Model Outputs

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04428] Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices

Abstract page for arXiv paper 2603.04428: Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Dev...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 130 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

I compiled every major AI agent security incident from 2024-2026 in one place - 90 incidents, all sourced, updated weekly

[R] Forced Depth Consideration Reduces Type II Errors in LLM Self-Classification: Evidence from an Exploration Prompting Ablation Study - (200 trap prompts, 4 models, 8 Step-0 variants) [R]

I asked ChatGPT and Gemini to generate a world map

All Content

[2603.04727] Are Multimodal LLMs Ready for Surveillance? A Reality Check on Zero-Shot Anomaly Detection in the Wild

[2603.04707] Detection of Illicit Content on Online Marketplaces using Large Language Models

[2603.04698] Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

[2603.04678] Optimizing Language Models for Crosslingual Knowledge Consistency

[2603.04676] Decoding the Pulse of Reasoning VLMs in Multi-Image Understanding Tasks

[2603.04663] Neuro-Symbolic Financial Reasoning via Deterministic Fact Ledgers and Adversarial Low-Latency Hallucination Detector

[2603.04597] Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

[2603.04474] From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

[2603.04464] Understanding the Dynamics of Demonstration Conflict in In-Context Learning

[2603.04459] Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

[2603.04460] VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

[2603.04455] Large Language Models as Bidding Agents in Repeated HetNet Auction

[2603.04454] Query Disambiguation via Answer-Free Context: Doubling Performance on Humanity's Last Exam

[2603.04453] Induced Numerical Instability: Hidden Costs in Multimodal Large Language Models

[2603.04452] A unified foundational framework for knowledge injection and evaluation of Large Language Models in Combustion Science

[2603.04444] vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

[2603.04436] ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation

[2603.04443] AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

[2603.04429] What Is Missing: Interpretable Ratings for Large Language Model Outputs

[2603.04428] Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices

Related Topics

Stay updated with AI News