Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

I compiled every major AI agent security incident from 2024-2026 in one place - 90 incidents, all sourced, updated weekly

After tracking AI agent security incidents for the past year, I put together a single reference covering every major breach, vulnerabilit...

Reddit - Artificial Intelligence · 1 min ·
Llms

[R] Forced Depth Consideration Reduces Type II Errors in LLM Self-Classification: Evidence from an Exploration Prompting Ablation Study - (200 trap prompts, 4 models, 8 Step-0 variants) [R]

LLM-Based task classifier tend to misroute prompts that look simple at first glance, but require deeper understanding - I call it "Type I...

Reddit - Machine Learning · 1 min ·
Llms

I asked ChatGPT and Gemini to generate a world map

submitted by /u/Pitiful-Entrance5769 [link] [comments]

Reddit - Artificial Intelligence · 1 min ·

All Content

[2603.05028] Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure
Llms

[2603.05028] Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

Abstract page for arXiv paper 2603.05028: Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

arXiv - AI · 4 min ·
[2603.05016] BioLLMAgent: A Hybrid Framework with Enhanced Structural Interpretability for Simulating Human Decision-Making in Computational Psychiatry
Llms

[2603.05016] BioLLMAgent: A Hybrid Framework with Enhanced Structural Interpretability for Simulating Human Decision-Making in Computational Psychiatry

Abstract page for arXiv paper 2603.05016: BioLLMAgent: A Hybrid Framework with Enhanced Structural Interpretability for Simulating Human ...

arXiv - AI · 3 min ·
[2603.04951] Retrieval-Augmented Generation with Covariate Time Series
Llms

[2603.04951] Retrieval-Augmented Generation with Covariate Time Series

Abstract page for arXiv paper 2603.04951: Retrieval-Augmented Generation with Covariate Time Series

arXiv - AI · 4 min ·
[2603.04904] Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems
Llms

[2603.04904] Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems

Abstract page for arXiv paper 2603.04904: Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in ...

arXiv - AI · 4 min ·
[2603.04900] EvoTool: Self-Evolving Tool-Use Policy Optimization in LLM Agents via Blame-Aware Mutation and Diversity-Aware Selection
Llms

[2603.04900] EvoTool: Self-Evolving Tool-Use Policy Optimization in LLM Agents via Blame-Aware Mutation and Diversity-Aware Selection

Abstract page for arXiv paper 2603.04900: EvoTool: Self-Evolving Tool-Use Policy Optimization in LLM Agents via Blame-Aware Mutation and ...

arXiv - AI · 4 min ·
[2603.04896] Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs
Llms

[2603.04896] Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

Abstract page for arXiv paper 2603.04896: Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection...

arXiv - AI · 3 min ·
[2603.04894] Differentially Private Multimodal In-Context Learning
Llms

[2603.04894] Differentially Private Multimodal In-Context Learning

Abstract page for arXiv paper 2603.04894: Differentially Private Multimodal In-Context Learning

arXiv - AI · 3 min ·
[2603.04868] K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation
Llms

[2603.04868] K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Abstract page for arXiv paper 2603.04868: K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory ...

arXiv - AI · 3 min ·
[2603.04837] Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models
Llms

[2603.04837] Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models

Abstract page for arXiv paper 2603.04837: Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Languag...

arXiv - AI · 4 min ·
[2603.04822] VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment
Llms

[2603.04822] VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

Abstract page for arXiv paper 2603.04822: VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

arXiv - AI · 4 min ·
[2603.04818] LLM-Grounded Explainability for Port Congestion Prediction via Temporal Graph Attention Networks
Llms

[2603.04818] LLM-Grounded Explainability for Port Congestion Prediction via Temporal Graph Attention Networks

Abstract page for arXiv paper 2603.04818: LLM-Grounded Explainability for Port Congestion Prediction via Temporal Graph Attention Networks

arXiv - AI · 4 min ·
[2603.04791] Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling
Llms

[2603.04791] Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

Abstract page for arXiv paper 2603.04791: Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

arXiv - AI · 4 min ·
[2603.04783] Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction
Llms

[2603.04783] Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

Abstract page for arXiv paper 2603.04783: Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-T...

arXiv - AI · 4 min ·
[2603.04751] Evaluating the Search Agent in a Parallel World
Llms

[2603.04751] Evaluating the Search Agent in a Parallel World

Abstract page for arXiv paper 2603.04751: Evaluating the Search Agent in a Parallel World

arXiv - AI · 4 min ·
[2603.04750] HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel
Llms

[2603.04750] HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

Abstract page for arXiv paper 2603.04750: HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

arXiv - AI · 3 min ·
[2603.04741] CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics
Llms

[2603.04741] CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

Abstract page for arXiv paper 2603.04741: CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

arXiv - Machine Learning · 3 min ·
[2603.04735] Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery
Llms

[2603.04735] Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

Abstract page for arXiv paper 2603.04735: Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

arXiv - AI · 4 min ·
[2603.04670] Using Vision + Language Models to Predict Item Difficulty
Llms

[2603.04670] Using Vision + Language Models to Predict Item Difficulty

Abstract page for arXiv paper 2603.04670: Using Vision + Language Models to Predict Item Difficulty

arXiv - AI · 3 min ·
[2603.04636] When Agents Persuade: Propaganda Generation and Mitigation in LLMs
Llms

[2603.04636] When Agents Persuade: Propaganda Generation and Mitigation in LLMs

Abstract page for arXiv paper 2603.04636: When Agents Persuade: Propaganda Generation and Mitigation in LLMs

arXiv - AI · 3 min ·
[2603.04631] Towards automated data analysis: A guided framework for LLM-based risk estimation
Llms

[2603.04631] Towards automated data analysis: A guided framework for LLM-based risk estimation

Abstract page for arXiv paper 2603.04631: Towards automated data analysis: A guided framework for LLM-based risk estimation

arXiv - AI · 3 min ·
Previous Page 132 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime