Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

I built an LLM proxy that uses differential geometry to detect prompt injection — here’s what actually works (and what doesn’t)

I’ve spent the last few months building Arc Gate, a monitoring proxy for deployed LLMs. The pitch: one URL change, and you get real-time ...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

Llms

Reality of SaaS

Why on earth would you pay $49/mo for a polished Saas product when you can spend $500 a day building one for yourself in Claude. Absolute...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

Llms

Deterministic vs. probabilistic guardrails for agentic AI — our approach and an open-source tool [D]

We've been thinking hard about whether safety guardrails for AI agents should be LLM-based (probabilistic) or rule-based (deterministic)....

Reddit - Machine Learning · 1 min · about 7 hours ago

All Content

Llms

[2510.03253] Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents

Abstract page for arXiv paper 2510.03253: Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.02999] Untargeted Jailbreak Attack

Abstract page for arXiv paper 2510.02999: Untargeted Jailbreak Attack

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.02245] ExGRPO: Learning to Reason from Experience

Abstract page for arXiv paper 2510.02245: ExGRPO: Learning to Reason from Experience

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.01051] GEM: A Gym for Agentic LLMs

Abstract page for arXiv paper 2510.01051: GEM: A Gym for Agentic LLMs

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

Abstract page for arXiv paper 2510.00819: Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts

Abstract page for arXiv paper 2509.25678: Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

Abstract page for arXiv paper 2510.00041: Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

Abstract page for arXiv paper 2509.26601: MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

Abstract page for arXiv paper 2509.26432: AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Abstract page for arXiv paper 2509.26346: EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.24198] Negative Pre-activations Differentiate Syntax

Abstract page for arXiv paper 2509.24198: Negative Pre-activations Differentiate Syntax

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

Abstract page for arXiv paper 2509.26324: COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Abstract page for arXiv paper 2509.23365: Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25837] Distillation of Large Language Models via Concrete Score Matching

Abstract page for arXiv paper 2509.25837: Distillation of Large Language Models via Concrete Score Matching

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors

Abstract page for arXiv paper 2509.25532: Calibrating Verbalized Confidence with Self-Generated Distractors

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

Abstract page for arXiv paper 2509.25390: SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

Abstract page for arXiv paper 2509.22957: Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

Abstract page for arXiv paper 2509.25175: EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

arXiv - AI · 3 min · about 2 months ago

Llms

[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families

Abstract page for arXiv paper 2509.25087: Scaling with Collapse: Efficient and Predictable Training of LLM Families

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.24385] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

Abstract page for arXiv paper 2509.24385: Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

arXiv - AI · 4 min · about 2 months ago

Previous Page 215 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

I built an LLM proxy that uses differential geometry to detect prompt injection — here’s what actually works (and what doesn’t)

Reality of SaaS

Deterministic vs. probabilistic guardrails for agentic AI — our approach and an open-source tool [D]

All Content

[2510.03253] Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents

[2510.02999] Untargeted Jailbreak Attack

[2510.02245] ExGRPO: Learning to Reason from Experience

[2510.01051] GEM: A Gym for Agentic LLMs

[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts

[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

[2509.24198] Negative Pre-activations Differentiate Syntax

[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

[2509.25837] Distillation of Large Language Models via Concrete Score Matching

[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors

[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families

[2509.24385] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

Related Topics

Stay updated with AI News