Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

How strongly do you believe LLM judges on the for the ML papers?? [D]

I'm curious about your thoughts on these, as far as I've seen most of the comments are nitpicking about "missing ablations" while some co...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

Google just dropped "Deep Research Max" — We are officially entering the era of Autonomous Agents. RIP to the "Junior Analyst"?

The shift from "Chatbot" to "Agent" just hit warp speed. Google’s release of Deep Research Max isn't just another incremental update; it’...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — try it in 30 seconds without leaving this

Built Arc Gate — sits in front of any OpenAI-compatible endpoint and blocks prompt injection before it reaches your model. Just change yo...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

All Content

Llms

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

Abstract page for arXiv paper 2603.03897: IROSA: Interactive Robot Skill Adaptation using Natural Language

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

Abstract page for arXiv paper 2603.03881: On the Suitability of LLM-Driven Agents for Dark Pattern Audits

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

Abstract page for arXiv paper 2603.03336: Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

Abstract page for arXiv paper 2603.03310: Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

Abstract page for arXiv paper 2603.03823: SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

Abstract page for arXiv paper 2603.03790: T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Re...

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Abstract page for arXiv paper 2603.04378: Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

Abstract page for arXiv paper 2603.04355: Efficient Refusal Ablation in LLM through Optimal Transport

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

Abstract page for arXiv paper 2603.04354: Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

Abstract page for arXiv paper 2603.03752: Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

Abstract page for arXiv paper 2603.04300: LUMINA: Foundation Models for Topology Transferable ACOPF

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03739] PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation

Abstract page for arXiv paper 2603.03739: PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03727] Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots through LLM-Generated Probes

Abstract page for arXiv paper 2603.03727: Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots throug...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04276] Causality Elicitation from Large Language Models

Abstract page for arXiv paper 2603.04276: Causality Elicitation from Large Language Models

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04142] A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series

Abstract page for arXiv paper 2603.04142: A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.03681] EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs

Abstract page for arXiv paper 2603.03681: EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03677] MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation

Abstract page for arXiv paper 2603.03677: MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric...

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04135] Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization

Abstract page for arXiv paper 2603.04135: Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03637] Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions

Abstract page for arXiv paper 2603.03637: Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial I...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03633] Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study

Abstract page for arXiv paper 2603.03633: Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study

arXiv - AI · 4 min · about 2 months ago

Previous Page 268 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

How strongly do you believe LLM judges on the for the ML papers?? [D]

Google just dropped "Deep Research Max" — We are officially entering the era of Autonomous Agents. RIP to the "Junior Analyst"?

Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — try it in 30 seconds without leaving this

All Content

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

[2603.03739] PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation

[2603.03727] Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots through LLM-Generated Probes

[2603.04276] Causality Elicitation from Large Language Models

[2603.04142] A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series

[2603.03681] EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs

[2603.03677] MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation

[2603.04135] Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization

[2603.03637] Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions

[2603.03633] Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study

Related Topics

Stay updated with AI News