Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

How strongly do you believe LLM judges on the for the ML papers?? [D]

I'm curious about your thoughts on these, as far as I've seen most of the comments are nitpicking about "missing ablations" while some co...

Reddit - Machine Learning · 1 min ·
Llms

Google just dropped "Deep Research Max" — We are officially entering the era of Autonomous Agents. RIP to the "Junior Analyst"?

The shift from "Chatbot" to "Agent" just hit warp speed. Google’s release of Deep Research Max isn't just another incremental update; it’...

Reddit - Artificial Intelligence · 1 min ·
Llms

Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — try it in 30 seconds without leaving this

Built Arc Gate — sits in front of any OpenAI-compatible endpoint and blocks prompt injection before it reaches your model. Just change yo...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language
Llms

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

Abstract page for arXiv paper 2603.03897: IROSA: Interactive Robot Skill Adaptation using Natural Language

arXiv - AI · 3 min ·
[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits
Llms

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

Abstract page for arXiv paper 2603.03881: On the Suitability of LLM-Driven Agents for Dark Pattern Audits

arXiv - AI · 4 min ·
[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification
Llms

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

Abstract page for arXiv paper 2603.03336: Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

arXiv - Machine Learning · 4 min ·
[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention
Llms

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

Abstract page for arXiv paper 2603.03310: Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

arXiv - Machine Learning · 3 min ·
[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
Llms

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

Abstract page for arXiv paper 2603.03823: SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

arXiv - AI · 3 min ·
[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
Llms

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

Abstract page for arXiv paper 2603.03790: T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Re...

arXiv - AI · 4 min ·
[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization
Llms

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Abstract page for arXiv paper 2603.04378: Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

arXiv - AI · 3 min ·
[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport
Llms

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

Abstract page for arXiv paper 2603.04355: Efficient Refusal Ablation in LLM through Optimal Transport

arXiv - AI · 4 min ·
[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading
Llms

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

Abstract page for arXiv paper 2603.04354: Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

arXiv - Machine Learning · 3 min ·
[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning
Llms

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

Abstract page for arXiv paper 2603.03752: Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

arXiv - AI · 3 min ·
[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF
Llms

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

Abstract page for arXiv paper 2603.04300: LUMINA: Foundation Models for Topology Transferable ACOPF

arXiv - Machine Learning · 3 min ·
[2603.03739] PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation
Llms

[2603.03739] PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation

Abstract page for arXiv paper 2603.03739: PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent ...

arXiv - AI · 4 min ·
[2603.03727] Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots through LLM-Generated Probes
Llms

[2603.03727] Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots through LLM-Generated Probes

Abstract page for arXiv paper 2603.03727: Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots throug...

arXiv - AI · 3 min ·
[2603.04276] Causality Elicitation from Large Language Models
Llms

[2603.04276] Causality Elicitation from Large Language Models

Abstract page for arXiv paper 2603.04276: Causality Elicitation from Large Language Models

arXiv - AI · 3 min ·
[2603.04142] A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series
Llms

[2603.04142] A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series

Abstract page for arXiv paper 2603.04142: A Multi-Agent Framework for Interpreting Multivariate Physiological Time Series

arXiv - Machine Learning · 4 min ·
[2603.03681] EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs
Llms

[2603.03681] EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs

Abstract page for arXiv paper 2603.03681: EvoPrune: Early-Stage Visual Token Pruning for Efficient MLLMs

arXiv - AI · 3 min ·
[2603.03677] MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation
Llms

[2603.03677] MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric Consultation

Abstract page for arXiv paper 2603.03677: MIND: Unified Inquiry and Diagnosis RL with Criteria Grounded Clinical Supports for Psychiatric...

arXiv - AI · 4 min ·
[2603.04135] Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization
Llms

[2603.04135] Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization

Abstract page for arXiv paper 2603.04135: Unbiased Dynamic Pruning for Efficient Group-Based Policy Optimization

arXiv - AI · 4 min ·
[2603.03637] Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions
Llms

[2603.03637] Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial Instructions

Abstract page for arXiv paper 2603.03637: Image-based Prompt Injection: Hijacking Multimodal LLMs through Visually Embedded Adversarial I...

arXiv - AI · 3 min ·
[2603.03633] Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study
Llms

[2603.03633] Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study

Abstract page for arXiv paper 2603.03633: Goal-Driven Risk Assessment for LLM-Powered Systems: A Healthcare Case Study

arXiv - AI · 4 min ·
Previous Page 268 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime