Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

OpenAI expands its cyber defense program with GPT-5.4-Cyber for vetted researchers

The company is scaling its Trusted Access for Cyber (TAC) program to thousands of verified individual defenders and hundreds of teams res...

Reddit - Artificial Intelligence · 1 min · 9 minutes ago

Llms

Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning

Gemini Robotics-ER 1.6 is a significant upgrade to the reasoning-first model that enables robots to understand their environments with un...

Reddit - Artificial Intelligence · 1 min · 9 minutes ago

Llms

[2603.10652] Are Video Reasoning Models Ready to Go Outside?

Abstract page for arXiv paper 2603.10652: Are Video Reasoning Models Ready to Go Outside?

arXiv - AI · 4 min · about 1 hour ago

All Content

Llms

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

Abstract page for arXiv paper 2603.04002: Discriminative Perception via Anchored Description for Reasoning Segmentation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

Abstract page for arXiv paper 2603.03589: stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

Abstract page for arXiv paper 2603.03983: GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Abstract page for arXiv paper 2603.03583: ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

Abstract page for arXiv paper 2603.03964: BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Abstract page for arXiv paper 2603.03915: Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personalit...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

Abstract page for arXiv paper 2603.03897: IROSA: Interactive Robot Skill Adaptation using Natural Language

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

Abstract page for arXiv paper 2603.03881: On the Suitability of LLM-Driven Agents for Dark Pattern Audits

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

Abstract page for arXiv paper 2603.03336: Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

Abstract page for arXiv paper 2603.03310: Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

Abstract page for arXiv paper 2603.03823: SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

Abstract page for arXiv paper 2603.03790: T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Re...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Abstract page for arXiv paper 2603.04378: Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

Abstract page for arXiv paper 2603.04355: Efficient Refusal Ablation in LLM through Optimal Transport

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

Abstract page for arXiv paper 2603.04354: Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

Abstract page for arXiv paper 2603.03752: Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

Abstract page for arXiv paper 2603.04300: LUMINA: Foundation Models for Topology Transferable ACOPF

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03739] PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation

Abstract page for arXiv paper 2603.03739: PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03727] Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots through LLM-Generated Probes

Abstract page for arXiv paper 2603.03727: Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots throug...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04276] Causality Elicitation from Large Language Models

Abstract page for arXiv paper 2603.04276: Causality Elicitation from Large Language Models

arXiv - AI · 3 min · about 1 month ago

Previous Page 178 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

OpenAI expands its cyber defense program with GPT-5.4-Cyber for vetted researchers

Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning

[2603.10652] Are Video Reasoning Models Ready to Go Outside?

All Content

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

[2603.03739] PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation

[2603.03727] Understanding Parents' Desires in Moderating Children's Interactions with GenAI Chatbots through LLM-Generated Probes

[2603.04276] Causality Elicitation from Large Language Models

Related Topics

Stay updated with AI News