Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Mira Murati’s deposition pulled back the curtain on Sam Altman’s ouster | The Verge
Llms

Mira Murati’s deposition pulled back the curtain on Sam Altman’s ouster | The Verge

Thanks to Musk v. Altman, the public is getting a concrete look at details of Sam Altman’s ouster from OpenAI, much of it centered on for...

The Verge - AI · 11 min ·
Llms

Diffusion for generating/editing ASTs? [D]

I’m not a machine learning expert or anything, but I do enjoy learning about how it all works. I’ve noticed that one of the main limitati...

Reddit - Machine Learning · 1 min ·
ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns | The Verge
Llms

ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns | The Verge

OpenAI is launching an optional safety feature for ChatGPT that allows adult users to assign an emergency contact for mental health and s...

The Verge - AI · 4 min ·

All Content

[2510.02209] StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Llms

[2510.02209] StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

Abstract page for arXiv paper 2510.02209: StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

arXiv - Machine Learning · 4 min ·
[2510.03253] Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents
Llms

[2510.03253] Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents

Abstract page for arXiv paper 2510.03253: Solving the Granularity Mismatch: Hierarchical Preference Learning for Long-Horizon LLM Agents

arXiv - Machine Learning · 4 min ·
[2510.02999] Untargeted Jailbreak Attack
Llms

[2510.02999] Untargeted Jailbreak Attack

Abstract page for arXiv paper 2510.02999: Untargeted Jailbreak Attack

arXiv - AI · 4 min ·
[2510.02245] ExGRPO: Learning to Reason from Experience
Llms

[2510.02245] ExGRPO: Learning to Reason from Experience

Abstract page for arXiv paper 2510.02245: ExGRPO: Learning to Reason from Experience

arXiv - Machine Learning · 4 min ·
[2510.01051] GEM: A Gym for Agentic LLMs
Llms

[2510.01051] GEM: A Gym for Agentic LLMs

Abstract page for arXiv paper 2510.01051: GEM: A Gym for Agentic LLMs

arXiv - Machine Learning · 4 min ·
[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning
Llms

[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

Abstract page for arXiv paper 2510.00819: Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

arXiv - Machine Learning · 4 min ·
[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts
Llms

[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts

Abstract page for arXiv paper 2509.25678: Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized...

arXiv - Machine Learning · 4 min ·
[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness
Llms

[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

Abstract page for arXiv paper 2510.00041: Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

arXiv - AI · 4 min ·
[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages
Llms

[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

Abstract page for arXiv paper 2509.26601: MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47...

arXiv - Machine Learning · 4 min ·
[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size
Llms

[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

Abstract page for arXiv paper 2509.26432: AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

arXiv - Machine Learning · 4 min ·
[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
Llms

[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Abstract page for arXiv paper 2509.26346: EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

arXiv - AI · 4 min ·
[2509.24198] Negative Pre-activations Differentiate Syntax
Llms

[2509.24198] Negative Pre-activations Differentiate Syntax

Abstract page for arXiv paper 2509.24198: Negative Pre-activations Differentiate Syntax

arXiv - Machine Learning · 4 min ·
[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models
Llms

[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

Abstract page for arXiv paper 2509.26324: COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

arXiv - AI · 4 min ·
[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought
Llms

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Abstract page for arXiv paper 2509.23365: Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

arXiv - Machine Learning · 4 min ·
[2509.25837] Distillation of Large Language Models via Concrete Score Matching
Llms

[2509.25837] Distillation of Large Language Models via Concrete Score Matching

Abstract page for arXiv paper 2509.25837: Distillation of Large Language Models via Concrete Score Matching

arXiv - Machine Learning · 4 min ·
[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors
Llms

[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors

Abstract page for arXiv paper 2509.25532: Calibrating Verbalized Confidence with Self-Generated Distractors

arXiv - AI · 4 min ·
[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs
Llms

[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

Abstract page for arXiv paper 2509.25390: SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

arXiv - AI · 4 min ·
[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas
Llms

[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

Abstract page for arXiv paper 2509.22957: Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

arXiv - Machine Learning · 4 min ·
[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering
Llms

[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

Abstract page for arXiv paper 2509.25175: EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

arXiv - AI · 3 min ·
[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families
Llms

[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families

Abstract page for arXiv paper 2509.25087: Scaling with Collapse: Efficient and Predictable Training of LLM Families

arXiv - Machine Learning · 4 min ·
Previous Page 338 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime