Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

I Cut Claude API Costs by 50% Using This Self Modifying Agentic System

I've been developing a self-modifying Al agent system that effectively cuts my Claude API usage in half, Claude thinks and then I basical...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Sentient OS: a custom on-device vision LLM that understands your entire digital life (every screenshot, note, file, email...), while your device charges overnight. Talk to your data, get proactive reminders, and explore knowledge graphs!

99% of "AI" apps are just GPT wrappers that pipe your data to cloud LLMs and call it a product. No one's ever created an intelligence lay...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

What to build while we still have access to cheap AI?

AI companies are subsidizing access the same way Uber subsidized rides and AWS subsidized compute in the early days - burning cash to gra...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

All Content

Llms

[2510.01051] GEM: A Gym for Agentic LLMs

Abstract page for arXiv paper 2510.01051: GEM: A Gym for Agentic LLMs

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

Abstract page for arXiv paper 2510.00819: Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts

Abstract page for arXiv paper 2509.25678: Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

Abstract page for arXiv paper 2510.00041: Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

Abstract page for arXiv paper 2509.26601: MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

Abstract page for arXiv paper 2509.26432: AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Abstract page for arXiv paper 2509.26346: EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.24198] Negative Pre-activations Differentiate Syntax

Abstract page for arXiv paper 2509.24198: Negative Pre-activations Differentiate Syntax

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

Abstract page for arXiv paper 2509.26324: COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Abstract page for arXiv paper 2509.23365: Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25837] Distillation of Large Language Models via Concrete Score Matching

Abstract page for arXiv paper 2509.25837: Distillation of Large Language Models via Concrete Score Matching

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors

Abstract page for arXiv paper 2509.25532: Calibrating Verbalized Confidence with Self-Generated Distractors

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

Abstract page for arXiv paper 2509.25390: SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

Abstract page for arXiv paper 2509.22957: Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

Abstract page for arXiv paper 2509.25175: EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

arXiv - AI · 3 min · about 2 months ago

Llms

[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families

Abstract page for arXiv paper 2509.25087: Scaling with Collapse: Efficient and Predictable Training of LLM Families

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.24385] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

Abstract page for arXiv paper 2509.24385: Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.24282] SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

Abstract page for arXiv paper 2509.24282: SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.24245] Prompt and Parameter Co-Optimization for Large Language Models

Abstract page for arXiv paper 2509.24245: Prompt and Parameter Co-Optimization for Large Language Models

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.24203] Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Abstract page for arXiv paper 2509.24203: Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRP...

arXiv - Machine Learning · 4 min · about 2 months ago

Previous Page 304 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

I Cut Claude API Costs by 50% Using This Self Modifying Agentic System

Sentient OS: a custom on-device vision LLM that understands your entire digital life (every screenshot, note, file, email...), while your device charges overnight. Talk to your data, get proactive reminders, and explore knowledge graphs!

What to build while we still have access to cheap AI?

All Content

[2510.01051] GEM: A Gym for Agentic LLMs

[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts

[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

[2509.24198] Negative Pre-activations Differentiate Syntax

[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

[2509.25837] Distillation of Large Language Models via Concrete Score Matching

[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors

[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families

[2509.24385] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

[2509.24282] SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

[2509.24245] Prompt and Parameter Co-Optimization for Large Language Models

[2509.24203] Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Related Topics

Stay updated with AI News