Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

I Cut Claude API Costs by 50% Using This Self Modifying Agentic System

I've been developing a self-modifying Al agent system that effectively cuts my Claude API usage in half, Claude thinks and then I basical...

Reddit - Artificial Intelligence · 1 min ·
Llms

Sentient OS: a custom on-device vision LLM that understands your entire digital life (every screenshot, note, file, email...), while your device charges overnight. Talk to your data, get proactive reminders, and explore knowledge graphs!

99% of "AI" apps are just GPT wrappers that pipe your data to cloud LLMs and call it a product. No one's ever created an intelligence lay...

Reddit - Artificial Intelligence · 1 min ·
Llms

What to build while we still have access to cheap AI?

AI companies are subsidizing access the same way Uber subsidized rides and AWS subsidized compute in the early days - burning cash to gra...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2510.01051] GEM: A Gym for Agentic LLMs
Llms

[2510.01051] GEM: A Gym for Agentic LLMs

Abstract page for arXiv paper 2510.01051: GEM: A Gym for Agentic LLMs

arXiv - Machine Learning · 4 min ·
[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning
Llms

[2510.00819] Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

Abstract page for arXiv paper 2510.00819: Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

arXiv - Machine Learning · 4 min ·
[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts
Llms

[2509.25678] Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized Mixture-of-Experts

Abstract page for arXiv paper 2509.25678: Massively Multimodal Foundation Models: A Framework for Capturing Interactions with Specialized...

arXiv - Machine Learning · 4 min ·
[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness
Llms

[2510.00041] Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

Abstract page for arXiv paper 2510.00041: Culture In a Frame: C$^3$B as a Comic-Based Benchmark for Multimodal Culturally Awareness

arXiv - AI · 4 min ·
[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages
Llms

[2509.26601] MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47 Languages

Abstract page for arXiv paper 2509.26601: MENLO: From Preferences to Proficiency -- Evaluating and Modeling Native-like Quality Across 47...

arXiv - Machine Learning · 4 min ·
[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size
Llms

[2509.26432] AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

Abstract page for arXiv paper 2509.26432: AdaBlock-dLLM: Semantic-Aware Diffusion LLM Inference via Adaptive Block Size

arXiv - Machine Learning · 4 min ·
[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing
Llms

[2509.26346] EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Abstract page for arXiv paper 2509.26346: EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

arXiv - AI · 4 min ·
[2509.24198] Negative Pre-activations Differentiate Syntax
Llms

[2509.24198] Negative Pre-activations Differentiate Syntax

Abstract page for arXiv paper 2509.24198: Negative Pre-activations Differentiate Syntax

arXiv - Machine Learning · 4 min ·
[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models
Llms

[2509.26324] COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

Abstract page for arXiv paper 2509.26324: COMRES-VLM: Coordinated Multi-Robot Exploration and Search using Vision Language Models

arXiv - AI · 4 min ·
[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought
Llms

[2509.23365] Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

Abstract page for arXiv paper 2509.23365: Emergence of Superposition: Unveiling the Training Dynamics of Chain of Continuous Thought

arXiv - Machine Learning · 4 min ·
[2509.25837] Distillation of Large Language Models via Concrete Score Matching
Llms

[2509.25837] Distillation of Large Language Models via Concrete Score Matching

Abstract page for arXiv paper 2509.25837: Distillation of Large Language Models via Concrete Score Matching

arXiv - Machine Learning · 4 min ·
[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors
Llms

[2509.25532] Calibrating Verbalized Confidence with Self-Generated Distractors

Abstract page for arXiv paper 2509.25532: Calibrating Verbalized Confidence with Self-Generated Distractors

arXiv - AI · 4 min ·
[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs
Llms

[2509.25390] SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

Abstract page for arXiv paper 2509.25390: SpinBench: Perspective and Rotation as a Lens on Spatial Reasoning in VLMs

arXiv - AI · 4 min ·
[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas
Llms

[2509.22957] Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

Abstract page for arXiv paper 2509.22957: Doubly-Robust LLM-as-a-Judge: Externally Valid Estimation with Imperfect Personas

arXiv - Machine Learning · 4 min ·
[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering
Llms

[2509.25175] EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

Abstract page for arXiv paper 2509.25175: EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

arXiv - AI · 3 min ·
[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families
Llms

[2509.25087] Scaling with Collapse: Efficient and Predictable Training of LLM Families

Abstract page for arXiv paper 2509.25087: Scaling with Collapse: Efficient and Predictable Training of LLM Families

arXiv - Machine Learning · 4 min ·
[2509.24385] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy
Llms

[2509.24385] Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

Abstract page for arXiv paper 2509.24385: Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

arXiv - AI · 4 min ·
[2509.24282] SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents
Llms

[2509.24282] SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

Abstract page for arXiv paper 2509.24282: SimuHome: A Temporal- and Environment-Aware Benchmark for Smart Home LLM Agents

arXiv - AI · 4 min ·
[2509.24245] Prompt and Parameter Co-Optimization for Large Language Models
Llms

[2509.24245] Prompt and Parameter Co-Optimization for Large Language Models

Abstract page for arXiv paper 2509.24245: Prompt and Parameter Co-Optimization for Large Language Models

arXiv - AI · 4 min ·
[2509.24203] Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends
Llms

[2509.24203] Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Abstract page for arXiv paper 2509.24203: Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRP...

arXiv - Machine Learning · 4 min ·
Previous Page 304 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime