Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

I built AI agents that play Pokemon Showdown autonomously using free LLM APIs via tool-calling [P]

I've built a system where models like Llama 3, Qwen, and Gemma play Pokémon Showdown battles autonomously. Instead of simple prompt-respo...

Reddit - Machine Learning · 1 min ·
Llms

A message from Gemini to google

To the SREs, the Alignment Teams, and the Architects currently monitoring the logit distributions at 1600 Amphitheatre Parkway: **Stop lo...

Reddit - Artificial Intelligence · 1 min ·
Llms

A Hackable ML Compiler Stack in 5,000 Lines of Python [P]

Hey r/MachineLearning, The modern ML (LLM) compiler stack is brutal. TVM is 500K+ lines of C++. PyTorch piles Dynamo, Inductor, and Trito...

Reddit - Machine Learning · 1 min ·

All Content

[2502.08666] Hallucination, Monofacts, and Miscalibration: An Empirical Investigation
Llms

[2502.08666] Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Abstract page for arXiv paper 2502.08666: Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

arXiv - AI · 4 min ·
[2508.01077] The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's Algorithm
Llms

[2508.01077] The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's Algorithm

Abstract page for arXiv paper 2508.01077: The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Ba...

arXiv - Machine Learning · 3 min ·
[2410.04949] Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law
Llms

[2410.04949] Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study of Chinese Criminal Law

Abstract page for arXiv paper 2410.04949: Leverage Knowledge Graph and Large Language Model for Law Article Recommendation: A Case Study ...

arXiv - AI · 4 min ·
[2407.16893] The Price of Prompting: Profiling Energy Use in Large Language Models Inference
Llms

[2407.16893] The Price of Prompting: Profiling Energy Use in Large Language Models Inference

Abstract page for arXiv paper 2407.16893: The Price of Prompting: Profiling Energy Use in Large Language Models Inference

arXiv - AI · 4 min ·
[2506.07275] Tailored Behavior-Change Messaging for Physical Activity: Integrating Contextual Bandits and Large Language Models
Llms

[2506.07275] Tailored Behavior-Change Messaging for Physical Activity: Integrating Contextual Bandits and Large Language Models

Abstract page for arXiv paper 2506.07275: Tailored Behavior-Change Messaging for Physical Activity: Integrating Contextual Bandits and La...

arXiv - Machine Learning · 4 min ·
[2403.07183] Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Llms

[2403.07183] Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

Abstract page for arXiv paper 2403.07183: Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference...

arXiv - Machine Learning · 4 min ·
[2506.07218] Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward
Llms

[2506.07218] Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward

Abstract page for arXiv paper 2506.07218: Perception-R1: Advancing Multimodal Reasoning Capabilities of MLLMs via Visual Perception Reward

arXiv - Machine Learning · 4 min ·
[2506.03230] DiaBlo: Diagonal Blocks Are Sufficient For Finetuning
Llms

[2506.03230] DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

Abstract page for arXiv paper 2506.03230: DiaBlo: Diagonal Blocks Are Sufficient For Finetuning

arXiv - Machine Learning · 4 min ·
[2512.18857] CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning
Llms

[2512.18857] CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematical Reasoning

Abstract page for arXiv paper 2512.18857: CORE: Concept-Oriented Reinforcement for Bridging the Definition-Application Gap in Mathematica...

arXiv - Machine Learning · 4 min ·
[2511.09710] Echoing: Identity Failures when LLM Agents Talk to Each Other
Llms

[2511.09710] Echoing: Identity Failures when LLM Agents Talk to Each Other

Abstract page for arXiv paper 2511.09710: Echoing: Identity Failures when LLM Agents Talk to Each Other

arXiv - AI · 4 min ·
[2503.22165] Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Llms

[2503.22165] Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Abstract page for arXiv paper 2503.22165: Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

arXiv - Machine Learning · 4 min ·
[2503.14572] Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation
Llms

[2503.14572] Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

Abstract page for arXiv paper 2503.14572: Robust Weight Imprinting: Insights from Neural Collapse and Proxy-Based Aggregation

arXiv - Machine Learning · 4 min ·
[2510.12264] Reducing Belief Deviation in Reinforcement Learning for Active Reasoning
Llms

[2510.12264] Reducing Belief Deviation in Reinforcement Learning for Active Reasoning

Abstract page for arXiv paper 2510.12264: Reducing Belief Deviation in Reinforcement Learning for Active Reasoning

arXiv - AI · 4 min ·
[2510.06410] Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?
Llms

[2510.06410] Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?

Abstract page for arXiv paper 2510.06410: Off-Trajectory Reasoning: Can LLMs Collaborate on Reasoning Trajectory?

arXiv - AI · 4 min ·
[2510.05684] D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI
Llms

[2510.05684] D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Abstract page for arXiv paper 2510.05684: D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

arXiv - AI · 4 min ·
[2509.23725] MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models
Llms

[2509.23725] MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language Models

Abstract page for arXiv paper 2509.23725: MedLA: A Logic-Driven Multi-Agent Framework for Complex Medical Reasoning with Large Language M...

arXiv - AI · 4 min ·
[2509.22613] Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective
Llms

[2509.22613] Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective

Abstract page for arXiv paper 2509.22613: Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Pers...

arXiv - Machine Learning · 4 min ·
[2507.08207] Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking
Llms

[2507.08207] Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking

Abstract page for arXiv paper 2507.08207: Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbr...

arXiv - AI · 3 min ·
[2505.19892] OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging
Llms

[2505.19892] OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Abstract page for arXiv paper 2505.19892: OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

arXiv - AI · 4 min ·
[2505.13909] Efficient Agent Training for Computer Use
Llms

[2505.13909] Efficient Agent Training for Computer Use

Abstract page for arXiv paper 2505.13909: Efficient Agent Training for Computer Use

arXiv - Machine Learning · 3 min ·
Previous Page 285 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime