Machine Learning

ML algorithms, training, and inference

Top This Week

Llms

The loss curve said tie. The judges said otherwise. Seeking replication for an early LLM training result [R]

TL;DR - I've written two novel functions that shape the training signal for LLMs. Early tests show people prefer responses from models tr...

Reddit - Machine Learning · 1 min ·
Machine Learning

Fast experiment on T4 GPU. Self play training on Dark Hex (Colab notebook) [P]

Last week I run a fun experiment on Dark Hex. Here's a visualization of two iterations (1800 vs 1900) of agent playing agains each other ...

Reddit - Machine Learning · 1 min ·
Machine Learning

Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]

I built a small pytorch sampler called dynabatch after facing this specific batching issue while fine tuning a NLLB-200 600M model. Train...

Reddit - Machine Learning · 1 min ·

All Content

[2604.04528] Receding-Horizon Control via Drifting Models
Machine Learning

[2604.04528] Receding-Horizon Control via Drifting Models

Abstract page for arXiv paper 2604.04528: Receding-Horizon Control via Drifting Models

arXiv - AI · 3 min ·
[2604.04482] Scalable and Explainable Learner-Video Interaction Prediction using Multimodal Large Language Models
Llms

[2604.04482] Scalable and Explainable Learner-Video Interaction Prediction using Multimodal Large Language Models

Abstract page for arXiv paper 2604.04482: Scalable and Explainable Learner-Video Interaction Prediction using Multimodal Large Language M...

arXiv - AI · 3 min ·
[2604.04468] What Makes a Sale? Rethinking End-to-End Seller--Buyer Retail Dynamics with LLM Agents
Llms

[2604.04468] What Makes a Sale? Rethinking End-to-End Seller--Buyer Retail Dynamics with LLM Agents

Abstract page for arXiv paper 2604.04468: What Makes a Sale? Rethinking End-to-End Seller--Buyer Retail Dynamics with LLM Agents

arXiv - AI · 3 min ·
[2604.04448] PSY-STEP: Structuring Therapeutic Targets and Action Sequences for Proactive Counseling Dialogue Systems
Machine Learning

[2604.04448] PSY-STEP: Structuring Therapeutic Targets and Action Sequences for Proactive Counseling Dialogue Systems

Abstract page for arXiv paper 2604.04448: PSY-STEP: Structuring Therapeutic Targets and Action Sequences for Proactive Counseling Dialogu...

arXiv - AI · 3 min ·
[2604.04403] MolDA: Molecular Understanding and Generation via Large Language Diffusion Model
Llms

[2604.04403] MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

Abstract page for arXiv paper 2604.04403: MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

arXiv - AI · 3 min ·
[2604.04383] Optimizing Service Operations via LLM-Powered Multi-Agent Simulation
Llms

[2604.04383] Optimizing Service Operations via LLM-Powered Multi-Agent Simulation

Abstract page for arXiv paper 2604.04383: Optimizing Service Operations via LLM-Powered Multi-Agent Simulation

arXiv - AI · 3 min ·
[2604.04344] Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning
Machine Learning

[2604.04344] Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning

Abstract page for arXiv paper 2604.04344: Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning

arXiv - AI · 3 min ·
[2604.04297] PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence
Llms

[2604.04297] PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence

Abstract page for arXiv paper 2604.04297: PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence

arXiv - AI · 3 min ·
[2604.04281] Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts
Machine Learning

[2604.04281] Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts

Abstract page for arXiv paper 2604.04281: Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts

arXiv - AI · 3 min ·
[2604.04274] InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI
Llms

[2604.04274] InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

Abstract page for arXiv paper 2604.04274: InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

arXiv - AI · 3 min ·
[2604.04220] TimeSeek: Temporal Reliability of Agentic Forecasters
Llms

[2604.04220] TimeSeek: Temporal Reliability of Agentic Forecasters

Abstract page for arXiv paper 2604.04220: TimeSeek: Temporal Reliability of Agentic Forecasters

arXiv - AI · 3 min ·
[2604.04190] Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verification
Llms

[2604.04190] Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verification

Abstract page for arXiv paper 2604.04190: Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verifica...

arXiv - AI · 4 min ·
[2604.04182] Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty
Llms

[2604.04182] Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

Abstract page for arXiv paper 2604.04182: Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

arXiv - AI · 3 min ·
[2604.04171] A Model of Understanding in Deep Learning Systems
Machine Learning

[2604.04171] A Model of Understanding in Deep Learning Systems

Abstract page for arXiv paper 2604.04171: A Model of Understanding in Deep Learning Systems

arXiv - AI · 3 min ·
[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents
Llms

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

Abstract page for arXiv paper 2604.04157: Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

arXiv - AI · 4 min ·
[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting
Llms

[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

Abstract page for arXiv paper 2604.04145: Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

arXiv - AI · 4 min ·
[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents
Llms

[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

Abstract page for arXiv paper 2604.04131: Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

arXiv - AI · 3 min ·
[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories
Machine Learning

[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

Abstract page for arXiv paper 2604.04106: InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

arXiv - AI · 3 min ·
[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents
Machine Learning

[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

Abstract page for arXiv paper 2604.03976: Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

arXiv - AI · 4 min ·
[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion
Llms

[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion

Abstract page for arXiv paper 2604.03898: LLM-Agent-based Social Simulation for Attitude Diffusion

arXiv - AI · 3 min ·
Previous Page 262 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime