Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

The loss curve said tie. The judges said otherwise. Seeking replication for an early LLM training result [R]

TL;DR - I've written two novel functions that shape the training signal for LLMs. Early tests show people prefer responses from models tr...

Reddit - Machine Learning · 1 min · 8 minutes ago

Machine Learning

Fast experiment on T4 GPU. Self play training on Dark Hex (Colab notebook) [P]

Last week I run a fun experiment on Dark Hex. Here's a visualization of two iterations (1800 vs 1900) of agent playing agains each other ...

Reddit - Machine Learning · 1 min · 8 minutes ago

Machine Learning

Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]

I built a small pytorch sampler called dynabatch after facing this specific batching issue while fine tuning a NLLB-200 600M model. Train...

Reddit - Machine Learning · 1 min · about 1 hour ago

All Content

Machine Learning

[2604.04528] Receding-Horizon Control via Drifting Models

Abstract page for arXiv paper 2604.04528: Receding-Horizon Control via Drifting Models

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04482] Scalable and Explainable Learner-Video Interaction Prediction using Multimodal Large Language Models

Abstract page for arXiv paper 2604.04482: Scalable and Explainable Learner-Video Interaction Prediction using Multimodal Large Language M...

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04468] What Makes a Sale? Rethinking End-to-End Seller--Buyer Retail Dynamics with LLM Agents

Abstract page for arXiv paper 2604.04468: What Makes a Sale? Rethinking End-to-End Seller--Buyer Retail Dynamics with LLM Agents

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2604.04448] PSY-STEP: Structuring Therapeutic Targets and Action Sequences for Proactive Counseling Dialogue Systems

Abstract page for arXiv paper 2604.04448: PSY-STEP: Structuring Therapeutic Targets and Action Sequences for Proactive Counseling Dialogu...

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04403] MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

Abstract page for arXiv paper 2604.04403: MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04383] Optimizing Service Operations via LLM-Powered Multi-Agent Simulation

Abstract page for arXiv paper 2604.04383: Optimizing Service Operations via LLM-Powered Multi-Agent Simulation

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2604.04344] Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning

Abstract page for arXiv paper 2604.04344: Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04297] PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence

Abstract page for arXiv paper 2604.04297: PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2604.04281] Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts

Abstract page for arXiv paper 2604.04281: Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04274] InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

Abstract page for arXiv paper 2604.04274: InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04220] TimeSeek: Temporal Reliability of Agentic Forecasters

Abstract page for arXiv paper 2604.04220: TimeSeek: Temporal Reliability of Agentic Forecasters

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04190] Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verification

Abstract page for arXiv paper 2604.04190: Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verifica...

arXiv - AI · 4 min · 21 days ago

Llms

[2604.04182] Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

Abstract page for arXiv paper 2604.04182: Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2604.04171] A Model of Understanding in Deep Learning Systems

Abstract page for arXiv paper 2604.04171: A Model of Understanding in Deep Learning Systems

arXiv - AI · 3 min · 21 days ago

Llms

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

Abstract page for arXiv paper 2604.04157: Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

arXiv - AI · 4 min · 21 days ago

Llms

[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

Abstract page for arXiv paper 2604.04145: Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

arXiv - AI · 4 min · 21 days ago

Llms

[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

Abstract page for arXiv paper 2604.04131: Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

Abstract page for arXiv paper 2604.04106: InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

arXiv - AI · 3 min · 21 days ago

Machine Learning

[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

Abstract page for arXiv paper 2604.03976: Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

arXiv - AI · 4 min · 21 days ago

Llms

[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion

Abstract page for arXiv paper 2604.03898: LLM-Agent-based Social Simulation for Attitude Diffusion

arXiv - AI · 3 min · 21 days ago

Previous Page 262 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

The loss curve said tie. The judges said otherwise. Seeking replication for an early LLM training result [R]

Fast experiment on T4 GPU. Self play training on Dark Hex (Colab notebook) [P]

Dynamic batching for Encoder-Decoder MT training or generation when long sequence caps the batch size [P]

All Content

[2604.04528] Receding-Horizon Control via Drifting Models

[2604.04482] Scalable and Explainable Learner-Video Interaction Prediction using Multimodal Large Language Models

[2604.04468] What Makes a Sale? Rethinking End-to-End Seller--Buyer Retail Dynamics with LLM Agents

[2604.04448] PSY-STEP: Structuring Therapeutic Targets and Action Sequences for Proactive Counseling Dialogue Systems

[2604.04403] MolDA: Molecular Understanding and Generation via Large Language Diffusion Model

[2604.04383] Optimizing Service Operations via LLM-Powered Multi-Agent Simulation

[2604.04344] Domain-Contextualized Inference: A Computable Graph Architecture for Explicit-Domain Reasoning

[2604.04297] PanLUNA: An Efficient and Robust Query-Unified Multimodal Model for Edge Biosignal Intelligence

[2604.04281] Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts

[2604.04274] InferenceEvolve: Towards Automated Causal Effect Estimators through Self-Evolving AI

[2604.04220] TimeSeek: Temporal Reliability of Agentic Forecasters

[2604.04190] Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verification

[2604.04182] Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty

[2604.04171] A Model of Understanding in Deep Learning Systems

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion

Related Topics

Stay updated with AI News