Machine Learning

ML algorithms, training, and inference

Top This Week

Llms

One of the fastest ways to lose trust in a self-hosted LLM: prompt injection compliance

One production problem that feels bigger than people admit: a model looks fine, sounds safe, and then gives away too much the moment some...

Reddit - Artificial Intelligence · 1 min ·
Llms

One of the fastest ways to lose trust in a self-hosted LLM: prompt injection compliance [P]

One production problem that feels bigger than people admit: a model looks fine, sounds safe, and then gives away too much the moment some...

Reddit - Machine Learning · 1 min ·
Machine Learning

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

So, yesterday run was a success and I did get an avg rollout length of about 64 tokens as attached in the image! This was with quality_re...

Reddit - Machine Learning · 1 min ·

All Content

[2603.28248] Reasoning as Energy Minimization over Structured Latent Trajectories
Machine Learning

[2603.28248] Reasoning as Energy Minimization over Structured Latent Trajectories

Abstract page for arXiv paper 2603.28248: Reasoning as Energy Minimization over Structured Latent Trajectories

arXiv - AI · 4 min ·
[2603.28197] EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling
Llms

[2603.28197] EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

Abstract page for arXiv paper 2603.28197: EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

arXiv - AI · 3 min ·
[2603.28183] PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision
Llms

[2603.28183] PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision

Abstract page for arXiv paper 2603.28183: PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and...

arXiv - AI · 4 min ·
[2603.28135] CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning
Machine Learning

[2603.28135] CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

Abstract page for arXiv paper 2603.28135: CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

arXiv - AI · 4 min ·
[2603.28062] SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring
Llms

[2603.28062] SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring

Abstract page for arXiv paper 2603.28062: SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring

arXiv - AI · 4 min ·
[2603.28052] Meta-Harness: End-to-End Optimization of Model Harnesses
Llms

[2603.28052] Meta-Harness: End-to-End Optimization of Model Harnesses

Abstract page for arXiv paper 2603.28052: Meta-Harness: End-to-End Optimization of Model Harnesses

arXiv - AI · 3 min ·
[2603.28026] When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA
Machine Learning

[2603.28026] When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA

Abstract page for arXiv paper 2603.28026: When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA

arXiv - AI · 3 min ·
[2603.28015] What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?
Machine Learning

[2603.28015] What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?

Abstract page for arXiv paper 2603.28015: What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?

arXiv - AI · 3 min ·
[2603.28010] HeteroHub: An Applicable Data Management Framework for Heterogeneous Multi-Embodied Agent System
Machine Learning

[2603.28010] HeteroHub: An Applicable Data Management Framework for Heterogeneous Multi-Embodied Agent System

Abstract page for arXiv paper 2603.28010: HeteroHub: An Applicable Data Management Framework for Heterogeneous Multi-Embodied Agent System

arXiv - AI · 3 min ·
[2603.27977] SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology
Machine Learning

[2603.27977] SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

Abstract page for arXiv paper 2603.27977: SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

arXiv - AI · 4 min ·
[2603.27958] CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs
Llms

[2603.27958] CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs

Abstract page for arXiv paper 2603.27958: CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs

arXiv - AI · 3 min ·
[2603.27751] SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games
Machine Learning

[2603.27751] SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games

Abstract page for arXiv paper 2603.27751: SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games

arXiv - AI · 4 min ·
[2603.27738] TianJi:An autonomous AI meteorologist for discovering physical mechanisms in atmospheric science
Machine Learning

[2603.27738] TianJi:An autonomous AI meteorologist for discovering physical mechanisms in atmospheric science

Abstract page for arXiv paper 2603.27738: TianJi:An autonomous AI meteorologist for discovering physical mechanisms in atmospheric science

arXiv - AI · 4 min ·
[2603.27438] The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work
Machine Learning

[2603.27438] The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

Abstract page for arXiv paper 2603.27438: The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

arXiv - AI · 4 min ·
[2603.27423] AstraAI: LLMs, Retrieval, and AST-Guided Assistance for HPC Codebases
Llms

[2603.27423] AstraAI: LLMs, Retrieval, and AST-Guided Assistance for HPC Codebases

Abstract page for arXiv paper 2603.27423: AstraAI: LLMs, Retrieval, and AST-Guided Assistance for HPC Codebases

arXiv - AI · 3 min ·
[2603.27404] Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based Ethical Tutoring
Llms

[2603.27404] Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based Ethical Tutoring

Abstract page for arXiv paper 2603.27404: Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based E...

arXiv - AI · 4 min ·
[2603.27360] Defend: Automated Rebuttals for Peer Review with Minimal Author Guidance
Llms

[2603.27360] Defend: Automated Rebuttals for Peer Review with Minimal Author Guidance

Abstract page for arXiv paper 2603.27360: Defend: Automated Rebuttals for Peer Review with Minimal Author Guidance

arXiv - AI · 4 min ·
[2603.27343] Beyond Completion: Probing Cumulative State Tracking to Predict LLM Agent Performance
Llms

[2603.27343] Beyond Completion: Probing Cumulative State Tracking to Predict LLM Agent Performance

Abstract page for arXiv paper 2603.27343: Beyond Completion: Probing Cumulative State Tracking to Predict LLM Agent Performance

arXiv - AI · 3 min ·
[2603.27338] CounterMoral: Editing Morals in Language Models
Llms

[2603.27338] CounterMoral: Editing Morals in Language Models

Abstract page for arXiv paper 2603.27338: CounterMoral: Editing Morals in Language Models

arXiv - AI · 3 min ·
[2603.27314] TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba
Machine Learning

[2603.27314] TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

Abstract page for arXiv paper 2603.27314: TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

arXiv - AI · 3 min ·
Previous Page 192 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime