Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

One of the fastest ways to lose trust in a self-hosted LLM: prompt injection compliance

One production problem that feels bigger than people admit: a model looks fine, sounds safe, and then gives away too much the moment some...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

One of the fastest ways to lose trust in a self-hosted LLM: prompt injection compliance [P]

One production problem that feels bigger than people admit: a model looks fine, sounds safe, and then gives away too much the moment some...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

So, yesterday run was a success and I did get an avg rollout length of about 64 tokens as attached in the image! This was with quality_re...

Reddit - Machine Learning · 1 min · about 4 hours ago

All Content

Machine Learning

[2603.28248] Reasoning as Energy Minimization over Structured Latent Trajectories

Abstract page for arXiv paper 2603.28248: Reasoning as Energy Minimization over Structured Latent Trajectories

arXiv - AI · 4 min · 15 days ago

Llms

[2603.28197] EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

Abstract page for arXiv paper 2603.28197: EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

arXiv - AI · 3 min · 15 days ago

Llms

[2603.28183] PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision

Abstract page for arXiv paper 2603.28183: PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and...

arXiv - AI · 4 min · 15 days ago

Machine Learning

[2603.28135] CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

Abstract page for arXiv paper 2603.28135: CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

arXiv - AI · 4 min · 15 days ago

Llms

[2603.28062] SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring

Abstract page for arXiv paper 2603.28062: SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring

arXiv - AI · 4 min · 15 days ago

Llms

[2603.28052] Meta-Harness: End-to-End Optimization of Model Harnesses

Abstract page for arXiv paper 2603.28052: Meta-Harness: End-to-End Optimization of Model Harnesses

arXiv - AI · 3 min · 15 days ago

Machine Learning

[2603.28026] When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA

Abstract page for arXiv paper 2603.28026: When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA

arXiv - AI · 3 min · 15 days ago

Machine Learning

[2603.28015] What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?

Abstract page for arXiv paper 2603.28015: What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?

arXiv - AI · 3 min · 15 days ago

Machine Learning

[2603.28010] HeteroHub: An Applicable Data Management Framework for Heterogeneous Multi-Embodied Agent System

Abstract page for arXiv paper 2603.28010: HeteroHub: An Applicable Data Management Framework for Heterogeneous Multi-Embodied Agent System

arXiv - AI · 3 min · 15 days ago

Machine Learning

[2603.27977] SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

Abstract page for arXiv paper 2603.27977: SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

arXiv - AI · 4 min · 15 days ago

Llms

[2603.27958] CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs

Abstract page for arXiv paper 2603.27958: CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs

arXiv - AI · 3 min · 15 days ago

Machine Learning

[2603.27751] SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games

Abstract page for arXiv paper 2603.27751: SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games

arXiv - AI · 4 min · 15 days ago

Machine Learning

[2603.27738] TianJi:An autonomous AI meteorologist for discovering physical mechanisms in atmospheric science

Abstract page for arXiv paper 2603.27738: TianJi:An autonomous AI meteorologist for discovering physical mechanisms in atmospheric science

arXiv - AI · 4 min · 15 days ago

Machine Learning

[2603.27438] The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

Abstract page for arXiv paper 2603.27438: The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

arXiv - AI · 4 min · 15 days ago

Llms

[2603.27423] AstraAI: LLMs, Retrieval, and AST-Guided Assistance for HPC Codebases

Abstract page for arXiv paper 2603.27423: AstraAI: LLMs, Retrieval, and AST-Guided Assistance for HPC Codebases

arXiv - AI · 3 min · 15 days ago

Llms

[2603.27404] Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based Ethical Tutoring

Abstract page for arXiv paper 2603.27404: Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based E...

arXiv - AI · 4 min · 15 days ago

Llms

[2603.27360] Defend: Automated Rebuttals for Peer Review with Minimal Author Guidance

Abstract page for arXiv paper 2603.27360: Defend: Automated Rebuttals for Peer Review with Minimal Author Guidance

arXiv - AI · 4 min · 15 days ago

Llms

[2603.27343] Beyond Completion: Probing Cumulative State Tracking to Predict LLM Agent Performance

Abstract page for arXiv paper 2603.27343: Beyond Completion: Probing Cumulative State Tracking to Predict LLM Agent Performance

arXiv - AI · 3 min · 15 days ago

Llms

[2603.27338] CounterMoral: Editing Morals in Language Models

Abstract page for arXiv paper 2603.27338: CounterMoral: Editing Morals in Language Models

arXiv - AI · 3 min · 15 days ago

Machine Learning

[2603.27314] TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

Abstract page for arXiv paper 2603.27314: TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

arXiv - AI · 3 min · 15 days ago

Previous Page 192 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

One of the fastest ways to lose trust in a self-hosted LLM: prompt injection compliance

One of the fastest ways to lose trust in a self-hosted LLM: prompt injection compliance [P]

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

All Content

[2603.28248] Reasoning as Energy Minimization over Structured Latent Trajectories

[2603.28197] EpiPersona: Persona Projection and Episode Coupling for Pluralistic Preference Modeling

[2603.28183] PReD: An LLM-based Foundation Multimodal Model for Electromagnetic Perception, Recognition, and Decision

[2603.28135] CoT2-Meta: Budgeted Metacognitive Control for Test-Time Reasoning

[2603.28062] SLOW: Strategic Logical-inference Open Workspace for Cognitive Adaptation in AI Tutoring

[2603.28052] Meta-Harness: End-to-End Optimization of Model Harnesses

[2603.28026] When Choices Become Priors: Contrastive Decoding for Scientific Figure Multiple-Choice QA

[2603.28015] What an Autonomous Agent Discovers About Molecular Transformer Design: Does It Transfer?

[2603.28010] HeteroHub: An Applicable Data Management Framework for Heterogeneous Multi-Embodied Agent System

[2603.27977] SARL: Label-Free Reinforcement Learning by Rewarding Reasoning Topology

[2603.27958] CARV: A Diagnostic Benchmark for Compositional Analogical Reasoning in Multimodal LLMs

[2603.27751] SkyNet: Belief-Aware Planning for Partially-Observable Stochastic Games

[2603.27738] TianJi:An autonomous AI meteorologist for discovering physical mechanisms in atmospheric science

[2603.27438] The Novelty Bottleneck: A Framework for Understanding Human Effort Scaling in AI-Assisted Work

[2603.27423] AstraAI: LLMs, Retrieval, and AST-Guided Assistance for HPC Codebases

[2603.27404] Heterogeneous Debate Engine: Identity-Grounded Cognitive Architecture for Resilient LLM-Based Ethical Tutoring

[2603.27360] Defend: Automated Rebuttals for Peer Review with Minimal Author Guidance

[2603.27343] Beyond Completion: Probing Cumulative State Tracking to Predict LLM Agent Performance

[2603.27338] CounterMoral: Editing Morals in Language Models

[2603.27314] TokenDance: Token-to-Token Music-to-Dance Generation with Bidirectional Mamba

Related Topics

Stay updated with AI News