Machine Learning

ML algorithms, training, and inference

Top This Week

Llms

[R] Depth-first pruning transfers: GPT-2 → TinyLlama with stable gains and minimal loss

TL;DR: Removing the right layers (instead of shrinking all layers) makes transformer models ~8–12% smaller with only ~6–8% quality loss, ...

Reddit - Machine Learning · 1 min ·
Llms

Built a training stability monitor that detects instability before your loss curve shows anything — open sourced the core today

Been working on a weight divergence trajectory curvature approach to detecting neural network training instability. Treats weight updates...

Reddit - Artificial Intelligence · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·

All Content

[2603.23974] Machine vision with small numbers of detected photons per inference
Machine Learning

[2603.23974] Machine vision with small numbers of detected photons per inference

Abstract page for arXiv paper 2603.23974: Machine vision with small numbers of detected photons per inference

arXiv - Machine Learning · 4 min ·
[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More
Llms

[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Abstract page for arXiv paper 2603.23971: The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

arXiv - Machine Learning · 4 min ·
[2603.23943] ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities
Machine Learning

[2603.23943] ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities

Abstract page for arXiv paper 2603.23943: ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities

arXiv - Machine Learning · 3 min ·
[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development
Llms

[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

Abstract page for arXiv paper 2603.23937: Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

arXiv - Machine Learning · 4 min ·
[2603.23911] Self-Distillation for Multi-Token Prediction
Llms

[2603.23911] Self-Distillation for Multi-Token Prediction

Abstract page for arXiv paper 2603.23911: Self-Distillation for Multi-Token Prediction

arXiv - Machine Learning · 3 min ·
[2603.23933] ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE
Machine Learning

[2603.23933] ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE

Abstract page for arXiv paper 2603.23933: ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE

arXiv - Machine Learning · 4 min ·
[2603.23873] The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions and Search
Machine Learning

[2603.23873] The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions and Search

Abstract page for arXiv paper 2603.23873: The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions...

arXiv - Machine Learning · 4 min ·
[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding
Llms

[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

Abstract page for arXiv paper 2603.23914: Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient ...

arXiv - Machine Learning · 4 min ·
[2603.23835] Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models
Machine Learning

[2603.23835] Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models

Abstract page for arXiv paper 2603.23835: Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models

arXiv - Machine Learning · 4 min ·
[2603.23822] How Vulnerable Are Edge LLMs?
Llms

[2603.23822] How Vulnerable Are Edge LLMs?

Abstract page for arXiv paper 2603.23822: How Vulnerable Are Edge LLMs?

arXiv - Machine Learning · 3 min ·
[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models
Llms

[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models

Abstract page for arXiv paper 2603.23821: Perturbation: A simple and efficient adversarial tracer for representation learning in language...

arXiv - Machine Learning · 3 min ·
[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection
Llms

[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

Abstract page for arXiv paper 2603.23800: Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt ...

arXiv - Machine Learning · 4 min ·
[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning
Llms

[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning

Abstract page for arXiv paper 2603.23794: Sparse Autoencoders for Interpretable Medical Image Representation Learning

arXiv - Machine Learning · 3 min ·
[2603.23785] Retinal Disease Classification from Fundus Images using CNN Transfer Learning
Machine Learning

[2603.23785] Retinal Disease Classification from Fundus Images using CNN Transfer Learning

Abstract page for arXiv paper 2603.23785: Retinal Disease Classification from Fundus Images using CNN Transfer Learning

arXiv - Machine Learning · 3 min ·
[2603.23722] Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL
Machine Learning

[2603.23722] Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL

Abstract page for arXiv paper 2603.23722: Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL

arXiv - Machine Learning · 4 min ·
[2603.23736] Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems
Machine Learning

[2603.23736] Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems

Abstract page for arXiv paper 2603.23736: Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems

arXiv - Machine Learning · 4 min ·
[2603.23685] The Economics of Builder Saturation in Digital Markets
Machine Learning

[2603.23685] The Economics of Builder Saturation in Digital Markets

Abstract page for arXiv paper 2603.23685: The Economics of Builder Saturation in Digital Markets

arXiv - Machine Learning · 4 min ·
[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models
Llms

[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

Abstract page for arXiv paper 2603.23668: Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language...

arXiv - Machine Learning · 3 min ·
[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load
Llms

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load

Abstract page for arXiv paper 2603.23640: LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustain...

arXiv - Machine Learning · 4 min ·
[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models
Llms

[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models

Abstract page for arXiv paper 2603.23611: LLMORPH: Automated Metamorphic Testing of Large Language Models

arXiv - Machine Learning · 4 min ·
Previous Page 35 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime