Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[R] Depth-first pruning transfers: GPT-2 → TinyLlama with stable gains and minimal loss

TL;DR: Removing the right layers (instead of shrinking all layers) makes transformer models ~8–12% smaller with only ~6–8% quality loss, ...

Reddit - Machine Learning · 1 min · 7 minutes ago

Llms

Built a training stability monitor that detects instability before your loss curve shows anything — open sourced the core today

Been working on a weight divergence trajectory curvature approach to detecting neural network training instability. Treats weight updates...

Reddit - Artificial Intelligence · 1 min · 8 minutes ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 11 minutes ago

All Content

Machine Learning

[2603.23974] Machine vision with small numbers of detected photons per inference

Abstract page for arXiv paper 2603.23974: Machine vision with small numbers of detected photons per inference

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Abstract page for arXiv paper 2603.23971: The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2603.23943] ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities

Abstract page for arXiv paper 2603.23943: ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

Abstract page for arXiv paper 2603.23937: Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23911] Self-Distillation for Multi-Token Prediction

Abstract page for arXiv paper 2603.23911: Self-Distillation for Multi-Token Prediction

arXiv - Machine Learning · 3 min · 5 days ago

Machine Learning

[2603.23933] ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE

Abstract page for arXiv paper 2603.23933: ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2603.23873] The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions and Search

Abstract page for arXiv paper 2603.23873: The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions...

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

Abstract page for arXiv paper 2603.23914: Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient ...

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2603.23835] Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models

Abstract page for arXiv paper 2603.23835: Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23822] How Vulnerable Are Edge LLMs?

Abstract page for arXiv paper 2603.23822: How Vulnerable Are Edge LLMs?

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models

Abstract page for arXiv paper 2603.23821: Perturbation: A simple and efficient adversarial tracer for representation learning in language...

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

Abstract page for arXiv paper 2603.23800: Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt ...

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning

Abstract page for arXiv paper 2603.23794: Sparse Autoencoders for Interpretable Medical Image Representation Learning

arXiv - Machine Learning · 3 min · 5 days ago

Machine Learning

[2603.23785] Retinal Disease Classification from Fundus Images using CNN Transfer Learning

Abstract page for arXiv paper 2603.23785: Retinal Disease Classification from Fundus Images using CNN Transfer Learning

arXiv - Machine Learning · 3 min · 5 days ago

Machine Learning

[2603.23722] Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL

Abstract page for arXiv paper 2603.23722: Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2603.23736] Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems

Abstract page for arXiv paper 2603.23736: Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems

arXiv - Machine Learning · 4 min · 5 days ago

Machine Learning

[2603.23685] The Economics of Builder Saturation in Digital Markets

Abstract page for arXiv paper 2603.23685: The Economics of Builder Saturation in Digital Markets

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

Abstract page for arXiv paper 2603.23668: Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language...

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load

Abstract page for arXiv paper 2603.23640: LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustain...

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models

Abstract page for arXiv paper 2603.23611: LLMORPH: Automated Metamorphic Testing of Large Language Models

arXiv - Machine Learning · 4 min · 5 days ago

Previous Page 35 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

[R] Depth-first pruning transfers: GPT-2 → TinyLlama with stable gains and minimal loss

Built a training stability monitor that detects instability before your loss curve shows anything — open sourced the core today

UMKC Announces New Master of Science in Artificial Intelligence

All Content

[2603.23974] Machine vision with small numbers of detected photons per inference

[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

[2603.23943] ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities

[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

[2603.23911] Self-Distillation for Multi-Token Prediction

[2603.23933] ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE

[2603.23873] The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions and Search

[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

[2603.23835] Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models

[2603.23822] How Vulnerable Are Edge LLMs?

[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models

[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning

[2603.23785] Retinal Disease Classification from Fundus Images using CNN Transfer Learning

[2603.23722] Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL

[2603.23736] Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems

[2603.23685] The Economics of Builder Saturation in Digital Markets

[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load

[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models

Related Topics

Stay updated with AI News