Machine Learning
ML algorithms, training, and inference
Top This Week
New technique makes AI models leaner and faster while they’re still learning
Question regarding Transformer's pipeline module [D]
from transformers import pipeline , DistilBertTokenizer , DistilBertModel model = DistilBertModel . from_pretrained ('distilbert-base-cas...
All Content
[2604.04681] Batch Loss Score for Dynamic Data Pruning
Abstract page for arXiv paper 2604.04681: Batch Loss Score for Dynamic Data Pruning
[2604.04655] Grokking as Dimensional Phase Transition in Neural Networks
Abstract page for arXiv paper 2604.04655: Grokking as Dimensional Phase Transition in Neural Networks
[2604.04648] From Curiosity to Caution: Mitigating Reward Hacking for Best-of-N with Pessimism
Abstract page for arXiv paper 2604.04648: From Curiosity to Caution: Mitigating Reward Hacking for Best-of-N with Pessimism
[2604.04614] A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplete Multimodal EHRs
Abstract page for arXiv paper 2604.04614: A Clinical Point Cloud Paradigm for In-Hospital Mortality Prediction from Multi-Level Incomplet...
[2604.04611] Dynamic Free-Rider Detection in Federated Learning via Simulated Attack Patterns
Abstract page for arXiv paper 2604.04611: Dynamic Free-Rider Detection in Federated Learning via Simulated Attack Patterns
[2604.04535] Learning from Equivalence Queries, Revisited
Abstract page for arXiv paper 2604.04535: Learning from Equivalence Queries, Revisited
[2604.04518] Reproducibility study on how to find Spurious Correlations, Shortcut Learning, Clever Hans or Group-Distributional non-robustness and how to fix them
Abstract page for arXiv paper 2604.04518: Reproducibility study on how to find Spurious Correlations, Shortcut Learning, Clever Hans or G...
[2604.04516] GAIN: Multiplicative Modulation for Domain Adaptation
Abstract page for arXiv paper 2604.04516: GAIN: Multiplicative Modulation for Domain Adaptation
[2604.04497] One Model for All: Multi-Objective Controllable Language Models
Abstract page for arXiv paper 2604.04497: One Model for All: Multi-Objective Controllable Language Models
[2604.04493] SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models
Abstract page for arXiv paper 2604.04493: SLaB: Sparse-Lowrank-Binary Decomposition for Efficient Large Language Models
[2604.04485] ECG Biometrics with ArcFace-Inception: External Validation on MIMIC and HEEDB
Abstract page for arXiv paper 2604.04485: ECG Biometrics with ArcFace-Inception: External Validation on MIMIC and HEEDB
[2604.04475] Discrete Prototypical Memories for Federated Time Series Foundation Models
Abstract page for arXiv paper 2604.04475: Discrete Prototypical Memories for Federated Time Series Foundation Models
[2604.04474] MAVEN: A Mesh-Aware Volumetric Encoding Network for Simulating 3D Flexible Deformation
Abstract page for arXiv paper 2604.04474: MAVEN: A Mesh-Aware Volumetric Encoding Network for Simulating 3D Flexible Deformation
[2604.04461] DP-OPD: Differentially Private On-Policy Distillation for Language Models
Abstract page for arXiv paper 2604.04461: DP-OPD: Differentially Private On-Policy Distillation for Language Models
[2604.04410] Relative Density Ratio Optimization for Stable and Statistically Consistent Model Alignment
Abstract page for arXiv paper 2604.04410: Relative Density Ratio Optimization for Stable and Statistically Consistent Model Alignment
[2604.04394] Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games
Abstract page for arXiv paper 2604.04394: Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games
[2604.04380] CPT: Controllable and Editable Design Variations with Language Models
Abstract page for arXiv paper 2604.04380: CPT: Controllable and Editable Design Variations with Language Models
[2604.04364] Context is All You Need
Abstract page for arXiv paper 2604.04364: Context is All You Need
[2604.04343] Deep Kuratowski Embedding Neural Networks for Wasserstein Metric Learning
Abstract page for arXiv paper 2604.04343: Deep Kuratowski Embedding Neural Networks for Wasserstein Metric Learning
[2604.04342] Generative models for decision-making under distributional shift
Abstract page for arXiv paper 2604.04342: Generative models for decision-making under distributional shift
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime