[2604.00830] Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies
Abstract page for arXiv paper 2604.00830: Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies
Abstract page for arXiv paper 2604.00830: Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies
Abstract page for arXiv paper 2604.00821: Optimal Brain Decomposition for Accurate LLM Low-Rank Approximation
Abstract page for arXiv paper 2604.00812: Cost-Penalized Fitness in FMA-Orchestrated Mixture of Experts: Experimental Evidence for Molecu...
Abstract page for arXiv paper 2604.00785: Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer
Abstract page for arXiv paper 2604.00801: Routing-Free Mixture-of-Experts
Abstract page for arXiv paper 2604.00800: MIRANDA: MId-feature RANk-adversarial Domain Adaptation toward climate change-robust ecological...
Abstract page for arXiv paper 2604.00779: Using predefined vector systems to speed up neural network multimillion class classification
Abstract page for arXiv paper 2604.00770: Thinking Wrong in Silence: Backdoor Attacks on Continuous Latent Reasoning
Abstract page for arXiv paper 2604.00767: ActivityNarrated: An Open-Ended Narrative Paradigm for Wearable Human Activity Understanding
Abstract page for arXiv paper 2604.00739: BioCOMPASS: Integrating Biomarkers into Transformer-Based Immunotherapy Response Prediction
Abstract page for arXiv paper 2604.00698: Learning to Hint for Reinforcement Learning
Abstract page for arXiv paper 2604.00733: Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and S...
Abstract page for arXiv paper 2604.00689: Performance of Neural and Polynomial Operator Surrogates
Abstract page for arXiv paper 2604.00726: Exploring Silent Data Corruption as a Reliability Challenge in LLM Training
Abstract page for arXiv paper 2604.00686: Full-Gradient Successor Feature Representations
Abstract page for arXiv paper 2604.00669: Embedded Variational Neural Stochastic Differential Equations for Learning Heterogeneous Dynamics
Abstract page for arXiv paper 2604.00653: Chameleons do not Forget: Prompt-Based Online Continual Learning for Next Activity Prediction
Abstract page for arXiv paper 2604.00626: A Survey of On-Policy Distillation for Large Language Models
Abstract page for arXiv paper 2604.00599: Predicting Dynamics of Ultra-Large Complex Systems by Inferring Governing Equations