[2603.17677] Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models
Abstract page for arXiv paper 2603.17677: Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models
GPT, Claude, Gemini, and other LLMs
Abstract page for arXiv paper 2603.17677: Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models
Abstract page for arXiv paper 2511.14617: Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning
Abstract page for arXiv paper 2510.05497: Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference
Abstract page for arXiv paper 2506.08762: EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements
Abstract page for arXiv paper 2601.18734: Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models
Abstract page for arXiv paper 2512.07419: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery v...
Abstract page for arXiv paper 2510.17276: Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems
Abstract page for arXiv paper 2509.25762: OPPO: Accelerating PPO-based RLHF via Pipeline Overlap
Abstract page for arXiv paper 2508.02833: TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback
Abstract page for arXiv paper 2506.09016: SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning
Abstract page for arXiv paper 2505.23648: Continuous Chain of Thought Enables Parallel Exploration and Reasoning
Abstract page for arXiv paper 2603.05280: Layer by layer, module by module: Choose both for optimal OOD probing of ViT
Abstract page for arXiv paper 2603.05143: Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers
Abstract page for arXiv paper 2603.05035: Good-Enough LLM Obfuscation (GELO)
Abstract page for arXiv paper 2603.05026: RepoLaunch: Automating Build&Test Pipeline of Code Repositories on ANY Language and ANY Platform
Abstract page for arXiv paper 2603.04964: Replaying pre-training data improves fine-tuning
Abstract page for arXiv paper 2603.04716: SLO-Aware Compute Resource Allocation for Prefill-Decode Disaggregated LLM Inference
Abstract page for arXiv paper 2603.04480: AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2
Abstract page for arXiv paper 2603.04466: Act-Observe-Rewrite: Multimodal Coding Agents as In-Context Policy Learners for Robot Manipulation
Abstract page for arXiv paper 2603.05232: SlideSparse: Fast and Flexible (2N-2):2N Structured Sparsity
Abstract page for arXiv paper 2603.04972: Functionality-Oriented LLM Merging on the Fisher--Rao Manifold
Abstract page for arXiv paper 2603.04956: WaterSIC: information-theoretically (near) optimal linear layer quantization
Abstract page for arXiv paper 2603.04948: $\nabla$-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime