[D] Howcome Muon is only being used for Transformers?
Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...
GPT, Claude, Gemini, and other LLMs
Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...
Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...
A study found that sycophancy is pervasive among chatbots, and that bots are more likely than human peers to affirm a person's bad behavior.
Abstract page for arXiv paper 2601.20009: LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?
Abstract page for arXiv paper 2601.14958: Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala
Abstract page for arXiv paper 2601.12494: Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs
Abstract page for arXiv paper 2601.07148: Measuring Iterative Temporal Reasoning with Time Puzzles
Abstract page for arXiv paper 2601.01547: Vision-language models lag human performance on physical dynamics and intent reasoning
Abstract page for arXiv paper 2601.01279: Collusive Pricing Under LLM
Abstract page for arXiv paper 2512.16523: TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Abstract page for arXiv paper 2512.03903: BERnaT: Basque Encoders for Representing Natural Textual Diversity
Abstract page for arXiv paper 2512.05959: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Abstract page for arXiv paper 2511.23455: The Price of Progress: Price Performance and the Future of AI
Abstract page for arXiv paper 2511.19299: Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
Abstract page for arXiv paper 2511.22169: Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization
Abstract page for arXiv paper 2511.17561: LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models
Abstract page for arXiv paper 2511.14977: SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification
Abstract page for arXiv paper 2511.11828: Conformal Constrained Policy Optimization for Cost-Effective LLM Agents
Abstract page for arXiv paper 2511.06174: LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs
Abstract page for arXiv paper 2510.27543: DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Mo...
Abstract page for arXiv paper 2510.13232: What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging
Abstract page for arXiv paper 2510.08138: Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention...
Abstract page for arXiv paper 2510.06638: StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime