Uber burned its entire 2026 AI coding budget in 4 months - $500-2k per engineer per month
Uber deployed Claude Code to engineers in December 2025. By April 2026, the company had consumed its entire annual AI budget - not becaus...
GPT, Claude, Gemini, and other LLMs
Uber deployed Claude Code to engineers in December 2025. By April 2026, the company had consumed its entire annual AI budget - not becaus...
I’m sharing a research prototype exploring a different approach to LLM-based multi-agent systems. Most current agent frameworks rely on f...
I have realised Claude answers as best as you prompt it. And I suck at it. 😂 I have tried role playing you are top 1% etc and adding cons...
Abstract page for arXiv paper 2506.06683: RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks
Abstract page for arXiv paper 2506.03135: OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models
Abstract page for arXiv paper 2506.02860: Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs
Abstract page for arXiv paper 2505.11076: Addition is almost all you need: Compressing large language models with double binary factoriza...
Abstract page for arXiv paper 2505.24298: AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
Abstract page for arXiv paper 2505.21786: VeriTrail: Closed-Domain Hallucination Detection with Traceability
Abstract page for arXiv paper 2504.03889: Identifying and Evaluating Inactive Heads in Pretrained LLMs
Abstract page for arXiv paper 2505.20278: Characterizing Pattern Matching and Its Limits on Compositional Task Structures
Abstract page for arXiv paper 2505.21413: RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning
Abstract page for arXiv paper 2505.21396: Augmenting Research Ideation with Data: An Empirical Investigation in Social Science
Abstract page for arXiv paper 2503.08980: I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts...
Abstract page for arXiv paper 2505.16056: Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models
Abstract page for arXiv paper 2505.17702: Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via De...
Abstract page for arXiv paper 2505.17568: JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models
Abstract page for arXiv paper 2505.15504: Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification
Abstract page for arXiv paper 2505.13109: FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference
Abstract page for arXiv paper 2505.12186: Self-Destructive Language Model
Abstract page for arXiv paper 2502.01481: Intrinsic Entropy of Context Length Scaling in LLMs
Abstract page for arXiv paper 2505.02881: Rewriting Pre-Training Data Boosts LLM Performance in Math and Code
Abstract page for arXiv paper 2505.02872: Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime