Frameworks For Supporting LLM/Agentic Benchmarking [P]
I think the way we are approaching benchmarking is a bit problematic. From reading about how frontier labs benchmark their models, they e...
ML algorithms, training, and inference
I think the way we are approaching benchmarking is a bit problematic. From reading about how frontier labs benchmark their models, they e...
I think the way we are approaching benchmarking is a bit problematic. From reading about how frontier labs benchmark their models, they e...
Hello everyone! I built an AI/ML algorithm simulation and visualization app. You can run each algorithm step-by-step, edit parameters, an...
Abstract page for arXiv paper 2603.24041: Minimal Sufficient Representations for Self-interpretable Deep Neural Networks
Abstract page for arXiv paper 2603.23974: Machine vision with small numbers of detected photons per inference
Abstract page for arXiv paper 2603.23971: The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More
Abstract page for arXiv paper 2603.23943: ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities
Abstract page for arXiv paper 2603.23937: Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development
Abstract page for arXiv paper 2603.23911: Self-Distillation for Multi-Token Prediction
Abstract page for arXiv paper 2603.23933: ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE
Abstract page for arXiv paper 2603.23873: The DeepXube Software Package for Solving Pathfinding Problems with Learned Heuristic Functions...
Abstract page for arXiv paper 2603.23914: Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient ...
Abstract page for arXiv paper 2603.23835: Beyond Consistency: Inference for the Relative risk functional in Deep Nonparametric Cox Models
Abstract page for arXiv paper 2603.23822: How Vulnerable Are Edge LLMs?
Abstract page for arXiv paper 2603.23821: Perturbation: A simple and efficient adversarial tracer for representation learning in language...
Abstract page for arXiv paper 2603.23800: Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt ...
Abstract page for arXiv paper 2603.23794: Sparse Autoencoders for Interpretable Medical Image Representation Learning
Abstract page for arXiv paper 2603.23785: Retinal Disease Classification from Fundus Images using CNN Transfer Learning
Abstract page for arXiv paper 2603.23722: Dual-Gated Epistemic Time-Dilation: Autonomous Compute Modulation in Asynchronous MARL
Abstract page for arXiv paper 2603.23736: Wasserstein Parallel Transport for Predicting the Dynamics of Statistical Systems
Abstract page for arXiv paper 2603.23685: The Economics of Builder Saturation in Digital Markets
Abstract page for arXiv paper 2603.23668: Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language...
Abstract page for arXiv paper 2603.23640: LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustain...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime