[D] Tested model routing on financial AI datasets — good savings and curious what benchmarks others use.
Ran a benchmark evaluating whether prompt complexity-based routing delivers meaningful savings. Used public HuggingFace datasets. Here's ...
ML algorithms, training, and inference
Ran a benchmark evaluating whether prompt complexity-based routing delivers meaningful savings. Used public HuggingFace datasets. Here's ...
i'm doing research on some trending fields in AI, currently working on small language models and would love to meet people who are workin...
I'm using Gemini just for they gave us a student-free-pro pack. It can't see the images I sent, most of the time it just rewrites the mes...
Abstract page for arXiv paper 2603.23219: Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?
Abstract page for arXiv paper 2603.23007: AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screensho...
Abstract page for arXiv paper 2603.23269: Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs
Abstract page for arXiv paper 2603.23251: Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguist...
Abstract page for arXiv paper 2603.22966: Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees
Abstract page for arXiv paper 2603.22918: EVA: Efficient Reinforcement Learning for End-to-End Video Agent
Abstract page for arXiv paper 2603.23159: Conformal Cross-Modal Active Learning
Abstract page for arXiv paper 2603.22911: ForestPrune: High-ratio Visual Token Compression for Video Multimodal Large Language Models via...
Abstract page for arXiv paper 2603.23171: Robust Safety Monitoring of Language Models via Activation Watermarking
Abstract page for arXiv paper 2603.22876: Grounding Sim-to-Real Generalization in Dexterous Manipulation: An Empirical Study with Vision-...
Abstract page for arXiv paper 2603.23136: HGNet: Scalable Foundation Model for Automated Knowledge Graph Generation from Scientific Liter...
Abstract page for arXiv paper 2603.23057: Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Reco...
Abstract page for arXiv paper 2603.23063: Machine Learning Models for the Early Detection of Burnout in Software Engineering: a Systemati...
Abstract page for arXiv paper 2603.22854: Avoiding Over-smoothing in Social Media Rumor Detection with Pre-trained Propagation Tree Trans...
Abstract page for arXiv paper 2603.22853: Agent Audit: A Security Analysis System for LLM Agent Applications
Abstract page for arXiv paper 2603.23055: Post-Selection Distributional Model Evaluation
Abstract page for arXiv paper 2603.23041: HUydra: Full-Range Lung CT Synthesis via Multiple HU Interval Generative Modelling
Abstract page for arXiv paper 2603.22851: UniQueR: Unified Query-based Feedforward 3D Reconstruction
Abstract page for arXiv paper 2603.23037: YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable...
Abstract page for arXiv paper 2603.22841: UAV-DETR: DETR for Anti-Drone Target Detection
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime