[R] Is autoresearch really better than classic hyperparameter tuning?
We did experiments comparing Optuna & autoresearch. Autoresearch converges faster, is more cost-efficient, and even generalizes bette...
Text understanding and language tasks
We did experiments comparing Optuna & autoresearch. Autoresearch converges faster, is more cost-efficient, and even generalizes bette...
Automate iOS apps with XCUITest and Droidrun using just natural language. You send the command to Droidrun, and the agent starts the task...
I trained a BERT-style transformer on 276K Kubernetes YAML files, replacing standard positional encoding with learned tree coordinates (d...
The paper presents GATES, a self-distillation method for document-grounded question answering, enhancing model performance by leveraging ...
The paper presents a novel framework, Memory-guided Prototypical Co-occurrence Learning (MPCL), aimed at improving mixed emotion recognit...
The paper introduces CREDIT, a method for certified ownership verification of deep neural networks to combat model extraction attacks, en...
$κ$-Explorer presents a novel framework for active model estimation in Markov decision processes (MDPs), focusing on optimizing explorati...
The paper presents KBVQ-MoE, a novel framework for improving vector quantization in Mixture of Experts (MoE) large language models, addre...
The paper presents FROST, an innovative method that utilizes attention mechanisms to filter out reasoning outliers, enhancing the efficie...
The paper presents HiGR, a novel framework for generative slate recommendation that enhances efficiency and user preference alignment thr...
The paper introduces Refusal Steering, a method for controlling Large Language Models' refusal behavior on sensitive topics without retra...
The paper presents E-GRPO, a novel framework for training search agents using synthetic data, enhancing their ability to learn from near-...
The paper presents Latent-Augmented Discrete Diffusion Models (LADD), which enhance discrete diffusion models for improved language gener...
This article explores how rationales generated by large language models (LLMs) influence human judgments of plausibility in commonsense r...
This paper evaluates the robustness of Vision-Language-Action (VLA) models against various multi-modal perturbations, proposing a new met...
The paper presents RHYTHM, a framework utilizing hierarchical temporal tokenization to enhance human mobility predictions by leveraging l...
RooseBERT introduces a specialized language model for political discourse, enhancing the analysis of political debates through improved s...
This article explores the performance of State Space Models (SSMs) and hybrid language models in processing long-context inputs, highligh...
The K-Function framework enhances children's language evaluation by integrating precise phoneme transcription with LLM-driven scoring, im...
HSSBench introduces a benchmark for evaluating Multimodal Large Language Models (MLLMs) in Humanities and Social Sciences, addressing gap...
The paper explores performance asymmetry in Model-Based Reinforcement Learning (MBRL), highlighting significant disparities in agent perf...
HoloLLM introduces a Multimodal Large Language Model that enhances human sensing and reasoning by integrating diverse sensory inputs, out...
The paper presents CONTINA, a method for predicting traffic demand with confidence intervals that adapt to changing conditions, ensuring ...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime