Inside Real Estate Launches Streams AI Mobile App to Boost Agent Productivity and Response
Inside Real Estate launched Streams, an AI-powered mobile app that delivers real-time lead insights, follow-ups and productivity tools to...
AI startup funding, launches, and acquisitions
Inside Real Estate launched Streams, an AI-powered mobile app that delivers real-time lead insights, follow-ups and productivity tools to...
Abstract page for arXiv paper 2603.05659: When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual T...
Abstract page for arXiv paper 2512.16081: Evaluation of Generative Models for Emotional 3D Animation Generation in VR
This pilot study explores the orchestration of LLM agents in scientific research, focusing on the generation and evaluation of multiple-c...
The paper presents a metacognitive framework for Large Language Models (LLMs) that enhances their reasoning capabilities by integrating p...
The paper presents MiSCHiEF, a benchmark for evaluating fine-grained image-caption alignment, focusing on safety and cultural contexts, h...
This article evaluates the accuracy of discrete diffusion language models (dLLMs) through a sampler-centric framework, revealing signific...
This article presents SME-HGT, a Heterogeneous Graph Transformer framework designed to identify high-potential small and medium enterpris...
Luna-2 introduces a scalable architecture for single-token evaluation using small language models, enhancing accuracy and reducing costs ...
This paper presents a novel statistical method for modeling irregular multivariate time series with missing data, demonstrating superior ...
The paper introduces 1D-Bench, a benchmark for evaluating iterative UI code generation with visual feedback, aimed at improving design-to...
The paper presents VLANeXt, a framework for building effective Vision-Language-Action (VLA) models, addressing inconsistencies in trainin...
The paper introduces SenTSR-Bench, a framework that enhances time-series reasoning by integrating insights from specialized time-series l...
The article examines red teaming as a socio-technical practice in evaluating large language models (LLMs), highlighting the importance of...
The paper introduces AlphaForgeBench, a framework for evaluating trading strategies using Large Language Models (LLMs), addressing issues...
This article evaluates SAP's RPT-1 model for enterprise business process prediction, comparing its performance against traditional machin...
The article presents a novel evaluation framework for mechanistic interpretability research, utilizing AI agents to enhance research rigo...
This study evaluates the effectiveness of large language models (LLMs) in generating subject lines for mental health counseling emails, h...
TimeRadar introduces a novel approach to time series anomaly detection using a domain-rotatable foundation model that enhances the differ...
This paper investigates the alignment of representations from time series, vision, and language modalities, revealing insights into their...
The paper presents ARTIST, a novel approach to time series reasoning that utilizes adaptive segment selection to improve accuracy in answ...
This article evaluates 15 large language models on quantum mechanics problem-solving across diverse tasks, revealing performance stratifi...
This paper presents a diagnostic method for evaluating LLM reranker behavior using fixed evidence pools, isolating ranking policies from ...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime