AI Startups

AI startup funding, launches, and acquisitions

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Startups

Inside Real Estate Launches Streams AI Mobile App to Boost Agent Productivity and Response

Inside Real Estate launched Streams, an AI-powered mobile app that delivers real-time lead insights, follow-ups and productivity tools to...

AI Tools & Products · 5 min · about 4 hours ago

Machine Learning

[2603.05659] When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On

Abstract page for arXiv paper 2603.05659: When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual T...

arXiv - AI · 4 min · about 6 hours ago

Machine Learning

[2512.16081] Evaluation of Generative Models for Emotional 3D Animation Generation in VR

Abstract page for arXiv paper 2512.16081: Evaluation of Generative Models for Emotional 3D Animation Generation in VR

arXiv - AI · 4 min · about 6 hours ago

All Content

Llms

[2507.20174] LRR-Bench: Left, Right or Rotate? Vision-Language models Still Struggle With Spatial Understanding Tasks

The paper introduces LRR-Bench, a benchmark for evaluating Vision-Language Models (VLMs) on spatial understanding tasks, revealing signif...

arXiv - AI · 4 min · about 1 month ago

Llms

[2412.17596] Evaluating LLMs' Divergent Thinking Capabilities for Scientific Idea Generation with Minimal Context

This article evaluates the divergent thinking capabilities of Large Language Models (LLMs) for scientific idea generation using minimal c...

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.04934] Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

The paper discusses the limitations of current unlearning methods in large language models (LLMs), revealing that they fail to effectivel...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2509.24228] Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

This paper presents a benchmark for evaluating positive-unlabeled (PU) learning algorithms, addressing inconsistencies in experimental se...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2509.22295] Aurora: Towards Universal Generative Multimodal Time Series Forecasting

Aurora introduces a Multimodal Time Series Foundation Model that enhances cross-domain generalization in time series forecasting by integ...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2512.06393] Conflict-Aware Fusion: Resolving Logic Inertia in Large Language Models via Structured Cognitive Priors

This article introduces Conflict-Aware Fusion, a framework designed to address Logic Inertia in large language models (LLMs) by integrati...

arXiv - Machine Learning · 4 min · about 1 month ago

Data Science

[2510.05761] Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis

This article presents a novel approach to predicting the virality of memes on Reddit using a multimodal dataset and advanced machine lear...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2506.22740] Explanations are a Means to an End: Decision Theoretic Explanation Evaluation

The paper presents a decision-theoretic framework for evaluating explanations in AI, emphasizing their role as information signals that i...

arXiv - AI · 3 min · about 1 month ago

Llms

[2504.12764] GraphOmni: A Comprehensive and Extensible Benchmark Framework for Large Language Models on Graph-theoretic Tasks

GraphOmni introduces a benchmark framework for evaluating large language models on graph-theoretic tasks, highlighting performance variab...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2501.00773] Revisiting Graph Neural Networks for Graph-level Tasks: Taxonomy, Empirical Study, and Future Directions

This article presents a comprehensive study on Graph Neural Networks (GNNs) for graph-level tasks, categorizing them into five types and ...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2411.01685] Reducing Biases in Record Matching Through Scores Calibration

This paper explores methods to reduce biases in record matching through score calibration, proposing two model-agnostic post-processing t...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.19948] Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

This article presents a framework for assessing the risks associated with using large language models (LLMs) in mental health support, hi...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.19984] Multivariate time-series forecasting of ASTRI-Horn monitoring data: A Normal Behavior Model

This article presents a Normal Behavior Model (NBM) for forecasting monitoring data from the ASTRI-Horn telescope, demonstrating effectiv...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.19843] MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems

The paper presents MAS-FIRE, a framework for evaluating the reliability of LLM-based Multi-Agent Systems through fault injection, address...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.19339] SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and Splits

SplitLight is an open-source toolkit designed to enhance the evaluation of recommender systems by providing measurable and comparable dat...

arXiv - Machine Learning · 3 min · about 1 month ago

Data Science

[2602.19329] Dynamic Elasticity Between Forest Loss and Carbon Emissions: A Subnational Panel Analysis of the United States

This article analyzes the dynamic relationship between forest loss and carbon emissions in the U.S. using a comprehensive dataset from 20...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.19320] Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations

This article presents a comprehensive analysis of agentic memory systems in large language models, highlighting their architectural frame...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.18813] Habilis-$β$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model

Habilis-$β$ is a new on-device vision-language-action model that excels in fast-motion tasks, demonstrating superior performance in real-...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.18525] Do Generative Metrics Predict YOLO Performance? An Evaluation Across Models, Augmentation Ratios, and Dataset Complexity

This paper evaluates the effectiveness of generative metrics in predicting the performance of YOLO object detection models across various...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.18922] Why Agent Caching Fails and How to Fix It: Structured Intent Canonicalization with Few-Shot Learning

The paper discusses the limitations of current agent caching methods in AI, proposing a new framework, W5H2, that improves efficiency and...

arXiv - Machine Learning · 3 min · about 1 month ago

Previous Page 52 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Startups

Top This Week

Inside Real Estate Launches Streams AI Mobile App to Boost Agent Productivity and Response

[2603.05659] When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On

[2512.16081] Evaluation of Generative Models for Emotional 3D Animation Generation in VR

All Content

[2507.20174] LRR-Bench: Left, Right or Rotate? Vision-Language models Still Struggle With Spatial Understanding Tasks

[2412.17596] Evaluating LLMs' Divergent Thinking Capabilities for Scientific Idea Generation with Minimal Context

[2511.04934] Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

[2509.24228] Accessible, Realistic, and Fair Evaluation of Positive-Unlabeled Learning Algorithms

[2509.22295] Aurora: Towards Universal Generative Multimodal Time Series Forecasting

[2512.06393] Conflict-Aware Fusion: Resolving Logic Inertia in Large Language Models via Structured Cognitive Priors

[2510.05761] Early Multimodal Prediction of Cross-Lingual Meme Virality on Reddit: A Time-Window Analysis

[2506.22740] Explanations are a Means to an End: Decision Theoretic Explanation Evaluation

[2504.12764] GraphOmni: A Comprehensive and Extensible Benchmark Framework for Large Language Models on Graph-theoretic Tasks

[2501.00773] Revisiting Graph Neural Networks for Graph-level Tasks: Taxonomy, Empirical Study, and Future Directions

[2411.01685] Reducing Biases in Record Matching Through Scores Calibration

[2602.19948] Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

[2602.19984] Multivariate time-series forecasting of ASTRI-Horn monitoring data: A Normal Behavior Model

[2602.19843] MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems

[2602.19339] SplitLight: An Exploratory Toolkit for Recommender Systems Datasets and Splits

[2602.19329] Dynamic Elasticity Between Forest Loss and Carbon Emissions: A Subnational Panel Analysis of the United States

[2602.19320] Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations

[2602.18813] Habilis-$β$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model

[2602.18525] Do Generative Metrics Predict YOLO Performance? An Evaluation Across Models, Augmentation Ratios, and Dataset Complexity

[2602.18922] Why Agent Caching Fails and How to Fix It: Structured Intent Canonicalization with Few-Shot Learning

Related Topics

Stay updated with AI News