AI Startups

AI startup funding, launches, and acquisitions

Top This Week

Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Exclusive: Runway launches $10M fund, Builders program to support early stage AI startups | TechCrunch
Machine Learning

Exclusive: Runway launches $10M fund, Builders program to support early stage AI startups | TechCrunch

Runway is launching a $10 million fund and startup program to back companies building with its AI video models, as it pushes toward inter...

TechCrunch - AI · 7 min ·
The Download: AI health tools and the Pentagon’s Anthropic culture war | MIT Technology Review
Ai Startups

The Download: AI health tools and the Pentagon’s Anthropic culture war | MIT Technology Review

California has defied Trump's demands to stop AI regulation.

MIT Technology Review · 5 min ·

All Content

[2602.22752] Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction
Llms

[2602.22752] Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction

This article presents a study on the operational validity of using Large Language Models (LLMs) to simulate social media user behavior th...

arXiv - AI · 4 min ·
[2602.22710] Same Words, Different Judgments: Modality Effects on Preference Alignment
Ai Safety

[2602.22710] Same Words, Different Judgments: Modality Effects on Preference Alignment

This study explores how modality affects preference alignment in AI systems, comparing human and synthetic evaluations of audio and text ...

arXiv - AI · 3 min ·
[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA
Llms

[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA

The paper presents PSQE, a method for enhancing pseudo seed quality in unsupervised multimodal entity alignment, addressing challenges in...

arXiv - Machine Learning · 4 min ·
[2602.22827] TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models
Llms

[2602.22827] TARAZ: Persian Short-Answer Question Benchmark for Cultural Evaluation of Language Models

The paper presents TARAZ, a Persian short-answer question benchmark designed to evaluate the cultural competence of large language models...

arXiv - Machine Learning · 3 min ·
[2602.22801] Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving
Machine Learning

[2602.22801] Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving

This article explores the application of diffusion models in end-to-end autonomous driving, demonstrating their effectiveness through ext...

arXiv - Machine Learning · 4 min ·
[2602.22570] Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation
Machine Learning

[2602.22570] Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation

The paper discusses the evaluation challenges in text-to-image generation, focusing on classifier-free guidance (CFG) and proposing a new...

arXiv - AI · 4 min ·
[2602.22609] EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning
Machine Learning

[2602.22609] EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning

EvolveGen introduces a novel framework for generating hardware model checking benchmarks using reinforcement learning, addressing the ben...

arXiv - Machine Learning · 4 min ·
[2602.22543] Ruyi2 Technical Report
Llms

[2602.22543] Ruyi2 Technical Report

The Ruyi2 Technical Report presents advancements in adaptive computing strategies for Large Language Models (LLMs), focusing on efficienc...

arXiv - AI · 3 min ·
[2602.22488] Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints
Machine Learning

[2602.22488] Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints

This article evaluates transfer learning models for IoT DDoS detection, focusing on explainability and resource constraints. It analyzes ...

arXiv - AI · 3 min ·
[2602.23305] A Proper Scoring Rule for Virtual Staining
Machine Learning

[2602.23305] A Proper Scoring Rule for Virtual Staining

The paper introduces a novel scoring rule for evaluating generative virtual staining models in high-throughput screening, emphasizing the...

arXiv - Machine Learning · 3 min ·
[2602.22221] Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews
Llms

[2602.22221] Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews

This article evaluates misinformation exposure on the Chinese web by comparing traditional search engines, LLMs, and AI-generated overvie...

arXiv - AI · 3 min ·
[2602.23060] RhythmBERT: A Self-Supervised Language Model Based on Latent Representations of ECG Waveforms for Heart Disease Detection
Llms

[2602.23060] RhythmBERT: A Self-Supervised Language Model Based on Latent Representations of ECG Waveforms for Heart Disease Detection

RhythmBERT is a novel self-supervised language model designed for ECG waveform analysis, enhancing heart disease detection by treating EC...

arXiv - Machine Learning · 4 min ·
[2602.22902] A Data-Driven Approach to Support Clinical Renal Replacement Therapy
Machine Learning

[2602.22902] A Data-Driven Approach to Support Clinical Renal Replacement Therapy

This study explores a machine learning approach to predict membrane fouling in patients undergoing Continuous Renal Replacement Therapy (...

arXiv - Machine Learning · 4 min ·
[2602.23199] SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation
Llms

[2602.23199] SC-Arena: A Natural Language Benchmark for Single-Cell Reasoning with Knowledge-Augmented Evaluation

SC-Arena introduces a natural language benchmark for evaluating single-cell reasoning in large language models, addressing gaps in curren...

arXiv - AI · 4 min ·
[2602.22831] Moral Preferences of LLMs Under Directed Contextual Influence
Llms

[2602.22831] Moral Preferences of LLMs Under Directed Contextual Influence

This paper explores how contextual influences affect the moral decision-making of large language models (LLMs) in scenarios akin to troll...

arXiv - AI · 4 min ·
[2602.23161] PATRA: Pattern-Aware Alignment and Balanced Reasoning for Time Series Question Answering
Llms

[2602.23161] PATRA: Pattern-Aware Alignment and Balanced Reasoning for Time Series Question Answering

The paper presents PATRA, a novel model for Time Series Question Answering that enhances reasoning by incorporating pattern awareness and...

arXiv - AI · 3 min ·
[2602.22747] Set-based v.s. Distribution-based Representations of Epistemic Uncertainty: A Comparative Study
Machine Learning

[2602.22747] Set-based v.s. Distribution-based Representations of Epistemic Uncertainty: A Comparative Study

This study compares set-based and distribution-based representations of epistemic uncertainty in neural networks, highlighting their rela...

arXiv - Machine Learning · 3 min ·
[2602.22703] Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning
Llms

[2602.22703] Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning

The paper presents GeoPerceive, a benchmark for evaluating geometric perception in vision-language models (VLMs), and introduces GeoDPO, ...

arXiv - Machine Learning · 4 min ·
[2602.22953] General Agent Evaluation
Llms

[2602.22953] General Agent Evaluation

This paper introduces a framework for evaluating general-purpose agents, proposing a Unified Protocol and Exgentic framework, and benchma...

arXiv - AI · 3 min ·
[2602.22610] DP-aware AdaLN-Zero: Taming Conditioning-Induced Heavy-Tailed Gradients in Differentially Private Diffusion
Machine Learning

[2602.22610] DP-aware AdaLN-Zero: Taming Conditioning-Induced Heavy-Tailed Gradients in Differentially Private Diffusion

The paper introduces DP-aware AdaLN-Zero, a novel mechanism to mitigate heavy-tailed gradients in differentially private diffusion models...

arXiv - Machine Learning · 4 min ·
Previous Page 40 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime