Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

Musk v. Altman is just getting started | TechCrunch

Watch as the Equity podcast team discusses what's actually at stake in the courtroom and what to watch for as Altman and others take the ...

TechCrunch - AI · 3 min · 29 minutes ago

Machine Learning

Did you know you can't steal a charity? Don't worry. Elon Musk will remind you. | TechCrunch

Today on Equity, we break down what's actually at stake in the Musk v Altman case, plus deals, defense tech, and what Big Tech's earnings...

TechCrunch - AI · 4 min · 29 minutes ago

Machine Learning

Why ML conference reviews sometimes feel like a “lottery“ [D]

I’ve been trying to make sense of all the “ML conferences are a lottery” takes, and honestly I think it’s both true and not true dependin...

Reddit - Machine Learning · 1 min · about 2 hours ago

All Content

Machine Learning

[2603.12510] Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies

Abstract page for arXiv paper 2603.12510: Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Ro...

arXiv - AI · 4 min · 24 days ago

Llms

[2603.11749] Truth as a Compression Artifact in Language Model Training

Abstract page for arXiv paper 2603.11749: Truth as a Compression Artifact in Language Model Training

arXiv - AI · 4 min · 24 days ago

Llms

[2603.10047] Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction

Abstract page for arXiv paper 2603.10047: Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination ...

arXiv - AI · 4 min · 24 days ago

Machine Learning

[2603.09030] PlayWorld: Learning Robot World Models from Autonomous Play

Abstract page for arXiv paper 2603.09030: PlayWorld: Learning Robot World Models from Autonomous Play

arXiv - AI · 4 min · 24 days ago

Llms

[2602.08392] ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs

Abstract page for arXiv paper 2602.08392: ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs

arXiv - AI · 4 min · 24 days ago

Llms

[2601.11109] Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

Abstract page for arXiv paper 2601.11109: Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

arXiv - AI · 3 min · 24 days ago

Machine Learning

[2601.08565] Rewriting Video: Text-Driven Reauthoring of Video Footage

Abstract page for arXiv paper 2601.08565: Rewriting Video: Text-Driven Reauthoring of Video Footage

arXiv - AI · 3 min · 24 days ago

Machine Learning

[2512.18388] Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models

Abstract page for arXiv paper 2512.18388: Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creatio...

arXiv - AI · 4 min · 24 days ago

Llms

[2601.00263] Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation

Abstract page for arXiv paper 2601.00263: Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counter...

arXiv - AI · 4 min · 24 days ago

Llms

[2512.11919] A fine-grained look at causal effects in causal spaces

Abstract page for arXiv paper 2512.11919: A fine-grained look at causal effects in causal spaces

arXiv - AI · 4 min · 24 days ago

Llms

[2510.15746] LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

Abstract page for arXiv paper 2510.15746: LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

arXiv - AI · 4 min · 24 days ago

Llms

[2511.06448] When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

Abstract page for arXiv paper 2511.06448: When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Plat...

arXiv - AI · 4 min · 24 days ago

Machine Learning

[2511.06391] HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

Abstract page for arXiv paper 2511.06391: HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate S...

arXiv - AI · 4 min · 24 days ago

Llms

[2510.25890] ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted MDE

Abstract page for arXiv paper 2510.25890: ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted...

arXiv - AI · 4 min · 24 days ago

Llms

[2510.15148] XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

Abstract page for arXiv paper 2510.15148: XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

arXiv - AI · 4 min · 24 days ago

Llms

[2510.13829] A Linguistics-Aware LLM Watermarking via Syntactic Predictability

Abstract page for arXiv paper 2510.13829: A Linguistics-Aware LLM Watermarking via Syntactic Predictability

arXiv - AI · 3 min · 24 days ago

Llms

[2510.06800] FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

Abstract page for arXiv paper 2510.06800: FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipe...

arXiv - AI · 4 min · 24 days ago

Llms

[2509.24186] Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

Abstract page for arXiv paper 2509.24186: Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

arXiv - AI · 4 min · 24 days ago

Machine Learning

[2509.23279] Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

Abstract page for arXiv paper 2509.23279: Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

arXiv - AI · 3 min · 24 days ago

Llms

[2509.22258] Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

Abstract page for arXiv paper 2509.22258: Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

arXiv - AI · 4 min · 24 days ago

Previous Page 309 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

Musk v. Altman is just getting started | TechCrunch

Did you know you can't steal a charity? Don't worry. Elon Musk will remind you. | TechCrunch

Why ML conference reviews sometimes feel like a “lottery“ [D]

All Content

[2603.12510] Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies

[2603.11749] Truth as a Compression Artifact in Language Model Training

[2603.10047] Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction

[2603.09030] PlayWorld: Learning Robot World Models from Autonomous Play

[2602.08392] ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs

[2601.11109] Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

[2601.08565] Rewriting Video: Text-Driven Reauthoring of Video Footage

[2512.18388] Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models

[2601.00263] Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation

[2512.11919] A fine-grained look at causal effects in causal spaces

[2510.15746] LLMs Judge Themselves: A Game-Theoretic Framework for Human-Aligned Evaluation

[2511.06448] When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

[2511.06391] HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

[2510.25890] ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted MDE

[2510.15148] XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

[2510.13829] A Linguistics-Aware LLM Watermarking via Syntactic Predictability

[2510.06800] FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

[2509.24186] Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

[2509.23279] Vid-Freeze: Protecting Images from Malicious Image-to-Video Generation via Temporal Freezing

[2509.22258] Beyond Classification Accuracy: Neural-MedBench and the Need for Deeper Reasoning Benchmarks

Related Topics

Stay updated with AI News