Machine Learning

ML algorithms, training, and inference

Top This Week

[2604.17188] Beyond Overlap Metrics: Rewarding Reasoning and Preferences for Faithful Multi-Role Dialogue Summarization
Machine Learning

[2604.17188] Beyond Overlap Metrics: Rewarding Reasoning and Preferences for Faithful Multi-Role Dialogue Summarization

Abstract page for arXiv paper 2604.17188: Beyond Overlap Metrics: Rewarding Reasoning and Preferences for Faithful Multi-Role Dialogue Su...

arXiv - AI · 4 min ·
[2603.09723] RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation
Llms

[2603.09723] RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

Abstract page for arXiv paper 2603.09723: RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

arXiv - AI · 4 min ·
[2601.21225] MGSM-Pro: A Simple Strategy for Robust Multilingual Mathematical Reasoning Evaluation
Llms

[2601.21225] MGSM-Pro: A Simple Strategy for Robust Multilingual Mathematical Reasoning Evaluation

Abstract page for arXiv paper 2601.21225: MGSM-Pro: A Simple Strategy for Robust Multilingual Mathematical Reasoning Evaluation

arXiv - AI · 4 min ·

All Content

[2501.07813] Talk to Right Specialists: Iterative Routing in Multi-agent Systems for Question Answering
Machine Learning

[2501.07813] Talk to Right Specialists: Iterative Routing in Multi-agent Systems for Question Answering

Abstract page for arXiv paper 2501.07813: Talk to Right Specialists: Iterative Routing in Multi-agent Systems for Question Answering

arXiv - AI · 4 min ·
[2408.11871] MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models
Llms

[2408.11871] MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

Abstract page for arXiv paper 2408.11871: MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models

arXiv - AI · 3 min ·
[2406.14194] VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model
Llms

[2406.14194] VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model

Abstract page for arXiv paper 2406.14194: VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model

arXiv - AI · 4 min ·
[2604.01438] ClawSafety: "Safe" LLMs, Unsafe Agents
Llms

[2604.01438] ClawSafety: "Safe" LLMs, Unsafe Agents

Abstract page for arXiv paper 2604.01438: ClawSafety: "Safe" LLMs, Unsafe Agents

arXiv - AI · 4 min ·
[2603.18633] An Onto-Relational-Sophic Framework for Governing Synthetic Minds
Llms

[2603.18633] An Onto-Relational-Sophic Framework for Governing Synthetic Minds

Abstract page for arXiv paper 2603.18633: An Onto-Relational-Sophic Framework for Governing Synthetic Minds

arXiv - AI · 4 min ·
[2603.09127] Collective AI can amplify tiny perturbations into divergent decisions
Llms

[2603.09127] Collective AI can amplify tiny perturbations into divergent decisions

Abstract page for arXiv paper 2603.09127: Collective AI can amplify tiny perturbations into divergent decisions

arXiv - AI · 4 min ·
[2602.07943] IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery
Llms

[2602.07943] IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery

Abstract page for arXiv paper 2602.07943: IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery

arXiv - AI · 3 min ·
[2602.03151] Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration
Llms

[2602.03151] Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration

Abstract page for arXiv paper 2602.03151: Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional ...

arXiv - AI · 4 min ·
[2601.22776] TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization
Llms

[2601.22776] TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization

Abstract page for arXiv paper 2601.22776: TSPO: Breaking the Double Homogenization Dilemma in Multi-turn Search Policy Optimization

arXiv - AI · 3 min ·
[2601.21439] The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Decision-Making
Llms

[2601.21439] The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Decision-Making

Abstract page for arXiv paper 2601.21439: The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Deci...

arXiv - AI · 4 min ·
[2511.16383] An Agent-Based Framework for the Automatic Validation of Mathematical Optimization Models
Llms

[2511.16383] An Agent-Based Framework for the Automatic Validation of Mathematical Optimization Models

Abstract page for arXiv paper 2511.16383: An Agent-Based Framework for the Automatic Validation of Mathematical Optimization Models

arXiv - AI · 3 min ·
[2601.05656] HAG: Hierarchical Demographic Tree-based Agent Generation for Topic-Adaptive Simulation
Llms

[2601.05656] HAG: Hierarchical Demographic Tree-based Agent Generation for Topic-Adaptive Simulation

Abstract page for arXiv paper 2601.05656: HAG: Hierarchical Demographic Tree-based Agent Generation for Topic-Adaptive Simulation

arXiv - AI · 3 min ·
[2512.13168] Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows
Machine Learning

[2512.13168] Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

Abstract page for arXiv paper 2512.13168: Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

arXiv - AI · 4 min ·
[2511.14130] PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval
Llms

[2511.14130] PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval

Abstract page for arXiv paper 2511.14130: PRISM: Prompt-Refined In-Context System Modelling for Financial Retrieval

arXiv - AI · 4 min ·
[2510.09901] Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics
Llms

[2510.09901] Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

Abstract page for arXiv paper 2510.09901: Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

arXiv - AI · 3 min ·
[2508.02900] Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game
Machine Learning

[2508.02900] Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game

Abstract page for arXiv paper 2508.02900: Seemingly Simple Planning Problems are Computationally Challenging: The Countdown Game

arXiv - AI · 4 min ·
[2502.13388] Reflection of Episodes: Learning to Play Game from Expert and Self Experiences
Llms

[2502.13388] Reflection of Episodes: Learning to Play Game from Expert and Self Experiences

Abstract page for arXiv paper 2502.13388: Reflection of Episodes: Learning to Play Game from Expert and Self Experiences

arXiv - AI · 3 min ·
[2411.06498] Barriers to Complexity-Theoretic Proofs that "AGI" Using Machine Learning is Impossible
Machine Learning

[2411.06498] Barriers to Complexity-Theoretic Proofs that "AGI" Using Machine Learning is Impossible

Abstract page for arXiv paper 2411.06498: Barriers to Complexity-Theoretic Proofs that "AGI" Using Machine Learning is Impossible

arXiv - AI · 3 min ·
[2604.04924] Your Pre-trained Diffusion Model Secretly Knows Restoration
Machine Learning

[2604.04924] Your Pre-trained Diffusion Model Secretly Knows Restoration

Abstract page for arXiv paper 2604.04924: Your Pre-trained Diffusion Model Secretly Knows Restoration

arXiv - AI · 3 min ·
[2604.04906] How AI Aggregation Affects Knowledge
Machine Learning

[2604.04906] How AI Aggregation Affects Knowledge

Abstract page for arXiv paper 2604.04906: How AI Aggregation Affects Knowledge

arXiv - AI · 3 min ·
Previous Page 282 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime