Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Machine Learning

Phone screen: Microsoft AI Principal MLE

submitted by /u/sustain-able-tea [link] [comments]

Reddit - ML Jobs · 1 min · 13 minutes ago

Llms

We open-sourced our AI agent config management tool — 888 stars, nearly 100 forks — requesting community feedback

We've been building Caliber to solve AI agent configuration management and released our full setup as open source. The response has been ...

Reddit - Artificial Intelligence · 1 min · 13 minutes ago

Llms

The open-source AI agent config repo the community has been building just hit 888 stars — asking for feedback & feature ideas

Over the past year our team and community have been building an open-source collection of AI agent configs: production-ready system promp...

Reddit - Artificial Intelligence · 1 min · 13 minutes ago

All Content

Llms

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

Abstract page for arXiv paper 2604.04157: Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

arXiv - AI · 4 min · 25 days ago

Llms

[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

Abstract page for arXiv paper 2604.04145: Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

arXiv - AI · 4 min · 25 days ago

Llms

[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

Abstract page for arXiv paper 2604.04131: Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

arXiv - AI · 3 min · 25 days ago

Machine Learning

[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

Abstract page for arXiv paper 2604.04106: InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

arXiv - AI · 3 min · 25 days ago

Machine Learning

[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

Abstract page for arXiv paper 2604.03976: Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion

Abstract page for arXiv paper 2604.03898: LLM-Agent-based Social Simulation for Attitude Diffusion

arXiv - AI · 3 min · 25 days ago

Llms

[2604.03888] PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Latency Arbitrage

Abstract page for arXiv paper 2604.03888: PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Laten...

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03893] FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

Abstract page for arXiv paper 2604.03893: FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03820] Affording Process Auditability with QualAnalyzer: An Atomistic LLM Analysis Tool for Qualitative Research

Abstract page for arXiv paper 2604.03820: Affording Process Auditability with QualAnalyzer: An Atomistic LLM Analysis Tool for Qualitativ...

arXiv - AI · 3 min · 25 days ago

Llms

[2604.03742] Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Process and DualJudge

Abstract page for arXiv paper 2604.03742: Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Pro...

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03675] PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

Abstract page for arXiv paper 2604.03675: PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

arXiv - AI · 3 min · 25 days ago

Llms

[2604.03660] TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical Tables

Abstract page for arXiv paper 2604.03660: TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical...

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03656] Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative Engine Optimization

Abstract page for arXiv paper 2604.03656: Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative E...

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03631] Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors

Abstract page for arXiv paper 2604.03631: Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning ...

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03630] A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

Abstract page for arXiv paper 2604.03630: A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery...

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03589] Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on the TruthfulQA Benchmark

Abstract page for arXiv paper 2604.03589: Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on t...

arXiv - AI · 4 min · 25 days ago

Machine Learning

[2604.03571] Selective Forgetting for Large Reasoning Models

Abstract page for arXiv paper 2604.03571: Selective Forgetting for Large Reasoning Models

arXiv - AI · 4 min · 25 days ago

Llms

[2604.03557] When Do Hallucinations Arise? A Graph Perspective on the Evolution of Path Reuse and Path Compression

Abstract page for arXiv paper 2604.03557: When Do Hallucinations Arise? A Graph Perspective on the Evolution of Path Reuse and Path Compr...

arXiv - AI · 3 min · 25 days ago

Machine Learning

[2604.03527] Explainable Model Routing for Agentic Workflows

Abstract page for arXiv paper 2604.03527: Explainable Model Routing for Agentic Workflows

arXiv - AI · 3 min · 25 days ago

Llms

[2604.03524] Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Layer Governability in Large Language Models

Abstract page for arXiv paper 2604.03524: Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Laye...

arXiv - AI · 4 min · 25 days ago

Previous Page 318 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

Phone screen: Microsoft AI Principal MLE

We open-sourced our AI agent config management tool — 888 stars, nearly 100 forks — requesting community feedback

The open-source AI agent config repo the community has been building just hit 888 stars — asking for feedback & feature ideas

All Content

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion

[2604.03888] PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Latency Arbitrage

[2604.03893] FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

[2604.03820] Affording Process Auditability with QualAnalyzer: An Atomistic LLM Analysis Tool for Qualitative Research

[2604.03742] Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Process and DualJudge

[2604.03675] PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

[2604.03660] TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical Tables

[2604.03656] Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative Engine Optimization

[2604.03631] Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors

[2604.03630] A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

[2604.03589] Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on the TruthfulQA Benchmark

[2604.03571] Selective Forgetting for Large Reasoning Models

[2604.03557] When Do Hallucinations Arise? A Graph Perspective on the Evolution of Path Reuse and Path Compression

[2604.03527] Explainable Model Routing for Agentic Workflows

[2604.03524] Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Layer Governability in Large Language Models

Related Topics

Stay updated with AI News