Machine Learning

ML algorithms, training, and inference

Top This Week

Machine Learning

Phone screen: Microsoft AI Principal MLE

submitted by /u/sustain-able-tea [link] [comments]

Reddit - ML Jobs · 1 min ·
Llms

We open-sourced our AI agent config management tool — 888 stars, nearly 100 forks — requesting community feedback

We've been building Caliber to solve AI agent configuration management and released our full setup as open source. The response has been ...

Reddit - Artificial Intelligence · 1 min ·
Llms

The open-source AI agent config repo the community has been building just hit 888 stars — asking for feedback & feature ideas

Over the past year our team and community have been building an open-source collection of AI agent configs: production-ready system promp...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents
Llms

[2604.04157] Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

Abstract page for arXiv paper 2604.04157: Readable Minds: Emergent Theory-of-Mind-Like Behavior in LLM Poker Agents

arXiv - AI · 4 min ·
[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting
Llms

[2604.04145] Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

Abstract page for arXiv paper 2604.04145: Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting

arXiv - AI · 4 min ·
[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents
Llms

[2604.04131] Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

Abstract page for arXiv paper 2604.04131: Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents

arXiv - AI · 3 min ·
[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories
Machine Learning

[2604.04106] InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

Abstract page for arXiv paper 2604.04106: InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories

arXiv - AI · 3 min ·
[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents
Machine Learning

[2604.03976] Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

Abstract page for arXiv paper 2604.03976: Quantifying Trust: Financial Risk Management for Trustworthy AI Agents

arXiv - AI · 4 min ·
[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion
Llms

[2604.03898] LLM-Agent-based Social Simulation for Attitude Diffusion

Abstract page for arXiv paper 2604.03898: LLM-Agent-based Social Simulation for Attitude Diffusion

arXiv - AI · 3 min ·
[2604.03888] PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Latency Arbitrage
Llms

[2604.03888] PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Latency Arbitrage

Abstract page for arXiv paper 2604.03888: PolySwarm: A Multi-Agent Large Language Model Framework for Prediction Market Trading and Laten...

arXiv - AI · 4 min ·
[2604.03893] FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning
Llms

[2604.03893] FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

Abstract page for arXiv paper 2604.03893: FeynmanBench: Benchmarking Multimodal LLMs on Diagrammatic Physics Reasoning

arXiv - AI · 4 min ·
[2604.03820] Affording Process Auditability with QualAnalyzer: An Atomistic LLM Analysis Tool for Qualitative Research
Llms

[2604.03820] Affording Process Auditability with QualAnalyzer: An Atomistic LLM Analysis Tool for Qualitative Research

Abstract page for arXiv paper 2604.03820: Affording Process Auditability with QualAnalyzer: An Atomistic LLM Analysis Tool for Qualitativ...

arXiv - AI · 3 min ·
[2604.03742] Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Process and DualJudge
Llms

[2604.03742] Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Process and DualJudge

Abstract page for arXiv paper 2604.03742: Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Pro...

arXiv - AI · 4 min ·
[2604.03675] PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training
Llms

[2604.03675] PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

Abstract page for arXiv paper 2604.03675: PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

arXiv - AI · 3 min ·
[2604.03660] TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical Tables
Llms

[2604.03660] TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical Tables

Abstract page for arXiv paper 2604.03660: TableVision: A Large-Scale Benchmark for Spatially Grounded Reasoning over Complex Hierarchical...

arXiv - AI · 4 min ·
[2604.03656] Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative Engine Optimization
Llms

[2604.03656] Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative Engine Optimization

Abstract page for arXiv paper 2604.03656: Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative E...

arXiv - AI · 4 min ·
[2604.03631] Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors
Llms

[2604.03631] Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors

Abstract page for arXiv paper 2604.03631: Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning ...

arXiv - AI · 4 min ·
[2604.03630] A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction
Llms

[2604.03630] A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery and Clinical Prediction

Abstract page for arXiv paper 2604.03630: A Multimodal Foundation Model of Spatial Transcriptomics and Histology for Biological Discovery...

arXiv - AI · 4 min ·
[2604.03589] Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on the TruthfulQA Benchmark
Llms

[2604.03589] Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on the TruthfulQA Benchmark

Abstract page for arXiv paper 2604.03589: Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on t...

arXiv - AI · 4 min ·
[2604.03571] Selective Forgetting for Large Reasoning Models
Machine Learning

[2604.03571] Selective Forgetting for Large Reasoning Models

Abstract page for arXiv paper 2604.03571: Selective Forgetting for Large Reasoning Models

arXiv - AI · 4 min ·
[2604.03557] When Do Hallucinations Arise? A Graph Perspective on the Evolution of Path Reuse and Path Compression
Llms

[2604.03557] When Do Hallucinations Arise? A Graph Perspective on the Evolution of Path Reuse and Path Compression

Abstract page for arXiv paper 2604.03557: When Do Hallucinations Arise? A Graph Perspective on the Evolution of Path Reuse and Path Compr...

arXiv - AI · 3 min ·
[2604.03527] Explainable Model Routing for Agentic Workflows
Machine Learning

[2604.03527] Explainable Model Routing for Agentic Workflows

Abstract page for arXiv paper 2604.03527: Explainable Model Routing for Agentic Workflows

arXiv - AI · 3 min ·
[2604.03524] Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Layer Governability in Large Language Models
Llms

[2604.03524] Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Layer Governability in Large Language Models

Abstract page for arXiv paper 2604.03524: Structural Rigidity and the 57-Token Predictive Window: A Physical Framework for Inference-Laye...

arXiv - AI · 4 min ·
Previous Page 318 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime