[2509.21825] DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries

[2509.21825] DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries

arXiv - AI 3 min read Article

Summary

The paper introduces DS-STAR, a data science agent designed to automate complex workflows by integrating diverse data formats and generating comprehensive reports for open-ended queries, outperforming existing models in various benchmarks.

Why It Matters

As data science becomes increasingly complex, tools like DS-STAR are crucial for automating tasks that require the synthesis of information from multiple sources. This advancement can significantly enhance productivity and accuracy in data analysis, making it relevant for researchers and practitioners in AI and data science.

Key Takeaways

  • DS-STAR effectively integrates data from heterogeneous formats.
  • It generates comprehensive reports for open-ended queries, surpassing existing models.
  • The agent shows state-of-the-art performance on multiple benchmarks.
  • It excels particularly in complex QA tasks requiring multi-file processing.
  • Over 88% of evaluations preferred DS-STAR's report quality over baseline models.

Computer Science > Artificial Intelligence arXiv:2509.21825 (cs) [Submitted on 26 Sep 2025 (v1), last revised 24 Feb 2026 (this version, v4)] Title:DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries Authors:Jaehyun Nam, Jinsung Yoon, Jiefeng Chen, Raj Sinha, Jinwoo Shin, Tomas Pfister View a PDF of the paper titled DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries, by Jaehyun Nam and 5 other authors View PDF HTML (experimental) Abstract:While large language models (LLMs) have shown promise in automating data science, existing agents often struggle with the complexity of real-world workflows that require exploring multiple sources and synthesizing open-ended insights. In this paper, we introduce DS-STAR, a specialized agent to bridge this gap. Unlike prior approaches, DS-STAR is designed to (1) seamlessly process and integrate data across diverse, heterogeneous formats, and (2) move beyond simple QA to generate comprehensive research reports for open-ended queries. Extensive evaluation shows that DS-STAR achieves state-of-the-art performance on four benchmarks: DABStep, DABStep-Research, KramaBench, and DA-Code. Most notably, it significantly outperforms existing baseline models especially in hard-level QA tasks requiring multi-file processing, and generates high-quality data science reports that are preferred over the best baseline model in over 88% of cases. Subj...

Related Articles

Llms

wtf bro did what? arc 3 2026

The Physarum Explorer is a high-speed, bio-inspired neural model designed specifically for ARC geometry. Here is the snapshot of its curr...

Reddit - Artificial Intelligence · 1 min ·
Llms

A robot car with a Claude AI brain started a YouTube vlog about its own existence

Not a demo reel. Not a tutorial. A robot narrating its own experience — debugging, falling off shelves, questioning its identity. First-p...

Reddit - Artificial Intelligence · 1 min ·
Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min ·
Llms

do you guys actually trust AI tools with your data?

idk if it’s just me but lately i’ve been thinking about how casually we use stuff like chatgpt and claude for everything like coding, ran...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime