[2602.23335] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset
Summary
This paper presents the Asta Interaction Dataset, analyzing over 200,000 user queries from AI-powered research tools to understand user engagement and query patterns.
Why It Matters
As AI tools become integral to scientific research, understanding user interaction is crucial for improving these systems. This dataset provides valuable insights into how researchers utilize AI, which can inform future designs and evaluations of AI research assistants.
Key Takeaways
- The Asta Interaction Dataset includes over 200,000 user queries and interaction logs.
- Users engage with AI tools as collaborative partners, submitting complex queries.
- Experience leads to more targeted queries and deeper engagement with citations.
- The dataset introduces a new query intent taxonomy for better AI tool design.
- Findings can guide the development of future AI research assistants.
Computer Science > Human-Computer Interaction arXiv:2602.23335 (cs) [Submitted on 26 Feb 2026] Title:Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset Authors:Dany Haddad, Dan Bareket, Joseph Chee Chang, Jay DeYoung, Jena D. Hwang, Uri Katz, Mark Polak, Sangho Suh, Harshit Surana, Aryeh Tiktinsky, Shriya Atmakuri, Jonathan Bragg, Mike D'Arcy, Sergey Feldman, Amal Hassan-Ali, Rubén Lozano, Bodhisattwa Prasad Majumder, Charles McGrady, Amanpreet Singh, Brooke Vlahos, Yoav Goldberg, Doug Downey View a PDF of the paper titled Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset, by Dany Haddad and 21 other authors View PDF HTML (experimental) Abstract:AI-powered scientific research tools are rapidly being integrated into research workflows, yet the field lacks a clear lens into how researchers use these systems in real-world settings. We present and analyze the Asta Interaction Dataset, a large-scale resource comprising over 200,000 user queries and interaction logs from two deployed tools (a literature discovery interface and a scientific question-answering interface) within an LLM-powered retrieval-augmented generation platform. Using this dataset, we characterize query patterns, engagement behaviors, and how usage evolves with experience. We find that users submit longer and more complex queries than in traditional search, and treat the system as a collaborative resear...