[2602.16930] Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users
Summary
The paper explores how blind users can customize interactions with conversational visual question answering systems, highlighting the need for user control in AI tools.
Why It Matters
This research addresses the limitations of existing assistive technologies for blind users, emphasizing the importance of user control in enhancing accessibility and interaction quality. By identifying customization techniques, the study aims to improve user experience and effectiveness of visual question answering systems.
Key Takeaways
- Blind users benefit from customizable interactions in visual question answering systems.
- Current systems often lack flexibility and verbosity control, impacting user experience.
- The study introduces prompting techniques that enhance user control and interaction quality.
- Data from 418 interactions provides insights for improving design in assistive technologies.
- Customization techniques can help users navigate system limitations effectively.
Computer Science > Human-Computer Interaction arXiv:2602.16930 (cs) [Submitted on 18 Feb 2026] Title:Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users Authors:Farnaz Zamiri Zeraati, Yang Trista Cao, Yuehan Qiao, Hal Daumé III, Hernisa Kacorri View a PDF of the paper titled Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users, by Farnaz Zamiri Zeraati and 4 other authors View PDF HTML (experimental) Abstract:Prompting and steering techniques are well established in general-purpose generative AI, yet assistive visual question answering (VQA) tools for blind users still follow rigid interaction patterns with limited opportunities for customization. User control can be helpful when system responses are misaligned with their goals and contexts, a gap that becomes especially consequential for blind users that may rely on these systems for access. We invite 11 blind users to customize their interactions with a real-world conversational VQA system. Drawing on 418 interactions, reflections, and post-study interviews, we analyze prompting-based techniques participants adopted, including those introduced in the study and those developed independently in real-world settings. VQA interactions were often lengthy: participants averaged 3 turns, sometimes up to 21, with input text typically tenfold shorter than the responses they heard. Built on state-of-the-art LLMs, the system lacked verbosity controls,...