[2602.20918] Predicting Sentence Acceptability Judgments in Multimodal Contexts

[2602.20918] Predicting Sentence Acceptability Judgments in Multimodal Contexts

arXiv - AI 4 min read Article

Summary

This paper explores how visual context influences sentence acceptability judgments in humans and large language models (LLMs), revealing that visual images have minimal impact on human ratings, while LLMs show varied performance based on context.

Why It Matters

Understanding how multimodal contexts affect sentence acceptability is crucial for advancing natural language processing and improving the design of AI systems. This research highlights the differences in processing between humans and LLMs, which can inform future AI development and applications in language understanding.

Key Takeaways

  • Visual context has little impact on human sentence acceptability judgments.
  • LLMs can predict human acceptability judgments with high accuracy, especially without visual context.
  • Different LLMs exhibit varying patterns in sentence acceptability, with some closely resembling human judgments.
  • The presence of visual contexts decreases the correlation between LLM predictions and their internal representations.
  • This study provides insights into the processing differences between humans and LLMs in multimodal contexts.

Computer Science > Artificial Intelligence arXiv:2602.20918 (cs) [Submitted on 24 Feb 2026] Title:Predicting Sentence Acceptability Judgments in Multimodal Contexts Authors:Hyewon Jang, Nikolai Ilinykh, Sharid Loáiciga, Jey Han Lau, Shalom Lappin View a PDF of the paper titled Predicting Sentence Acceptability Judgments in Multimodal Contexts, by Hyewon Jang and 4 other authors View PDF HTML (experimental) Abstract:Previous work has examined the capacity of deep neural networks (DNNs), particularly transformers, to predict human sentence acceptability judgments, both independently of context, and in document contexts. We consider the effect of prior exposure to visual images (i.e., visual context) on these judgments for humans and large language models (LLMs). Our results suggest that, in contrast to textual context, visual images appear to have little if any impact on human acceptability ratings. However, LLMs display the compression effect seen in previous work on human judgments in document contexts. Different sorts of LLMs are able to predict human acceptability judgments to a high degree of accuracy, but in general, their performance is slightly better when visual contexts are removed. Moreover, the distribution of LLM judgments varies among models, with Qwen resembling human patterns, and others diverging from them. LLM-generated predictions on sentence acceptability are highly correlated with their normalised log probabilities in general. However, the correlations d...

Related Articles

[2603.29957] Think Anywhere in Code Generation
Llms

[2603.29957] Think Anywhere in Code Generation

Abstract page for arXiv paper 2603.29957: Think Anywhere in Code Generation

arXiv - Machine Learning · 3 min ·
[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning
Llms

[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Abstract page for arXiv paper 2603.16880: NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectr...

arXiv - Machine Learning · 4 min ·
[2512.21106] Semantic Refinement with LLMs for Graph Representations
Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min ·
[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models
Llms

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

Abstract page for arXiv paper 2511.18123: Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-La...

arXiv - Machine Learning · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime