[2410.15173] Uncovering Autoregressive LLM Knowledge of Thematic Fit in Event Representation

[2410.15173] Uncovering Autoregressive LLM Knowledge of Thematic Fit in Event Representation

arXiv - AI 3 min read Article

Summary

This paper explores how autoregressive large language models (LLMs) assess thematic fit in event representation, achieving state-of-the-art results while revealing differences in performance between closed and open weight models.

Why It Matters

Understanding how LLMs evaluate thematic fit is crucial for improving their application in natural language processing tasks. This research highlights the strengths and weaknesses of different model types, which can inform future developments in AI and machine learning.

Key Takeaways

  • LLMs can effectively estimate thematic fit for semantic roles.
  • Closed models outperform open models in overall scores but struggle with filtering incompatible sentences.
  • Multi-step reasoning enhances performance in closed models.

Computer Science > Computation and Language arXiv:2410.15173 (cs) [Submitted on 19 Oct 2024 (v1), last revised 22 Feb 2026 (this version, v3)] Title:Uncovering Autoregressive LLM Knowledge of Thematic Fit in Event Representation Authors:Safeyah Khaled Alshemali, Daniel Bauer, Yuval Marton View a PDF of the paper titled Uncovering Autoregressive LLM Knowledge of Thematic Fit in Event Representation, by Safeyah Khaled Alshemali and 2 other authors View PDF HTML (experimental) Abstract:The thematic fit estimation task measures semantic arguments' compatibility with a specific semantic role for a specific predicate. We investigate if LLMs have consistent, expressible knowledge of event arguments' thematic fit by experimenting with various prompt designs, manipulating input context, reasoning, and output forms. We set a new state-of-the-art on thematic fit benchmarks, but show that closed and open weight LLMs respond differently to our prompting strategies: Closed models achieve better scores overall and benefit from multi-step reasoning, but they perform worse at filtering out generated sentences incompatible with the specified predicate, role, and argument. Comments: Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI) Cite as: arXiv:2410.15173 [cs.CL]   (or arXiv:2410.15173v3 [cs.CL] for this version)   https://doi.org/10.48550/arXiv.2410.15173 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Safeyah Alshemali [view email] [v1...

Related Articles

Llms

Building knowledge bases from YouTube data using LLMs -- my workflow after 52 guides

I've been building a system that turns YouTube channels into structured knowledge bases. Thought I'd share the workflow since Karpathy's ...

Reddit - Artificial Intelligence · 1 min ·
What is AI, how do apps like ChatGPT work and why are there concerns?
Llms

What is AI, how do apps like ChatGPT work and why are there concerns?

AI is transforming modern life, but some critics worry about its potential misuse and environmental impact.

AI News - General · 7 min ·
[2603.29957] Think Anywhere in Code Generation
Llms

[2603.29957] Think Anywhere in Code Generation

Abstract page for arXiv paper 2603.29957: Think Anywhere in Code Generation

arXiv - Machine Learning · 3 min ·
[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning
Llms

[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Abstract page for arXiv paper 2603.16880: NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectr...

arXiv - Machine Learning · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime