Large Language Models
GPT, Claude, Gemini, and other LLMs
Top This Week
All Content
[2603.19276] From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG
Abstract page for arXiv paper 2603.19276: From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG
[2603.19275] Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models
Abstract page for arXiv paper 2603.19275: Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language M...
[2603.19274] CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation
Abstract page for arXiv paper 2603.19274: CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation
[2603.19273] LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages
Abstract page for arXiv paper 2603.19273: LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages
[2603.19271] A Human-Centered Workflow for Using Large Language Models in Content Analysis
Abstract page for arXiv paper 2603.19271: A Human-Centered Workflow for Using Large Language Models in Content Analysis
[2603.19268] Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization
Abstract page for arXiv paper 2603.19268: Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization
[2603.19266] Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
Abstract page for arXiv paper 2603.19266: Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the ...
[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation
Abstract page for arXiv paper 2603.19264: Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation
[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference
Abstract page for arXiv paper 2603.19262: The α-Law of Observable Belief Revision in Large Language Model Inference
[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models
Abstract page for arXiv paper 2603.19255: LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models
[2603.19258] MAPLE: Metadata Augmented Private Language Evolution
Abstract page for arXiv paper 2603.19258: MAPLE: Metadata Augmented Private Language Evolution
[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams
Abstract page for arXiv paper 2603.19252: GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams
[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2
[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)
Abstract page for arXiv paper 2603.19236: L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)
[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models
Abstract page for arXiv paper 2603.19247: When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models
[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search
Abstract page for arXiv paper 2603.17765: Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Simi...
[2603.20170] Learning Dynamic Belief Graphs for Theory-of-mind Reasoning
Abstract page for arXiv paper 2603.20170: Learning Dynamic Belief Graphs for Theory-of-mind Reasoning
[2603.20101] Pitfalls in Evaluating Interpretability Agents
Abstract page for arXiv paper 2603.20101: Pitfalls in Evaluating Interpretability Agents
[2603.20046] Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs
Abstract page for arXiv paper 2603.20046: Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for ...
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime