Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Anthropic gave Claude $100 to go shopping, here’s what the AI ended up buying

Anthropic’s AI experiment showed Claude independently handled 186 deals worth over $4,000, but results varied by model capability, with u...

AI Tools & Products · 5 min · about 2 hours ago

Llms

CoreWeave (CRWV) Partners with Anthropic to Provide Infrastructure for Claude AI Models

CoreWeave Inc. (NASDAQ:CRWV) is one of the best technology stocks to buy for the next decade. On April 20, CoreWeave announced a multi-ye...

AI Tools & Products · 2 min · about 2 hours ago

Llms

[2604.01650] AromaGen: Interactive Generation of Rich Olfactory Experiences with Multimodal Language Models

Abstract page for arXiv paper 2604.01650: AromaGen: Interactive Generation of Rich Olfactory Experiences with Multimodal Language Models

arXiv - AI · 4 min · about 3 hours ago

All Content

Llms

[2603.19281] URAG: A Benchmark for Uncertainty Quantification in Retrieval-Augmented Large Language Models

Abstract page for arXiv paper 2603.19281: URAG: A Benchmark for Uncertainty Quantification in Retrieval-Augmented Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19280] From Feature-Based Models to Generative AI: Validity Evidence for Constructed Response Scoring

Abstract page for arXiv paper 2603.19280: From Feature-Based Models to Generative AI: Validity Evidence for Constructed Response Scoring

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19278] HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning

Abstract page for arXiv paper 2603.19278: HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19276] From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

Abstract page for arXiv paper 2603.19276: From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19275] Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

Abstract page for arXiv paper 2603.19275: Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language M...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19274] CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

Abstract page for arXiv paper 2603.19274: CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19273] LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

Abstract page for arXiv paper 2603.19273: LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19271] A Human-Centered Workflow for Using Large Language Models in Content Analysis

Abstract page for arXiv paper 2603.19271: A Human-Centered Workflow for Using Large Language Models in Content Analysis

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19268] Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

Abstract page for arXiv paper 2603.19268: Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19266] Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

Abstract page for arXiv paper 2603.19266: Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

Abstract page for arXiv paper 2603.19265: When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

Abstract page for arXiv paper 2603.19264: Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference

Abstract page for arXiv paper 2603.19262: The α-Law of Observable Belief Revision in Large Language Model Inference

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

Abstract page for arXiv paper 2603.19255: LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19258] MAPLE: Metadata Augmented Private Language Evolution

Abstract page for arXiv paper 2603.19258: MAPLE: Metadata Augmented Private Language Evolution

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

Abstract page for arXiv paper 2603.19252: GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

Abstract page for arXiv paper 2603.19236: L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

Abstract page for arXiv paper 2603.19247: When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

Abstract page for arXiv paper 2603.17765: Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Simi...

arXiv - AI · 4 min · about 1 month ago

Previous Page 233 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Anthropic gave Claude $100 to go shopping, here’s what the AI ended up buying

CoreWeave (CRWV) Partners with Anthropic to Provide Infrastructure for Claude AI Models

[2604.01650] AromaGen: Interactive Generation of Rich Olfactory Experiences with Multimodal Language Models

All Content

[2603.19281] URAG: A Benchmark for Uncertainty Quantification in Retrieval-Augmented Large Language Models

[2603.19280] From Feature-Based Models to Generative AI: Validity Evidence for Constructed Response Scoring

[2603.19278] HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning

[2603.19276] From Flat to Structural: Enhancing Automated Short Answer Grading with GraphRAG

[2603.19275] Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models

[2603.19274] CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation

[2603.19273] LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages

[2603.19271] A Human-Centered Workflow for Using Large Language Models in Content Analysis

[2603.19268] Full-Stack Domain Enhancement for Combustion LLMs: Construction and Optimization

[2603.19266] Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion

[2603.19265] When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models

[2603.19264] Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation

[2603.19262] The α-Law of Observable Belief Revision in Large Language Model Inference

[2603.19255] LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

[2603.19258] MAPLE: Metadata Augmented Private Language Evolution

[2603.19252] GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

[2603.19236] L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

[2603.19247] When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models

[2603.17765] Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search

Related Topics

Stay updated with AI News