Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Agents Can Now Propose and Deploy Their Own Code Changes

150 clones yesterday. 43 stars in 3 days. Every agent framework you've used (LangChain, LangGraph, Claude Code) assumes agents are tools ...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

[2603.17839] How do LLMs Compute Verbal Confidence

Abstract page for arXiv paper 2603.17839: How do LLMs Compute Verbal Confidence

arXiv - AI · 4 min · about 3 hours ago

Llms

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

Abstract page for arXiv paper 2603.15970: 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight...

arXiv - AI · 4 min · about 3 hours ago

All Content

Llms

[2603.14672] Seamless Deception: Larger Language Models Are Better Knowledge Concealers

Abstract page for arXiv paper 2603.14672: Seamless Deception: Larger Language Models Are Better Knowledge Concealers

arXiv - AI · 3 min · 8 days ago

Llms

[2603.14602] PA3: Policy-Aware Agent Alignment through Chain-of-Thought

Abstract page for arXiv paper 2603.14602: PA3: Policy-Aware Agent Alignment through Chain-of-Thought

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.13406] Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for AH Detection

Abstract page for arXiv paper 2603.13406: Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for A...

arXiv - AI · 4 min · 8 days ago

Llms

[2603.13275] PREBA: Surgical Duration Prediction via PCA-Weighted Retrieval-Augmented LLMs and Bayesian Averaging Aggregation

Abstract page for arXiv paper 2603.13275: PREBA: Surgical Duration Prediction via PCA-Weighted Retrieval-Augmented LLMs and Bayesian Aver...

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.07496] From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Abstract page for arXiv paper 2603.07496: From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

arXiv - AI · 3 min · 8 days ago

Llms

[2602.11549] Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Abstract page for arXiv paper 2602.11549: Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2602.07077] CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models

Abstract page for arXiv paper 2602.07077: CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models

arXiv - AI · 4 min · 8 days ago

Llms

[2602.00319] Detecting AI-Generated Content in Academic Peer Reviews

Abstract page for arXiv paper 2602.00319: Detecting AI-Generated Content in Academic Peer Reviews

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

Abstract page for arXiv paper 2601.20009: LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

Abstract page for arXiv paper 2601.14958: Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

arXiv - AI · 3 min · 8 days ago

Llms

[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

Abstract page for arXiv paper 2601.12494: Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

arXiv - AI · 4 min · 8 days ago

Llms

[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles

Abstract page for arXiv paper 2601.07148: Measuring Iterative Temporal Reasoning with Time Puzzles

arXiv - AI · 3 min · 8 days ago

Llms

[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning

Abstract page for arXiv paper 2601.01547: Vision-language models lag human performance on physical dynamics and intent reasoning

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2601.01279] Collusive Pricing Under LLM

Abstract page for arXiv paper 2601.01279: Collusive Pricing Under LLM

arXiv - AI · 4 min · 8 days ago

Llms

[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

Abstract page for arXiv paper 2512.16523: TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

arXiv - AI · 4 min · 8 days ago

Llms

[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity

Abstract page for arXiv paper 2512.03903: BERnaT: Basque Encoders for Representing Natural Textual Diversity

arXiv - AI · 3 min · 8 days ago

Llms

[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Abstract page for arXiv paper 2512.05959: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

arXiv - AI · 4 min · 8 days ago

Llms

[2511.23455] The Price of Progress: Price Performance and the Future of AI

Abstract page for arXiv paper 2511.23455: The Price of Progress: Price Performance and the Future of AI

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

Abstract page for arXiv paper 2511.19299: Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

Abstract page for arXiv paper 2511.22169: Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

arXiv - AI · 4 min · 8 days ago

Previous Page 51 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Agents Can Now Propose and Deploy Their Own Code Changes

[2603.17839] How do LLMs Compute Verbal Confidence

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

All Content

[2603.14672] Seamless Deception: Larger Language Models Are Better Knowledge Concealers

[2603.14602] PA3: Policy-Aware Agent Alignment through Chain-of-Thought

[2603.13406] Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for AH Detection

[2603.13275] PREBA: Surgical Duration Prediction via PCA-Weighted Retrieval-Augmented LLMs and Bayesian Averaging Aggregation

[2603.07496] From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

[2602.11549] Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

[2602.07077] CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models

[2602.00319] Detecting AI-Generated Content in Academic Peer Reviews

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles

[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning

[2601.01279] Collusive Pricing Under LLM

[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity

[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

[2511.23455] The Price of Progress: Price Performance and the Future of AI

[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

Related Topics

Stay updated with AI News