Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

Agents Can Now Propose and Deploy Their Own Code Changes

150 clones yesterday. 43 stars in 3 days. Every agent framework you've used (LangChain, LangGraph, Claude Code) assumes agents are tools ...

Reddit - Artificial Intelligence · 1 min ·
[2603.17839] How do LLMs Compute Verbal Confidence
Llms

[2603.17839] How do LLMs Compute Verbal Confidence

Abstract page for arXiv paper 2603.17839: How do LLMs Compute Verbal Confidence

arXiv - AI · 4 min ·
[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models
Llms

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

Abstract page for arXiv paper 2603.15970: 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight...

arXiv - AI · 4 min ·

All Content

[2603.14672] Seamless Deception: Larger Language Models Are Better Knowledge Concealers
Llms

[2603.14672] Seamless Deception: Larger Language Models Are Better Knowledge Concealers

Abstract page for arXiv paper 2603.14672: Seamless Deception: Larger Language Models Are Better Knowledge Concealers

arXiv - AI · 3 min ·
[2603.14602] PA3: Policy-Aware Agent Alignment through Chain-of-Thought
Llms

[2603.14602] PA3: Policy-Aware Agent Alignment through Chain-of-Thought

Abstract page for arXiv paper 2603.14602: PA3: Policy-Aware Agent Alignment through Chain-of-Thought

arXiv - Machine Learning · 3 min ·
[2603.13406] Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for AH Detection
Llms

[2603.13406] Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for AH Detection

Abstract page for arXiv paper 2603.13406: Nuanced Emotion Recognition Based on a Segment-based MLLM Framework Leveraging Qwen3-Omni for A...

arXiv - AI · 4 min ·
[2603.13275] PREBA: Surgical Duration Prediction via PCA-Weighted Retrieval-Augmented LLMs and Bayesian Averaging Aggregation
Llms

[2603.13275] PREBA: Surgical Duration Prediction via PCA-Weighted Retrieval-Augmented LLMs and Bayesian Averaging Aggregation

Abstract page for arXiv paper 2603.13275: PREBA: Surgical Duration Prediction via PCA-Weighted Retrieval-Augmented LLMs and Bayesian Aver...

arXiv - Machine Learning · 4 min ·
[2603.07496] From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents
Llms

[2603.07496] From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

Abstract page for arXiv paper 2603.07496: From Thinker to Society: Security in Hierarchical Autonomy Evolution of AI Agents

arXiv - AI · 3 min ·
[2602.11549] Native Reasoning Models: Training Language Models to Reason on Unverifiable Data
Llms

[2602.11549] Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Abstract page for arXiv paper 2602.11549: Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

arXiv - Machine Learning · 4 min ·
[2602.07077] CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models
Llms

[2602.07077] CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models

Abstract page for arXiv paper 2602.07077: CALM: Class-Conditional Sparse Attention Vectors for Large Audio-Language Models

arXiv - AI · 4 min ·
[2602.00319] Detecting AI-Generated Content in Academic Peer Reviews
Llms

[2602.00319] Detecting AI-Generated Content in Academic Peer Reviews

Abstract page for arXiv paper 2602.00319: Detecting AI-Generated Content in Academic Peer Reviews

arXiv - Machine Learning · 3 min ·
[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?
Llms

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

Abstract page for arXiv paper 2601.20009: LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

arXiv - Machine Learning · 4 min ·
[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala
Llms

[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

Abstract page for arXiv paper 2601.14958: Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

arXiv - AI · 3 min ·
[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs
Llms

[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

Abstract page for arXiv paper 2601.12494: Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

arXiv - AI · 4 min ·
[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles
Llms

[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles

Abstract page for arXiv paper 2601.07148: Measuring Iterative Temporal Reasoning with Time Puzzles

arXiv - AI · 3 min ·
[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning
Llms

[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning

Abstract page for arXiv paper 2601.01547: Vision-language models lag human performance on physical dynamics and intent reasoning

arXiv - Machine Learning · 4 min ·
[2601.01279] Collusive Pricing Under LLM
Llms

[2601.01279] Collusive Pricing Under LLM

Abstract page for arXiv paper 2601.01279: Collusive Pricing Under LLM

arXiv - AI · 4 min ·
[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Llms

[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

Abstract page for arXiv paper 2512.16523: TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

arXiv - AI · 4 min ·
[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity
Llms

[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity

Abstract page for arXiv paper 2512.03903: BERnaT: Basque Encoders for Representing Natural Textual Diversity

arXiv - AI · 3 min ·
[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Llms

[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Abstract page for arXiv paper 2512.05959: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

arXiv - AI · 4 min ·
[2511.23455] The Price of Progress: Price Performance and the Future of AI
Llms

[2511.23455] The Price of Progress: Price Performance and the Future of AI

Abstract page for arXiv paper 2511.23455: The Price of Progress: Price Performance and the Future of AI

arXiv - Machine Learning · 4 min ·
[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
Llms

[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

Abstract page for arXiv paper 2511.19299: Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

arXiv - Machine Learning · 4 min ·
[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization
Llms

[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

Abstract page for arXiv paper 2511.22169: Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

arXiv - AI · 4 min ·
Previous Page 51 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime