Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

[2604.17460] Agentic Education: Using Claude Code to Teach Claude Code

Abstract page for arXiv paper 2604.17460: Agentic Education: Using Claude Code to Teach Claude Code

arXiv - AI · 4 min · about 3 hours ago

Llms

[2603.09117] Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

Abstract page for arXiv paper 2603.09117: Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Ve...

arXiv - AI · 3 min · about 3 hours ago

Llms

[2602.10140] Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study

Abstract page for arXiv paper 2602.10140: Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study

arXiv - AI · 4 min · about 3 hours ago

All Content

Llms

[2505.05619] LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities

Abstract page for arXiv paper 2505.05619: LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Languag...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2404.02138] Topic-Based Watermarks for Large Language Models

Abstract page for arXiv paper 2404.02138: Topic-Based Watermarks for Large Language Models

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.04288] Contextual Drag: How Errors in the Context Affect LLM Reasoning

Abstract page for arXiv paper 2602.04288: Contextual Drag: How Errors in the Context Affect LLM Reasoning

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2601.09566] Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling

Abstract page for arXiv paper 2601.09566: Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling

arXiv - AI · 3 min · about 2 months ago

Llms

[2511.12832] From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

Abstract page for arXiv paper 2511.12832: From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

arXiv - AI · 3 min · about 2 months ago

Llms

[2510.14686] xLLM Technical Report

Abstract page for arXiv paper 2510.14686: xLLM Technical Report

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.14086] Every Language Model Has a Forgery-Resistant Signature

Abstract page for arXiv paper 2510.14086: Every Language Model Has a Forgery-Resistant Signature

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.13900] Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

Abstract page for arXiv paper 2510.13900: Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.13315] Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

Abstract page for arXiv paper 2510.13315: Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.06084] Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

Abstract page for arXiv paper 2510.06084: Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.22641] Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

Abstract page for arXiv paper 2509.22641: Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

arXiv - AI · 4 min · about 2 months ago

$[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute$

Llms

[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

Abstract page for arXiv paper 2509.21091: Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

arXiv - AI · 3 min · about 2 months ago

Llms

[2509.20986] SiNGER: A Clearer Voice Distills Vision Transformers Further

Abstract page for arXiv paper 2509.20986: SiNGER: A Clearer Voice Distills Vision Transformers Further

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.12610] ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

Abstract page for arXiv paper 2509.12610: ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.10625] No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

Abstract page for arXiv paper 2509.10625: No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.05425] No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

Abstract page for arXiv paper 2509.05425: No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

arXiv - AI · 3 min · about 2 months ago

Llms

[2511.10833] SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

Abstract page for arXiv paper 2511.10833: SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2511.08939] TransactionGPT

Abstract page for arXiv paper 2511.08939: TransactionGPT

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2507.05890] Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

Abstract page for arXiv paper 2507.05890: Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

arXiv - AI · 4 min · about 2 months ago

Llms

[2507.01335] LEDOM: Reverse Language Model

Abstract page for arXiv paper 2507.01335: LEDOM: Reverse Language Model

arXiv - AI · 3 min · about 2 months ago

Previous Page 290 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[2604.17460] Agentic Education: Using Claude Code to Teach Claude Code

[2603.09117] Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards

[2602.10140] Can Large Language Models Implement Agent-Based Models? An ODD-based Replication Study

All Content

[2505.05619] LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities

[2404.02138] Topic-Based Watermarks for Large Language Models

[2602.04288] Contextual Drag: How Errors in the Context Affect LLM Reasoning

[2601.09566] Hot-Start from Pixels: Low-Resolution Visual Tokens for Chinese Language Modeling

[2511.12832] From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

[2510.14686] xLLM Technical Report

[2510.14086] Every Language Model Has a Forgery-Resistant Signature

[2510.13900] Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

[2510.13315] Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

[2510.06084] Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

[2509.22641] Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

[2509.20986] SiNGER: A Clearer Voice Distills Vision Transformers Further

[2509.12610] ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

[2509.10625] No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

[2509.05425] No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

[2511.10833] SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

[2511.08939] TransactionGPT

[2507.05890] Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

[2507.01335] LEDOM: Reverse Language Model

Related Topics

Stay updated with AI News