Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

The Day AI Stopped Being a Tab You Switch To — Claude Is Now Inside Your Software

submitted by /u/monotvtv [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

How are LLMs 'corrected' when users identify them spreading misinformation or saying something harmful?

I watched Last Week Tonight's piece on AI chatbots today, and it got me thinking about that old screenshot of a Google search in which Ge...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

What is the scientific value of administering the standard Rorschach test to LLMs when the training data is almost certainly contaminated? (R) + [D]

A recent paper published in JMIR Mental Health (Csigó & Cserey, 2026) caught my attention. The researchers administered the 10 standa...

Reddit - Machine Learning · 1 min · about 3 hours ago

All Content

Llms

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

Abstract page for arXiv paper 2603.04162: Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Lan...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04069] Monitoring Emergent Reward Hacking During Generation via Internal Activations

Abstract page for arXiv paper 2603.04069: Monitoring Emergent Reward Hacking During Generation via Internal Activations

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03683] CONCUR: Benchmarking LLMs for Concurrent Code Generation

Abstract page for arXiv paper 2603.03683: CONCUR: Benchmarking LLMs for Concurrent Code Generation

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

Abstract page for arXiv paper 2603.04002: Discriminative Perception via Anchored Description for Reasoning Segmentation

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

Abstract page for arXiv paper 2603.03589: stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

Abstract page for arXiv paper 2603.03983: GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Abstract page for arXiv paper 2603.03583: ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

Abstract page for arXiv paper 2603.03964: BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Abstract page for arXiv paper 2603.03915: Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personalit...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

Abstract page for arXiv paper 2603.03897: IROSA: Interactive Robot Skill Adaptation using Natural Language

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

Abstract page for arXiv paper 2603.03881: On the Suitability of LLM-Driven Agents for Dark Pattern Audits

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

Abstract page for arXiv paper 2603.03336: Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

Abstract page for arXiv paper 2603.03310: Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

Abstract page for arXiv paper 2603.03823: SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

Abstract page for arXiv paper 2603.03790: T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Re...

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Abstract page for arXiv paper 2603.04378: Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

Abstract page for arXiv paper 2603.04355: Efficient Refusal Ablation in LLM through Optimal Transport

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

Abstract page for arXiv paper 2603.04354: Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

Abstract page for arXiv paper 2603.03752: Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

Abstract page for arXiv paper 2603.04300: LUMINA: Foundation Models for Topology Transferable ACOPF

arXiv - Machine Learning · 3 min · about 2 months ago

Previous Page 255 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

The Day AI Stopped Being a Tab You Switch To — Claude Is Now Inside Your Software

How are LLMs 'corrected' when users identify them spreading misinformation or saying something harmful?

What is the scientific value of administering the standard Rorschach test to LLMs when the training data is almost certainly contaminated? (R) + [D]

All Content

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

[2603.04069] Monitoring Emergent Reward Hacking During Generation via Internal Activations

[2603.03683] CONCUR: Benchmarking LLMs for Concurrent Code Generation

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

[2603.03897] IROSA: Interactive Robot Skill Adaptation using Natural Language

[2603.03881] On the Suitability of LLM-Driven Agents for Dark Pattern Audits

[2603.03336] Prompt-Dependent Ranking of Large Language Models with Uncertainty Quantification

[2603.03310] Entropic-Time Inference: Self-Organizing Large Language Model Decoding Beyond Attention

[2603.03823] SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

[2603.03790] T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning

[2603.04378] Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

[2603.04355] Efficient Refusal Ablation in LLM through Optimal Transport

[2603.04354] Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

[2603.03752] Confidence-Calibrated Small-Large Language Model Collaboration for Cost-Efficient Reasoning

[2603.04300] LUMINA: Foundation Models for Topology Transferable ACOPF

Related Topics

Stay updated with AI News