Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Apple to open Siri to rival AI services beyond ChatGPT

Apple plans to open its Siri voice assistant to rival artificial intelligence (AI) services, moving beyond its partnership with OpenAI, a...

AI Tools & Products · 4 min · 23 minutes ago

Llms

Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong

The boring stuff finally does itself.

AI Tools & Products · 9 min · 23 minutes ago

Llms

ChatGPT Just Got 33% More Accurate (The AI News You Missed)

ChatGPT has improved its accuracy by 33%, marking a notable enhancement for users of the AI platform.

AI Tools & Products · 1 min · 23 minutes ago

All Content

Llms

[2603.22816] When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning

Abstract page for arXiv paper 2603.22816: When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language...

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22714] PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset

Abstract page for arXiv paper 2603.22714: PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representativ...

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22755] KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-Hoc Cooperative LLM Training

Abstract page for arXiv paper 2603.22755: KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-H...

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22629] LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

Abstract page for arXiv paper 2603.22629: LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22665] Improving LLM Predictions via Inter-Layer Structural Encoders

Abstract page for arXiv paper 2603.22665: Improving LLM Predictions via Inter-Layer Structural Encoders

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.22623] To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models

Abstract page for arXiv paper 2603.22623: To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22563] Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling

Abstract page for arXiv paper 2603.22563: Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.22499] OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection

Abstract page for arXiv paper 2603.22499: OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.22593] Language Models Can Explain Visual Features via Steering

Abstract page for arXiv paper 2603.22593: Language Models Can Explain Visual Features via Steering

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22582] Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

Abstract page for arXiv paper 2603.22582: Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22577] STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

Abstract page for arXiv paper 2603.22577: STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22528] GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs

Abstract page for arXiv paper 2603.22528: GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22519] LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface

Abstract page for arXiv paper 2603.22519: LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22510] Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals

Abstract page for arXiv paper 2603.22510: Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22492] Tiny Inference-Time Scaling with Latent Verifiers

Abstract page for arXiv paper 2603.22492: Tiny Inference-Time Scaling with Latent Verifiers

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22479] Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games

Abstract page for arXiv paper 2603.22479: Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22473] Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures

Abstract page for arXiv paper 2603.22473: Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architec...

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22355] Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees

Abstract page for arXiv paper 2603.22355: Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalizat...

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.22344] Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study

Abstract page for arXiv paper 2603.22344: Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2603.22459] LLM-guided headline rewriting for clickability enhancement without clickbait

Abstract page for arXiv paper 2603.22459: LLM-guided headline rewriting for clickability enhancement without clickbait

arXiv - AI · 4 min · 5 days ago

Previous Page 18 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Apple to open Siri to rival AI services beyond ChatGPT

Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong

ChatGPT Just Got 33% More Accurate (The AI News You Missed)

All Content

[2603.22816] When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning

[2603.22714] PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset

[2603.22755] KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-Hoc Cooperative LLM Training

[2603.22629] LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

[2603.22665] Improving LLM Predictions via Inter-Layer Structural Encoders

[2603.22623] To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models

[2603.22563] Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling

[2603.22499] OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection

[2603.22593] Language Models Can Explain Visual Features via Steering

[2603.22582] Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

[2603.22577] STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

[2603.22528] GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs

[2603.22519] LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface

[2603.22510] Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals

[2603.22492] Tiny Inference-Time Scaling with Latent Verifiers

[2603.22479] Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games

[2603.22473] Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures

[2603.22355] Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees

[2603.22344] Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study

[2603.22459] LLM-guided headline rewriting for clickability enhancement without clickbait

Related Topics

Stay updated with AI News