Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Apple to open Siri to rival AI services beyond ChatGPT
Llms

Apple to open Siri to rival AI services beyond ChatGPT

Apple plans to open its Siri voice assistant to rival artificial intelligence (AI) services, moving beyond its partnership with OpenAI, a...

AI Tools & Products · 4 min ·
Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong
Llms

Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong

The boring stuff finally does itself.

AI Tools & Products · 9 min ·
Llms

ChatGPT Just Got 33% More Accurate (The AI News You Missed)

ChatGPT has improved its accuracy by 33%, marking a notable enhancement for users of the AI platform.

AI Tools & Products · 1 min ·

All Content

[2603.22816] When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning
Llms

[2603.22816] When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning

Abstract page for arXiv paper 2603.22816: When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language...

arXiv - AI · 4 min ·
[2603.22714] PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset
Llms

[2603.22714] PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset

Abstract page for arXiv paper 2603.22714: PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representativ...

arXiv - AI · 3 min ·
[2603.22755] KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-Hoc Cooperative LLM Training
Llms

[2603.22755] KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-Hoc Cooperative LLM Training

Abstract page for arXiv paper 2603.22755: KALAVAI: Predicting When Independent Specialist Fusion Works -- A Quantitative Model for Post-H...

arXiv - AI · 3 min ·
[2603.22629] LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
Llms

[2603.22629] LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

Abstract page for arXiv paper 2603.22629: LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

arXiv - AI · 4 min ·
[2603.22665] Improving LLM Predictions via Inter-Layer Structural Encoders
Llms

[2603.22665] Improving LLM Predictions via Inter-Layer Structural Encoders

Abstract page for arXiv paper 2603.22665: Improving LLM Predictions via Inter-Layer Structural Encoders

arXiv - Machine Learning · 3 min ·
[2603.22623] To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models
Llms

[2603.22623] To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models

Abstract page for arXiv paper 2603.22623: To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models

arXiv - AI · 4 min ·
[2603.22563] Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling
Llms

[2603.22563] Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling

Abstract page for arXiv paper 2603.22563: Privacy-Preserving Reinforcement Learning from Human Feedback via Decoupled Reward Modeling

arXiv - Machine Learning · 3 min ·
[2603.22499] OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection
Llms

[2603.22499] OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection

Abstract page for arXiv paper 2603.22499: OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection

arXiv - Machine Learning · 4 min ·
[2603.22593] Language Models Can Explain Visual Features via Steering
Llms

[2603.22593] Language Models Can Explain Visual Features via Steering

Abstract page for arXiv paper 2603.22593: Language Models Can Explain Visual Features via Steering

arXiv - AI · 3 min ·
[2603.22582] Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?
Llms

[2603.22582] Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

Abstract page for arXiv paper 2603.22582: Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

arXiv - AI · 4 min ·
[2603.22577] STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving
Llms

[2603.22577] STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

Abstract page for arXiv paper 2603.22577: STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving

arXiv - AI · 4 min ·
[2603.22528] GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs
Llms

[2603.22528] GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs

Abstract page for arXiv paper 2603.22528: GraphRAG for Engineering Diagrams: ChatP&ID Enables LLM Interaction with P&IDs

arXiv - AI · 4 min ·
[2603.22519] LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface
Llms

[2603.22519] LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface

Abstract page for arXiv paper 2603.22519: LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface

arXiv - AI · 4 min ·
[2603.22510] Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals
Llms

[2603.22510] Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals

Abstract page for arXiv paper 2603.22510: Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals

arXiv - AI · 3 min ·
[2603.22492] Tiny Inference-Time Scaling with Latent Verifiers
Llms

[2603.22492] Tiny Inference-Time Scaling with Latent Verifiers

Abstract page for arXiv paper 2603.22492: Tiny Inference-Time Scaling with Latent Verifiers

arXiv - AI · 4 min ·
[2603.22479] Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games
Llms

[2603.22479] Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games

Abstract page for arXiv paper 2603.22479: Cognitive Training for Language Models: Towards General Capabilities via Cross-Entropy Games

arXiv - AI · 3 min ·
[2603.22473] Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures
Llms

[2603.22473] Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures

Abstract page for arXiv paper 2603.22473: Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architec...

arXiv - AI · 3 min ·
[2603.22355] Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees
Llms

[2603.22355] Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees

Abstract page for arXiv paper 2603.22355: Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalizat...

arXiv - Machine Learning · 4 min ·
[2603.22344] Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study
Llms

[2603.22344] Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study

Abstract page for arXiv paper 2603.22344: Errors in AI-Assisted Retrieval of Medical Literature: A Comparative Study

arXiv - Machine Learning · 4 min ·
[2603.22459] LLM-guided headline rewriting for clickability enhancement without clickbait
Llms

[2603.22459] LLM-guided headline rewriting for clickability enhancement without clickbait

Abstract page for arXiv paper 2603.22459: LLM-guided headline rewriting for clickability enhancement without clickbait

arXiv - AI · 4 min ·
Previous Page 18 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime