Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

We're releasing a paper on a new framework for reading and interpreting the internal cognitive states of large language models: "The Lyra...

Reddit - Machine Learning · 1 min · 32 minutes ago

Llms

Looking to build a production-level AI/ML project (agentic systems), need guidance on what to build

Hi everyone, I’m a final-year undergraduate AI/ML student currently focusing on applied AI / agentic systems. So far, I’ve spent time und...

Reddit - ML Jobs · 1 min · about 2 hours ago

Llms

Google isn’t an AI-first company despite Gemini being great

Any time I see an article quoting a Google executive about how "successfully" they’ve implemented AI, I roll my eyes. People treat these ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

All Content

Llms

[2505.13770] Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

Abstract page for arXiv paper 2505.13770: Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Infe...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.21033] Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

Abstract page for arXiv paper 2511.21033: Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.04439] CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

Abstract page for arXiv paper 2511.04439: CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.08966] Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language Models

Abstract page for arXiv paper 2510.08966: Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Lang...

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.04997] Foam-Agent: Towards Automated Intelligent CFD Workflows

Abstract page for arXiv paper 2505.04997: Foam-Agent: Towards Automated Intelligent CFD Workflows

arXiv - AI · 3 min · about 1 month ago

Llms

[2503.07928] The StudyChat Dataset: Analyzing Student Dialogues With ChatGPT in an Artificial Intelligence Course

Abstract page for arXiv paper 2503.07928: The StudyChat Dataset: Analyzing Student Dialogues With ChatGPT in an Artificial Intelligence C...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.05500] POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

Abstract page for arXiv paper 2603.05500: POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.05494] Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Abstract page for arXiv paper 2603.05494: Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.05488] Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Abstract page for arXiv paper 2603.05488: Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.05471] Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Abstract page for arXiv paper 2603.05471: Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.05432] Ensembling Language Models with Sequential Monte Carlo

Abstract page for arXiv paper 2603.05432: Ensembling Language Models with Sequential Monte Carlo

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.05421] MobileFetalCLIP: Selective Repulsive Knowledge Distillation for Mobile Fetal Ultrasound Analysis

Abstract page for arXiv paper 2603.05421: MobileFetalCLIP: Selective Repulsive Knowledge Distillation for Mobile Fetal Ultrasound Analysis

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.05308] Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

Abstract page for arXiv paper 2603.05308: Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.05210] Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding

Abstract page for arXiv paper 2603.05210: Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.05299] WavSLM: Single-Stream Speech Language Modeling via WavLM Distillation

Abstract page for arXiv paper 2603.05299: WavSLM: Single-Stream Speech Language Modeling via WavLM Distillation

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.05167] C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning

Abstract page for arXiv paper 2603.05167: C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reas...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.05121] Measuring the Redundancy of Decoder Layers in SpeechLLMs

Abstract page for arXiv paper 2603.05121: Measuring the Redundancy of Decoder Layers in SpeechLLMs

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04982] Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

Abstract page for arXiv paper 2603.04982: Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04976] 3D-RFT: Reinforcement Fine-Tuning for Video-based 3D Scene Understanding

Abstract page for arXiv paper 2603.04976: 3D-RFT: Reinforcement Fine-Tuning for Video-based 3D Scene Understanding

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04968] When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

Abstract page for arXiv paper 2603.04968: When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

arXiv - AI · 3 min · about 1 month ago

Previous Page 119 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

Looking to build a production-level AI/ML project (agentic systems), need guidance on what to build

Google isn’t an AI-first company despite Gemini being great

All Content

[2505.13770] Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

[2511.21033] Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

[2511.04439] CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

[2510.08966] Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language Models

[2505.04997] Foam-Agent: Towards Automated Intelligent CFD Workflows

[2503.07928] The StudyChat Dataset: Analyzing Student Dialogues With ChatGPT in an Artificial Intelligence Course

[2603.05500] POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

[2603.05494] Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

[2603.05488] Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

[2603.05471] Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

[2603.05432] Ensembling Language Models with Sequential Monte Carlo

[2603.05421] MobileFetalCLIP: Selective Repulsive Knowledge Distillation for Mobile Fetal Ultrasound Analysis

[2603.05308] Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

[2603.05210] Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding

[2603.05299] WavSLM: Single-Stream Speech Language Modeling via WavLM Distillation

[2603.05167] C2-Faith: Benchmarking LLM Judges for Causal and Coverage Faithfulness in Chain-of-Thought Reasoning

[2603.05121] Measuring the Redundancy of Decoder Layers in SpeechLLMs

[2603.04982] Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

[2603.04976] 3D-RFT: Reinforcement Fine-Tuning for Video-based 3D Scene Understanding

[2603.04968] When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

Related Topics

Stay updated with AI News