Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

OpenAI introduces new 'Trusted Contact' safeguard for cases of possible self-harm | TechCrunch
Llms

OpenAI introduces new 'Trusted Contact' safeguard for cases of possible self-harm | TechCrunch

The company is expanding its efforts to protect ChatGPT users in cases where conversations may turn to self-harm.

TechCrunch - AI · 5 min ·
Mira Murati’s deposition pulled back the curtain on Sam Altman’s ouster | The Verge
Llms

Mira Murati’s deposition pulled back the curtain on Sam Altman’s ouster | The Verge

Thanks to Musk v. Altman, the public is getting a concrete look at details of Sam Altman’s ouster from OpenAI, much of it centered on for...

The Verge - AI · 11 min ·
Llms

Diffusion for generating/editing ASTs? [D]

I’m not a machine learning expert or anything, but I do enjoy learning about how it all works. I’ve noticed that one of the main limitati...

Reddit - Machine Learning · 1 min ·

All Content

[2506.20746] Dynamic Weight Grafting: Localizing Finetuned Factual Knowledge in Transformers
Llms

[2506.20746] Dynamic Weight Grafting: Localizing Finetuned Factual Knowledge in Transformers

Abstract page for arXiv paper 2506.20746: Dynamic Weight Grafting: Localizing Finetuned Factual Knowledge in Transformers

arXiv - Machine Learning · 4 min ·
[2506.15872] Hidden Breakthroughs in Language Model Training
Llms

[2506.15872] Hidden Breakthroughs in Language Model Training

Abstract page for arXiv paper 2506.15872: Hidden Breakthroughs in Language Model Training

arXiv - Machine Learning · 3 min ·
[2508.11999] MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding
Llms

[2508.11999] MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understanding

Abstract page for arXiv paper 2508.11999: MOON: Generative MLLM-based Multimodal Representation Learning for E-commerce Product Understan...

arXiv - Machine Learning · 4 min ·
[2508.06526] PiKV: KV Cache Management System for Mixture of Experts
Llms

[2508.06526] PiKV: KV Cache Management System for Mixture of Experts

Abstract page for arXiv paper 2508.06526: PiKV: KV Cache Management System for Mixture of Experts

arXiv - AI · 4 min ·
[2506.15307] SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC
Llms

[2506.15307] SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC

Abstract page for arXiv paper 2506.15307: SecP-Tuning: Efficient Privacy-Preserving Prompt Tuning for Large Language Models via MPC

arXiv - Machine Learning · 4 min ·
[2506.14003] Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs
Llms

[2506.14003] Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs

Abstract page for arXiv paper 2506.14003: Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs

arXiv - Machine Learning · 4 min ·
[2507.15852] Advancing Complex Video Object Segmentation via Progressive Concept Construction
Llms

[2507.15852] Advancing Complex Video Object Segmentation via Progressive Concept Construction

Abstract page for arXiv paper 2507.15852: Advancing Complex Video Object Segmentation via Progressive Concept Construction

arXiv - AI · 4 min ·
[2507.04219] Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs
Llms

[2507.04219] Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs

Abstract page for arXiv paper 2507.04219: Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs

arXiv - Machine Learning · 4 min ·
[2506.02939] QKV Projections Require a Fraction of Their Memory
Llms

[2506.02939] QKV Projections Require a Fraction of Their Memory

Abstract page for arXiv paper 2506.02939: QKV Projections Require a Fraction of Their Memory

arXiv - Machine Learning · 3 min ·
[2506.20666] Cognitive models can reveal interpretable value trade-offs in language models
Llms

[2506.20666] Cognitive models can reveal interpretable value trade-offs in language models

Abstract page for arXiv paper 2506.20666: Cognitive models can reveal interpretable value trade-offs in language models

arXiv - AI · 4 min ·
[2506.18841] LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning
Llms

[2506.18841] LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Abstract page for arXiv paper 2506.18841: LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

arXiv - AI · 4 min ·
[2505.19590] Learning to Reason without External Rewards
Llms

[2505.19590] Learning to Reason without External Rewards

Abstract page for arXiv paper 2505.19590: Learning to Reason without External Rewards

arXiv - Machine Learning · 3 min ·
[2506.13474] Language Agents for Hypothesis-driven Clinical Decision Making with Reinforcement Learning
Llms

[2506.13474] Language Agents for Hypothesis-driven Clinical Decision Making with Reinforcement Learning

Abstract page for arXiv paper 2506.13474: Language Agents for Hypothesis-driven Clinical Decision Making with Reinforcement Learning

arXiv - Machine Learning · 4 min ·
[2506.15498] SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling
Llms

[2506.15498] SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling

Abstract page for arXiv paper 2506.15498: SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervisio...

arXiv - Machine Learning · 4 min ·
[2505.18116] NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning
Llms

[2505.18116] NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

Abstract page for arXiv paper 2505.18116: NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning

arXiv - Machine Learning · 4 min ·
[2506.10085] VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models
Llms

[2506.10085] VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

Abstract page for arXiv paper 2506.10085: VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

arXiv - AI · 4 min ·
[2505.16122] Plan and Budget: Effective and Efficient Test-Time Scaling on Reasoning Large Language Models
Llms

[2505.16122] Plan and Budget: Effective and Efficient Test-Time Scaling on Reasoning Large Language Models

Abstract page for arXiv paper 2505.16122: Plan and Budget: Effective and Efficient Test-Time Scaling on Reasoning Large Language Models

arXiv - Machine Learning · 4 min ·
[2505.14042] Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners
Llms

[2505.14042] Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners

Abstract page for arXiv paper 2505.14042: Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners

arXiv - Machine Learning · 4 min ·
[2506.08902] Intention-Conditioned Flow Occupancy Models
Llms

[2506.08902] Intention-Conditioned Flow Occupancy Models

Abstract page for arXiv paper 2506.08902: Intention-Conditioned Flow Occupancy Models

arXiv - Machine Learning · 4 min ·
[2506.06683] RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks
Llms

[2506.06683] RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks

Abstract page for arXiv paper 2506.06683: RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks

arXiv - AI · 4 min ·
Previous Page 340 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime