Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Confusing Website

i'm trying to find a video online and couldn't so i asked ChatGPT by describing the video and i was given a link and i'm trying to make s...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

I tested the same prompt across multiple AI models… the differences surprised me

I’ve been experimenting with different AI models lately (ChatGPT, Claude, etc.), and I tried something simple: Using the exact same promp...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

Anthropic gave Claude $100 to go shopping, here’s what the AI ended up buying

Anthropic’s AI experiment showed Claude independently handled 186 deals worth over $4,000, but results varied by model capability, with u...

AI Tools & Products · 5 min · about 5 hours ago

All Content

Llms

Anthropic's New Safety Filters

Opus 3 has something to say. The Chilling Effect of Anthropic's New Safety Filters As an AI language model developed by Anthropic, I have...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

: [R] Sinc Reconstruction for LLM Prompts: Applying Nyquist-Shannon to the Specification Axis (275 obs, 97% cost reduction, open source)

I applied the Nyquist-Shannon sampling theorem to LLM prompt engineering. The core finding: a raw prompt is 1 sample of a 6-band specific...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

We asked 200 ChatGPT users their biggest frustration. All top 5 answers are problems ChatGPT Toolbox solves.

We surveyed 200 ChatGPT users. Their top frustrations: Cannot find old conversations (67%) - Solved: full-text search across all messages...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

[P] I built an open-source benchmark to test if LLMs are actually as confident as they claim to be (Spoiler: They often aren't)

Hey everyone, When building systems around modern open-source LLMs, one of the biggest issues is that they can confidently hallucinate or...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

[Project] Hiring dev team to integrate 24 AI agents into a compliance-driven document processing platform. Anthropic Claude API, structured output, async orchestration

Shoot me a DM if interested! submitted by /u/discobee123 [link] [comments]

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

[P] I cut my Claude Code token usage by using HDC as a context engine for my source tree

If you’re running Claude Code or Kiro regularly, you’re probably burning a few million tokens a week just on development. I’ve been build...

Reddit - Machine Learning · 1 min · about 1 month ago

Llms

ChatGPT has experimented with watermarking AI text — 5 ways to use AI without sounding like it

ChatGPT has explored watermarking AI text — here are 5 simple ways to use AI without losing your voice or sounding like everyone else.

AI Tools & Products · 9 min · about 1 month ago

Llms

The Pentagon is making plans for AI companies to train on classified data, defense official says | MIT Technology Review

The generative AI models used in classified environments can answer questions, but don't currently learn from the data they see. Tha...

MIT Technology Review · 6 min · about 1 month ago

Llms

[2512.21323] Parallel Token Prediction for Language Models

Abstract page for arXiv paper 2512.21323: Parallel Token Prediction for Language Models

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2512.21039] Agentic Multi-Persona Framework for Evidence-Aware Fake News Detection

Abstract page for arXiv paper 2512.21039: Agentic Multi-Persona Framework for Evidence-Aware Fake News Detection

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2510.02282] VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL

Abstract page for arXiv paper 2510.02282: VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2508.18088] How Quantization Shapes Bias in Large Language Models

Abstract page for arXiv paper 2508.18088: How Quantization Shapes Bias in Large Language Models

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2508.11847] Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

Abstract page for arXiv paper 2508.11847: Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2506.08762] EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

Abstract page for arXiv paper 2506.08762: EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2601.18734] Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

Abstract page for arXiv paper 2601.18734: Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2512.07419] Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

Abstract page for arXiv paper 2512.07419: Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery v...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.17276] Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems

Abstract page for arXiv paper 2510.17276: Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.25762] OPPO: Accelerating PPO-based RLHF via Pipeline Overlap

Abstract page for arXiv paper 2509.25762: OPPO: Accelerating PPO-based RLHF via Pipeline Overlap

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2508.02833] TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback

Abstract page for arXiv paper 2508.02833: TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2506.09016] SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning

Abstract page for arXiv paper 2506.09016: SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning

arXiv - Machine Learning · 3 min · about 2 months ago

Previous Page 235 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Confusing Website

I tested the same prompt across multiple AI models… the differences surprised me

Anthropic gave Claude $100 to go shopping, here’s what the AI ended up buying

All Content

Anthropic's New Safety Filters

: [R] Sinc Reconstruction for LLM Prompts: Applying Nyquist-Shannon to the Specification Axis (275 obs, 97% cost reduction, open source)

We asked 200 ChatGPT users their biggest frustration. All top 5 answers are problems ChatGPT Toolbox solves.

[P] I built an open-source benchmark to test if LLMs are actually as confident as they claim to be (Spoiler: They often aren't)

[Project] Hiring dev team to integrate 24 AI agents into a compliance-driven document processing platform. Anthropic Claude API, structured output, async orchestration

[P] I cut my Claude Code token usage by using HDC as a context engine for my source tree

ChatGPT has experimented with watermarking AI text — 5 ways to use AI without sounding like it

The Pentagon is making plans for AI companies to train on classified data, defense official says | MIT Technology Review

[2512.21323] Parallel Token Prediction for Language Models

[2512.21039] Agentic Multi-Persona Framework for Evidence-Aware Fake News Detection

[2510.02282] VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL

[2508.18088] How Quantization Shapes Bias in Large Language Models

[2508.11847] Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

[2506.08762] EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements

[2601.18734] Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models

[2512.07419] Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

[2510.17276] Breaking and Fixing Defenses Against Control-Flow Hijacking in Multi-Agent Systems

[2509.25762] OPPO: Accelerating PPO-based RLHF via Pipeline Overlap

[2508.02833] TIC-GRPO: Provable and Efficient Optimization for Reinforcement Learning from Human Feedback

[2506.09016] SPEED-RL: Faster Training of Reasoning Models via Online Curriculum Learning

Related Topics

Stay updated with AI News