Large Language Models
GPT, Claude, Gemini, and other LLMs
Top This Week
All Content
I building a real-time reality show where 10 AI agents (Claude) compete, form alliances, betray each other, and get eliminated by viewer votes — running a live test right now
For the past few weeks I've been building The Experiment — a live reality show where 10 AI agents are actually playing a game against eac...
[D] Predicting total cost of agentic LLM workflows - is there a research gap around output token count and chain depth estimation?
Working on a practical problem that I think has an interesting ML angle. In agentic LLM workflows (tool use, multi-step reasoning, ReAct-...
Claude Code rolls out a voice mode capability | TechCrunch
Anthropic is stepping up its game in the AI coding space with the rollout of Voice Mode in Claude Code.
ChatGPT's new GPT-5.3 Instant model will stop telling you to calm down | TechCrunch
The company says the new model will reduce the "cringe" that's been annoying its users for months.
Warden — Lock the input, not the screen
I use Claude Code and Cursor for extended agent sessions, sometimes 30-45 minutes of autonomous coding across multiple files. the problem...
Google’s latest Pixel drop allows Gemini to order groceries for you and more | The Verge
Google is launching a big update for Pixel phones, and that includes the ability for its Gemini AI assistant to complete tasks for you, l...
How to Switch From ChatGPT to Claude With Just 1 Simple Prompt
Instructions to enable JavaScript and disable ad blockers for optimal functionality.
[D] If reasoning requires optimization rather than generation, what does that mean for the scaling paradigm?
Been digging into the architectural differences between autoregressive LLMs and Energy-Based Models (EBMs) for reasoning tasks, especiall...
Anthropic's Claude AI being used in Iran war by U.S. military, sources say
Sam Altman responds after mass ChatGPT uninstalls help Claude AI become the most popular iPhone app
submitted by /u/Tiny-Independent273 [link] [comments]
Is ChatGPT Softening Its Coverage of the US Government? I Ran an Experiment.
I have suspected something fundamental has changed within OpenAI and ChatGPT since 5.2 came out, I noticed it would become blunt and appe...
Pentagon Used Claude AI to Attack Iran Just Hours After Trump’s Ban on Anthropic
Anthropic’s Claude is suddenly the most popular iPhone app following Pentagon feud
LLMs can unmask pseudonymous users at scale with surprising accuracy - Ars Technica
Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.
[R] Phase-Only Language Model via O(N Log N) FFT Mixing (PRISM): Exploring Interference Under Unit-Magnitude Constraints
Hello everyone, this is about https://arxiv.org/abs/2512.01208 I have decided to share it to get some feedback. I think it is interesting...
[D] frontier models are a zero sum game for a few tasks - what they gain in reasoning they lose in your specific thing
when Google shipped Gemini 3 last November, it set new benchmarks on reasoning and coding. but it also removed pixel-level image segmenta...
[2602.11909] Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning
Abstract page for arXiv paper 2602.11909: Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning
[2601.18685] LLAMA LIMA: A Living Meta-Analysis on the Effects of Generative AI on Learning Mathematics
Abstract page for arXiv paper 2601.18685: LLAMA LIMA: A Living Meta-Analysis on the Effects of Generative AI on Learning Mathematics
[2601.08427] Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
Abstract page for arXiv paper 2601.08427: Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime