Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I published a paper today on something I've been calling postural manipulation. The short version: ordi...

Reddit - Artificial Intelligence · 1 min ·
Llms

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I've been documenting what I'm calling postural manipulation: a specific class of language that install...

Reddit - Machine Learning · 1 min ·
There are more AI health tools than ever—but how well do they work? | MIT Technology Review
Llms

There are more AI health tools than ever—but how well do they work? | MIT Technology Review

Earlier this month, Microsoft launched Copilot Health, a new space within its Copilot app where users will be able to connect their medic...

MIT Technology Review · 11 min ·

All Content

[2603.22651] Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies
Llms

[2603.22651] Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies

Abstract page for arXiv paper 2603.22651: Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Stu...

arXiv - AI · 4 min ·
[2603.22619] Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning
Llms

[2603.22619] Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

Abstract page for arXiv paper 2603.22619: Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

arXiv - AI · 4 min ·
[2603.22608] Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length
Llms

[2603.22608] Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length

Abstract page for arXiv paper 2603.22608: Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance C...

arXiv - AI · 4 min ·
[2603.22386] From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents
Llms

[2603.22386] From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Abstract page for arXiv paper 2603.22386: From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

arXiv - AI · 4 min ·
[2603.22305] CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News
Llms

[2603.22305] CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

Abstract page for arXiv paper 2603.22305: CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset ...

arXiv - AI · 4 min ·
[2603.22304] Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization
Llms

[2603.22304] Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

Abstract page for arXiv paper 2603.22304: Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

arXiv - Machine Learning · 3 min ·
[2603.22303] Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models
Llms

[2603.22303] Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models

Abstract page for arXiv paper 2603.22303: Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models

arXiv - AI · 4 min ·
[2603.22301] Latent Semantic Manifolds in Large Language Models
Llms

[2603.22301] Latent Semantic Manifolds in Large Language Models

Abstract page for arXiv paper 2603.22301: Latent Semantic Manifolds in Large Language Models

arXiv - AI · 3 min ·
[2603.22299] Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores
Llms

[2603.22299] Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores

Abstract page for arXiv paper 2603.22299: Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Infor...

arXiv - AI · 3 min ·
[2603.22294] Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks
Llms

[2603.22294] Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks

Abstract page for arXiv paper 2603.22294: Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks

arXiv - AI · 3 min ·
Llms

[P] Cold Validation: Open-source system where one AI agent audits another with zero shared context

We released an open-source architecture for independent AI agent verification. The core idea: the agent that built something should never...

Reddit - Machine Learning · 1 min ·
Llms

FREE HUMANIZER SERVICES WITH GPT HUMAN!!!

Hey, I know how much it sucks to deal with AI detectors at school right now, so I wanted to help out. I recently paid for an unlimited me...

Reddit - Artificial Intelligence · 1 min ·
Llms

Open-source AI system on a $500 GPU outperforms Claude Sonnet on coding benchmarks

What if building more and more datacenters was not the only option? If we are able to get similar levels of performance for top models at...

Reddit - Artificial Intelligence · 1 min ·
Anthropic says Claude can now use your computer to finish tasks for you in AI agent push
Llms

Anthropic says Claude can now use your computer to finish tasks for you in AI agent push

Anthropic and its rivals are trying to ramp up capabilities of AI agents after OpenClaw went viral earlier this year.

AI Tools & Products · 3 min ·
Llms

Alright I'm just going to crash out a bit about LLMs rn downvote me upvote me up to you

Hello everyone hope you're having a nice day I'm just ugh I'm so tired and confused and frustrated. I'm desperately trying to map/figure ...

Reddit - Artificial Intelligence · 1 min ·
Pentagon’s ‘Attempt to Cripple’ Anthropic Is Troubling, Judge Says | WIRED
Llms

Pentagon’s ‘Attempt to Cripple’ Anthropic Is Troubling, Judge Says | WIRED

During a hearing Tuesday, a district court judge questioned the Department of Defense’s motivations for labeling the Claude AI developer ...

Wired - AI · 6 min ·
Llms

I used an app to analyze 3 years of my Claude conversations. It identified a behavioral pattern I'd never named.

Exported everything. Normalized it. Ran cross-source analysis against my journal entries, calendar, and sleep data. The output I couldn't...

Reddit - Artificial Intelligence · 1 min ·
Llms

I tested ChatGPT vs Claude vs Gemini for coding ...here's what I found

So ive been going back and forth between these three for actual work (not just asking it to write fizzbuzz) and wanted to share what I fo...

Reddit - Artificial Intelligence · 1 min ·
OpenAI's plans to make ChatGPT more like Amazon aren't going so well | TechCrunch
Llms

OpenAI's plans to make ChatGPT more like Amazon aren't going so well | TechCrunch

OpenAI says its moving away from Instant Checkout, which allowed users to buy items directly through the ChatGPT interface.

TechCrunch - AI · 4 min ·
Google TV's new Gemini features keep fans updated on sports teams and more | TechCrunch
Llms

Google TV's new Gemini features keep fans updated on sports teams and more | TechCrunch

Three Gemini-powered features are coming to your Google TV. This includes visual responses, deep dives, and sports briefs.

TechCrunch - AI · 4 min ·
Previous Page 27 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime