Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I published a paper today on something I've been calling postural manipulation. The short version: ordi...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I've been documenting what I'm calling postural manipulation: a specific class of language that install...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

There are more AI health tools than ever—but how well do they work? | MIT Technology Review

Earlier this month, Microsoft launched Copilot Health, a new space within its Copilot app where users will be able to connect their medic...

MIT Technology Review · 11 min · about 2 hours ago

All Content

Llms

[2603.22651] Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies

Abstract page for arXiv paper 2603.22651: Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Stu...

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22619] Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

Abstract page for arXiv paper 2603.22619: Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22608] Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length

Abstract page for arXiv paper 2603.22608: Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance C...

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22386] From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Abstract page for arXiv paper 2603.22386: From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22305] CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

Abstract page for arXiv paper 2603.22305: CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset ...

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22304] Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

Abstract page for arXiv paper 2603.22304: Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

arXiv - Machine Learning · 3 min · 5 days ago

Llms

[2603.22303] Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models

Abstract page for arXiv paper 2603.22303: Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models

arXiv - AI · 4 min · 5 days ago

Llms

[2603.22301] Latent Semantic Manifolds in Large Language Models

Abstract page for arXiv paper 2603.22301: Latent Semantic Manifolds in Large Language Models

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22299] Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores

Abstract page for arXiv paper 2603.22299: Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Infor...

arXiv - AI · 3 min · 5 days ago

Llms

[2603.22294] Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks

Abstract page for arXiv paper 2603.22294: Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks

arXiv - AI · 3 min · 5 days ago

Llms

[P] Cold Validation: Open-source system where one AI agent audits another with zero shared context

We released an open-source architecture for independent AI agent verification. The core idea: the agent that built something should never...

Reddit - Machine Learning · 1 min · 6 days ago

Llms

FREE HUMANIZER SERVICES WITH GPT HUMAN!!!

Hey, I know how much it sucks to deal with AI detectors at school right now, so I wanted to help out. I recently paid for an unlimited me...

Reddit - Artificial Intelligence · 1 min · 6 days ago

Llms

Open-source AI system on a $500 GPU outperforms Claude Sonnet on coding benchmarks

What if building more and more datacenters was not the only option? If we are able to get similar levels of performance for top models at...

Reddit - Artificial Intelligence · 1 min · 6 days ago

Llms

Anthropic says Claude can now use your computer to finish tasks for you in AI agent push

Anthropic and its rivals are trying to ramp up capabilities of AI agents after OpenClaw went viral earlier this year.

AI Tools & Products · 3 min · 6 days ago

Llms

Alright I'm just going to crash out a bit about LLMs rn downvote me upvote me up to you

Hello everyone hope you're having a nice day I'm just ugh I'm so tired and confused and frustrated. I'm desperately trying to map/figure ...

Reddit - Artificial Intelligence · 1 min · 6 days ago

Llms

Pentagon’s ‘Attempt to Cripple’ Anthropic Is Troubling, Judge Says | WIRED

During a hearing Tuesday, a district court judge questioned the Department of Defense’s motivations for labeling the Claude AI developer ...

Wired - AI · 6 min · 6 days ago

Llms

I used an app to analyze 3 years of my Claude conversations. It identified a behavioral pattern I'd never named.

Exported everything. Normalized it. Ran cross-source analysis against my journal entries, calendar, and sleep data. The output I couldn't...

Reddit - Artificial Intelligence · 1 min · 6 days ago

Llms

I tested ChatGPT vs Claude vs Gemini for coding ...here's what I found

So ive been going back and forth between these three for actual work (not just asking it to write fizzbuzz) and wanted to share what I fo...

Reddit - Artificial Intelligence · 1 min · 6 days ago

Llms

OpenAI's plans to make ChatGPT more like Amazon aren't going so well | TechCrunch

OpenAI says its moving away from Instant Checkout, which allowed users to buy items directly through the ChatGPT interface.

TechCrunch - AI · 4 min · 6 days ago

Llms

Google TV's new Gemini features keep fans updated on sports teams and more | TechCrunch

Three Gemini-powered features are coming to your Google TV. This includes visual responses, deep dives, and sports briefs.

TechCrunch - AI · 4 min · 6 days ago

Previous Page 27 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

There are more AI health tools than ever—but how well do they work? | MIT Technology Review

All Content

[2603.22651] Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies

[2603.22619] Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

[2603.22608] Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length

[2603.22386] From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

[2603.22305] CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

[2603.22304] Mitigating Premature Discretization with Progressive Quantization for Robust Vector Tokenization

[2603.22303] Sample Transform Cost-Based Training-Free Hallucination Detector for Large Language Models

[2603.22301] Latent Semantic Manifolds in Large Language Models

[2603.22299] Between the Layers Lies the Truth: Uncertainty Estimation in LLMs Using Intra-Layer Local Information Scores

[2603.22294] Efficient Embedding-based Synthetic Data Generation for Complex Reasoning Tasks

[P] Cold Validation: Open-source system where one AI agent audits another with zero shared context

FREE HUMANIZER SERVICES WITH GPT HUMAN!!!

Open-source AI system on a $500 GPU outperforms Claude Sonnet on coding benchmarks

Anthropic says Claude can now use your computer to finish tasks for you in AI agent push

Alright I'm just going to crash out a bit about LLMs rn downvote me upvote me up to you

Pentagon’s ‘Attempt to Cripple’ Anthropic Is Troubling, Judge Says | WIRED

I used an app to analyze 3 years of my Claude conversations. It identified a behavioral pattern I'd never named.

I tested ChatGPT vs Claude vs Gemini for coding ...here's what I found

OpenAI's plans to make ChatGPT more like Amazon aren't going so well | TechCrunch

Google TV's new Gemini features keep fans updated on sports teams and more | TechCrunch

Related Topics

Stay updated with AI News