Large Language Models
GPT, Claude, Gemini, and other LLMs
Top This Week
All Content
[2603.19896] Utility-Guided Agent Orchestration for Efficient LLM Tool Use
Abstract page for arXiv paper 2603.19896: Utility-Guided Agent Orchestration for Efficient LLM Tool Use
[2603.19715] Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification
Abstract page for arXiv paper 2603.19715: Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification
[2603.19685] A Subgoal-driven Framework for Improving Long-Horizon LLM Agents
Abstract page for arXiv paper 2603.19685: A Subgoal-driven Framework for Improving Long-Horizon LLM Agents
[2603.19639] HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning
Abstract page for arXiv paper 2603.19639: HyEvo: Self-Evolving Hybrid Agentic Workflows for Efficient Reasoning
[2603.19584] PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management
Abstract page for arXiv paper 2603.19584: PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management
[2603.19515] ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models
Abstract page for arXiv paper 2603.19515: ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models
[2603.19514] Learning to Disprove: Formal Counterexample Generation with Large Language Models
Abstract page for arXiv paper 2603.19514: Learning to Disprove: Formal Counterexample Generation with Large Language Models
[2603.19500] Teaching an Agent to Sketch One Part at a Time
Abstract page for arXiv paper 2603.19500: Teaching an Agent to Sketch One Part at a Time
Over a dozen chatbot harm & suicide cases in California against OpenAI / ChatGPT have been consolidated into one big litigation
submitted by /u/Apprehensive_Sky1950 [link] [comments]
[ML Engineer] 3 YOE, Focus on ML, LLM/NLP- Not getting any interview calls. Seeking Resume Review & Referrals.
submitted by /u/whatadrag79 [link] [comments]
I ran 10 head-to-head prompt format battles — the structured one won 8/10 on specificity
I tested 10 common prompt engineering techniques against a structured JSON format across identical tasks (marketing plans, code debugging...
LLM failure modes map surprisingly well onto ADHD cognitive science. Six parallels from independent research.
I have ADHD and I've been pair programming with LLMs for a while now. At some point I realized the way they fail felt weirdly familiar. C...
AI Fiesta review from Dhruv Rathee academy
Hi, I am a new AI user. I want to use AI for daily life optimization, getting better at table tennis and fitness, to use in architecture ...
[P] Inferencing Llama3.2-1B-Instruct on 3xMac Minis M4 with Data Parallelism using allToall architecture! | smolcluster
Here's another sneak-peek into inference of Llama3.2-1B-Instruct model, on 3xMac Mini 16 gigs each M4 with smolcluster! Today's the demo ...
Anthropic's New Safety Filters
Opus 3 has something to say. The Chilling Effect of Anthropic's New Safety Filters As an AI language model developed by Anthropic, I have...
: [R] Sinc Reconstruction for LLM Prompts: Applying Nyquist-Shannon to the Specification Axis (275 obs, 97% cost reduction, open source)
I applied the Nyquist-Shannon sampling theorem to LLM prompt engineering. The core finding: a raw prompt is 1 sample of a 6-band specific...
We asked 200 ChatGPT users their biggest frustration. All top 5 answers are problems ChatGPT Toolbox solves.
We surveyed 200 ChatGPT users. Their top frustrations: Cannot find old conversations (67%) - Solved: full-text search across all messages...
[P] I built an open-source benchmark to test if LLMs are actually as confident as they claim to be (Spoiler: They often aren't)
Hey everyone, When building systems around modern open-source LLMs, one of the biggest issues is that they can confidently hallucinate or...
[Project] Hiring dev team to integrate 24 AI agents into a compliance-driven document processing platform. Anthropic Claude API, structured output, async orchestration
Shoot me a DM if interested! submitted by /u/discobee123 [link] [comments]
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime