Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Building knowledge bases from YouTube data using LLMs -- my workflow after 52 guides

I've been building a system that turns YouTube channels into structured knowledge bases. Thought I'd share the workflow since Karpathy's ...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

What is AI, how do apps like ChatGPT work and why are there concerns?

AI is transforming modern life, but some critics worry about its potential misuse and environmental impact.

AI News - General · 7 min · about 6 hours ago

Llms

[2603.29957] Think Anywhere in Code Generation

Abstract page for arXiv paper 2603.29957: Think Anywhere in Code Generation

arXiv - Machine Learning · 3 min · about 9 hours ago

All Content

Llms

[2603.22370] FAAR: Format-Aware Adaptive Rounding for NVFP4

Abstract page for arXiv paper 2603.22370: FAAR: Format-Aware Adaptive Rounding for NVFP4

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22935] Ran Score: a LLM-based Evaluation Score for Radiology Report Generation

Abstract page for arXiv paper 2603.22935: Ran Score: a LLM-based Evaluation Score for Radiology Report Generation

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22934] ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

Abstract page for arXiv paper 2603.22934: ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22904] Separating Diagnosis from Control: Auditable Policy Adaptation in Agent-Based Simulations with LLM-Based Diagnostics

Abstract page for arXiv paper 2603.22904: Separating Diagnosis from Control: Auditable Policy Adaptation in Agent-Based Simulations with ...

arXiv - AI · 3 min · 9 days ago

Llms

[2603.22352] WIST: Web-Grounded Iterative Self-Play Tree for Domain-Targeted Reasoning Improvement

Abstract page for arXiv paper 2603.22352: WIST: Web-Grounded Iterative Self-Play Tree for Domain-Targeted Reasoning Improvement

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22871] Dynamical Systems Theory Behind a Hierarchical Reasoning Model

Abstract page for arXiv paper 2603.22871: Dynamical Systems Theory Behind a Hierarchical Reasoning Model

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22869] Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories

Abstract page for arXiv paper 2603.22869: Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Tr...

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

Abstract page for arXiv paper 2603.22339: Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

arXiv - Machine Learning · 4 min · 9 days ago

Llms

[2603.22333] Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

Abstract page for arXiv paper 2603.22333: Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

arXiv - AI · 3 min · 9 days ago

Llms

[2603.22332] Large Language Models for Missing Data Imputation: Understanding Behavior, Hallucination Effects, and Control Mechanisms

Abstract page for arXiv paper 2603.22332: Large Language Models for Missing Data Imputation: Understanding Behavior, Hallucination Effect...

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22829] Improving Safety Alignment via Balanced Direct Preference Optimization

Abstract page for arXiv paper 2603.22829: Improving Safety Alignment via Balanced Direct Preference Optimization

arXiv - AI · 3 min · 9 days ago

Llms

[2603.22329] Trained Persistent Memory for Frozen Decoder-Only LLMs

Abstract page for arXiv paper 2603.22329: Trained Persistent Memory for Frozen Decoder-Only LLMs

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22777] AgriPestDatabase-v1.0: A Structured Insect Dataset for Training Agricultural Large Language Model

Abstract page for arXiv paper 2603.22777: AgriPestDatabase-v1.0: A Structured Insect Dataset for Training Agricultural Large Language Model

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22767] Can LLM Agents Generate Real-World Evidence? Evaluating Observational Studies in Medical Databases

Abstract page for arXiv paper 2603.22767: Can LLM Agents Generate Real-World Evidence? Evaluating Observational Studies in Medical Databases

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22324] DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

Abstract page for arXiv paper 2603.22324: DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

arXiv - AI · 3 min · 9 days ago

Llms

[2603.22744] Beyond Binary Correctness: Scaling Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks

Abstract page for arXiv paper 2603.22744: Beyond Binary Correctness: Scaling Evaluation of Long-Horizon Agents on Subjective Enterprise T...

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22651] Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies

Abstract page for arXiv paper 2603.22651: Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Stu...

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22619] Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

Abstract page for arXiv paper 2603.22619: Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22608] Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length

Abstract page for arXiv paper 2603.22608: Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance C...

arXiv - AI · 4 min · 9 days ago

Llms

[2603.22386] From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Abstract page for arXiv paper 2603.22386: From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

arXiv - AI · 4 min · 9 days ago

Previous Page 56 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Building knowledge bases from YouTube data using LLMs -- my workflow after 52 guides

What is AI, how do apps like ChatGPT work and why are there concerns?

[2603.29957] Think Anywhere in Code Generation

All Content

[2603.22370] FAAR: Format-Aware Adaptive Rounding for NVFP4

[2603.22935] Ran Score: a LLM-based Evaluation Score for Radiology Report Generation

[2603.22934] ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning

[2603.22904] Separating Diagnosis from Control: Auditable Policy Adaptation in Agent-Based Simulations with LLM-Based Diagnostics

[2603.22352] WIST: Web-Grounded Iterative Self-Play Tree for Domain-Targeted Reasoning Improvement

[2603.22871] Dynamical Systems Theory Behind a Hierarchical Reasoning Model

[2603.22869] Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories

[2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits

[2603.22333] Graph Signal Processing Meets Mamba2: Adaptive Filter Bank via Delta Modulation

[2603.22332] Large Language Models for Missing Data Imputation: Understanding Behavior, Hallucination Effects, and Control Mechanisms

[2603.22829] Improving Safety Alignment via Balanced Direct Preference Optimization

[2603.22329] Trained Persistent Memory for Frozen Decoder-Only LLMs

[2603.22777] AgriPestDatabase-v1.0: A Structured Insect Dataset for Training Agricultural Large Language Model

[2603.22767] Can LLM Agents Generate Real-World Evidence? Evaluating Observational Studies in Medical Databases

[2603.22324] DAQ: Delta-Aware Quantization for Post-Training LLM Weight Compression

[2603.22744] Beyond Binary Correctness: Scaling Evaluation of Long-Horizon Agents on Subjective Enterprise Tasks

[2603.22651] Benchmarking Multi-Agent LLM Architectures for Financial Document Processing: A Comparative Study of Orchestration Patterns, Cost-Accuracy Tradeoffs and Production Scaling Strategies

[2603.22619] Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning

[2603.22608] Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length

[2603.22386] From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Related Topics

Stay updated with AI News