Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

How strongly do you believe LLM judges on the for the ML papers?? [D]

I'm curious about your thoughts on these, as far as I've seen most of the comments are nitpicking about "missing ablations" while some co...

Reddit - Machine Learning · 1 min · 18 minutes ago

Llms

Google just dropped "Deep Research Max" — We are officially entering the era of Autonomous Agents. RIP to the "Junior Analyst"?

The shift from "Chatbot" to "Agent" just hit warp speed. Google’s release of Deep Research Max isn't just another incremental update; it’...

Reddit - Artificial Intelligence · 1 min · 18 minutes ago

Llms

Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — try it in 30 seconds without leaving this

Built Arc Gate — sits in front of any OpenAI-compatible endpoint and blocks prompt injection before it reaches your model. Just change yo...

Reddit - Artificial Intelligence · 1 min · 18 minutes ago

All Content

Llms

[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

Abstract page for arXiv paper 2505.21281: RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

arXiv - AI · 4 min · about 2 months ago

Llms

[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

Abstract page for arXiv paper 2504.20505: MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities o...

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Abstract page for arXiv paper 2603.04317: World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurr...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Abstract page for arXiv paper 2603.04257: Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

Abstract page for arXiv paper 2603.04293: LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04277] VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

Abstract page for arXiv paper 2603.04277: VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04259] When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies

Abstract page for arXiv paper 2603.04259: When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04222] PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving

Abstract page for arXiv paper 2603.04222: PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Ada...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04165] PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

Abstract page for arXiv paper 2603.04165: PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04177] CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

Abstract page for arXiv paper 2603.04177: CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

arXiv - AI · 3 min · about 2 months ago

$[2603.04128] Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation$

Llms

[2603.04128] Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Abstract page for arXiv paper 2603.04128: Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Coopera...

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

Abstract page for arXiv paper 2603.04162: Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Lan...

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.04069] Monitoring Emergent Reward Hacking During Generation via Internal Activations

Abstract page for arXiv paper 2603.04069: Monitoring Emergent Reward Hacking During Generation via Internal Activations

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03683] CONCUR: Benchmarking LLMs for Concurrent Code Generation

Abstract page for arXiv paper 2603.03683: CONCUR: Benchmarking LLMs for Concurrent Code Generation

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

Abstract page for arXiv paper 2603.04002: Discriminative Perception via Anchored Description for Reasoning Segmentation

arXiv - AI · 4 min · about 2 months ago

Llms

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

Abstract page for arXiv paper 2603.03589: stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

Abstract page for arXiv paper 2603.03983: GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Abstract page for arXiv paper 2603.03583: ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

Abstract page for arXiv paper 2603.03964: BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

arXiv - AI · 3 min · about 2 months ago

Llms

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Abstract page for arXiv paper 2603.03915: Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personalit...

arXiv - AI · 3 min · about 2 months ago

Previous Page 267 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

How strongly do you believe LLM judges on the for the ML papers?? [D]

Google just dropped "Deep Research Max" — We are officially entering the era of Autonomous Agents. RIP to the "Junior Analyst"?

Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — try it in 30 seconds without leaving this

All Content

[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

[2603.04277] VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

[2603.04259] When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies

[2603.04222] PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving

[2603.04165] PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

[2603.04177] CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

[2603.04128] Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

[2603.04069] Monitoring Emergent Reward Hacking During Generation via Internal Activations

[2603.03683] CONCUR: Benchmarking LLMs for Concurrent Code Generation

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Related Topics

Stay updated with AI News