Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

How strongly do you believe LLM judges on the for the ML papers?? [D]

I'm curious about your thoughts on these, as far as I've seen most of the comments are nitpicking about "missing ablations" while some co...

Reddit - Machine Learning · 1 min ·
Llms

Google just dropped "Deep Research Max" — We are officially entering the era of Autonomous Agents. RIP to the "Junior Analyst"?

The shift from "Chatbot" to "Agent" just hit warp speed. Google’s release of Deep Research Max isn't just another incremental update; it’...

Reddit - Artificial Intelligence · 1 min ·
Llms

Built a prompt injection proxy that beats OpenAI Moderation and LlamaGuard — try it in 30 seconds without leaving this

Built Arc Gate — sits in front of any OpenAI-compatible endpoint and blocks prompt injection before it reaches your model. Just change yo...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models
Llms

[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

Abstract page for arXiv paper 2505.21281: RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

arXiv - AI · 4 min ·
[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living
Llms

[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

Abstract page for arXiv paper 2504.20505: MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities o...

arXiv - AI · 4 min ·
[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings
Llms

[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Abstract page for arXiv paper 2603.04317: World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurr...

arXiv - AI · 3 min ·
[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory
Llms

[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Abstract page for arXiv paper 2603.04257: Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

arXiv - Machine Learning · 4 min ·
[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance
Llms

[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

Abstract page for arXiv paper 2603.04293: LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

arXiv - AI · 3 min ·
[2603.04277] VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments
Llms

[2603.04277] VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

Abstract page for arXiv paper 2603.04277: VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

arXiv - AI · 4 min ·
[2603.04259] When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies
Llms

[2603.04259] When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies

Abstract page for arXiv paper 2603.04259: When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies

arXiv - AI · 4 min ·
[2603.04222] PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving
Llms

[2603.04222] PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving

Abstract page for arXiv paper 2603.04222: PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Ada...

arXiv - AI · 3 min ·
[2603.04165] PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters
Llms

[2603.04165] PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

Abstract page for arXiv paper 2603.04165: PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters

arXiv - AI · 3 min ·
[2603.04177] CodeTaste: Can LLMs Generate Human-Level Code Refactorings?
Llms

[2603.04177] CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

Abstract page for arXiv paper 2603.04177: CodeTaste: Can LLMs Generate Human-Level Code Refactorings?

arXiv - AI · 3 min ·
[2603.04128] Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
Llms

[2603.04128] Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Abstract page for arXiv paper 2603.04128: Crab$^{+}$: A Scalable and Unified Audio-Visual Scene Understanding Model with Explicit Coopera...

arXiv - AI · 4 min ·
[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model
Llms

[2603.04162] Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Language Model

Abstract page for arXiv paper 2603.04162: Bielik-Q2-Sharp: A Comparative Study of Extreme 2-bit Quantization Methods for a Polish 11B Lan...

arXiv - AI · 3 min ·
[2603.04069] Monitoring Emergent Reward Hacking During Generation via Internal Activations
Llms

[2603.04069] Monitoring Emergent Reward Hacking During Generation via Internal Activations

Abstract page for arXiv paper 2603.04069: Monitoring Emergent Reward Hacking During Generation via Internal Activations

arXiv - AI · 4 min ·
[2603.03683] CONCUR: Benchmarking LLMs for Concurrent Code Generation
Llms

[2603.03683] CONCUR: Benchmarking LLMs for Concurrent Code Generation

Abstract page for arXiv paper 2603.03683: CONCUR: Benchmarking LLMs for Concurrent Code Generation

arXiv - Machine Learning · 4 min ·
[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation
Llms

[2603.04002] Discriminative Perception via Anchored Description for Reasoning Segmentation

Abstract page for arXiv paper 2603.04002: Discriminative Perception via Anchored Description for Reasoning Segmentation

arXiv - AI · 4 min ·
[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads
Llms

[2603.03589] stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

Abstract page for arXiv paper 2603.03589: stratum: A System Infrastructure for Massive Agent-Centric ML Workloads

arXiv - Machine Learning · 4 min ·
[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery
Llms

[2603.03983] GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

Abstract page for arXiv paper 2603.03983: GeoSeg: Training-Free Reasoning-Driven Segmentation in Remote Sensing Imagery

arXiv - AI · 3 min ·
[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer
Llms

[2603.03583] ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Abstract page for arXiv paper 2603.03583: ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

arXiv - Machine Learning · 3 min ·
[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft
Llms

[2603.03964] BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

Abstract page for arXiv paper 2603.03964: BLOCK: An Open-Source Bi-Stage MLLM Character-to-Skin Pipeline for Minecraft

arXiv - AI · 3 min ·
[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects
Llms

[2603.03915] Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Abstract page for arXiv paper 2603.03915: Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personalit...

arXiv - AI · 3 min ·
Previous Page 267 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime