Large Language Models
GPT, Claude, Gemini, and other LLMs
Top This Week
All Content
[2505.20065] SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
Abstract page for arXiv paper 2505.20065: SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space
Abstract page for arXiv paper 2510.09782: The Geometry of Reasoning: Flowing Logics in Representation Space
[2510.07972] SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance
Abstract page for arXiv paper 2510.07972: SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance
[2509.21782] Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety
Abstract page for arXiv paper 2509.21782: Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety
[2508.03284] ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
Abstract page for arXiv paper 2508.03284: ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools
[2505.02888] When Your Own Output Becomes Your Training Data: Noise-to-Meaning Loops and a Formal RSI Trigger
Abstract page for arXiv paper 2505.02888: When Your Own Output Becomes Your Training Data: Noise-to-Meaning Loops and a Formal RSI Trigger
[2507.15796] From Privacy to Trust in the Agentic Era: A Taxonomy of Challenges in Trustworthy Federated Learning Through the Lens of Trust Report 2.0
Abstract page for arXiv paper 2507.15796: From Privacy to Trust in the Agentic Era: A Taxonomy of Challenges in Trustworthy Federated Lea...
[2507.15518] HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics
Abstract page for arXiv paper 2507.15518: HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics
[2502.01534] Preference Leakage: A Contamination Problem in LLM-as-a-judge
Abstract page for arXiv paper 2502.01534: Preference Leakage: A Contamination Problem in LLM-as-a-judge
[2506.08321] LeanTutor: Towards a Verified AI Mathematical Proof Tutor
Abstract page for arXiv paper 2506.08321: LeanTutor: Towards a Verified AI Mathematical Proof Tutor
[2505.21668] R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning
Abstract page for arXiv paper 2505.21668: R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning
[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models
Abstract page for arXiv paper 2505.21281: RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models
[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living
Abstract page for arXiv paper 2504.20505: MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities o...
[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings
Abstract page for arXiv paper 2603.04317: World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurr...
[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory
Abstract page for arXiv paper 2603.04257: Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory
[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance
Abstract page for arXiv paper 2603.04293: LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance
[2603.04277] VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments
Abstract page for arXiv paper 2603.04277: VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments
[2603.04259] When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies
Abstract page for arXiv paper 2603.04259: When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies
[2603.04222] PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Adaptive Autonomous Driving
Abstract page for arXiv paper 2603.04222: PRAM-R: A Perception-Reasoning-Action-Memory Framework with LLM-Guided Modality Routing for Ada...
[2603.04165] PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters
Abstract page for arXiv paper 2603.04165: PlaneCycle: Training-Free 2D-to-3D Lifting of Foundation Models Without Adapters
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime