Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

I thought of something while cooking up a simple RL AI. Please Validate it. [R]

So, I was trying to build a simple AI when I thought of, 'How could I give an AI some emotions? ' This led to one thing after another, an...

Reddit - Machine Learning · 1 min · about 3 hours ago

Llms

Open-source list of GenAI-related incidents

I am sharing this open-source list of cases where the ethics of GenAI use were put in the spotlight, in the hopes of sparking discussion ...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

I built a repo for implementing and training LLM architectures from scratch in minimal PyTorch — contributions welcome! [P]

Hey everyone, I've been working on a repo where I implement large language model architectures using the simplest PyTorch code possible. ...

Reddit - Machine Learning · 1 min · about 4 hours ago

All Content

Llms

[2511.22235] Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Abstract page for arXiv paper 2511.22235: Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

Abstract page for arXiv paper 2511.21471: SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.05854] Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

Abstract page for arXiv paper 2511.05854: Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.26905] Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

Abstract page for arXiv paper 2510.26905: Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.20065] SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

Abstract page for arXiv paper 2505.20065: SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space

Abstract page for arXiv paper 2510.09782: The Geometry of Reasoning: Flowing Logics in Representation Space

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.07972] SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

Abstract page for arXiv paper 2510.07972: SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.21782] Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety

Abstract page for arXiv paper 2509.21782: Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety

arXiv - AI · 4 min · about 1 month ago

Llms

[2508.03284] ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools

Abstract page for arXiv paper 2508.03284: ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.02888] When Your Own Output Becomes Your Training Data: Noise-to-Meaning Loops and a Formal RSI Trigger

Abstract page for arXiv paper 2505.02888: When Your Own Output Becomes Your Training Data: Noise-to-Meaning Loops and a Formal RSI Trigger

arXiv - AI · 3 min · about 1 month ago

Llms

[2507.15796] From Privacy to Trust in the Agentic Era: A Taxonomy of Challenges in Trustworthy Federated Learning Through the Lens of Trust Report 2.0

Abstract page for arXiv paper 2507.15796: From Privacy to Trust in the Agentic Era: A Taxonomy of Challenges in Trustworthy Federated Lea...

arXiv - AI · 4 min · about 1 month ago

Llms

[2507.15518] HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics

Abstract page for arXiv paper 2507.15518: HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics

arXiv - AI · 4 min · about 1 month ago

Llms

[2502.01534] Preference Leakage: A Contamination Problem in LLM-as-a-judge

Abstract page for arXiv paper 2502.01534: Preference Leakage: A Contamination Problem in LLM-as-a-judge

arXiv - AI · 4 min · about 1 month ago

Llms

[2506.08321] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Abstract page for arXiv paper 2506.08321: LeanTutor: Towards a Verified AI Mathematical Proof Tutor

arXiv - AI · 3 min · about 1 month ago

Llms

[2505.21668] R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

Abstract page for arXiv paper 2505.21668: R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

Abstract page for arXiv paper 2505.21281: RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

Abstract page for arXiv paper 2504.20505: MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities o...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Abstract page for arXiv paper 2603.04317: World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurr...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Abstract page for arXiv paper 2603.04257: Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

Abstract page for arXiv paper 2603.04293: LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

arXiv - AI · 3 min · about 1 month ago

Previous Page 195 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

I thought of something while cooking up a simple RL AI. Please Validate it. [R]

Open-source list of GenAI-related incidents

I built a repo for implementing and training LLM architectures from scratch in minimal PyTorch — contributions welcome! [P]

All Content

[2511.22235] Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

[2511.05854] Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

[2510.26905] Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

[2505.20065] SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety

[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space

[2510.07972] SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

[2509.21782] Benchmarking MLLM-based Web Understanding: Reasoning, Robustness and Safety

[2508.03284] ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools

[2505.02888] When Your Own Output Becomes Your Training Data: Noise-to-Meaning Loops and a Formal RSI Trigger

[2507.15796] From Privacy to Trust in the Agentic Era: A Taxonomy of Challenges in Trustworthy Federated Learning Through the Lens of Trust Report 2.0

[2507.15518] HAMLET: A Hierarchical and Adaptive Multi-Agent Framework for Live Embodied Theatrics

[2502.01534] Preference Leakage: A Contamination Problem in LLM-as-a-judge

[2506.08321] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

[2505.21668] R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

[2505.21281] RLJP: Legal Judgment Prediction via First-Order Logic Rule-enhanced with Large Language Models

[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

[2603.04257] Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

Related Topics

Stay updated with AI News