Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

Arc Gate —LLM proxy that hits P=1.00 R=1.00 F1=1.00 on indirect/roleplay prompt injection (beats OpenAI Moderation and LlamaGuard)

Benchmarked on 40 out-of-distribution prompts, indirect requests, roleplay framings, hypothetical scenarios, technical phrasings. The stu...

Reddit - Artificial Intelligence · 1 min ·
Claude can now plug directly into Photoshop, Blender, and Ableton | The Verge
Llms

Claude can now plug directly into Photoshop, Blender, and Ableton | The Verge

Anthropic has launched a set of connectors for Claude that allow the AI chatbot to tap into popular creative software

The Verge - AI · 4 min ·
Llms

Built a multiplayer map where you can see everyone's Claude Code activity as creatures battling it out

Hello r/artificial I built this specifically for Claude Code users - every prompt you run feeds a digital pet called a Prompt Creature. T...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2510.07181] TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Llms

[2510.07181] TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

Abstract page for arXiv paper 2510.07181: TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

arXiv - AI · 4 min ·
[2505.06046] Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information
Llms

[2505.06046] Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

Abstract page for arXiv paper 2505.06046: Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information

arXiv - Machine Learning · 4 min ·
[2509.25541] Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play
Llms

[2509.25541] Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Abstract page for arXiv paper 2509.25541: Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

arXiv - AI · 4 min ·
[2504.08714] Generating Fine Details of Entity Interactions
Llms

[2504.08714] Generating Fine Details of Entity Interactions

Abstract page for arXiv paper 2504.08714: Generating Fine Details of Entity Interactions

arXiv - Machine Learning · 3 min ·
[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning
Llms

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Abstract page for arXiv paper 2509.24222: Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

arXiv - AI · 4 min ·
[2412.19436] Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback
Llms

[2412.19436] Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

Abstract page for arXiv paper 2412.19436: Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback

arXiv - Machine Learning · 3 min ·
[2509.13471] An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software
Llms

[2509.13471] An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software

Abstract page for arXiv paper 2509.13471: An LLM Agentic Approach for Legal-Critical Software: A Case Study for Tax Prep Software

arXiv - AI · 4 min ·
[2509.06415] Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models
Llms

[2509.06415] Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Language Models

Abstract page for arXiv paper 2509.06415: Index-Preserving Lightweight Token Pruning for Efficient Document Understanding in Vision-Langu...

arXiv - AI · 3 min ·
[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering
Llms

[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering

Abstract page for arXiv paper 2508.07321: ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answe...

arXiv - AI · 3 min ·
[2508.00450] When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation
Llms

[2508.00450] When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation

Abstract page for arXiv paper 2508.00450: When Relevance Meets Novelty: Dual-Stable Periodic Optimization for Serendipitous Recommendation

arXiv - AI · 4 min ·
[2507.09875] Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition
Llms

[2507.09875] Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

Abstract page for arXiv paper 2507.09875: Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

arXiv - AI · 4 min ·
[2507.07847] From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems
Llms

[2507.07847] From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Abstract page for arXiv paper 2507.07847: From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Au...

arXiv - AI · 4 min ·
[2602.05630] Rewards as Labels: Revisiting RLVR from a Classification Perspective
Llms

[2602.05630] Rewards as Labels: Revisiting RLVR from a Classification Perspective

Abstract page for arXiv paper 2602.05630: Rewards as Labels: Revisiting RLVR from a Classification Perspective

arXiv - Machine Learning · 4 min ·
[2601.17473] LeanTutor: Towards a Verified AI Mathematical Proof Tutor
Llms

[2601.17473] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Abstract page for arXiv paper 2601.17473: LeanTutor: Towards a Verified AI Mathematical Proof Tutor

arXiv - Machine Learning · 3 min ·
[2505.23783] Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning
Llms

[2505.23783] Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

Abstract page for arXiv paper 2505.23783: Boosting In-Context Learning in LLMs Through the Lens of Classical Supervised Learning

arXiv - AI · 4 min ·
[2512.20760] Generalization of RLVR Using Causal Reasoning as a Testbed
Llms

[2512.20760] Generalization of RLVR Using Causal Reasoning as a Testbed

Abstract page for arXiv paper 2512.20760: Generalization of RLVR Using Causal Reasoning as a Testbed

arXiv - AI · 4 min ·
[2504.07109] OSCAR: Online Soft Compression And Reranking
Llms

[2504.07109] OSCAR: Online Soft Compression And Reranking

Abstract page for arXiv paper 2504.07109: OSCAR: Online Soft Compression And Reranking

arXiv - AI · 3 min ·
[2503.07885] Safety Guardrails for LLM-Enabled Robots
Llms

[2503.07885] Safety Guardrails for LLM-Enabled Robots

Abstract page for arXiv paper 2503.07885: Safety Guardrails for LLM-Enabled Robots

arXiv - AI · 4 min ·
[2511.22935] EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model
Llms

[2511.22935] EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

Abstract page for arXiv paper 2511.22935: EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

arXiv - AI · 4 min ·
[2412.13091] LMUnit: Fine-grained Evaluation with Natural Language Unit Tests
Llms

[2412.13091] LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

Abstract page for arXiv paper 2412.13091: LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

arXiv - AI · 3 min ·
Previous Page 252 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime