Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

I am not an "anti" like this guy, but still an interesting video of person interacting with chat 4o

(Posting Here because removed by Chatgpt Complaints moderators because the model here is 4o, and refuse to believe there were any safety ...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

We built a way for two people's AI context to talk to each other (without sharing their conversations)

We've been thinking about how we use AI in our relationships. Big part of it is about other people. Talking about them, figuring out what...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

No flattery please, Claude: I’m British | Brief letters

AI Tools & Products · 2 min · about 4 hours ago

All Content

Llms

[2507.08207] Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking

Abstract page for arXiv paper 2507.08207: Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbr...

arXiv - AI · 3 min · 2 months ago

Llms

[2505.19892] OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

Abstract page for arXiv paper 2505.19892: OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

arXiv - AI · 4 min · 2 months ago

Llms

[2505.13909] Efficient Agent Training for Computer Use

Abstract page for arXiv paper 2505.13909: Efficient Agent Training for Computer Use

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2505.13180] ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

Abstract page for arXiv paper 2505.13180: ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

arXiv - AI · 4 min · 2 months ago

Llms

[2603.03269] LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Abstract page for arXiv paper 2603.03269: LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.03180] Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling

Abstract page for arXiv paper 2603.03180: Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industr...

arXiv - AI · 4 min · 2 months ago

Llms

[2603.03192] MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization

Abstract page for arXiv paper 2603.03192: MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Pr...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2603.02952] Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

Abstract page for arXiv paper 2603.02952: Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in singl...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.03095] Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

Abstract page for arXiv paper 2603.03095: Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

arXiv - AI · 3 min · 2 months ago

Llms

[2603.02775] From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

Abstract page for arXiv paper 2603.02775: From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.03047] TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health

Abstract page for arXiv paper 2603.03047: TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language M...

arXiv - AI · 4 min · 2 months ago

Llms

[2603.02983] Contextualized Privacy Defense for LLM Agents

Abstract page for arXiv paper 2603.02983: Contextualized Privacy Defense for LLM Agents

arXiv - AI · 3 min · 2 months ago

Llms

[2603.02623] Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

Abstract page for arXiv paper 2603.02623: Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.02949] SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment

Abstract page for arXiv paper 2603.02949: SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark ...

arXiv - AI · 3 min · 2 months ago

Llms

[2603.02909] Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction

Abstract page for arXiv paper 2603.02909: Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-...

arXiv - AI · 4 min · 2 months ago

Llms

[2603.02830] Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

Abstract page for arXiv paper 2603.02830: Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

arXiv - AI · 4 min · 2 months ago

Llms

[2603.02789] OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-Scale Datasets

Abstract page for arXiv paper 2603.02789: OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-S...

arXiv - AI · 3 min · 2 months ago

Llms

[2603.02470] Video TokenCom: Textual Intent-Guided Multi-Rate Video Token Communications with UEP-Based Adaptive Source-Channel Coding

Abstract page for arXiv paper 2603.02470: Video TokenCom: Textual Intent-Guided Multi-Rate Video Token Communications with UEP-Based Adap...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.02760] Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration

Abstract page for arXiv paper 2603.02760: Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration

arXiv - AI · 3 min · 2 months ago

Llms

[2603.02748] iGVLM: Dynamic Instruction-Guided Vision Encoding for Question-Aware Multimodal Understanding

Abstract page for arXiv paper 2603.02748: iGVLM: Dynamic Instruction-Guided Vision Encoding for Question-Aware Multimodal Understanding

arXiv - AI · 3 min · 2 months ago

Previous Page 328 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

I am not an "anti" like this guy, but still an interesting video of person interacting with chat 4o

We built a way for two people's AI context to talk to each other (without sharing their conversations)

No flattery please, Claude: I’m British | Brief letters

All Content

[2507.08207] Toward a Dynamic Stackelberg Game-Theoretic Framework for Agentic AI Defense Against LLM Jailbreaking

[2505.19892] OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging

[2505.13909] Efficient Agent Training for Computer Use

[2505.13180] ViPlan: A Benchmark for Visual Planning with Symbolic Predicates and Vision-Language Models

[2603.03269] LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

[2603.03180] Type-Aware Retrieval-Augmented Generation with Dependency Closure for Solver-Executable Industrial Optimization Modeling

[2603.03192] MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization

[2603.02952] Sparse autoencoders reveal organized biological knowledge but minimal regulatory logic in single-cell foundation models: a comparative atlas of Geneformer and scGPT

[2603.03095] Compact Prompting in Instruction-tuned LLMs for Joint Argumentative Component Detection

[2603.02775] From Solver to Tutor: Evaluating the Pedagogical Intelligence of LLMs with KMP-Bench

[2603.03047] TrustMH-Bench: A Comprehensive Benchmark for Evaluating the Trustworthiness of Large Language Models in Mental Health

[2603.02983] Contextualized Privacy Defense for LLM Agents

[2603.02623] Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

[2603.02949] SEALing the Gap: A Reference Framework for LLM Inference Carbon Estimation via Multi-Benchmark Driven Embodiment

[2603.02909] Learning to Generate and Extract: A Multi-Agent Collaboration Framework For Zero-shot Document-level Event Arguments Extraction

[2603.02830] Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

[2603.02789] OCR or Not? Rethinking Document Information Extraction in the MLLMs Era with Real-World Large-Scale Datasets

[2603.02470] Video TokenCom: Textual Intent-Guided Multi-Rate Video Token Communications with UEP-Based Adaptive Source-Channel Coding

[2603.02760] Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration

[2603.02748] iGVLM: Dynamic Instruction-Guided Vision Encoding for Question-Aware Multimodal Understanding

Related Topics

Stay updated with AI News