Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Gemini Can Now Create AI Images Using Your Own Photos and Videos
Llms

Gemini Can Now Create AI Images Using Your Own Photos and Videos

Gemini, Google Photos, Nano Banana 2, and Personal Intelligence have all combined to give you new features through the AI prompt box.

AI Tools & Products · 6 min ·
Claude Mythos: Finance ministers and top bankers raise serious concerns about AI model
Llms

Claude Mythos: Finance ministers and top bankers raise serious concerns about AI model

Experts say Mythos potentially has an unprecedented ability to identify and exploit cyber-security weaknesses.

AI Tools & Products · 6 min ·
What is Anthopic's Claude Mythos and what risks does it pose?
Llms

What is Anthopic's Claude Mythos and what risks does it pose?

The company's claim the AI tool can outperform humans at some hacking and cyber-security tasks has sparked fears in the financial world.

AI Tools & Products · 6 min ·

All Content

[2603.04421] Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?
Llms

[2603.04421] Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?

Abstract page for arXiv paper 2603.04421: Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?

arXiv - AI · 3 min ·
[2603.04419] Context-Dependent Affordance Computation in Vision-Language Models
Llms

[2603.04419] Context-Dependent Affordance Computation in Vision-Language Models

Abstract page for arXiv paper 2603.04419: Context-Dependent Affordance Computation in Vision-Language Models

arXiv - Machine Learning · 4 min ·
[2603.04413] Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries
Llms

[2603.04413] Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries

Abstract page for arXiv paper 2603.04413: Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Me...

arXiv - AI · 4 min ·
[2603.04411] One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache
Llms

[2603.04411] One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache

Abstract page for arXiv paper 2603.04411: One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache

arXiv - Machine Learning · 3 min ·
[2603.04410] SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models
Llms

[2603.04410] SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

Abstract page for arXiv paper 2603.04410: SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

arXiv - AI · 4 min ·
[2603.04409] Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework
Llms

[2603.04409] Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

Abstract page for arXiv paper 2603.04409: Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

arXiv - AI · 4 min ·
[2603.04406] CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models
Llms

[2603.04406] CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models

Abstract page for arXiv paper 2603.04406: CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG M...

arXiv - AI · 4 min ·
[2603.04407] Semantic Containment as a Fundamental Property of Emergent Misalignment
Llms

[2603.04407] Semantic Containment as a Fundamental Property of Emergent Misalignment

Abstract page for arXiv paper 2603.04407: Semantic Containment as a Fundamental Property of Emergent Misalignment

arXiv - AI · 3 min ·
[2603.04405] Lost in Translation: How Language Re-Aligns Vision for Cross-Species Pathology
Llms

[2603.04405] Lost in Translation: How Language Re-Aligns Vision for Cross-Species Pathology

Abstract page for arXiv paper 2603.04405: Lost in Translation: How Language Re-Aligns Vision for Cross-Species Pathology

arXiv - Machine Learning · 4 min ·
[2603.05498] The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks
Llms

[2603.05498] The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

Abstract page for arXiv paper 2603.05498: The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

arXiv - AI · 3 min ·
[2603.05485] Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation
Llms

[2603.05485] Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

Abstract page for arXiv paper 2603.05485: Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

arXiv - AI · 3 min ·
[2603.05399] Judge Reliability Harness: Stress Testing the Reliability of LLM Judges
Llms

[2603.05399] Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

Abstract page for arXiv paper 2603.05399: Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

arXiv - AI · 3 min ·
[2603.05392] Legal interpretation and AI: from expert systems to argumentation and LLMs
Llms

[2603.05392] Legal interpretation and AI: from expert systems to argumentation and LLMs

Abstract page for arXiv paper 2603.05392: Legal interpretation and AI: from expert systems to argumentation and LLMs

arXiv - AI · 3 min ·
[2603.05294] STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
Llms

[2603.05294] STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks

Abstract page for arXiv paper 2603.05294: STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks

arXiv - AI · 3 min ·
[2603.05290] X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes
Llms

[2603.05290] X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes

Abstract page for arXiv paper 2603.05290: X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes

arXiv - AI · 4 min ·
[2603.05240] GCAgent: Enhancing Group Chat Communication through Dialogue Agents System
Llms

[2603.05240] GCAgent: Enhancing Group Chat Communication through Dialogue Agents System

Abstract page for arXiv paper 2603.05240: GCAgent: Enhancing Group Chat Communication through Dialogue Agents System

arXiv - AI · 3 min ·
[2603.05129] MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus
Llms

[2603.05129] MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

Abstract page for arXiv paper 2603.05129: MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty C...

arXiv - AI · 4 min ·
[2603.05120] Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning
Llms

[2603.05120] Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning

Abstract page for arXiv paper 2603.05120: Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Re...

arXiv - AI · 3 min ·
[2603.05044] WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents
Llms

[2603.05044] WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents

Abstract page for arXiv paper 2603.05044: WebFactory: Automated Compression of Foundational Language Intelligence into Grounded Web Agents

arXiv - AI · 4 min ·
[2603.05040] Enhancing Zero-shot Commonsense Reasoning by Integrating Visual Knowledge via Machine Imagination
Llms

[2603.05040] Enhancing Zero-shot Commonsense Reasoning by Integrating Visual Knowledge via Machine Imagination

Abstract page for arXiv paper 2603.05040: Enhancing Zero-shot Commonsense Reasoning by Integrating Visual Knowledge via Machine Imagination

arXiv - AI · 3 min ·
Previous Page 189 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime