Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

Abstract page for arXiv paper 2602.07238: Is there "Secret Sauce'' in Large Language Model Development?

arXiv - Machine Learning · 3 min · about 5 hours ago

Llms

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

Abstract page for arXiv paper 2602.01203: Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

arXiv - Machine Learning · 4 min · about 5 hours ago

Llms

[2601.01322] LinMU: Multimodal Understanding Made Linear

Abstract page for arXiv paper 2601.01322: LinMU: Multimodal Understanding Made Linear

arXiv - Machine Learning · 4 min · about 5 hours ago

All Content

Llms

[2603.02080] From Pixels to Patches: Pooling Strategies for Earth Embeddings

Abstract page for arXiv paper 2603.02080: From Pixels to Patches: Pooling Strategies for Earth Embeddings

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2603.02026] Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT

Abstract page for arXiv paper 2603.02026: Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.01834] Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions

Abstract page for arXiv paper 2603.01834: Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.11661] Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm

Abstract page for arXiv paper 2602.11661: Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization ...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.10625] To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

Abstract page for arXiv paper 2602.10625: To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

arXiv - AI · 4 min · 2 months ago

Llms

[2602.09794] Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain

Abstract page for arXiv paper 2602.09794: Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain

arXiv - AI · 4 min · 2 months ago

Llms

[2602.09463] SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning

Abstract page for arXiv paper 2602.09463: SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Re...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.07543] MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning

Abstract page for arXiv paper 2602.07543: MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning

arXiv - AI · 4 min · 2 months ago

Llms

[2601.10729] OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration

Abstract page for arXiv paper 2601.10729: OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2601.08166] ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

Abstract page for arXiv paper 2601.08166: ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

arXiv - AI · 3 min · 2 months ago

Llms

[2601.06502] DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization

Abstract page for arXiv paper 2601.06502: DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimi...

arXiv - AI · 4 min · 2 months ago

Llms

[2603.01691] Building a Strong Instruction Language Model for a Less-Resourced Language

Abstract page for arXiv paper 2603.01691: Building a Strong Instruction Language Model for a Less-Resourced Language

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2512.20745] AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

Abstract page for arXiv paper 2512.20745: AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2512.12411] Detecting the Disturbance: A Nuanced View of Introspective Abilities in LLMs

Abstract page for arXiv paper 2512.12411: Detecting the Disturbance: A Nuanced View of Introspective Abilities in LLMs

arXiv - AI · 4 min · 2 months ago

Llms

[2603.01590] IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

Abstract page for arXiv paper 2603.01590: IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2512.01351] Benchmarking Overton Pluralism in LLMs

Abstract page for arXiv paper 2512.01351: Benchmarking Overton Pluralism in LLMs

arXiv - AI · 3 min · 2 months ago

Llms

[2512.01210] Knowledge Graph Augmented Large Language Models for Disease Prediction

Abstract page for arXiv paper 2512.01210: Knowledge Graph Augmented Large Language Models for Disease Prediction

arXiv - AI · 3 min · 2 months ago

Llms

[2511.10788] From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models

Abstract page for arXiv paper 2511.10788: From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models

arXiv - AI · 4 min · 2 months ago

Llms

[2511.00206] Addressing Longstanding Challenges in Cognitive Science with Language Models

Abstract page for arXiv paper 2511.00206: Addressing Longstanding Challenges in Cognitive Science with Language Models

arXiv - AI · 3 min · 2 months ago

Llms

[2603.01471] Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality

Abstract page for arXiv paper 2603.01471: Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality

arXiv - Machine Learning · 4 min · 2 months ago

Previous Page 322 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

[2601.01322] LinMU: Multimodal Understanding Made Linear

All Content

[2603.02080] From Pixels to Patches: Pooling Strategies for Earth Embeddings

[2603.02026] Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT

[2603.01834] Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions

[2602.11661] Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm

[2602.10625] To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

[2602.09794] Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain

[2602.09463] SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning

[2602.07543] MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning

[2601.10729] OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration

[2601.08166] ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

[2601.06502] DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization

[2603.01691] Building a Strong Instruction Language Model for a Less-Resourced Language

[2512.20745] AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

[2512.12411] Detecting the Disturbance: A Nuanced View of Introspective Abilities in LLMs

[2603.01590] IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

[2512.01351] Benchmarking Overton Pluralism in LLMs

[2512.01210] Knowledge Graph Augmented Large Language Models for Disease Prediction

[2511.10788] From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models

[2511.00206] Addressing Longstanding Challenges in Cognitive Science with Language Models

[2603.01471] Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality

Related Topics

Stay updated with AI News