Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?
Llms

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

Abstract page for arXiv paper 2602.07238: Is there "Secret Sauce'' in Large Language Model Development?

arXiv - Machine Learning · 3 min ·
[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse
Llms

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

Abstract page for arXiv paper 2602.01203: Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

arXiv - Machine Learning · 4 min ·
[2601.01322] LinMU: Multimodal Understanding Made Linear
Llms

[2601.01322] LinMU: Multimodal Understanding Made Linear

Abstract page for arXiv paper 2601.01322: LinMU: Multimodal Understanding Made Linear

arXiv - Machine Learning · 4 min ·

All Content

[2603.02080] From Pixels to Patches: Pooling Strategies for Earth Embeddings
Llms

[2603.02080] From Pixels to Patches: Pooling Strategies for Earth Embeddings

Abstract page for arXiv paper 2603.02080: From Pixels to Patches: Pooling Strategies for Earth Embeddings

arXiv - Machine Learning · 3 min ·
[2603.02026] Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT
Llms

[2603.02026] Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT

Abstract page for arXiv paper 2603.02026: Learning to Read Where to Look: Disease-Aware Vision-Language Pretraining for 3D CT

arXiv - Machine Learning · 4 min ·
[2603.01834] Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions
Llms

[2603.01834] Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions

Abstract page for arXiv paper 2603.01834: Probing Materials Knowledge in LLMs: From Latent Embeddings to Reliable Predictions

arXiv - Machine Learning · 3 min ·
[2602.11661] Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm
Llms

[2602.11661] Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm

Abstract page for arXiv paper 2602.11661: Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization ...

arXiv - AI · 4 min ·
[2602.10625] To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks
Llms

[2602.10625] To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

Abstract page for arXiv paper 2602.10625: To Think or Not To Think, That is The Question for Large Reasoning Models in Theory of Mind Tasks

arXiv - AI · 4 min ·
[2602.09794] Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain
Llms

[2602.09794] Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain

Abstract page for arXiv paper 2602.09794: Learning Global Hypothesis Space for Enhancing Synergistic Reasoning Chain

arXiv - AI · 4 min ·
[2602.09463] SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning
Llms

[2602.09463] SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning

Abstract page for arXiv paper 2602.09463: SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Re...

arXiv - AI · 4 min ·
[2602.07543] MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning
Llms

[2602.07543] MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning

Abstract page for arXiv paper 2602.07543: MSP-LLM: A Unified Large Language Model Framework for Complete Material Synthesis Planning

arXiv - AI · 4 min ·
[2601.10729] OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration
Llms

[2601.10729] OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration

Abstract page for arXiv paper 2601.10729: OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache Reconfiguration

arXiv - Machine Learning · 4 min ·
[2601.08166] ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms
Llms

[2601.08166] ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

Abstract page for arXiv paper 2601.08166: ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

arXiv - AI · 3 min ·
[2601.06502] DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization
Llms

[2601.06502] DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization

Abstract page for arXiv paper 2601.06502: DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimi...

arXiv - AI · 4 min ·
[2603.01691] Building a Strong Instruction Language Model for a Less-Resourced Language
Llms

[2603.01691] Building a Strong Instruction Language Model for a Less-Resourced Language

Abstract page for arXiv paper 2603.01691: Building a Strong Instruction Language Model for a Less-Resourced Language

arXiv - Machine Learning · 4 min ·
[2512.20745] AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent
Llms

[2512.20745] AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

Abstract page for arXiv paper 2512.20745: AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

arXiv - Machine Learning · 4 min ·
[2512.12411] Detecting the Disturbance: A Nuanced View of Introspective Abilities in LLMs
Llms

[2512.12411] Detecting the Disturbance: A Nuanced View of Introspective Abilities in LLMs

Abstract page for arXiv paper 2512.12411: Detecting the Disturbance: A Nuanced View of Introspective Abilities in LLMs

arXiv - AI · 4 min ·
[2603.01590] IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs
Llms

[2603.01590] IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

Abstract page for arXiv paper 2603.01590: IDProxy: Cold-Start CTR Prediction for Ads and Recommendation at Xiaohongshu with Multimodal LLMs

arXiv - Machine Learning · 3 min ·
[2512.01351] Benchmarking Overton Pluralism in LLMs
Llms

[2512.01351] Benchmarking Overton Pluralism in LLMs

Abstract page for arXiv paper 2512.01351: Benchmarking Overton Pluralism in LLMs

arXiv - AI · 3 min ·
[2512.01210] Knowledge Graph Augmented Large Language Models for Disease Prediction
Llms

[2512.01210] Knowledge Graph Augmented Large Language Models for Disease Prediction

Abstract page for arXiv paper 2512.01210: Knowledge Graph Augmented Large Language Models for Disease Prediction

arXiv - AI · 3 min ·
[2511.10788] From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models
Llms

[2511.10788] From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models

Abstract page for arXiv paper 2511.10788: From Efficiency to Adaptivity: A Deeper Look at Adaptive Reasoning in Large Language Models

arXiv - AI · 4 min ·
[2511.00206] Addressing Longstanding Challenges in Cognitive Science with Language Models
Llms

[2511.00206] Addressing Longstanding Challenges in Cognitive Science with Language Models

Abstract page for arXiv paper 2511.00206: Addressing Longstanding Challenges in Cognitive Science with Language Models

arXiv - AI · 3 min ·
[2603.01471] Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality
Llms

[2603.01471] Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality

Abstract page for arXiv paper 2603.01471: Reconstructing Content via Collaborative Attention to Improve Multimodal Embedding Quality

arXiv - Machine Learning · 4 min ·
Previous Page 322 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime