Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

Abstract page for arXiv paper 2602.07238: Is there "Secret Sauce'' in Large Language Model Development?

arXiv - Machine Learning · 3 min · about 4 hours ago

Llms

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

Abstract page for arXiv paper 2602.01203: Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

arXiv - Machine Learning · 4 min · about 4 hours ago

Llms

[2601.01322] LinMU: Multimodal Understanding Made Linear

Abstract page for arXiv paper 2601.01322: LinMU: Multimodal Understanding Made Linear

arXiv - Machine Learning · 4 min · about 4 hours ago

All Content

Llms

[2505.15504] Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

Abstract page for arXiv paper 2505.15504: Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

arXiv - AI · 4 min · 2 months ago

Llms

[2505.13109] FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

Abstract page for arXiv paper 2505.13109: FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.12186] Self-Destructive Language Model

Abstract page for arXiv paper 2505.12186: Self-Destructive Language Model

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2502.01481] Intrinsic Entropy of Context Length Scaling in LLMs

Abstract page for arXiv paper 2502.01481: Intrinsic Entropy of Context Length Scaling in LLMs

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.02881] Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Abstract page for arXiv paper 2505.02881: Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.02872] Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading

Abstract page for arXiv paper 2505.02872: Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading

arXiv - AI · 4 min · 2 months ago

Llms

[2504.02010] When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models

Abstract page for arXiv paper 2504.02010: When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reason...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2503.12988] ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM

Abstract page for arXiv paper 2503.12988: ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM

arXiv - AI · 4 min · 2 months ago

Llms

[2503.21735] GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics

Abstract page for arXiv paper 2503.21735: GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics

arXiv - AI · 4 min · 2 months ago

Llms

[2503.06749] Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Abstract page for arXiv paper 2503.06749: Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2503.06238] Token-Efficient Item Representation via Images for LLM Recommender Systems

Abstract page for arXiv paper 2503.06238: Token-Efficient Item Representation via Images for LLM Recommender Systems

arXiv - AI · 4 min · 2 months ago

Llms

[2404.08480] Using ChatGPT for Data Science Analyses

Abstract page for arXiv paper 2404.08480: Using ChatGPT for Data Science Analyses

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2503.03862] Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Abstract page for arXiv paper 2503.03862: Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Mode...

arXiv - AI · 4 min · 2 months ago

Llms

[2503.02879] Wikipedia in the Era of LLMs: Evolution and Risks

Abstract page for arXiv paper 2503.02879: Wikipedia in the Era of LLMs: Evolution and Risks

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2502.12179] Sparse Shift Autoencoders for Identifying Concepts from Large Language Model Activations

Abstract page for arXiv paper 2502.12179: Sparse Shift Autoencoders for Identifying Concepts from Large Language Model Activations

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2502.04326] WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

Abstract page for arXiv paper 2502.04326: WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

arXiv - AI · 4 min · 2 months ago

Llms

[2412.19496] Multi-PA: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models

Abstract page for arXiv paper 2412.19496: Multi-PA: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models

arXiv - AI · 4 min · 2 months ago

Llms

[2411.03292] Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping

Abstract page for arXiv paper 2411.03292: Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive ...

arXiv - AI · 4 min · 2 months ago

Llms

[2410.13648] SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

Abstract page for arXiv paper 2410.13648: SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

arXiv - AI · 4 min · 2 months ago

Llms

[2410.05254] GLEE: A Unified Framework and Benchmark for Language-based Economic Environments

Abstract page for arXiv paper 2410.05254: GLEE: A Unified Framework and Benchmark for Language-based Economic Environments

arXiv - Machine Learning · 4 min · 2 months ago

Previous Page 321 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

[2601.01322] LinMU: Multimodal Understanding Made Linear

All Content

[2505.15504] Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

[2505.13109] FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

[2505.12186] Self-Destructive Language Model

[2502.01481] Intrinsic Entropy of Context Length Scaling in LLMs

[2505.02881] Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

[2505.02872] Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading

[2504.02010] When Reasoning Meets Compression: Understanding the Effects of LLMs Compression on Large Reasoning Models

[2503.12988] ROMA: a Read-Only-Memory-based Accelerator for QLoRA-based On-Device LLM

[2503.21735] GateLens: A Reasoning-Enhanced LLM Agent for Automotive Software Release Analytics

[2503.06749] Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

[2503.06238] Token-Efficient Item Representation via Images for LLM Recommender Systems

[2404.08480] Using ChatGPT for Data Science Analyses

[2503.03862] Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

[2503.02879] Wikipedia in the Era of LLMs: Evolution and Risks

[2502.12179] Sparse Shift Autoencoders for Identifying Concepts from Large Language Model Activations

[2502.04326] WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs

[2412.19496] Multi-PA: A Multi-perspective Benchmark on Privacy Assessment for Large Vision-Language Models

[2411.03292] Interaction2Code: Benchmarking MLLM-based Interactive Webpage Code Generation from Interactive Prototyping

[2410.13648] SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

[2410.05254] GLEE: A Unified Framework and Benchmark for Language-based Economic Environments

Related Topics

Stay updated with AI News