Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

This Is Not Hacking. This Is Structured Intelligence.

Watch me demonstrate everything I've been talking about—live, in real time. The Setup: Maestro University AI enrollment system Standard c...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min · about 3 hours ago

All Content

Llms

[2510.05181] Auditing Pay-Per-Token in Large Language Models

Abstract page for arXiv paper 2510.05181: Auditing Pay-Per-Token in Large Language Models

arXiv - AI · 4 min · 7 days ago

Llms

[2510.05092] Learning to Interpret Weight Differences in Language Models

Abstract page for arXiv paper 2510.05092: Learning to Interpret Weight Differences in Language Models

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2510.02375] Pretraining with hierarchical memories: separating long-tail and common knowledge

Abstract page for arXiv paper 2510.02375: Pretraining with hierarchical memories: separating long-tail and common knowledge

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2510.02249] Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation

Abstract page for arXiv paper 2510.02249: Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2510.01037] CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

Abstract page for arXiv paper 2510.01037: CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2508.05694] DMFI: A Dual-Modality Log Analysis Framework for Insider Threat Detection with LoRA-Tuned Language Models

Abstract page for arXiv paper 2508.05694: DMFI: A Dual-Modality Log Analysis Framework for Insider Threat Detection with LoRA-Tuned Langu...

arXiv - AI · 4 min · 7 days ago

Llms

[2507.08704] Knowledge Fusion via Bidirectional Information Aggregation

Abstract page for arXiv paper 2507.08704: Knowledge Fusion via Bidirectional Information Aggregation

arXiv - AI · 4 min · 7 days ago

Llms

[2507.03156] The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mapping Study

Abstract page for arXiv paper 2507.03156: The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mappin...

arXiv - AI · 4 min · 7 days ago

Llms

[2506.13925] Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation

Abstract page for arXiv paper 2506.13925: Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation

arXiv - AI · 4 min · 7 days ago

Llms

[2506.11128] Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning

Abstract page for arXiv paper 2506.11128: Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning

arXiv - AI · 3 min · 7 days ago

Llms

[2505.20730] Do LLMs Understand Collaborative Signals? Diagnosis and Repair

Abstract page for arXiv paper 2505.20730: Do LLMs Understand Collaborative Signals? Diagnosis and Repair

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2504.14636] AlphaZero-Edu: Democratizing Access to AlphaZero

Abstract page for arXiv paper 2504.14636: AlphaZero-Edu: Democratizing Access to AlphaZero

arXiv - Machine Learning · 3 min · 7 days ago

Llms

[2503.13401] Levels of Analysis for Large Language Models

Abstract page for arXiv paper 2503.13401: Levels of Analysis for Large Language Models

arXiv - AI · 3 min · 7 days ago

Llms

[2502.11026] RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment

Abstract page for arXiv paper 2502.11026: RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2502.00618] DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Recognition

Abstract page for arXiv paper 2502.00618: DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Reco...

arXiv - AI · 4 min · 7 days ago

Llms

[2501.02406] A Training-free Method for LLM Text Attribution

Abstract page for arXiv paper 2501.02406: A Training-free Method for LLM Text Attribution

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2410.01591] Imaging foundation model for universal enhancement of non-ideal measurement CT

Abstract page for arXiv paper 2410.01591: Imaging foundation model for universal enhancement of non-ideal measurement CT

arXiv - AI · 4 min · 7 days ago

Llms

[2402.01749] Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

Abstract page for arXiv paper 2402.01749: Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2406.01914] HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task

Abstract page for arXiv paper 2406.01914: HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task

arXiv - AI · 4 min · 7 days ago

Llms

[2603.18908] Secure Linear Alignment of Large Language Models

Abstract page for arXiv paper 2603.18908: Secure Linear Alignment of Large Language Models

arXiv - AI · 3 min · 7 days ago

Previous Page 34 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

This Is Not Hacking. This Is Structured Intelligence.

[D] Howcome Muon is only being used for Transformers?

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

All Content

[2510.05181] Auditing Pay-Per-Token in Large Language Models

[2510.05092] Learning to Interpret Weight Differences in Language Models

[2510.02375] Pretraining with hierarchical memories: separating long-tail and common knowledge

[2510.02249] Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation

[2510.01037] CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

[2508.05694] DMFI: A Dual-Modality Log Analysis Framework for Insider Threat Detection with LoRA-Tuned Language Models

[2507.08704] Knowledge Fusion via Bidirectional Information Aggregation

[2507.03156] The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mapping Study

[2506.13925] Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation

[2506.11128] Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning

[2505.20730] Do LLMs Understand Collaborative Signals? Diagnosis and Repair

[2504.14636] AlphaZero-Edu: Democratizing Access to AlphaZero

[2503.13401] Levels of Analysis for Large Language Models

[2502.11026] RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment

[2502.00618] DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Recognition

[2501.02406] A Training-free Method for LLM Text Attribution

[2410.01591] Imaging foundation model for universal enhancement of non-ideal measurement CT

[2402.01749] Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

[2406.01914] HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task

[2603.18908] Secure Linear Alignment of Large Language Models

Related Topics

Stay updated with AI News