Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

This Is Not Hacking. This Is Structured Intelligence.

Watch me demonstrate everything I've been talking about—live, in real time. The Setup: Maestro University AI enrollment system Standard c...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min ·
Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min ·

All Content

[2510.05181] Auditing Pay-Per-Token in Large Language Models
Llms

[2510.05181] Auditing Pay-Per-Token in Large Language Models

Abstract page for arXiv paper 2510.05181: Auditing Pay-Per-Token in Large Language Models

arXiv - AI · 4 min ·
[2510.05092] Learning to Interpret Weight Differences in Language Models
Llms

[2510.05092] Learning to Interpret Weight Differences in Language Models

Abstract page for arXiv paper 2510.05092: Learning to Interpret Weight Differences in Language Models

arXiv - Machine Learning · 4 min ·
[2510.02375] Pretraining with hierarchical memories: separating long-tail and common knowledge
Llms

[2510.02375] Pretraining with hierarchical memories: separating long-tail and common knowledge

Abstract page for arXiv paper 2510.02375: Pretraining with hierarchical memories: separating long-tail and common knowledge

arXiv - Machine Learning · 4 min ·
[2510.02249] Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
Llms

[2510.02249] Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation

Abstract page for arXiv paper 2510.02249: Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation

arXiv - Machine Learning · 4 min ·
[2510.01037] CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs
Llms

[2510.01037] CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

Abstract page for arXiv paper 2510.01037: CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs

arXiv - Machine Learning · 4 min ·
[2508.05694] DMFI: A Dual-Modality Log Analysis Framework for Insider Threat Detection with LoRA-Tuned Language Models
Llms

[2508.05694] DMFI: A Dual-Modality Log Analysis Framework for Insider Threat Detection with LoRA-Tuned Language Models

Abstract page for arXiv paper 2508.05694: DMFI: A Dual-Modality Log Analysis Framework for Insider Threat Detection with LoRA-Tuned Langu...

arXiv - AI · 4 min ·
[2507.08704] Knowledge Fusion via Bidirectional Information Aggregation
Llms

[2507.08704] Knowledge Fusion via Bidirectional Information Aggregation

Abstract page for arXiv paper 2507.08704: Knowledge Fusion via Bidirectional Information Aggregation

arXiv - AI · 4 min ·
[2507.03156] The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mapping Study
Llms

[2507.03156] The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mapping Study

Abstract page for arXiv paper 2507.03156: The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mappin...

arXiv - AI · 4 min ·
[2506.13925] Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation
Llms

[2506.13925] Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation

Abstract page for arXiv paper 2506.13925: Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation

arXiv - AI · 4 min ·
[2506.11128] Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning
Llms

[2506.11128] Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning

Abstract page for arXiv paper 2506.11128: Theory-Grounded Evaluation of Human-Like Fallacy Patterns in LLM Reasoning

arXiv - AI · 3 min ·
[2505.20730] Do LLMs Understand Collaborative Signals? Diagnosis and Repair
Llms

[2505.20730] Do LLMs Understand Collaborative Signals? Diagnosis and Repair

Abstract page for arXiv paper 2505.20730: Do LLMs Understand Collaborative Signals? Diagnosis and Repair

arXiv - Machine Learning · 4 min ·
[2504.14636] AlphaZero-Edu: Democratizing Access to AlphaZero
Llms

[2504.14636] AlphaZero-Edu: Democratizing Access to AlphaZero

Abstract page for arXiv paper 2504.14636: AlphaZero-Edu: Democratizing Access to AlphaZero

arXiv - Machine Learning · 3 min ·
[2503.13401] Levels of Analysis for Large Language Models
Llms

[2503.13401] Levels of Analysis for Large Language Models

Abstract page for arXiv paper 2503.13401: Levels of Analysis for Large Language Models

arXiv - AI · 3 min ·
[2502.11026] RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment
Llms

[2502.11026] RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment

Abstract page for arXiv paper 2502.11026: RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment

arXiv - Machine Learning · 4 min ·
[2502.00618] DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Recognition
Llms

[2502.00618] DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Recognition

Abstract page for arXiv paper 2502.00618: DesCLIP: Robust Continual Learning via General Attribute Descriptions for VLM-Based Visual Reco...

arXiv - AI · 4 min ·
[2501.02406] A Training-free Method for LLM Text Attribution
Llms

[2501.02406] A Training-free Method for LLM Text Attribution

Abstract page for arXiv paper 2501.02406: A Training-free Method for LLM Text Attribution

arXiv - Machine Learning · 4 min ·
[2410.01591] Imaging foundation model for universal enhancement of non-ideal measurement CT
Llms

[2410.01591] Imaging foundation model for universal enhancement of non-ideal measurement CT

Abstract page for arXiv paper 2410.01591: Imaging foundation model for universal enhancement of non-ideal measurement CT

arXiv - AI · 4 min ·
[2402.01749] Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Llms

[2402.01749] Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

Abstract page for arXiv paper 2402.01749: Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

arXiv - Machine Learning · 4 min ·
[2406.01914] HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task
Llms

[2406.01914] HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task

Abstract page for arXiv paper 2406.01914: HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task

arXiv - AI · 4 min ·
[2603.18908] Secure Linear Alignment of Large Language Models
Llms

[2603.18908] Secure Linear Alignment of Large Language Models

Abstract page for arXiv paper 2603.18908: Secure Linear Alignment of Large Language Models

arXiv - AI · 3 min ·
Previous Page 34 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime