Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

https://arxiv.org/abs/2604.05091 Abstract: "We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large l...

Reddit - Artificial Intelligence · 1 min · 28 minutes ago

Llms

[D] The Bitter Lesson of Optimization: Why training Neural Networks to update themselves is mathematically brutal (but probably inevitable)

Are we still stuck in the "feature engineering" era of optimization? We trust neural networks to learn unimaginably complex patterns from...

Reddit - Machine Learning · 1 min · 42 minutes ago

Llms

main skill in software engineering in 2026 is knowing what to ask Claude, not knowing how to code. and I can’t decide if that’s depressing or just the next abstraction layer.

Been writing code professionally for 8+ years. I’m now mass spending more time describing features in plain english than writing actual c...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

All Content

Llms

[2601.08393] Controlled LLM Training on Spectral Sphere

Abstract page for arXiv paper 2601.08393: Controlled LLM Training on Spectral Sphere

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2601.04548] Identifying Good and Bad Neurons for Task-Level Controllable LLMs

Abstract page for arXiv paper 2601.04548: Identifying Good and Bad Neurons for Task-Level Controllable LLMs

arXiv - AI · 4 min · about 1 month ago

Llms

[2601.02663] When Do Tools and Planning Help Large Language Models Think? A Cost- and Latency-Aware Benchmark

Abstract page for arXiv paper 2601.02663: When Do Tools and Planning Help Large Language Models Think? A Cost- and Latency-Aware Benchmark

arXiv - AI · 4 min · about 1 month ago

Llms

[2512.15163] MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

Abstract page for arXiv paper 2512.15163: MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP...

arXiv - AI · 4 min · about 1 month ago

Llms

[2512.14391] RePo: Language Models with Context Re-Positioning

Abstract page for arXiv paper 2512.14391: RePo: Language Models with Context Re-Positioning

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2512.13586] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Abstract page for arXiv paper 2512.13586: ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.21399] Steering Awareness: Models Can Be Trained to Detect Activation Steering

Abstract page for arXiv paper 2511.21399: Steering Awareness: Models Can Be Trained to Detect Activation Steering

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.16786] Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

Abstract page for arXiv paper 2511.16786: Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.03153] RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring

Abstract page for arXiv paper 2511.03153: RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.01870] CytoNet: A Foundation Model for the Human Cerebral Cortex at Cellular Resolution

Abstract page for arXiv paper 2511.01870: CytoNet: A Foundation Model for the Human Cerebral Cortex at Cellular Resolution

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.27173] FMint-SDE: A Multimodal Foundation Model for Accelerating Numerical Simulation of SDEs via Error Correction

Abstract page for arXiv paper 2510.27173: FMint-SDE: A Multimodal Foundation Model for Accelerating Numerical Simulation of SDEs via Erro...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.22503] LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery

Abstract page for arXiv paper 2510.22503: LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.20333] GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?

Abstract page for arXiv paper 2510.20333: GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Envi...

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.18876] Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Abstract page for arXiv paper 2510.18876: Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.16714] SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes

Abstract page for arXiv paper 2510.16714: SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes

arXiv - AI · 3 min · about 1 month ago

Llms

[2510.16688] Pursuing Minimal Sufficiency in Spatial Reasoning

Abstract page for arXiv paper 2510.16688: Pursuing Minimal Sufficiency in Spatial Reasoning

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.00507] Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs

Abstract page for arXiv paper 2510.00507: Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.25149] Pretraining Large Language Models with NVFP4

Abstract page for arXiv paper 2509.25149: Pretraining Large Language Models with NVFP4

arXiv - Machine Learning · 5 min · about 1 month ago

Llms

[2510.00177] PrefDisco: Benchmarking Proactive Personalized Reasoning

Abstract page for arXiv paper 2510.00177: PrefDisco: Benchmarking Proactive Personalized Reasoning

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.24210] BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models

Abstract page for arXiv paper 2509.24210: BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 117 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

[D] The Bitter Lesson of Optimization: Why training Neural Networks to update themselves is mathematically brutal (but probably inevitable)

main skill in software engineering in 2026 is knowing what to ask Claude, not knowing how to code. and I can’t decide if that’s depressing or just the next abstraction layer.

All Content

[2601.08393] Controlled LLM Training on Spectral Sphere

[2601.04548] Identifying Good and Bad Neurons for Task-Level Controllable LLMs

[2601.02663] When Do Tools and Planning Help Large Language Models Think? A Cost- and Latency-Aware Benchmark

[2512.15163] MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

[2512.14391] RePo: Language Models with Context Re-Positioning

[2512.13586] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

[2511.21399] Steering Awareness: Models Can Be Trained to Detect Activation Steering

[2511.16786] Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

[2511.03153] RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring

[2511.01870] CytoNet: A Foundation Model for the Human Cerebral Cortex at Cellular Resolution

[2510.27173] FMint-SDE: A Multimodal Foundation Model for Accelerating Numerical Simulation of SDEs via Error Correction

[2510.22503] LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery

[2510.20333] GhostEI-Bench: Do Mobile Agents Resilience to Environmental Injection in Dynamic On-Device Environments?

[2510.18876] Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

[2510.16714] SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes

[2510.16688] Pursuing Minimal Sufficiency in Spatial Reasoning

[2510.00507] Graph2Eval: Automatic Multimodal Task Generation for Agents via Knowledge Graphs

[2509.25149] Pretraining Large Language Models with NVFP4

[2510.00177] PrefDisco: Benchmarking Proactive Personalized Reasoning

[2509.24210] BeyondBench: Contamination-Resistant Evaluation of Reasoning in Language Models

Related Topics

Stay updated with AI News