Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min · 5 minutes ago

Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

A study found that sycophancy is pervasive among chatbots, and that bots are more likely than human peers to affirm a person's bad behavior.

AI Tools & Products · 6 min · about 2 hours ago

All Content

Llms

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

Abstract page for arXiv paper 2601.20009: LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

Abstract page for arXiv paper 2601.14958: Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

arXiv - AI · 3 min · 7 days ago

Llms

[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

Abstract page for arXiv paper 2601.12494: Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

arXiv - AI · 4 min · 7 days ago

Llms

[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles

Abstract page for arXiv paper 2601.07148: Measuring Iterative Temporal Reasoning with Time Puzzles

arXiv - AI · 3 min · 7 days ago

Llms

[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning

Abstract page for arXiv paper 2601.01547: Vision-language models lag human performance on physical dynamics and intent reasoning

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2601.01279] Collusive Pricing Under LLM

Abstract page for arXiv paper 2601.01279: Collusive Pricing Under LLM

arXiv - AI · 4 min · 7 days ago

Llms

[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

Abstract page for arXiv paper 2512.16523: TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

arXiv - AI · 4 min · 7 days ago

Llms

[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity

Abstract page for arXiv paper 2512.03903: BERnaT: Basque Encoders for Representing Natural Textual Diversity

arXiv - AI · 3 min · 7 days ago

Llms

[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Abstract page for arXiv paper 2512.05959: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

arXiv - AI · 4 min · 7 days ago

Llms

[2511.23455] The Price of Progress: Price Performance and the Future of AI

Abstract page for arXiv paper 2511.23455: The Price of Progress: Price Performance and the Future of AI

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

Abstract page for arXiv paper 2511.19299: Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

Abstract page for arXiv paper 2511.22169: Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

arXiv - AI · 4 min · 7 days ago

Llms

[2511.17561] LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models

Abstract page for arXiv paper 2511.17561: LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models

arXiv - AI · 3 min · 7 days ago

Llms

[2511.14977] SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification

Abstract page for arXiv paper 2511.14977: SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification

arXiv - AI · 4 min · 7 days ago

Llms

[2511.11828] Conformal Constrained Policy Optimization for Cost-Effective LLM Agents

Abstract page for arXiv paper 2511.11828: Conformal Constrained Policy Optimization for Cost-Effective LLM Agents

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2511.06174] LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

Abstract page for arXiv paper 2511.06174: LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

arXiv - AI · 4 min · 7 days ago

Llms

[2510.27543] DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models

Abstract page for arXiv paper 2510.27543: DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Mo...

arXiv - AI · 4 min · 7 days ago

Llms

[2510.13232] What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

Abstract page for arXiv paper 2510.13232: What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

arXiv - AI · 4 min · 7 days ago

Llms

[2510.08138] Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention Discriminability

Abstract page for arXiv paper 2510.08138: Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention...

arXiv - AI · 4 min · 7 days ago

Llms

[2510.06638] StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering

Abstract page for arXiv paper 2510.06638: StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering

arXiv - AI · 4 min · 7 days ago

Previous Page 33 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[D] Howcome Muon is only being used for Transformers?

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

All Content

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles

[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning

[2601.01279] Collusive Pricing Under LLM

[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity

[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

[2511.23455] The Price of Progress: Price Performance and the Future of AI

[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

[2511.17561] LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models

[2511.14977] SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification

[2511.11828] Conformal Constrained Policy Optimization for Cost-Effective LLM Agents

[2511.06174] LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

[2510.27543] DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models

[2510.13232] What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

[2510.08138] Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention Discriminability

[2510.06638] StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering

Related Topics

Stay updated with AI News