Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min ·
Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min ·
Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users
Llms

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

A study found that sycophancy is pervasive among chatbots, and that bots are more likely than human peers to affirm a person's bad behavior.

AI Tools & Products · 6 min ·

All Content

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?
Llms

[2601.20009] LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

Abstract page for arXiv paper 2601.20009: LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

arXiv - Machine Learning · 4 min ·
[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala
Llms

[2601.14958] Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

Abstract page for arXiv paper 2601.14958: Script Sensitivity: Benchmarking Language Models on Unicode, Romanized and Mixed-Script Sinhala

arXiv - AI · 3 min ·
[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs
Llms

[2601.12494] Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

Abstract page for arXiv paper 2601.12494: Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs

arXiv - AI · 4 min ·
[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles
Llms

[2601.07148] Measuring Iterative Temporal Reasoning with Time Puzzles

Abstract page for arXiv paper 2601.07148: Measuring Iterative Temporal Reasoning with Time Puzzles

arXiv - AI · 3 min ·
[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning
Llms

[2601.01547] Vision-language models lag human performance on physical dynamics and intent reasoning

Abstract page for arXiv paper 2601.01547: Vision-language models lag human performance on physical dynamics and intent reasoning

arXiv - Machine Learning · 4 min ·
[2601.01279] Collusive Pricing Under LLM
Llms

[2601.01279] Collusive Pricing Under LLM

Abstract page for arXiv paper 2601.01279: Collusive Pricing Under LLM

arXiv - AI · 4 min ·
[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Llms

[2512.16523] TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

Abstract page for arXiv paper 2512.16523: TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

arXiv - AI · 4 min ·
[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity
Llms

[2512.03903] BERnaT: Basque Encoders for Representing Natural Textual Diversity

Abstract page for arXiv paper 2512.03903: BERnaT: Basque Encoders for Representing Natural Textual Diversity

arXiv - AI · 3 min ·
[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG
Llms

[2512.05959] M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

Abstract page for arXiv paper 2512.05959: M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG

arXiv - AI · 4 min ·
[2511.23455] The Price of Progress: Price Performance and the Future of AI
Llms

[2511.23455] The Price of Progress: Price Performance and the Future of AI

Abstract page for arXiv paper 2511.23455: The Price of Progress: Price Performance and the Future of AI

arXiv - Machine Learning · 4 min ·
[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
Llms

[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

Abstract page for arXiv paper 2511.19299: Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning

arXiv - Machine Learning · 4 min ·
[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization
Llms

[2511.22169] Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

Abstract page for arXiv paper 2511.22169: Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

arXiv - AI · 4 min ·
[2511.17561] LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models
Llms

[2511.17561] LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models

Abstract page for arXiv paper 2511.17561: LexInstructEval: Lexical Instruction Following Evaluation for Large Language Models

arXiv - AI · 3 min ·
[2511.14977] SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification
Llms

[2511.14977] SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification

Abstract page for arXiv paper 2511.14977: SVBRD-LLM: Self-Verifying Behavioral Rule Discovery for Autonomous Vehicle Identification

arXiv - AI · 4 min ·
[2511.11828] Conformal Constrained Policy Optimization for Cost-Effective LLM Agents
Llms

[2511.11828] Conformal Constrained Policy Optimization for Cost-Effective LLM Agents

Abstract page for arXiv paper 2511.11828: Conformal Constrained Policy Optimization for Cost-Effective LLM Agents

arXiv - Machine Learning · 4 min ·
[2511.06174] LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs
Llms

[2511.06174] LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

Abstract page for arXiv paper 2511.06174: LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs

arXiv - AI · 4 min ·
[2510.27543] DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models
Llms

[2510.27543] DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Models

Abstract page for arXiv paper 2510.27543: DialectalArabicMMLU: Benchmarking Dialectal Capabilities in Arabic and Multilingual Language Mo...

arXiv - AI · 4 min ·
[2510.13232] What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging
Llms

[2510.13232] What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

Abstract page for arXiv paper 2510.13232: What "Not" to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging

arXiv - AI · 4 min ·
[2510.08138] Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention Discriminability
Llms

[2510.08138] Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention Discriminability

Abstract page for arXiv paper 2510.08138: Understanding Temporal Logic Consistency in Video-Language Models through Cross-Modal Attention...

arXiv - AI · 4 min ·
[2510.06638] StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering
Llms

[2510.06638] StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering

Abstract page for arXiv paper 2510.06638: StaR-KVQA: Structured Reasoning Traces for Implicit-Knowledge Visual Question Answering

arXiv - AI · 4 min ·
Previous Page 33 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime