Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Why are we blindly trusting AI companies with our data?

Lately I’ve been seeing a story floating around that really made me pause. Apparently, there were claims that the US government asked Ant...

Reddit - Artificial Intelligence · 1 min · 30 minutes ago

Llms

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV

Artificial intelligence is transforming every corner of industry, and television is no exception. Major networks in Korea have recently a...

AI Tools & Products · 4 min · about 2 hours ago

Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min · about 2 hours ago

All Content

Llms

[2505.22318] Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds

Abstract page for arXiv paper 2505.22318: Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2504.16956] GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data

Abstract page for arXiv paper 2504.16956: GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2412.08686] LatentQA: Teaching LLMs to Decode Activations Into Natural Language

Abstract page for arXiv paper 2412.08686: LatentQA: Teaching LLMs to Decode Activations Into Natural Language

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2410.12164] Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning

Abstract page for arXiv paper 2410.12164: Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator...

arXiv - Machine Learning · 4 min · 5 days ago

Llms

[2504.07396] Automating quantum feature map design via large language models

Abstract page for arXiv paper 2504.07396: Automating quantum feature map design via large language models

arXiv - AI · 4 min · 5 days ago

Llms

[2502.01969] Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration

Abstract page for arXiv paper 2502.01969: Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration

arXiv - AI · 4 min · 5 days ago

Llms

[2306.05036] Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT for Mining Insights at Scale

Abstract page for arXiv paper 2306.05036: Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT for Mining Insights at ...

arXiv - AI · 4 min · 5 days ago

Llms

[2601.12138] DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

Abstract page for arXiv paper 2601.12138: DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

arXiv - AI · 3 min · 5 days ago

Llms

[2511.22076] Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

Abstract page for arXiv paper 2511.22076: Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in ...

arXiv - AI · 4 min · 5 days ago

Llms

[2601.18858] Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

Abstract page for arXiv paper 2601.18858: Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer...

arXiv - AI · 4 min · 5 days ago

Llms

[2510.05318] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

Abstract page for arXiv paper 2510.05318: BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynami...

arXiv - AI · 4 min · 5 days ago

Llms

[2510.00415] Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

Abstract page for arXiv paper 2510.00415: Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration und...

arXiv - AI · 4 min · 5 days ago

Llms

[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

Abstract page for arXiv paper 2603.23501: MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

arXiv - AI · 4 min · 5 days ago

Llms

[2603.23485] Failure of contextual invariance in gender inference with large language models

Abstract page for arXiv paper 2603.23485: Failure of contextual invariance in gender inference with large language models

arXiv - AI · 3 min · 5 days ago

Llms

[2603.23482] ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains

Abstract page for arXiv paper 2603.23482: ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains

arXiv - AI · 4 min · 5 days ago

Llms

[2510.16051] GUIrilla: A Scalable Framework for Automated Desktop UI Exploration

Abstract page for arXiv paper 2510.16051: GUIrilla: A Scalable Framework for Automated Desktop UI Exploration

arXiv - AI · 4 min · 5 days ago

Llms

[2603.23447] 3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding

Abstract page for arXiv paper 2603.23447: 3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Un...

arXiv - AI · 4 min · 5 days ago

Llms

[2603.23443] Evaluating LLM-Based Test Generation Under Software Evolution

Abstract page for arXiv paper 2603.23443: Evaluating LLM-Based Test Generation Under Software Evolution

arXiv - AI · 4 min · 5 days ago

Llms

[2603.23322] Leveraging LLMs and Social Media to Understand User Perception of Smartphone-Based Earthquake Early Warnings

Abstract page for arXiv paper 2603.23322: Leveraging LLMs and Social Media to Understand User Perception of Smartphone-Based Earthquake E...

arXiv - AI · 4 min · 5 days ago

Llms

[2507.00026] RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models

Abstract page for arXiv paper 2507.00026: RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models

arXiv - AI · 4 min · 5 days ago

Previous Page 20 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Why are we blindly trusting AI companies with our data?

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV

[2603.16629] MLLM-based Textual Explanations for Face Comparison

All Content

[2505.22318] Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds

[2504.16956] GeneMamba: An Efficient and Effective Foundation Model on Single Cell Data

[2412.08686] LatentQA: Teaching LLMs to Decode Activations Into Natural Language

[2410.12164] Table-LLM-Specialist: Language Model Specialists for Tables using Iterative Generator-Validator Fine-tuning

[2504.07396] Automating quantum feature map design via large language models

[2502.01969] Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration

[2306.05036] Mapping the Challenges of HCI: An Application and Evaluation of ChatGPT for Mining Insights at Scale

[2601.12138] DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

[2511.22076] Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

[2601.18858] Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

[2510.05318] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

[2510.00415] Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

[2603.23485] Failure of contextual invariance in gender inference with large language models

[2603.23482] ReqFusion: A Multi-Provider Framework for Automated PEGS Analysis Across Software Domains

[2510.16051] GUIrilla: A Scalable Framework for Automated Desktop UI Exploration

[2603.23447] 3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding

[2603.23443] Evaluating LLM-Based Test Generation Under Software Evolution

[2603.23322] Leveraging LLMs and Social Media to Understand User Perception of Smartphone-Based Earthquake Early Warnings

[2507.00026] RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models

Related Topics

Stay updated with AI News