Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

We gave 45 psychological questionnaires to 50 LLMs. What we found was not “personality.”

What is the “personality” of an LLM? What actually differentiates models psychometrically? Since LLMs entered public use, researchers hav...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

How to Disable Google's Gemini in Chrome | WIRED

Chrome users were caught off guard by a 4-GB Google AI model baked into Chrome, sparking privacy concerns. The good news: You can easily ...

Wired - AI · 6 min · about 3 hours ago

Llms

OpenAI introduces new 'Trusted Contact' safeguard for cases of possible self-harm | TechCrunch

The company is expanding its efforts to protect ChatGPT users in cases where conversations may turn to self-harm.

TechCrunch - AI · 5 min · about 4 hours ago

All Content

Llms

[2510.03605] Understanding the Role of Training Data in Test-Time Scaling

Abstract page for arXiv paper 2510.03605: Understanding the Role of Training Data in Test-Time Scaling

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.01327] SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution

Abstract page for arXiv paper 2603.01327: SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resol...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.01326] Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

Abstract page for arXiv paper 2603.01326: Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2509.23465] ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems

Abstract page for arXiv paper 2509.23465: ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Pro...

arXiv - AI · 4 min · 2 months ago

Llms

[2509.23415] From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

Abstract page for arXiv paper 2509.23415: From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database ...

arXiv - AI · 4 min · 2 months ago

Llms

[2509.21993] Bilinear representation mitigates reversal curse and enables consistent model editing

Abstract page for arXiv paper 2509.21993: Bilinear representation mitigates reversal curse and enables consistent model editing

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.01236] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

Abstract page for arXiv paper 2603.01236: AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2509.21028] Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

Abstract page for arXiv paper 2509.21028: Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

arXiv - AI · 3 min · 2 months ago

Llms

[2603.01214] Reasoning Boosts Opinion Alignment in LLMs

Abstract page for arXiv paper 2603.01214: Reasoning Boosts Opinion Alignment in LLMs

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2509.12282] AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

Abstract page for arXiv paper 2509.12282: AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.01213] Can AI Agents Agree?

Abstract page for arXiv paper 2603.01213: Can AI Agents Agree?

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2509.03906] Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatible Reasoning via Reinforcement Learning

Abstract page for arXiv paper 2509.03906: Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatibl...

arXiv - AI · 4 min · 2 months ago

Llms

[2509.01938] EigenBench: A Comparative Behavioral Measure of Value Alignment

Abstract page for arXiv paper 2509.01938: EigenBench: A Comparative Behavioral Measure of Value Alignment

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2508.20729] Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision

Abstract page for arXiv paper 2508.20729: Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision

arXiv - AI · 4 min · 2 months ago

Llms

[2508.15030] Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

Abstract page for arXiv paper 2508.15030: Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

arXiv - AI · 3 min · 2 months ago

Llms

[2507.16145] SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting

Abstract page for arXiv paper 2507.16145: SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validati...

arXiv - AI · 4 min · 2 months ago

Llms

[2506.24119] SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Abstract page for arXiv paper 2506.24119: SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforce...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2603.01089] CARD: Towards Conditional Design of Multi-agent Topological Structures

Abstract page for arXiv paper 2603.01089: CARD: Towards Conditional Design of Multi-agent Topological Structures

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2506.00530] CityLens: Evaluating Large Vision-Language Models for Urban Socioeconomic Sensing

Abstract page for arXiv paper 2506.00530: CityLens: Evaluating Large Vision-Language Models for Urban Socioeconomic Sensing

arXiv - AI · 4 min · 2 months ago

Llms

[2505.12565] mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Abstract page for arXiv paper 2505.12565: mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

arXiv - Machine Learning · 4 min · 2 months ago

Previous Page 344 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

We gave 45 psychological questionnaires to 50 LLMs. What we found was not “personality.”

How to Disable Google's Gemini in Chrome | WIRED

OpenAI introduces new 'Trusted Contact' safeguard for cases of possible self-harm | TechCrunch

All Content

[2510.03605] Understanding the Role of Training Data in Test-Time Scaling

[2603.01327] SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution

[2603.01326] Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

[2509.23465] ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems

[2509.23415] From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

[2509.21993] Bilinear representation mitigates reversal curse and enables consistent model editing

[2603.01236] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

[2509.21028] Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

[2603.01214] Reasoning Boosts Opinion Alignment in LLMs

[2509.12282] AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

[2603.01213] Can AI Agents Agree?

[2509.03906] Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatible Reasoning via Reinforcement Learning

[2509.01938] EigenBench: A Comparative Behavioral Measure of Value Alignment

[2508.20729] Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision

[2508.15030] Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

[2507.16145] SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting

[2506.24119] SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

[2603.01089] CARD: Towards Conditional Design of Multi-agent Topological Structures

[2506.00530] CityLens: Evaluating Large Vision-Language Models for Urban Socioeconomic Sensing

[2505.12565] mCLM: A Modular Chemical Language Model that Generates Functional and Makeable Molecules

Related Topics

Stay updated with AI News