Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

We gave 45 psychological questionnaires to 50 LLMs. What we found was not “personality.”

What is the “personality” of an LLM? What actually differentiates models psychometrically? Since LLMs entered public use, researchers hav...

Reddit - Artificial Intelligence · 1 min · 36 minutes ago

Llms

How to Disable Google's Gemini in Chrome | WIRED

Chrome users were caught off guard by a 4-GB Google AI model baked into Chrome, sparking privacy concerns. The good news: You can easily ...

Wired - AI · 6 min · 36 minutes ago

Llms

OpenAI introduces new 'Trusted Contact' safeguard for cases of possible self-harm | TechCrunch

The company is expanding its efforts to protect ChatGPT users in cases where conversations may turn to self-harm.

TechCrunch - AI · 5 min · about 1 hour ago

All Content

Llms

[2506.08902] Intention-Conditioned Flow Occupancy Models

Abstract page for arXiv paper 2506.08902: Intention-Conditioned Flow Occupancy Models

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2506.06683] RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks

Abstract page for arXiv paper 2506.06683: RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks

arXiv - AI · 4 min · 2 months ago

Llms

[2506.03135] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

Abstract page for arXiv paper 2506.03135: OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

arXiv - AI · 4 min · 2 months ago

Llms

[2506.02860] Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs

Abstract page for arXiv paper 2506.02860: Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs

arXiv - AI · 3 min · 2 months ago

Llms

[2505.11076] Addition is almost all you need: Compressing large language models with double binary factorization

Abstract page for arXiv paper 2505.11076: Addition is almost all you need: Compressing large language models with double binary factoriza...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.24298] AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Abstract page for arXiv paper 2505.24298: AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.21786] VeriTrail: Closed-Domain Hallucination Detection with Traceability

Abstract page for arXiv paper 2505.21786: VeriTrail: Closed-Domain Hallucination Detection with Traceability

arXiv - AI · 3 min · 2 months ago

Llms

[2504.03889] Identifying and Evaluating Inactive Heads in Pretrained LLMs

Abstract page for arXiv paper 2504.03889: Identifying and Evaluating Inactive Heads in Pretrained LLMs

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.20278] Characterizing Pattern Matching and Its Limits on Compositional Task Structures

Abstract page for arXiv paper 2505.20278: Characterizing Pattern Matching and Its Limits on Compositional Task Structures

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.21413] RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning

Abstract page for arXiv paper 2505.21413: RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning

arXiv - AI · 4 min · 2 months ago

Llms

[2505.21396] Augmenting Research Ideation with Data: An Empirical Investigation in Social Science

Abstract page for arXiv paper 2505.21396: Augmenting Research Ideation with Data: An Empirical Investigation in Social Science

arXiv - AI · 4 min · 2 months ago

Llms

[2503.08980] I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?

Abstract page for arXiv paper 2503.08980: I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.16056] Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

Abstract page for arXiv paper 2505.16056: Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.17702] Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

Abstract page for arXiv paper 2505.17702: Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via De...

arXiv - AI · 4 min · 2 months ago

Llms

[2505.17568] JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models

Abstract page for arXiv paper 2505.17568: JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models

arXiv - AI · 4 min · 2 months ago

Llms

[2505.15504] Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

Abstract page for arXiv paper 2505.15504: Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

arXiv - AI · 4 min · 2 months ago

Llms

[2505.13109] FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

Abstract page for arXiv paper 2505.13109: FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.12186] Self-Destructive Language Model

Abstract page for arXiv paper 2505.12186: Self-Destructive Language Model

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2502.01481] Intrinsic Entropy of Context Length Scaling in LLMs

Abstract page for arXiv paper 2502.01481: Intrinsic Entropy of Context Length Scaling in LLMs

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.02881] Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Abstract page for arXiv paper 2505.02881: Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

arXiv - Machine Learning · 4 min · 2 months ago

Previous Page 341 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

We gave 45 psychological questionnaires to 50 LLMs. What we found was not “personality.”

How to Disable Google's Gemini in Chrome | WIRED

OpenAI introduces new 'Trusted Contact' safeguard for cases of possible self-harm | TechCrunch

All Content

[2506.08902] Intention-Conditioned Flow Occupancy Models

[2506.06683] RoboPARA: Dual-Arm Robot Planning with Parallel Allocation and Recomposition Across Tasks

[2506.03135] OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models

[2506.02860] Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs

[2505.11076] Addition is almost all you need: Compressing large language models with double binary factorization

[2505.24298] AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

[2505.21786] VeriTrail: Closed-Domain Hallucination Detection with Traceability

[2504.03889] Identifying and Evaluating Inactive Heads in Pretrained LLMs

[2505.20278] Characterizing Pattern Matching and Its Limits on Compositional Task Structures

[2505.21413] RefTool: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning

[2505.21396] Augmenting Research Ideation with Data: An Empirical Investigation in Social Science

[2503.08980] I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?

[2505.16056] Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

[2505.17702] Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek

[2505.17568] JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models

[2505.15504] Exploiting Low-Dimensional Manifold of Features for Few-Shot Whole Slide Image Classification

[2505.13109] FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

[2505.12186] Self-Destructive Language Model

[2502.01481] Intrinsic Entropy of Context Length Scaling in LLMs

[2505.02881] Rewriting Pre-Training Data Boosts LLM Performance in Math and Code

Related Topics

Stay updated with AI News