Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

I asked ChatGPT and Gemini to generate a world map

submitted by /u/Pitiful-Entrance5769 [link] [comments]

Reddit - Artificial Intelligence · 1 min · 13 minutes ago

Llms

Cant wait to use Mythos model - Anthropic refuses to release Claude Mythos publicly — model found thousands of zero-days across every major OS and browser. Launches Project Glasswing with Apple, Microsoft, Google, and others for defensive use.

Anthropic announced Project Glasswing, a defensive cybersecurity initiative with Apple, Microsoft, Google, AWS, NVIDIA, CrowdStrike, and ...

Reddit - Artificial Intelligence · 1 min · 13 minutes ago

Llms

Studying Sutton and Barto's RL book and its connections to RL for LLMs (e.g., tool use, math reasoning, agents, and so on)? [D]

Hi everyone, I graduated from a Master in Math program last summer. In recent months, I have been trying to understand more about ML/DL a...

Reddit - Machine Learning · 1 min · 43 minutes ago

All Content

Llms

[2506.01062] SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

Abstract page for arXiv paper 2506.01062: SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2505.19255] VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

Abstract page for arXiv paper 2505.19255: VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2504.04372] Assessing the Impact of Code Changes on the Fault Localizability of Large Language Models

Abstract page for arXiv paper 2504.04372: Assessing the Impact of Code Changes on the Fault Localizability of Large Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.00485] Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models

Abstract page for arXiv paper 2602.00485: Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2601.03604] Interleaved Tool-Call Reasoning for Protein Function Understanding

Abstract page for arXiv paper 2601.03604: Interleaved Tool-Call Reasoning for Protein Function Understanding

arXiv - AI · 3 min · about 1 month ago

Llms

[2512.10534] Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Abstract page for arXiv paper 2512.10534: Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforceme...

arXiv - AI · 4 min · about 1 month ago

Llms

[2601.22571] PerfGuard: A Performance-Aware Agent for Visual Content Generation

Abstract page for arXiv paper 2601.22571: PerfGuard: A Performance-Aware Agent for Visual Content Generation

arXiv - AI · 4 min · about 1 month ago

Llms

[2512.14106] HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control

Abstract page for arXiv paper 2512.14106: HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental S...

arXiv - AI · 4 min · about 1 month ago

Llms

[2512.07081] ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes

Abstract page for arXiv paper 2512.07081: ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.13770] Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

Abstract page for arXiv paper 2505.13770: Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Infe...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.21033] Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

Abstract page for arXiv paper 2511.21033: Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

arXiv - AI · 4 min · about 1 month ago

Llms

[2511.04439] CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

Abstract page for arXiv paper 2511.04439: CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.08966] Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language Models

Abstract page for arXiv paper 2510.08966: Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Lang...

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.04997] Foam-Agent: Towards Automated Intelligent CFD Workflows

Abstract page for arXiv paper 2505.04997: Foam-Agent: Towards Automated Intelligent CFD Workflows

arXiv - AI · 3 min · about 1 month ago

Llms

[2503.07928] The StudyChat Dataset: Analyzing Student Dialogues With ChatGPT in an Artificial Intelligence Course

Abstract page for arXiv paper 2503.07928: The StudyChat Dataset: Analyzing Student Dialogues With ChatGPT in an Artificial Intelligence C...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.05500] POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

Abstract page for arXiv paper 2603.05500: POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.05494] Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Abstract page for arXiv paper 2603.05494: Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.05488] Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

Abstract page for arXiv paper 2603.05488: Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.05471] Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Abstract page for arXiv paper 2603.05471: Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.05432] Ensembling Language Models with Sequential Monte Carlo

Abstract page for arXiv paper 2603.05432: Ensembling Language Models with Sequential Monte Carlo

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 128 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

I asked ChatGPT and Gemini to generate a world map

Cant wait to use Mythos model - Anthropic refuses to release Claude Mythos publicly — model found thousands of zero-days across every major OS and browser. Launches Project Glasswing with Apple, Microsoft, Google, and others for defensive use.

Studying Sutton and Barto's RL book and its connections to RL for LLMs (e.g., tool use, math reasoning, agents, and so on)? [D]

All Content

[2506.01062] SealQA: Raising the Bar for Reasoning in Search-Augmented Language Models

[2505.19255] VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use

[2504.04372] Assessing the Impact of Code Changes on the Fault Localizability of Large Language Models

[2602.00485] Replacing Parameters with Preferences: Federated Alignment of Heterogeneous Vision-Language Models

[2601.03604] Interleaved Tool-Call Reasoning for Protein Function Understanding

[2512.10534] Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

[2601.22571] PerfGuard: A Performance-Aware Agent for Visual Content Generation

[2512.14106] HydroGEM: A Self Supervised Zero Shot Hybrid TCN Transformer Foundation Model for Continental Scale Streamflow Quality Control

[2512.07081] ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes

[2505.13770] Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference

[2511.21033] Towards Trustworthy Legal AI through LLM Agents and Formal Reasoning

[2511.04439] CoRPO: Adding a Correctness Bias to GRPO Improves Generalization

[2510.08966] Beyond Prefixes: Graph-as-Memory Cross-Attention for Knowledge Graph Completion with Large Language Models

[2505.04997] Foam-Agent: Towards Automated Intelligent CFD Workflows

[2503.07928] The StudyChat Dataset: Analyzing Student Dialogues With ChatGPT in an Artificial Intelligence Course

[2603.05500] POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

[2603.05494] Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

[2603.05488] Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

[2603.05471] Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

[2603.05432] Ensembling Language Models with Sequential Monte Carlo

Related Topics

Stay updated with AI News