Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

Claude mythos preview GameJam contestant

Claude was able to create this Indie Game Jam Challenge with simple user guided prompts in the Godong engine with Mythos Preview with Zer...

Reddit - Artificial Intelligence · 1 min ·
Llms

I implemented meta paper [P]

github link : genji970/Scaling-Test-Time-Compute-for-Agentic-Coding-: paper implementation of Meta Ai paper link : https://arxiv.org/abs/...

Reddit - Machine Learning · 1 min ·
Llms

How do I actually learn AI/ML deeply enough to build systems (not just follow tutorials)? [D]

I'm stuck in a loop where I consume AI/ML content but can’t move towards actually building real systems. - I understand things at a surfa...

Reddit - Machine Learning · 1 min ·

All Content

[2510.19842] DAG-Math: Graph-of-Thought Guided Mathematical Reasoning in LLMs
Llms

[2510.19842] DAG-Math: Graph-of-Thought Guided Mathematical Reasoning in LLMs

Abstract page for arXiv paper 2510.19842: DAG-Math: Graph-of-Thought Guided Mathematical Reasoning in LLMs

arXiv - Machine Learning · 4 min ·
[2603.01399] Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification
Llms

[2603.01399] Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verification

Abstract page for arXiv paper 2603.01399: Quasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient Verifi...

arXiv - Machine Learning · 4 min ·
[2510.04284] Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning
Llms

[2510.04284] Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning

Abstract page for arXiv paper 2510.04284: Doctor-R1: Mastering Clinical Inquiry with Experiential Agentic Reinforcement Learning

arXiv - AI · 4 min ·
[2510.04040] FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning
Llms

[2510.04040] FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning

Abstract page for arXiv paper 2510.04040: FaithCoT-Bench: Benchmarking Instance-Level Faithfulness of Chain-of-Thought Reasoning

arXiv - AI · 4 min ·
[2510.03605] Understanding the Role of Training Data in Test-Time Scaling
Llms

[2510.03605] Understanding the Role of Training Data in Test-Time Scaling

Abstract page for arXiv paper 2510.03605: Understanding the Role of Training Data in Test-Time Scaling

arXiv - Machine Learning · 4 min ·
[2603.01327] SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution
Llms

[2603.01327] SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution

Abstract page for arXiv paper 2603.01327: SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resol...

arXiv - Machine Learning · 4 min ·
[2603.01326] Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning
Llms

[2603.01326] Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

Abstract page for arXiv paper 2603.01326: Truth as a Trajectory: What Internal Representations Reveal About Large Language Model Reasoning

arXiv - Machine Learning · 4 min ·
[2509.23465] ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems
Llms

[2509.23465] ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems

Abstract page for arXiv paper 2509.23465: ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Pro...

arXiv - AI · 4 min ·
[2509.23415] From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents
Llms

[2509.23415] From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database Agents

Abstract page for arXiv paper 2509.23415: From Conversation to Query Execution: Benchmarking User and Tool Interactions for EHR Database ...

arXiv - AI · 4 min ·
[2509.21993] Bilinear representation mitigates reversal curse and enables consistent model editing
Llms

[2509.21993] Bilinear representation mitigates reversal curse and enables consistent model editing

Abstract page for arXiv paper 2509.21993: Bilinear representation mitigates reversal curse and enables consistent model editing

arXiv - Machine Learning · 4 min ·
[2603.01236] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models
Llms

[2603.01236] AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in Large Vision-Language Models

Abstract page for arXiv paper 2603.01236: AgilePruner: An Empirical Study of Attention and Diversity for Adaptive Visual Token Pruning in...

arXiv - Machine Learning · 4 min ·
[2509.21028] Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles
Llms

[2509.21028] Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

Abstract page for arXiv paper 2509.21028: Who Gets Cited Most? Benchmarking Long-Context Numerical Reasoning on Scientific Articles

arXiv - AI · 3 min ·
[2603.01214] Reasoning Boosts Opinion Alignment in LLMs
Llms

[2603.01214] Reasoning Boosts Opinion Alignment in LLMs

Abstract page for arXiv paper 2603.01214: Reasoning Boosts Opinion Alignment in LLMs

arXiv - Machine Learning · 3 min ·
[2509.12282] AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science
Llms

[2509.12282] AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

Abstract page for arXiv paper 2509.12282: AISSISTANT: Human-AI Collaborative Review and Perspective Research Workflows in Data Science

arXiv - Machine Learning · 4 min ·
[2603.01213] Can AI Agents Agree?
Llms

[2603.01213] Can AI Agents Agree?

Abstract page for arXiv paper 2603.01213: Can AI Agents Agree?

arXiv - Machine Learning · 3 min ·
[2509.03906] Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatible Reasoning via Reinforcement Learning
Llms

[2509.03906] Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatible Reasoning via Reinforcement Learning

Abstract page for arXiv paper 2509.03906: Toward Clinically Explainable AI for Medical Diagnosis: A Foundation Model with Human-Compatibl...

arXiv - AI · 4 min ·
[2509.01938] EigenBench: A Comparative Behavioral Measure of Value Alignment
Llms

[2509.01938] EigenBench: A Comparative Behavioral Measure of Value Alignment

Abstract page for arXiv paper 2509.01938: EigenBench: A Comparative Behavioral Measure of Value Alignment

arXiv - Machine Learning · 4 min ·
[2508.20729] Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision
Llms

[2508.20729] Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision

Abstract page for arXiv paper 2508.20729: Re4: Scientific Computing Agent with Rewriting, Resolution, Review and Revision

arXiv - AI · 4 min ·
[2508.15030] Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism
Llms

[2508.15030] Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

Abstract page for arXiv paper 2508.15030: Collab-REC: An LLM-based Agentic Framework for Balancing Recommendations in Tourism

arXiv - AI · 3 min ·
[2507.16145] SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting
Llms

[2507.16145] SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting

Abstract page for arXiv paper 2507.16145: SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validati...

arXiv - AI · 4 min ·
Previous Page 310 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime