Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Researchers asked ChatGPT, Gemini and Claude which jobs are most exposed to AI. The chatbots wildly diagree
Llms

Researchers asked ChatGPT, Gemini and Claude which jobs are most exposed to AI. The chatbots wildly diagree

A study reveals that AI models disagree on which jobs are most vulnerable to automation, highlighting the unreliability of AI-generated e...

AI Tools & Products · 4 min ·
I stopped treating ChatGPT like Google — and everything suddenly clicked
Llms

I stopped treating ChatGPT like Google — and everything suddenly clicked

I stopped using ChatGPT like Google and started treating it like a thinking partner — here’s why that simple shift made the AI dramatical...

AI Tools & Products · 8 min ·
Hackers abuse Google ads, Claude.ai chats to push Mac malware
Llms

Hackers abuse Google ads, Claude.ai chats to push Mac malware

AI Tools & Products · 6 min ·

All Content

[2502.01941] Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression
Llms

[2502.01941] Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Compression

Abstract page for arXiv paper 2502.01941: Semantic Integrity Matters: Benchmarking and Preserving High-Density Reasoning in KV Cache Comp...

arXiv - AI · 4 min ·
[2407.04183] Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms
Llms

[2407.04183] Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms

Abstract page for arXiv paper 2407.04183: Seeing Like an AI: How LLMs Apply (and Misapply) Wikipedia Neutrality Norms

arXiv - AI · 4 min ·
[2603.09652] MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants
Llms

[2603.09652] MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants

Abstract page for arXiv paper 2603.09652: MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assis...

arXiv - AI · 4 min ·
[2512.05439] BEAVER: An Efficient Deterministic LLM Verifier
Llms

[2512.05439] BEAVER: An Efficient Deterministic LLM Verifier

Abstract page for arXiv paper 2512.05439: BEAVER: An Efficient Deterministic LLM Verifier

arXiv - AI · 3 min ·
[2506.00886] Position: Agent Should Invoke External Tools ONLY When Epistemically Necessary
Llms

[2506.00886] Position: Agent Should Invoke External Tools ONLY When Epistemically Necessary

Abstract page for arXiv paper 2506.00886: Position: Agent Should Invoke External Tools ONLY When Epistemically Necessary

arXiv - AI · 4 min ·
[2510.01569] InvThink: Premortem Reasoning for Safer Language Models
Llms

[2510.01569] InvThink: Premortem Reasoning for Safer Language Models

Abstract page for arXiv paper 2510.01569: InvThink: Premortem Reasoning for Safer Language Models

arXiv - AI · 3 min ·
[2508.16571] LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence
Llms

[2508.16571] LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence

Abstract page for arXiv paper 2508.16571: LLM-Based Agents for Competitive Landscape Mapping in Drug Asset Due Diligence

arXiv - AI · 4 min ·
[2605.08063] Flow-OPD: On-Policy Distillation for Flow Matching Models
Llms

[2605.08063] Flow-OPD: On-Policy Distillation for Flow Matching Models

Abstract page for arXiv paper 2605.08063: Flow-OPD: On-Policy Distillation for Flow Matching Models

arXiv - AI · 4 min ·
[2605.08060] The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents
Llms

[2605.08060] The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents

Abstract page for arXiv paper 2605.08060: The Memory Curse: How Expanded Recall Erodes Cooperative Intent in LLM Agents

arXiv - AI · 4 min ·
[2605.08057] CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute Budget Allocation
Llms

[2605.08057] CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute Budget Allocation

Abstract page for arXiv paper 2605.08057: CA-SQL: Complexity-Aware Inference Time Reasoning for Text-to-SQL via Exploration and Compute B...

arXiv - AI · 3 min ·
[2605.07985] Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation
Llms

[2605.07985] Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation

Abstract page for arXiv paper 2605.07985: Dooly: Configuration-Agnostic, Redundancy-Aware Profiling for LLM Inference Simulation

arXiv - AI · 4 min ·
[2605.07830] CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios
Llms

[2605.07830] CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

Abstract page for arXiv paper 2605.07830: CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios

arXiv - AI · 3 min ·
[2605.07817] GazeVLM: Active Vision via Internal Attention Control for Multimodal Reasoning
Llms

[2605.07817] GazeVLM: Active Vision via Internal Attention Control for Multimodal Reasoning

Abstract page for arXiv paper 2605.07817: GazeVLM: Active Vision via Internal Attention Control for Multimodal Reasoning

arXiv - AI · 4 min ·
[2605.07731] Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs
Llms

[2605.07731] Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs

Abstract page for arXiv paper 2605.07731: Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs

arXiv - AI · 4 min ·
[2605.07723] LLM hallucinations in the wild: Large-scale evidence from non-existent citations
Llms

[2605.07723] LLM hallucinations in the wild: Large-scale evidence from non-existent citations

Abstract page for arXiv paper 2605.07723: LLM hallucinations in the wild: Large-scale evidence from non-existent citations

arXiv - AI · 4 min ·
[2605.07725] SOD: Step-wise On-policy Distillation for Small Language Model Agents
Llms

[2605.07725] SOD: Step-wise On-policy Distillation for Small Language Model Agents

Abstract page for arXiv paper 2605.07725: SOD: Step-wise On-policy Distillation for Small Language Model Agents

arXiv - AI · 3 min ·
[2605.07717] The AI-Native Large-Scale Agile Software Development Manifesto
Llms

[2605.07717] The AI-Native Large-Scale Agile Software Development Manifesto

Abstract page for arXiv paper 2605.07717: The AI-Native Large-Scale Agile Software Development Manifesto

arXiv - AI · 3 min ·
[2605.07705] Cross-Attention and Encoder-Decoder Transformers: A Logical Characterization
Llms

[2605.07705] Cross-Attention and Encoder-Decoder Transformers: A Logical Characterization

Abstract page for arXiv paper 2605.07705: Cross-Attention and Encoder-Decoder Transformers: A Logical Characterization

arXiv - AI · 3 min ·
[2605.07699] DRIP-R: A Benchmark for Decision-Making and Reasoning Under Real-World Policy Ambiguity in the Retail Domain
Llms

[2605.07699] DRIP-R: A Benchmark for Decision-Making and Reasoning Under Real-World Policy Ambiguity in the Retail Domain

Abstract page for arXiv paper 2605.07699: DRIP-R: A Benchmark for Decision-Making and Reasoning Under Real-World Policy Ambiguity in the ...

arXiv - AI · 3 min ·
[2605.07647] Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation
Llms

[2605.07647] Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the Impact of Task-Specific Adaptation

Abstract page for arXiv paper 2605.07647: Quality-Conditioned Agreement in Automated Short Answer Scoring: Mid-Range Degradation and the ...

arXiv - AI · 4 min ·
Previous Page 2 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime