[2604.05253] Spike Hijacking in Late-Interaction Retrieval

arXiv - Machine Learning April 08, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.05253: Spike Hijacking in Late-Interaction Retrieval

Computer Science > Information Retrieval arXiv:2604.05253 (cs) [Submitted on 6 Apr 2026] Title:Spike Hijacking in Late-Interaction Retrieval Authors:Karthik Suresh, Tushar Vatsa, Tracy King, Asim Kadav, Michael Friedrich View a PDF of the paper titled Spike Hijacking in Late-Interaction Retrieval, by Karthik Suresh and 4 other authors View PDF HTML (experimental) Abstract:Late-interaction retrieval models rely on hard maximum similarity (MaxSim) to aggregate token-level similarities. Although effective, this winner-take-all pooling rule may structurally bias training dynamics. We provide a mechanistic study of gradient routing and robustness in MaxSim-based retrieval. In a controlled synthetic environment with in-batch contrastive training, we demonstrate that MaxSim induces significantly higher patch-level gradient concentration than smoother alternatives such as Top-k pooling and softmax aggregation. While sparse routing can improve early discrimination, it also increases sensitivity to document length: as the number of document patches grows, MaxSim degrades more sharply than mild smoothing variants. We corroborate these findings on a real-world multi-vector retrieval benchmark, where controlled document-length sweeps reveal similar brittleness under hard max pooling. Together, our results isolate pooling-induced gradient concentration as a structural property of late-interaction retrieval and highlight a sparsity-robustness tradeoff. These findings motivate principled ...

Originally published on April 08, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 1 hour ago

Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min · about 1 hour ago

Machine Learning

Weird ICML decision [D]

Hello, A friend of mine had a paper with borderline scores accepted at ICML. However, the comment made by the meta reviewers feels like t...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[2603.13566] EmDT: Embedding Diffusion Transformer for Tabular Data Generation in Fraud Detection

Abstract page for arXiv paper 2603.13566: EmDT: Embedding Diffusion Transformer for Tabular Data Generation in Fraud Detection

arXiv - Machine Learning · 3 min · about 3 hours ago

[2604.05253] Spike Hijacking in Late-Interaction Retrieval

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

Accelerating science with AI and simulations

Weird ICML decision [D]

[2603.13566] EmDT: Embedding Diffusion Transformer for Tabular Data Generation in Fraud Detection

No comments

Stay updated with AI News