[2603.24934] CVA: Context-aware Video-text Alignment for Video

[2603.24934] CVA: Context-aware Video-text Alignment for Video Temporal Grounding

arXiv - Machine Learning March 27, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.24934: CVA: Context-aware Video-text Alignment for Video Temporal Grounding

Computer Science > Machine Learning arXiv:2603.24934 (cs) [Submitted on 26 Mar 2026] Title:CVA: Context-aware Video-text Alignment for Video Temporal Grounding Authors:Sungho Moon, Seunghun Lee, Jiwan Seo, Sunghoon Im View a PDF of the paper titled CVA: Context-aware Video-text Alignment for Video Temporal Grounding, by Sungho Moon and 3 other authors View PDF HTML (experimental) Abstract:We propose Context-aware Video-text Alignment (CVA), a novel framework to address a significant challenge in video temporal grounding: achieving temporally sensitive video-text alignment that remains robust to irrelevant background context. Our framework is built on three key components. First, we propose Query-aware Context Diversification (QCD), a new data augmentation strategy that ensures only semantically unrelated content is mixed in. It builds a video-text similarity-based pool of replacement clips to simulate diverse contexts while preventing the ``false negative" caused by query-agnostic mixing. Second, we introduce the Context-invariant Boundary Discrimination (CBD) loss, a contrastive loss that enforces semantic consistency at challenging temporal boundaries, making their representations robust to contextual shifts and hard negatives. Third, we introduce the Context-enhanced Transformer Encoder (CTE), a hierarchical architecture that combines windowed self-attention and bidirectional cross-attention with learnable queries to capture multi-scale temporal context. Through the syn...

Originally published on March 27, 2026. Curated by AI News.

Ai Safety

Washington needs AI guardrails — now | Opinion

We need legislation that draws clear lines on what AI systems may and may not do on behalf of the United States government

AI Tools & Products · 3 min · about 3 hours ago

Ai Safety

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

Abstract page for arXiv paper 2601.12910: SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

arXiv - AI · 3 min · about 9 hours ago

Machine Learning

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

Abstract page for arXiv paper 2509.21385: Debugging Concept Bottleneck Models through Removal and Retraining

arXiv - Machine Learning · 4 min · about 9 hours ago

Llms

[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

Abstract page for arXiv paper 2512.00804: Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

arXiv - AI · 4 min · about 9 hours ago

[2603.24934] CVA: Context-aware Video-text Alignment for Video Temporal Grounding

About this article

Related Articles

Washington needs AI guardrails — now | Opinion

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

No comments

Stay updated with AI News