[2603.28103] Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models
Abstract page for arXiv paper 2603.28103: Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models
Abstract page for arXiv paper 2603.28103: Transcription and Recognition of Italian Parliamentary Speeches Using Vision-Language Models
Abstract page for arXiv paper 2603.28086: MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptions
Abstract page for arXiv paper 2603.28069: MolmoPoint: Better Pointing for VLMs with Grounding Tokens
Abstract page for arXiv paper 2603.28066: Synonymix: Unified Group Personas for Generative Simulations
Abstract page for arXiv paper 2603.28032: CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied ...
Abstract page for arXiv paper 2603.27987: Beyond Dataset Distillation: Lossless Dataset Concentration via Diffusion-Assisted Distribution...
Abstract page for arXiv paper 2603.27991: ViviDoc: Generating Interactive Documents through Human-Agent Collaboration
Abstract page for arXiv paper 2603.27918: Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey
Abstract page for arXiv paper 2603.27982: CDH-Bench: A Commonsense-Driven Hallucination Benchmark for Evaluating Visual Fidelity in Visio...
Abstract page for arXiv paper 2603.27942: JaWildText: A Benchmark for Vision-Language Models on Japanese Scene Text Understanding
Abstract page for arXiv paper 2603.27817: Towards Context-Aware Image Anonymization with Multi-Agent Reasoning
Abstract page for arXiv paper 2603.27886: AI-ready design of realistic 2D materials and interfaces with Mat3ra-2D
Abstract page for arXiv paper 2603.27868: A Revealed Preference Framework for AI Alignment
Abstract page for arXiv paper 2603.27798: Towards Emotion Recognition with 3D Pointclouds Obtained from Facial Expression Images
Abstract page for arXiv paper 2603.27756: Heracles: Bridging Precise Tracking and Generative Synthesis for General Humanoid Control
Abstract page for arXiv paper 2603.27747: AI-Powered Facial Mask Removal Is Not Suitable For Biometric Identification
Abstract page for arXiv paper 2603.27745: Needle in the Repo: A Benchmark for Maintainability in AI-Generated Repository Edits
Abstract page for arXiv paper 2603.27727: Suppression of $^{14}\mathrm{C}$ photon hits in large liquid scintillator detectors via spatiot...
Abstract page for arXiv paper 2603.27716: The role of neuromorphic principles in the future of biomedicine and healthcare
Abstract page for arXiv paper 2603.27705: RAP: Retrieve, Adapt, and Prompt-Fit for Training-Free Few-Shot Medical Image Segmentation