[2508.00955] From Generator to Embedder: Harnessing Innate Abilities

[2508.00955] From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model

arXiv - Machine Learning March 02, 2026 4 min read

About this article

Abstract page for arXiv paper 2508.00955: From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model

Computer Science > Machine Learning arXiv:2508.00955 (cs) [Submitted on 1 Aug 2025 (v1), last revised 27 Feb 2026 (this version, v2)] Title:From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model Authors:Yeong-Joon Ju, Seong-Whan Lee View a PDF of the paper titled From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model, by Yeong-Joon Ju and 1 other authors View PDF HTML (experimental) Abstract:Adapting generative Multimodal Large Language Models (MLLMs) into universal embedding models typically demands resource-intensive contrastive pre-training, while traditional hard negative mining methods suffer from severe false negative contamination. In this paper, we propose a highly data-efficient framework that bypasses extensive pre-training to build a robust multimodal representation space. We first introduce a hierarchical embedding prompt that provides strong latent conditioning. By explicitly anchoring task definitions at the system level, this prompting strategy effectively bridges the modality gap and unlocks powerful zero-shot embedding capabilities. Building upon this latent conditioning, we present Self-aware Hard Negative Sampling (SaHa). Unlike conventional candidate-space mining, SaHa shifts the mechanism to the query-space by mapping retrieved candidates back to their owner queries to rigorously filter out semantic false n...

Originally published on March 02, 2026. Curated by AI News.

Llms

Combining the robot operating system with LLMs for natural-language control

Over the past few decades, robotics researchers have developed a wide range of increasingly advanced robots that can autonomously complet...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Which LLM is the best for writing a scientific paper?

I'll need to write a scientifc research paper for university. We're allowed and encouraged to use AI for our work. Be it for language or ...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

Anthropic is training Claude to recognize when its own tools are trying to manipulate it

One thing from Claude Code's source that I think is underappreciated. There's an explicit instruction in the system prompt: if the AI sus...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

The Claude Code leak accidentally published the first complete blueprint for production AI agents. Here's what it tells us about where this is all going.

Most coverage of the Claude Code leak focuses on the drama or the hidden features. But the bigger story is that this is the first time we...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

[2508.00955] From Generator to Embedder: Harnessing Innate Abilities of Multimodal LLMs via Building Zero-Shot Discriminative Embedding Model

About this article

Related Articles

Combining the robot operating system with LLMs for natural-language control

Which LLM is the best for writing a scientific paper?

Anthropic is training Claude to recognize when its own tools are trying to manipulate it

The Claude Code leak accidentally published the first complete blueprint for production AI agents. Here's what it tells us about where this is all going.

No comments

Stay updated with AI News