[2602.24181] A Mixed Diet Makes DINO An Omnivorous Vision Encoder

arXiv - AI March 02, 2026 3 min read

About this article

Abstract page for arXiv paper 2602.24181: A Mixed Diet Makes DINO An Omnivorous Vision Encoder

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.24181 (cs) [Submitted on 27 Feb 2026] Title:A Mixed Diet Makes DINO An Omnivorous Vision Encoder Authors:Rishabh Kabra, Maks Ovsjanikov, Drew A. Hudson, Ye Xia, Skanda Koppula, Andre Araujo, Joao Carreira, Niloy J. Mitra View a PDF of the paper titled A Mixed Diet Makes DINO An Omnivorous Vision Encoder, by Rishabh Kabra and 7 other authors View PDF HTML (experimental) Abstract:Pre-trained vision encoders like DINOv2 have demonstrated exceptional performance on unimodal tasks. However, we observe that their feature representations are poorly aligned across different modalities. For instance, the feature embedding for an RGB image and its corresponding depth map of the same scene exhibit a cosine similarity that is nearly identical to that of two random, unrelated images. To address this, we propose the Omnivorous Vision Encoder, a novel framework that learns a modality-agnostic feature space. We train the encoder with a dual objective: first, to maximize the feature alignment between different modalities of the same scene; and second, a distillation objective that anchors the learned representations to the output of a fully frozen teacher such as DINOv2. The resulting student encoder becomes "omnivorous" by producing a consistent, powerful embedding for a given scene, regardless of the input modality (RGB, Depth, Segmentation, etc.). This approach enables robust cross-modal understanding while retaining ...

Originally published on March 02, 2026. Curated by AI News.

Nlp

[D] Simple Questions Thread

Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instea...

Reddit - Machine Learning · 1 min · about 6 hours ago

Llms

Which LLM is the best for writing a scientific paper?

I'll need to write a scientifc research paper for university. We're allowed and encouraged to use AI for our work. Be it for language or ...

Reddit - Artificial Intelligence · 1 min · about 6 hours ago

Llms

The Claude Code leak accidentally published the first complete blueprint for production AI agents. Here's what it tells us about where this is all going.

Most coverage of the Claude Code leak focuses on the drama or the hidden features. But the bigger story is that this is the first time we...

Reddit - Artificial Intelligence · 1 min · about 8 hours ago

Llms

[For Hire] Junior AI/ML Engineer | RAG · LLMs · FastAPI · Vector DBs | Remote

Posting this for a friend who isn't on Reddit. A recent graduate, entry level, no commercial production experience but spent the past yea...

Reddit - ML Jobs · 1 min · about 10 hours ago

[2602.24181] A Mixed Diet Makes DINO An Omnivorous Vision Encoder

About this article

Related Articles

[D] Simple Questions Thread

Which LLM is the best for writing a scientific paper?

The Claude Code leak accidentally published the first complete blueprint for production AI agents. Here's what it tells us about where this is all going.

[For Hire] Junior AI/ML Engineer | RAG · LLMs · FastAPI · Vector DBs | Remote

No comments

Stay updated with AI News