Machine Learning Robotics Computer Vision Ai Agents

[2602.14401] pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI

arXiv - AI February 17, 2026 3 min read Article

Summary

The paper presents pFedNavi, a personalized federated learning framework for Vision-Language Navigation (VLN) that addresses privacy concerns and improves navigation success rates through adaptive client-specific model adjustments.

Why It Matters

As VLN applications grow, privacy and data heterogeneity pose significant challenges. pFedNavi offers a novel solution that enhances performance while maintaining user privacy, making it relevant for developers and researchers in AI and robotics.

Key Takeaways

pFedNavi personalizes federated learning by identifying client-specific layers.
The framework outperforms traditional FedAvg methods in navigation tasks.
Improvements include up to 7.5% in navigation success and faster convergence rates.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.14401 (cs) [Submitted on 16 Feb 2026] Title:pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI Authors:Qingqian Yang, Hao Wang, Sai Qian Zhang, Jian Li, Yang Hua, Miao Pan, Tao Song, Zhengwei Qi, Haibing Guan View a PDF of the paper titled pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI, by Qingqian Yang and 8 other authors View PDF HTML (experimental) Abstract:Vision-Language Navigation VLN requires large-scale trajectory instruction data from private indoor environments, raising significant privacy concerns. Federated Learning FL mitigates this by keeping data on-device, but vanilla FL struggles under VLNs' extreme cross-client heterogeneity in environments and instruction styles, making a single global model suboptimal. This paper proposes pFedNavi, a structure-aware and dynamically adaptive personalized federated learning framework tailored for VLN. Our key idea is to personalize where it matters: pFedNavi adaptively identifies client-specific layers via layer-wise mixing coefficients, and performs fine-grained parameter fusion on the selected components (e.g., the encoder-decoder projection and environment-sensitive decoder layers) to balance global knowledge sharing with local specialization. We evaluate pFedNavi on two standard VLN benchmarks, R2R and RxR, using both ResNet and CLIP visual representations. Across ...

Read Original Article

Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min · 25 minutes ago

Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

AI benchmarks are broken. Here’s what we need instead. | MIT Technology Review

One-off tests don’t measure AI’s true impact. We’re better off shifting to more human-centered, context-specific methods.

MIT Technology Review · 8 min · about 2 hours ago

Machine Learning

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

[D] Ive been trying to understand the technical setup of a project called Qubic. It claims to use distributed proof of work computing for...

Reddit - Machine Learning · 1 min · about 3 hours ago

[2602.14401] pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI

Summary

Why It Matters

Key Takeaways

Related Articles

[D] I had an idea, would love your thoughts

I had an idea, would love your thoughts

AI benchmarks are broken. Here’s what we need instead. | MIT Technology Review

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

No comments

Stay updated with AI News