[2602.20066] HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images

[2602.20066] HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images

arXiv - AI 3 min read Article

Summary

The paper presents HeatPrompt, a zero-shot vision-language framework for estimating urban heat demand from satellite images, enhancing energy planning in data-scarce regions.

Why It Matters

As cities strive to decarbonize heating systems, accurate heat demand mapping is essential. HeatPrompt leverages satellite imagery and machine learning to provide municipalities with critical data, addressing the lack of detailed building-level information and supporting climate action efforts.

Key Takeaways

  • HeatPrompt uses satellite images to estimate urban heat demand.
  • The framework achieves a 93.7% R^2 uplift and reduces MAE by 30%.
  • It is particularly useful for municipalities lacking detailed building data.
  • High-impact tokens identified align with high-demand zones.
  • The approach supports energy planning in data-scarce areas.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.20066 (cs) [Submitted on 23 Feb 2026] Title:HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images Authors:Kundan Thota, Xuanhao Mu, Thorsten Schlachter, Veit Hagenmeyer View a PDF of the paper titled HeatPrompt: Zero-Shot Vision-Language Modeling of Urban Heat Demand from Satellite Images, by Kundan Thota and 3 other authors View PDF HTML (experimental) Abstract:Accurate heat-demand maps play a crucial role in decarbonizing space heating, yet most municipalities lack detailed building-level data needed to calculate them. We introduce HeatPrompt, a zero-shot vision-language energy modeling framework that estimates annual heat demand using semantic features extracted from satellite images, basic Geographic Information System (GIS), and building-level features. We feed pretrained Large Vision Language Models (VLMs) with a domain-specific prompt to act as an energy planner and extract the visual attributes such as roof age, building density, etc, from the RGB satellite image that correspond to the thermal load. A Multi-Layer Perceptron (MLP) regressor trained on these captions shows an $R^2$ uplift of 93.7% and shrinks the mean absolute error (MAE) by 30% compared to the baseline model. Qualitative analysis shows that high-impact tokens align with high-demand zones, offering lightweight support for heat planning in data-scarce regions. Subjects: Computer Vision and Pattern...

Related Articles

Llms

Why are we blindly trusting AI companies with our data?

Lately I’ve been seeing a story floating around that really made me pause. Apparently, there were claims that the US government asked Ant...

Reddit - Artificial Intelligence · 1 min ·
De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV
Llms

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV

Artificial intelligence is transforming every corner of industry, and television is no exception. Major networks in Korea have recently a...

AI Tools & Products · 4 min ·
[2603.16629] MLLM-based Textual Explanations for Face Comparison
Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min ·
[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation
Llms

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime