[2510.25166] A Study on Inference Latency for Vision Transformers on Mobile Devices

[2510.25166] A Study on Inference Latency for Vision Transformers on Mobile Devices

arXiv - Machine Learning 3 min read Article

Summary

This study quantitatively analyzes the inference latency of 190 vision transformers (ViTs) on mobile devices, comparing them to 102 convolutional neural networks (CNNs) and providing insights into latency factors.

Why It Matters

As mobile devices increasingly utilize advanced machine learning techniques, understanding the performance of vision transformers is crucial for optimizing applications in real-world scenarios. This research offers valuable data that can inform developers and researchers about the efficiency of ViTs compared to traditional CNNs, impacting future mobile AI deployments.

Key Takeaways

  • The study evaluates the latency of 190 ViTs on mobile devices.
  • It compares ViTs with 102 CNNs to highlight performance differences.
  • A dataset of 1000 synthetic ViTs was developed to predict inference latency accurately.
  • Insights from this research can guide the design of more efficient mobile AI applications.
  • Understanding latency factors is essential for optimizing real-world applications.

Computer Science > Computer Vision and Pattern Recognition arXiv:2510.25166 (cs) [Submitted on 29 Oct 2025 (v1), last revised 18 Feb 2026 (this version, v2)] Title:A Study on Inference Latency for Vision Transformers on Mobile Devices Authors:Zhuojin Li, Marco Paolieri, Leana Golubchik View a PDF of the paper titled A Study on Inference Latency for Vision Transformers on Mobile Devices, by Zhuojin Li and 2 other authors View PDF HTML (experimental) Abstract:Given the significant advances in machine learning techniques on mobile devices, particularly in the domain of computer vision, in this work we quantitatively study the performance characteristics of 190 real-world vision transformers (ViTs) on mobile devices. Through a comparison with 102 real-world convolutional neural networks (CNNs), we provide insights into the factors that influence the latency of ViT architectures on mobile devices. Based on these insights, we develop a dataset including measured latencies of 1000 synthetic ViTs with representative building blocks and state-of-the-art architectures from two machine learning frameworks and six mobile platforms. Using this dataset, we show that inference latency of new ViTs can be predicted with sufficient accuracy for real-world applications. Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Performance (cs.PF) Cite as: arXiv:2510.25166 [cs.CV]   (or arXiv:2510.25166v2 [cs.CV] for this version)   https://doi.org/10.48550/arXiv.25...

Related Articles

Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min ·
Machine Learning

[Research] AI training is bad, so I started an research

Hello, I started researching about AI training Q:Why? R: Because AI training is bad right now. Q: What do you mean its bad? R: Like when ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime