[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models
About this article
Abstract page for arXiv paper 2603.23668: Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models
Computer Science > Hardware Architecture arXiv:2603.23668 (cs) [Submitted on 24 Mar 2026] Title:Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models Authors:Mohammad Saleh Vahdatpour, Yanqing Zhang View a PDF of the paper titled Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models, by Mohammad Saleh Vahdatpour and 1 other authors View PDF HTML (experimental) Abstract:The rapid deployment of machine learning across platforms from milliwatt-class TinyML devices to large language models has made energy efficiency a primary constraint for sustainable AI. Across these scales, performance and energy are increasingly limited by data movement and memory-system behavior rather than by arithmetic throughput alone. This work reviews energy efficient software hardware codesign methods spanning edge inference and training to datacenter-scale LLM serving, covering accelerator architectures (e.g., ASIC/FPGA dataflows, processing-/compute-in-memory designs) and system-level techniques (e.g., partitioning, quantization, scheduling, and runtime adaptation). We distill common design levers and trade-offs, and highlight recurring gaps including limited cross-platform generalization, large and costly co-design search spaces, and inconsistent benchmarking across workloads and deployment settings. Finally, we outline a hierarchical decomposition perspective that maps optimization strategies to com...