[2602.19789] Stop Preaching and Start Practising Data Frugality for Responsible Development of AI

[2602.19789] Stop Preaching and Start Practising Data Frugality for Responsible Development of AI

arXiv - Machine Learning 4 min read Article

Summary

This paper advocates for the machine learning community to adopt data frugality in AI development, emphasizing its environmental benefits and practical applications.

Why It Matters

As AI development increasingly relies on large datasets, the environmental impact of data scaling has become a critical concern. This paper highlights the need for a shift towards data frugality, which can reduce energy consumption and carbon emissions while maintaining model performance. It addresses a significant gap in the current practices of AI development, making it relevant for researchers and practitioners focused on sustainable AI.

Key Takeaways

  • Data frugality can significantly reduce training energy consumption.
  • Larger datasets yield diminishing returns in performance gains.
  • The environmental impacts of data scaling are substantial and often overlooked.
  • Empirical evidence supports the effectiveness of coreset-based subset selection.
  • Actionable recommendations are provided for implementing data frugality.

Computer Science > Machine Learning arXiv:2602.19789 (cs) [Submitted on 23 Feb 2026] Title:Stop Preaching and Start Practising Data Frugality for Responsible Development of AI Authors:Sophia N. Wilson, Guðrún Fjóla Guðmundsdóttir, Andrew Millard, Raghavendra Selvan, Sebastian Mair View a PDF of the paper titled Stop Preaching and Start Practising Data Frugality for Responsible Development of AI, by Sophia N. Wilson and 4 other authors View PDF HTML (experimental) Abstract:This position paper argues that the machine learning community must move from preaching to practising data frugality for responsible artificial intelligence (AI) development. For long, progress has been equated with ever-larger datasets, driving remarkable advances but now yielding increasingly diminishing performance gains alongside rising energy use and carbon emissions. While awareness of data frugal approaches has grown, their adoption has remained rhetorical, and data scaling continues to dominate development practice. We argue that this gap between preach and practice must be closed, as continued data scaling entails substantial and under-accounted environmental impacts. To ground our position, we provide indicative estimates of the energy use and carbon emissions associated with the downstream use of ImageNet-1K. We then present empirical evidence that data frugality is both practical and beneficial, demonstrating that coreset-based subset selection can substantially reduce training energy consumpt...

Related Articles

Llms

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

Hey everyone I've set up a self-hosted API gateway using [New-API](QuantumNous/new-ap) to manage and distribute Claude Opus 4.6 access ac...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] ICML reviewer making up false claim in acknowledgement, what to do?

In a rebuttal acknowledgement we received, the reviewer made up a claim that our method performs worse than baselines with some hyperpara...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

[D] Budget Machine Learning Hardware

Looking to get into machine learning and found this video on a piece of hardware for less than £500. Is it really possible to teach auton...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime