[2602.01379] WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics

[2602.01379] WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics

arXiv - Machine Learning 4 min read Article

Summary

The paper introduces WAKESET, a large-scale dataset for machine learning focused on turbulent wake dynamics, addressing the need for high-fidelity data in fluid dynamics.

Why It Matters

WAKESET fills a critical gap in the availability of high-quality datasets for training machine learning models in fluid dynamics, particularly for complex, high-Reynolds number flows. This dataset can significantly enhance the development of predictive models and improve engineering applications in turbulent environments.

Key Takeaways

  • WAKESET provides over 4,000 high-fidelity simulations of turbulent flows.
  • The dataset focuses on practical engineering problems, enhancing model training for real-world applications.
  • High-Reynolds number data is crucial for developing robust machine learning models in fluid dynamics.

Physics > Fluid Dynamics arXiv:2602.01379 (physics) [Submitted on 1 Feb 2026 (v1), last revised 22 Feb 2026 (this version, v2)] Title:WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics Authors:Zachary Cooper-Baldock, Paulo E. Santos, Russell S.A. Brinkworth, Karl Sammut View a PDF of the paper titled WAKESET: A Large-Scale, High-Reynolds Number Flow Dataset for Machine Learning of Turbulent Wake Dynamics, by Zachary Cooper-Baldock and Paulo E. Santos and Russell S.A. Brinkworth and Karl Sammut View PDF HTML (experimental) Abstract:Machine learning (ML) offers transformative potential for computational fluid dynamics (CFD), promising to accelerate simulations, improve turbulence modelling, and enable real-time flow prediction and control-capabilities that could fundamentally change how engineers approach fluid dynamics problems. However, the exploration of ML in fluid dynamics is critically hampered by the scarcity of large, diverse, and high-fidelity datasets suitable for training robust models. This limitation is particularly acute for highly turbulent flows, which dominate practical engineering applications yet remain computationally prohibitive to simulate at scale. High-Reynolds number turbulent datasets are essential for ML models to learn the complex, multi-scale physics characteristic of real-world flows, enabling generalisation beyond the simplified, low-Reynolds number regimes often represented in existing dat...

Related Articles

Machine Learning

[D] ICML reviewer making up false claim in acknowledgement, what to do?

In a rebuttal acknowledgement we received, the reviewer made up a claim that our method performs worse than baselines with some hyperpara...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

[D] Budget Machine Learning Hardware

Looking to get into machine learning and found this video on a piece of hardware for less than £500. Is it really possible to teach auton...

Reddit - Machine Learning · 1 min ·
Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime