2025 AAAI AAAI 2025

ml4xcube: Machine Learning Toolkits for Earth System Data Cubes

Abstract

Abstract Rapidly changing climate conditions and the increase in extreme events are posing severe challenges to human life and infrastructure, requiring sophisticated analytical capabilities for hazard prediction and disaster risk management. Earth System Data Cubes (ESDCs) have become an essential tool in Earth System Sciences (ESS) by organizing large-scale, multivariate environmental datasets into a structured, scalable and analysis-ready format. However, modern machine learning techniques are not yet being utilized to their full potential on ESDCs. This is due to the lack of proper tooling, domain-specific challenges, and high barriers of entry for practitioners. We introduce ml4xcube, an open-source Python framework designed to assist ESS domain experts in applying ML techniques on ESDCs for advanced analysis and prediction of environmental variables and impacts. Through a comprehensive suite of tools, it addresses specific challenges associated with the nature of ESS data, such as the non-uniform data distribution due to dynamic gaps, or spatio-temporal autocorrelation of environmental variables. Due to its modular architecture, it covers the complete analysis process, from data exploration, and preparation, to model development, result interpretation and evaluation. With support for distributed computing, it handles large ESDC datasets efficiently. In order to ease the adoption it includes extensive documentation and tutorial notebooks. We demonstrate ml4xcube's capabilities through three examples, showcasing its potential and capabilities for integrating machine learning with ESDC data.

🌉 Interdisciplinary Bridge — Computer Vision and Data Science & Analytics and Machine Learning
🧭 Keyword Pioneer — earth system data cube
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning