We fund the Earth system science community to create educational content through annual calls for educational pilots. The outcomes will be available through the NFDI4Earth educational portal for the public. The submitted proposals are evaluated by NFDI4Earth co-applicants according to the following criteria (read the full guideline here):

a.‎ Relevance to NFDI4Earth ‎
b.‎ State of the art content
c.‎ Novelty (addressing the gaps in existing OERs in ESS)‎
d.‎ Use of active teaching methods ‎
e.‎ Relevance to RDM in ESS
f.‎ Potential for integration into NFDI4Earth curricula

Here's a peek at the educational pilots that have made the cut year after year, each adding something special to our understanding of Earth system sciences:

Educational Pilots 2024

Artificial Intelligence – Basics and Geographical Applications

The "Artificial Intelligence – Basics and Geographical Applications" educational pilot, developed by Ruhr University Bochum's Institute of Geography, is designed for M.Sc. and PhD students in Earth System Science (ESS). It offers a comprehensive introduction to artificial intelligence (AI) with a focus on geospatial applications, specifically in the context of environmental monitoring, geosimulation, and Earth observation.
The course consists of Jupyter Notebooks and learning videos, providing practical, hands-on learning opportunities. Key topics covered include pattern and object recognition (e.g., traffic and ship detection), predictive modeling of natural and economic phenomena (e.g., water levels, housing markets, weather), and geosimulation for urban growth predictions. The course explores various AI techniques, from machine learning algorithms like Random Forest and Support Vector Machines to neural networks, including convolutional and recurrent neural networks.
Students are guided through theoretical concepts and practical applications of AI in geospatial contexts using Python and GIS tools. By the end of the course, learners will be able to distinguish between different AI concepts, use machine learning tools, and build and train their own simple neural networks for geospatial analysis.
The course materials are presented in English, ensuring accessibility to a broad audience, and require basic knowledge of digital image processing and GIS. It aims to fill a gap in the educational landscape by demonstrating how AI techniques can be applied to geographic data and help learners develop self-directed research skills in AI-driven geospatial applications.

Classification and Change Detection in Remote Sensing

The "Classification and Change Detection in Remote Sensing" course from TU Dresden provides modular, self-paced learning material on satellite data processing and analysis. Delivered through Jupyter notebooks in R, the course is aimed at BSc and MSc students and young researchers in Earth and Environmental Sciences.
The course introduces students to the structure and handling of raster data, including cleaning, visualization, and basic image processing. It then covers techniques for enhancing data and extracting important features, helping students prepare datasets for analysis. Both unsupervised and supervised classification methods are explored, teaching students how to classify land cover using machine learning techniques like Random Forest and Support Vector Machines. The course also emphasizes the importance of evaluating the accuracy of classification results.
In addition, students learn multi-temporal analysis and change detection methods, enabling them to track changes between different datasets over time, which is crucial for environmental monitoring. The course also includes techniques for visualizing and quantifying changes, providing students with the tools to communicate their research findings effectively.
The course materials are designed to be interactive and flexible, allowing students to modify the code and use their own data. The content is delivered in English and is available as open educational resources, encouraging its use and adaptation across different learning environments.

Educational Pilots 2023

Image pre-processing,‎ feature generation and ‎classification in remote sensing

The increasing availability of satellite data in recent years opens up new applications in many areas of ‎environmental science. The processing of large amounts of data, especially satellite data, is one of the ‎most important pillars of environmental monitoring. However, they also require extensive knowledge and ‎appropriately trained personnel. Educational institutions such as universities have the task of adapting to ‎these requirements. This adaptation must include all steps of the complex process chain for processing ‎satellite data. In this context, it is important not only to train technical skills, but also that methodological ‎competencies enable students to critically evaluate their own work steps. To reduce the complexity, a ‎modular structure is used, which also makes it possible to take into account existing skills in the data ‎processing chain.‎
At the latest due to the restrictions by the Corona pandemic digital teaching and learning opportunities ‎experienced an enormous boost. Experience has shown that flexible content can be an essential element ‎in motivating learning. The growing importance of MOOCs impressively underlines this development. These ‎developments and effects represent an opportunity to transform the necessary content of satellite data ‎processing into teaching and self-learning materials.‎
The objective is to develop Jupyter notebooks as self-learning material which provide a processing chain ‎of common classification task with remote sensing data.‎
The project comprises three main work stages, whereby the technical implementation of the ‎modules is the central element, namely methodological development of learning modules on the ‎process flow of processing satellite data, technical implementation of the modules in Jupyter ‎notebooks with example datasets, and testing and assessment of the modules with MSc students ‎in Environmental Sciences at TU Dresden.‎

AquaFortR: Streamlining Atmospheric Science, ‎Oceanography, Climate, and Water Research with ‎Fortran-accelerated R

Generally, simulation and modelling of the environmental processes are accomplished on the grid ‎level in which the investigation region is discretized to numerous grid points in the three ‎dimensions of space plus time. Consequently, these simulations produce enormous data sets and ‎processing this data extends beyond the current average Personal Computer capacity. However, ‎only some people have access to high-performance computing centers. Additionally, the ‎possibility of speeding up calculations and modelling exists in each PC through compiled ‎programming languages such as Fortran. This solution speeds up computations and can reduce ‎the CO2 footprint drastically.‎
R is one of the languages widely used in data analysis, visualization, and presentation, and it has a ‎wide supporting community and thousands of packages. Nevertheless, Fortran is one of the ‎fastest-performing languages -if not the fastest in number crunching- and one of the oldest. Due ‎to the latter, interest in Fortran is constantly low. Considering all the above, the need for ‎educational material that links R and Fortran is essential.‎
This project aims to provide one OER platform that will be a one-stop for all R users looking for ‎speed in general and users from Environmental Science disciplines in particular.‎
Many developers have made efforts to speed up R using C++; however, integration of Fortran ‎and R in such a package has yet to exist, to the best of the authors' knowledge. Filling this gap is ‎important because Fortran is well-suited for numerical and scientific computations due to its array ‎processing capabilities, performance, and efficiency. Commonly, computationally demanding ‎models are written in Fortran; thus, integrating Fortran and R will allow environmental modelers ‎and researchers to minimize changes between different programming languages.‎

Climatematch: Computational tools for climate science

The "Climatematch" EduPilot delivers educational content focused on teaching computational tools for climate science. The course materials are hosted online and consist of interactive Jupyter notebooks, coding tutorials, narrative explanations, and small embedded videos. These resources are designed for climate scientists, data engineers, and students with some programming knowledge who want to enhance their ability to work with large climate datasets.
The course is available in an open-access format and is designed to support flipped classrooms. Instructors or teaching assistants can use the materials to lead discussions and facilitate learning. The curriculum includes a wide range of climate science topics, including paleoclimate, modern climate data (atmospheric, oceanographic, and land), future climate projections, and the socio-economic and political dimensions of climate action. There is also a focus on using machine learning to address climate challenges.
Students have the opportunity to engage in research projects, using "project stubs" that allow them to work with big datasets in innovative ways.

Educational Pilots 2022

Teaching lead isotope geochemistry and application in archaeometry (LIGA-A)
‎

Lead isotopes are a well-known geochronological tool. However, lead isotope signatures can also be used to link non-ferrous metal objects to ore deposits because they do not fractionate in metallurgical processes. Based on this link, lead isotopes are a powerful tool to reconstruct past economical networks. When combined with other methods, they also help to decipher past interactions between humankind and the environment, especially the impact of mining activities. For these reasons, lead isotopes are a particularly well-suited example for an interdisciplinary approach that combines Earth System Sciences, Humanities, and Data Sciences.
The Educational Pilot “Teaching lead isotope geochemistry and application in archaeometry (LIGA-A)” will create a collection of educational materials that highlights this interlinkage and the importance of modern data scientific approaches to the topic. The educational materials will stand on their own but follow the way of lead isotope signatures from their generation in ore deposits through the metallurgical process and their measurement in the lab to the proper handling of such data, their visualization and interpretation and finally their application in concert with data from e.g. archaeological excavations, textual sources, and sediment cores.
To reach this aim, the educational resources will utilize a wide range of formats such as presentations, quizzes, animations, interactive visualization, and coding exercises. At the same time, the Educational Pilot will focus on the creation of materials that are as inclusive as possible from a technical point of view but also with regards to different impairments of the learners.

Introduction into chunking for large gridded datasets

For the efficient handling of large gridded datasets, the concept of a datacube has received much attention in the last years. A datacube stores datasets with common axes (like latitude, longitude, time) in a neatly organized and easily accessible format, that e. g. allows fast data subsetting. Part of the convenience of a datacube originates from the data being stored in so called chunks; memory readable standardized subsets of the data that allow efficient data access and parallel processing. However, accessing data on disk also creates an overhead on computation time from input/output operations. Thus access to the data cube is only fast when the data is provided with suitable chunking aligned to the analysis in question: To illustrate, if data is chunked for time series access, it will be inefficient to access a map (one timepoint from each chunk), and vice versa if data is chunked for spatial processing, it will be inefficient to access a time series separated across many chunks.
A proper chunking for efficient data reading and writing is especially important due to the following factors: The datasets that we have to handle in the earth system sciences are getting so large that they cannot be loaded in full into the working memory anymore. But when data have to be accessed on disk, the number of input/output operations should be minimized to avoid limiting computation speed. More and more data is also available in the cloud and needs to be made cloud compatible. Since data latency times become even more important in the cloud, the data is compressed. It is then very important to only decompress the data that is needed for the given analysis to optimize resources and computational speed. Both can be achieved by optimal chunking.
This course provides interactive notebooks and explorable explanations to give the student an intuition of the usage of different chunking strategies and their influence on the performance of the computations. The material will be provided as interactive Jupyter notebooks, so that the learners could follow along, experiment and modify the code at their own pace. The notebooks will be made available in Binder, allowing interactive online code execution, to lower the entry barrier. The material will be provided in English. The target group is expected to have some programming experience and some experience in the work with gridded data.

An open-access and interactive coding platform to facilitate E-Teaching in Geospatial Data Analysis ‎‎(Coding4Geo)
‎

Coding exercises are an important component of teaching data analysis in ESS today. Manually correcting assignments is often a heavy workload for exercise instructors. Students also often do not submit in time nor receive timely feedback. Therefore, automated code checking systems are promising for a wide range of teaching activities in ESS education. Several universities offer this service, based on different software architectures and infrastructures. Most of them are closed to their own students. In addition, the same basic content is often designed repeatedly at different universities, or even in different departments of a university.
Nbgrader is an existing tool that supports creating and grading assignments for Jupyter Notebooks. It can be easily deployed in a conventional server, where student users can program Python code online in a Jupyter-Notebook interface and the exercise instructors can automatically grade their submissions. The Institute of Cartography and Geoinformatics at the University of Hannover has implemented such a system and successfully deployed it for teaching activities using Python as the programming language since 2021 for their courses such as GIS I - modeling and data structure, laser scanning data processing, SLAM and etc.
The reuse of existing teaching materials is also of great importance. Within the education-oriented project ICAML - Interdisciplinary Center for Applied Machine Learning2 (Coordinated by co-applicant Martin Werner, BMBF funded 2018-2020), numerous Jupyter Notebook tutorials for machine learning topics in geospatial data analysis were developed and introduced to the community. While an interactive code checking process is important to further develop these tutorials and make these contents interactive and effortless to be included in future E-Teaching activities related to geospatial data analysis.

The future is urban, the data is smart – Analysis of urban transformation processes with volunteered ‎geographic information, social media geographic information and EO data

Changes in land use/cover are taking place worldwide on a variety of spatiotemporal scales and intensities. In this context, urbanization is a process that is affecting more and more areas of society and nature. Today, more than half of the world's population already lives in cities – in some European countries, the figure is up to 80%. Even though built-up areas account for only 2-3% of the land surface worldwide, their “ecological footprint” is enormous. Agricultural land, in particular, is being taken up for the expansion of settlement and transport areas. The analysis of such changes based on heterogeneous geospatial data sources is an important work step to estimate the future evolution of socio-ecological parameters such as migration, erosion, runoff patterns, biodiversity, etc.
Regional case studies from “hot spots” of urbanization will be used to perform the necessary work steps to capture and quantify urbanization in the context of sustainable development (Sustainable Development Goal 11). Modern methods for accessing open geodata will be presented and the extraction of thematic information from volunteered geographic information (VGI), social media geographic information (SMGI) and earth observation (EO) data with Python will be taught. The learners can comprehend all work steps independently on their own computer. Basic knowledge of digital image processing and Geographic Information Systems is required.

NFDI4Earth Educational Pilots

Educational Pilots 2024

Educational Pilots 2023

Educational Pilots 2022