6. Case studies

The following section describes implementation details of four case studies. The GPs are in implemented in GPytorch and GPflow in Jupyter notebooks. Come back soon to implement these directly with binder/Google Colab!

6.1. Glacier elevation change (GP interpolation)

The framework is applied to glacier elevation change over Greenland using data derived from ICESat and ICESat-2 satellites. In this regression problem, we would like to estimate glacier elevation change at unsampled locations. This case study has direct applications to estimating sea level rise. Ocean distance, topographic elevation, slope, aspect, surface glacier velocity, and spatial coordinates - are used as inputs $D=7$. The dataset is also large with $ N=5 \times 10^5$. We compare the framework with a pre-existing application of GPs. The model is evaluated with an emphasis on capturing extremes. We compare the framework with a pre-existing application of GPs to this problem by Gardner et al. (2019).

Part I: Problem definition
Part II: Data exploration
Part III: Model iteration - Coming soon!
Part IV: Scaling and testing - Coming soon!

6.2. Himalayan precipitation (GP extraloplation)

The aim of this case study is to predict future precipitation in the Upper Indus Basin, Himalayas. This area works as water tower for over 270 million people. We use ERA5 climate reanalysis data and historical atmospheric indices (derived from observations). The data has $1 \times 10^6$ data points $N$, $29$ possible dimensions $D$. Transformations are applied to generate more accurate uncertainty bounds. Clustering is used to overcome the non-stationarity in the precipitation’s temporal distribution across the basin. These key steps allow for accurate monthly predictions over 15 year horizon.

Part I: Problem definition - Coming soon!
Part II: Data exploration - Coming soon!
Part III: Model iteration - Coming soon!
Part IV: Scaling and testing - Coming soon!

6.3. Lorenz 96 model (Multi task GP)

We apply multi-task GP to learn the physical relationships between dimensions of the Lorenz 96 model in order to forecast future behaviour. The Lorenz 96 model is often used as simpler proxy for climate model as it can recreate the chaotic patterns observed in the atmosphere. It is therefore a useful tool to test new ideas such as better parametrisation schemes. The dataset is large with $N=4 \times 10^5$, $D=1$, and $M=8$. A sparse variational GP is used to overcome the large number of data required for training and testing. In this task, it is important that the samples generated by the model share the same properties as the data such as generating slow and fast oscillation regimes. We follow the tests proposed by Parthipan et al. (2022).

Part I: Problem definition - Coming soon!
Part II: Data exploration - Coming soon!
Part III: Model iteration - Coming soon!
Part IV: Scaling and testing - Coming soon!

6.4. Arctic water bodies (Classification)

Finally we apply the framework to water characteristic classification in the Arctic. The uncertainty in the prediction are used then used to infer potential mixing of water bodies. It will also give an understanding of sea ice formation and melt processes. Coordinates, depth and season are used (D=4). The dataset is also large with $N=10^5$. Again a sparse variational GP to accelerate inference whilst maintaining strong predictive skill ( accuracy = , AUC= ).

Part I: Problem definition - Coming soon!
Part II: Data exploration - Coming soon!
Part III: Model iteration - Coming soon!
Part IV: Scaling and testing - Coming soon!

6.5. Further reading

Below are further examples of GPs studies (with code) that deal with scalibity and kernel design for real world problems.