GitHub - alan-turing-institute/deepsensor at 80e23754d8353513aa8cf02efaa79da2f73d273f

Name	Name	Last commit message	Last commit date
Latest commit kallewesterling Merge pull request #52 from tom-andersson/merge-documentation-main Oct 2, 2023 80e2375 · Oct 2, 2023 History 434 Commits
.github/workflows	.github/workflows	Add Python 3.8 to CI	Sep 26, 2023
deepsensor	deepsensor	Merge branch 'add-documentation' into merge-documentation-main	Oct 2, 2023
docs	docs	Docstrings	Sep 4, 2023
figs	figs	Change logo	Sep 26, 2023
tests	tests	Merge branch 'add-documentation' into merge-documentation-main	Oct 2, 2023
.gitattributes	.gitattributes	Ignore Jupyter notebooks in repo programming language proportion	Jun 29, 2023
.gitignore	.gitignore	First push of docs directory	Sep 1, 2023
CITATION.cff	CITATION.cff	Bump version	Sep 26, 2023
LICENSE	LICENSE	Update LICENSE	May 16, 2023
README.md	README.md	Merge branch 'add-documentation' into merge-documentation-main	Oct 2, 2023
pyproject.toml	pyproject.toml	Remove legacy autoversion stuff	Jun 13, 2023
requirements.dev.txt	requirements.dev.txt	Set correct versions	Sep 4, 2023
requirements.txt	requirements.txt	Add `rioxarray` for satellite raster data	Aug 18, 2023
setup.cfg	setup.cfg	Bump version	Sep 26, 2023
setup.py	setup.py	Revert to previous setup.py	Jun 11, 2023
tox.ini	tox.ini	Fix `tox.ini` `envlist` syntax	Sep 26, 2023

A Python package and open-source project for modelling environmental data with neural processes

NOTE: This package is a work in progress and breaking changes are likely. If you are interested in using DeepSensor, please get in touch first (tomand@bas.ac.uk).

For demonstrators, use cases, and videos showcasing the functionality of DeepSensor, check out the DeepSensor Gallery!

Why neural processes?

NPs are a highly flexible class of probabilistic models that can:

ingest multiple context sets (i.e. data streams) containing gridded or pointwise observations
handle multiple gridded resolutions
predict at arbitrary target locations
quantify prediction uncertainty

These capabilities make NPs well suited to modelling spatio-temporal data, such as satellite observations, climate model output, and in-situ measurements. NPs have been used for range of environmental applications, including:

downscaling (i.e. super-resolution)
forecasting
infilling missing satellite data
sensor placement

Why DeepSensor?

DeepSensor aims to faithfully match the flexibility of NPs with a simple and intuitive interface. DeepSensor wraps around the powerful neuralprocessess package for the core modelling functionality, while allowing users to stay in the familiar xarray and pandas world and avoid the murky depths of tensors!

Deep learning library agnosticism

DeepSensor leverages the backends package to be compatible with either PyTorch or TensorFlow. Simply import deepsensor.torch or import deepsensor.tensorflow to choose between them!

Quick start

Here we will demonstrate a simple example of training a convolutional conditional neural process (ConvCNP) to spatially interpolate ERA5 data. First, pip install the package. In this case we will use the PyTorch backend.

pip install deepsensor
pip install torch

We can go from imports to predictions with a trained model in less than 30 lines of code!

import deepsensor.torch
from deepsensor.data.processor import DataProcessor
from deepsensor.data.loader import TaskLoader
from deepsensor.model.convnp import ConvNP
from deepsensor.train.train import Trainer

import xarray as xr
import pandas as pd
import numpy as np

# Load raw data
ds_raw = xr.tutorial.open_dataset("air_temperature")

# Normalise data
data_processor = DataProcessor(x1_name="lat", x2_name="lon")
ds = data_processor(ds_raw)

# Set up task loader
task_loader = TaskLoader(context=ds, target=ds)

# Set up model
model = ConvNP(data_processor, task_loader)

# Generate training tasks with up to 10% of grid cells passed as context and all grid cells
# passed as targets
train_tasks = []
for date in pd.date_range("2013-01-01", "2014-11-30")[::7]:
    task = task_loader(date, context_sampling=np.random.uniform(0.0, 0.1), target_sampling="all")
    train_tasks.append(task)

# Train model
trainer = Trainer(model, lr=5e-5)
for epoch in range(10):
    trainer(train_tasks, progress_bar=True)

# Predict on new task with 10% of context data and a dense grid of target points
test_task = task_loader("2014-12-31", 0.1)
mean_ds, std_ds = model.predict(test_task, X_t=ds_raw)

After training, the model can predict directly to xarray in your data's original units and coordinate system:

>>> mean_ds
<xarray.Dataset>
Dimensions:  (time: 1, lat: 25, lon: 53)
Coordinates:
  * time     (time) datetime64[ns] 2014-12-31
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
Data variables:
    air      (time, lat, lon) float32 246.7 244.4 245.5 ... 290.2 289.8 289.4

We can also predict directly to pandas containing a timeseries of predictions at off-grid locations by passing a numpy array of target locations to the X_t argument of .predict:

# Predict at two off-grid locations for three days in December 2014
test_tasks = task_loader(pd.date_range("2014-12-01", "2014-12-31"), 0.1)
mean_df, std_df = model.predict(test_tasks, X_t=np.array([[50, 280], [40, 250]]).T)

>>> mean_df
                              air
time       lat  lon              
2014-12-01 50.0 280.0  260.183056
           40.0 250.0  277.947373
2014-12-02 50.0 280.0   261.08943
           40.0 250.0  278.219599
2014-12-03 50.0 280.0  257.128185
           40.0 250.0  278.444229

This quickstart example is also available as a Jupyter notebook with added visualisations.

Extending DeepSensor with new models

To extend DeepSensor with a new model, simply create a new class that inherits from deepsensor.model.DeepSensorModel and implement the low-level prediction methods defined in deepsensor.model.model.ProbabilisticModel, such as .mean and .stddev.

class NewModel(DeepSensorModel):
    """A very naive model that predicts the mean of the first context set with a fixed stddev"""
    def __init__(self, data_processor: DataProcessor, task_loader: TaskLoader):
        super().__init__(data_processor, task_loader)
        
    def mean(self, task: Task):
        """Compute mean at target locations"""
        return np.mean(task["Y_c"][0])
    
    def stddev(self, task: Task):
        """Compute stddev at target locations"""
        return 0.1
    
    ...

NewModel can then be used in the same way as the built-in ConvNP model. See this Jupyter notebook for more details.

Citing DeepSensor

If you use DeepSensor in your research, please consider citing this repository. You can generate a BiBTeX entry by clicking the 'Cite this repository' button on the top right of this page.

Acknowledgements

DeepSensor is funded by The Alan Turing Institute.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why neural processes?

Why DeepSensor?

Deep learning library agnosticism

Quick start

Extending DeepSensor with new models

Citing DeepSensor

Acknowledgements

About

Releases 30

Packages

Contributors 11

Languages

License

alan-turing-institute/deepsensor

Folders and files

Latest commit

History

Repository files navigation

Why neural processes?

Why DeepSensor?

Deep learning library agnosticism

Quick start

Extending DeepSensor with new models

Citing DeepSensor

Acknowledgements

About

Resources

License

Code of conduct

Citation

Stars

Watchers

Forks

Releases 30

Packages 0

Contributors 11

Languages

Packages