Patchwise training and inference support #75

nilsleh · 2023-10-12T12:51:07Z

This PR aims to close #22 by implementing an option to run patch wise training.

The current approach is to expect normalized coordinates as a patch size sequence argument for the x1 and x2 dimension. The current patch size sampling strategy is random uniform sampling.

The way I have currently thought about supporting patch wise training is the following:

add a class method to TaskLoader which samples a uniform point in the normalized coordinate frame and takes the patch size to define a "bounding box" around that sampled form
subsequently check that for the sampled box there is context and target data available, a caveat here is the while loop which is at risk of running very long if samples are scarce
if a bbox fullfills those criteria it is used in the context and target sampling
- for xarrays it is slice isel statement
- for df its more involved because slice expects ordered columns of the coordinates which cannot be guaranteed
the sampled bbox is used to sample both the context and target set
if no patch_size is specified to the task loader call, there are no changes, default is None so everything should run as before

TODO:

correct visualization context encoding
work with unnormalized coordinates
patch wise inference that does stitching
unit tests for sampling strategy with numpy array, as the interp coordinates are not necessarily known beforehand to respect the random sampled bounds

tom-andersson · 2023-10-12T15:11:23Z

Thank you very much for opening this PR @nilsleh. Addressing #22 will be a significant addition to DeepSensor's functionality. It is really appreciated that you've taken the time to try tackling this.

I will start adding some high-level line comments. But firstly, a general point: In DeepSensor, I distinguish between 'slicing' a variable and 'sampling' a variable. In the TaskLoader the variables (xarray/pandas data) are first temporally sliced to specific date/s, and these smaller, sliced xarray/pandas objects are then passed to be sampled. 'Sampling' is the process of selecting a subset of (x, y) points from the set of all (x, y) points (with various sampling schemes available via the context_sampling and target_sampling kwargs). I see that you've added the slicing as part of sample_df and sample_da. We should instead consider the patching behaviour simply as a spatial slicing pre-processing step. I would do this like so:

# Temporal slice (already in TaskLoader)
context_slices = [
    self.time_slice_variable(var, date, delta_t)
    for var, delta_t in zip(self.context, self.context_delta_t)
]
target_slices = [
    self.time_slice_variable(var, date, delta_t)
    for var, delta_t in zip(self.target, self.target_delta_t)
]

# Spatial slice (to be added in this PR)
context_slices = [
    self.spatial_slice_variable(var, window)
    for var in context_slices
]
target_slices = [
    self.spatial_slice_variable(var, window)
    for var in target_slices
]

General comments about PRs

Please run pip install -r requirements.dev.txt
From project root, run pytest to check unit tests
From project root, run black deepsensor/ and black tests/
Ensure any docstrings adhere to Google style so that sphinx can generate docs from them.

tom-andersson · 2023-10-12T15:13:01Z

deepsensor/data/loader.py

+
+        :return sequence of patch spatial extent as [lat_min, lat_max, lon_min, lon_max]
+        """
+        # assumption of normalized spatial coordinates between 0 and 1


We can't assume data is bounded in [0, 1]. This is not guaranteed or enforced in any part of the DeepSensor data processing pipeline. Instead, we need a new method, run during the TaskLoader init, which computes the global min/max coordinate values of the context/target data, and then the central point of the patch should be sampled uniformly in this range.

Okay, to my understanding the TaskLoader only works on already normalized/standardized data and the coordinate bounds were normalized to [0,1] but that is good to know, thanks!

By default, the DataProcessor linearly normalises the coords of the first data variable it is provided with to lie in [0, 1], but subsequent variables may exceed that data range. Thus, although the data coords will typically lie in [0, 1], there is nothing constraining this to always hold.

deepsensor/data/loader.py

tom-andersson · 2023-10-12T15:14:35Z

deepsensor/data/loader.py

@@ -881,6 +974,9 @@ def task_generation(
            "split" sampling strategy for linked context and target set pairs.
            The remaining observations are used for the target set. Default is
            0.5.
+        patch_size: Sequence[float], optional
+            Desired patch size in lat/lon used for patchwise task generation. Usefule when considering


There are a few references to lat/lon specifically. Please instead use the DeepSensor standardised coordinate names x1/x2 in comments and variables. The TaskLoader operates only on standardised/normalised data.

tom-andersson · 2023-10-12T15:16:19Z

deepsensor/data/loader.py

@@ -1226,7 +1302,7 @@ def sample_variable(var, sampling_strat, seed):
                X_c_offrid_all = np.concatenate(X_c_offgrid, axis=1)
            Y_c_aux = (
                self.sample_offgrid_aux(
-                    X_c_offrid_all, self.time_slice_variable(self.aux_at_contexts, date)
+                    X_c_offrid_all, self.time_slice_variable(self.aux_at_contexts, date), sample_patch_size


We shouldn't need to spaitally slice offrid aux; this will happen implicitly because the context data used for sampling the self.aux_at_contexts xarray data will already have been spatially sliced.

tom-andersson · 2023-10-12T15:25:10Z

deepsensor/data/loader.py

+        lon_side = lon_extend / 2
+
+        # sample a point that satisfies the boundary and target conditions
+        continue_looking = True


I would remove the continue_looking logic entirely. Firstly, it's fine if the patch contains no context data; DeepSensor models should be able to handle this. The main risk here is that the patch contains no target data, which can lead to NaNs when passed to the ConvNP.loss_fn. However, it is much, much, easier to check for Tasks with no target data as a training pre-processing step. This would be a separate PR or something we expect the user to be aware of.

Is there an assumption about a common coordinate range between the context and the target? Because if so, we can gather the coordinate bound extend of the target variable and use that to do the random window sampling?

No, unfortunately we can't assume that. We'll have to loop over all the self.context and self.target variables updating the min/max data coordinate bounds.

tom-andersson · 2023-10-12T15:29:12Z

deepsensor/data/loader.py

-                    target_slices[target_idx] = target_var
+        # sample common patch size for context and target set
+        if self.patch_size is not None:
+            sample_patch_size = self.sample_patch_size_extent()


I would suggest we don't make patch_size a class attribute like this; it should only exist in the scope of __call__ here

tom-andersson · 2023-10-12T15:33:19Z

Hi @nilsleh, I've submitted my review. I hope you don't mind all the feedback - do let me know if you don't have the time and would prefer me to take over.

As a side note, to close #22 we'll need to solve the 'inference' part by adding a patch-processing feature to DeepSensorModel.predict.

nilsleh · 2024-04-22T16:35:16Z

deepsensor/data/loader.py

+            f"Must be one of [None, 'random', 'sliding']."
+        )
+
+        if patch_strategy is None:


I moved the logic to the _call_ function, however, there is quiet a bit of code redundancy because:

checking separate sampling strategies

checking whether one supplies a single data or a sequence of date that determines whether a Task or a list[Task] is returned

So that can be made more concise

nilsleh · 2024-04-22T16:36:52Z

tests/test_task_loader.py

+
+        # TODO it would be better to do this with pytest.fixtures
+        # but could not get to work so far
+        task = tl(


It would be better to have fixtures that generate the data setup and then we can test with different configurations like single date, list of dates, different context and target sampling strategies etc.

PR to make patching and stitching agnostic to coordinate direction

For patchwise prediction, get `patch_size` and `stride` directly from task

Update `patchwise_train` with latest changes from `main`

Refactoring of patchwise training and inference

davidwilby · 2024-10-22T07:49:06Z

Going to close this PR as we're managing this feature on a branch on my fork of the repo (davidwilby#4), will open a new PR when that's ready to go soon.

nilsleh added 3 commits October 10, 2023 19:01

stach changes

131c434

draft

3342b96

draft

b7cf3fa

nilsleh marked this pull request as draft October 12, 2023 12:51

nilsleh added 2 commits October 12, 2023 12:59

merge main

70f3783

wrong merge

379e3b2

tom-andersson requested changes Oct 12, 2023

View reviewed changes

nilsleh added 2 commits October 13, 2023 09:49

incorporate some of the feedback

85cd34b

run black

be8fffd

nilsleh mentioned this pull request Oct 16, 2023

Create empty space Spatiotemporal Xarray coordinate difference tolerance #78

Closed

nilsleh and others added 16 commits November 6, 2023 17:01

merge main

3415377

merge main

39dd15b

layout code

876970e

change __call__

d1cb338

revert

218f791

type annotation

37fe771

patch_size sampling test

fb20ccc

patchwise test trainer

5bda80b

gridded window patching

c276844

adding sliding window patching function

fde7e02

loader with bboxes

195a923

loader with boxes

824df24

Altering kwargs to enable for-loop and change sliding function

e6e1ae8

Merge branch 'patchwise_train' into msjr/patching

e75d022

move logic to call

bae0855

Merge branch 'main' into patchwise_train

a090d34

nilsleh commented Apr 22, 2024

View reviewed changes

Martin Rogers and others added 28 commits August 14, 2024 15:06

Resolve conflicts primarily due to use of unnorm_name and orig_name

6dd9b3a

use more informative error message for predict_patch

529e8c8

Merge branch 'patchwise_train' into dw/refactor_predict

f37e28c

fix use of stride_size

0344c2a

move patchwise parameter test to test_task_loader

840838d

fix patch_size and stride for sliding window tests

ceeb8ca

remove test as moved to test_task_loader

5fc1fe3

check input parameters in task loader

1f434cc

For patchwise prediction, get patch_size and stride directly from task

96edce8

Merge pull request #6 from nilsleh/msjr-test_patching

a705549

PR to make patching and stitching agnostic to coordinate direction

Merge branch 'patchwise_train' into dw/patch_size_from_task_for_predict

f5c015b

Merge pull request #7 from nilsleh/dw/patch_size_from_task_for_predict

c14d5d1

For patchwise prediction, get `patch_size` and `stride` directly from task

raise errors instead of assert

18f2e5a

use warning for stride > patch size

47d0998

remove comment

df0533b

raise error for stride > patch_size in prediction

f7d57e9

alter paramaters for test

fed3940

raise error for more than one date in predict_patch

b3a6dab

black

c8a38f2

Merge branch 'main' into dw/merge_main

52a0cb3

Merge branch 'patchwise_train' into dw/refactor_predict

2672fec

fix getting and checking of patch_size and stride

e10d645

fix docstrings and defaults

f6f843d

reinstate orig_name patch clip slicing

2e5c6a8

Merge pull request #8 from nilsleh/dw/merge_main

e4d5567

Update `patchwise_train` with latest changes from `main`

Merge branch 'patchwise_train' into dw/refactor_predict

e57f065

use hypothesis to expand on patchwise predict testing

51d8c05

Merge pull request #4 from nilsleh/dw/refactor_predict

79afa12

Refactoring of patchwise training and inference

davidwilby closed this Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patchwise training and inference support #75

Patchwise training and inference support #75

nilsleh commented Oct 12, 2023 •

edited

Loading

tom-andersson commented Oct 12, 2023

tom-andersson Oct 12, 2023

nilsleh Oct 12, 2023

tom-andersson Oct 13, 2023

tom-andersson Oct 12, 2023

tom-andersson Oct 12, 2023

tom-andersson Oct 12, 2023

nilsleh Oct 12, 2023

tom-andersson Oct 13, 2023

tom-andersson Oct 12, 2023

tom-andersson commented Oct 12, 2023

nilsleh Apr 22, 2024

nilsleh Apr 22, 2024

davidwilby commented Oct 22, 2024

Patchwise training and inference support #75

Patchwise training and inference support #75

Conversation

nilsleh commented Oct 12, 2023 • edited Loading

tom-andersson commented Oct 12, 2023

General comments about PRs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tom-andersson commented Oct 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidwilby commented Oct 22, 2024

nilsleh commented Oct 12, 2023 •

edited

Loading