TaskLoader fails when declaring multiple `target_delta_t` #129

acocac · 2024-09-18T12:51:40Z

I start experimenting a forecasting set up in DeepSensor (see a MWE in colab). The example below shows how I define a TaskLoader for predicting air temperature in the next two days (lead times):

task_loader = TaskLoader(
    context=[era5_ds["air"],] * 3,
    context_delta_t=[-1, -2, 0],
    target=[era5_ds["air"],era5_ds["air"]],
    target_delta_t=[1, 2],
    time_freq="D",  # daily frequency (the default)
)

Then I reuse the training procedure suggested in DeepSensor tutorials. However, the training stops and gives an error when computing RMSE for the validation tasks.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-22-d73321d37ac8>](https://7773me6r0z9-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240916-060345_RC00_675086238#) in <cell line: 16>()
     18     batch_losses = trainer(train_tasks)
     19     losses.append(np.mean(batch_losses))
---> 20     val_rmses.append(compute_val_rmse(model, val_tasks))
     21     if val_rmses[-1] < val_rmse_best:
     22         val_rmse_best = val_rmses[-1]

1 frames
[/usr/local/lib/python3.10/dist-packages/deepsensor/data/processor.py](https://7773me6r0z9-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240916-060345_RC00_675086238#) in map_array(self, data, var_ID, method, unnorm, add_offset)
    516             c = -c / m
    517             m = 1 / m
--> 518         data = data * m
    519         if add_offset:
    520             data = data + c

TypeError: can't multiply sequence by non-int of type 'float'

My guess is that some changes should be required in map_array when considering multiple targets. I suggest recognising the object type of data below. If it's a list, then perform the multiply operator per element, in this case np.array.

deepsensor/deepsensor/data/processor.py

Line 518 in 6de4ddb

data = data * m

The text was updated successfully, but these errors were encountered:

tom-andersson · 2024-10-20T22:14:12Z

Hey @acocac, thanks for raising this and the MWE. So you have two target sets for the two lead times, and you want to compute unnormalised RMSE in Kelvin for the first lead time. The model.predict interface is the intended way to get unnormalised predictions for computing unnormalised metrics. I've recently improved DeepSensor's forecasting functionality in deepsensor v0.4 which fixes model.predict forecast outputs; see #130 and #132.

However, in the MWE, you are not using the data_processor in the right way. .map_array is intended for a single array, not a list, so I suggest we keep the interface as-is. As a workaround, keeping the current approach:

# Don't do this:
# mean = data_processor.map_array(model.mean(task), target_var_ID, unnorm=True)
# true = data_processor.map_array(task["Y_t"][0], target_var_ID, unnorm=True)
# Do this:
lead_time_idx = 0
mean = model.mean(task)[lead_time_idx]
true = task["Y_t"][lead_time_idx]
error = np.abs(mean - true)
error_unnormalised = data_processor.map_array(error, target_var_ID, unnorm=True, add_offset=False)

But I'd suggest updating DeepSensor and using model.predict :-)

acocac added the help wanted Extra attention is needed label Sep 18, 2024

tom-andersson closed this as completed Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TaskLoader fails when declaring multiple `target_delta_t` #129

TaskLoader fails when declaring multiple `target_delta_t` #129

acocac commented Sep 18, 2024

tom-andersson commented Oct 20, 2024

TaskLoader fails when declaring multiple target_delta_t #129

TaskLoader fails when declaring multiple target_delta_t #129

Comments

acocac commented Sep 18, 2024

tom-andersson commented Oct 20, 2024

TaskLoader fails when declaring multiple `target_delta_t` #129

TaskLoader fails when declaring multiple `target_delta_t` #129