Tensor Size Mismatch During Training #1949

natedailey7 · 2023-10-02T23:32:19Z

natedailey7
Oct 2, 2023

Hello,

I'm trying to classify objects in aerial imagery using Raster Vision, and I'm following the training example in the Raster Vision tutorials. I'm hitting an error I can't figure out when I train the model. My training and validation images/labels are all 326x326 pixels (starting out with a small image). My code is as follows:

class_config = ClassConfig(
    names=['background', '0'],
    colors=['lightgray', 'darkred'],
    null_class='background')

0 is the label for objects I want to identify. The rest I want to classify as background.

train_image_uri = "path_to_geotiff"
train_label_uri = "path_to_geojson"
val_image_uri   = "path_to_geotiff"
val_label_uri   = "path_to_geojson"

train_ds = SemanticSegmentationSlidingWindowGeoDataset.from_uris(
    class_config=class_config,
    image_uri=train_image_uri,
    label_vector_uri=train_label_uri,
    label_vector_default_class_id=class_config.get_class_id('0'),
    size=163,
    stride=163,
)

val_ds = SemanticSegmentationSlidingWindowGeoDataset.from_uris(
    class_config=class_config,
    image_uri=val_image_uri,
    label_vector_uri=val_label_uri,
    label_vector_default_class_id=class_config.get_class_id('0'),
    size=163,
    stride=163,
)

For train_ds and val_ds, these window settings (size and stride) yield 4 windows that fit perfectly in a square pattern for the image. That is, print(len(train_ds)) outputs 4 (and the same for val_ds).

model = torch.hub.load(
    'AdeelH/pytorch-fpn:0.3',
    'make_fpn_resnet',
    name='resnet18',
    fpn_type='panoptic',
    num_classes=len(class_config),
    fpn_channels=128,
    in_channels=3,
    out_size=(326, 326),
    pretrained=True)

data_cfg = SemanticSegmentationGeoDataConfig(
    class_names=class_config.names,
    class_colors=class_config.colors,
    num_workers=0,
)

solver_cfg = SolverConfig(
    batch_sz=8,
    lr=3e-2,
    class_loss_weights=[1., 10.]
)

learner_cfg = SemanticSegmentationLearnerConfig(data=data_cfg, solver=solver_cfg)

learner = SemanticSegmentationLearner(
    cfg=learner_cfg,
    output_dir='./train-test/',
    model=model,
    train_ds=train_ds,
    valid_ds=val_ds,
)

Everything runs fine up to this point. Next, I train the model and get this error:

learner.train(epochs=3)

RuntimeError: The size of tensor a (41) must match the size of tensor b (82) at non-singleton dimension 3

Since it seems like this error is happening internally in Pytorch, I'm not sure how I can debug it. I've tried this solution on this site, but to no avail (specifically, the transform transform=A.Resize(model_input_size, model_input_size)) on the datasets).

Wondering if anyone has any advice.

Answered by AdeelH

Oct 3, 2023

I can reproduce your error with this:

import torch

model = torch.hub.load(
    'AdeelH/pytorch-fpn:0.3',
    'make_fpn_resnet',
    name='resnet18',
    fpn_type='panoptic',
    num_classes=2,
    fpn_channels=128,
    in_channels=3,
    out_size=(326, 326),
    pretrained=True)

x = torch.randn((1, 3, 163, 163))
out = model(x)

Assuming this is correct, the problem is the mismatch between out_size passed to the model and the size of the input going into the model. You need to make sure that the out_size matches the size of the inputs to the model. In your case, this is currently 163. If you want to resize the 163x163 chips before they are passed to the model to e.g. 256x256, you can do t…

View full answer

AdeelH · 2023-10-03T13:53:52Z

AdeelH
Oct 3, 2023
Maintainer

I can reproduce your error with this:

import torch

model = torch.hub.load(
    'AdeelH/pytorch-fpn:0.3',
    'make_fpn_resnet',
    name='resnet18',
    fpn_type='panoptic',
    num_classes=2,
    fpn_channels=128,
    in_channels=3,
    out_size=(326, 326),
    pretrained=True)

x = torch.randn((1, 3, 163, 163))
out = model(x)

Assuming this is correct, the problem is the mismatch between out_size passed to the model and the size of the input going into the model. You need to make sure that the out_size matches the size of the inputs to the model. In your case, this is currently 163. If you want to resize the 163x163 chips before they are passed to the model to e.g. 256x256, you can do that with transform=A.Resize(256, 256)) and out_size=(256, 256).

1 reply

natedailey7 Oct 3, 2023
Author

Thank you! I changed the out_size parameter in my model to out_size=(163, 163), and it seems good! I graduated to a new error and started a new discussion.

AdeelH · 2023-10-03T13:57:19Z

AdeelH
Oct 3, 2023
Maintainer

Also, if your images are all only 326x326, your dataset may be pre-chipped, in which case, this might be relevant: https://docs.rastervision.io/en/0.21/usage/tutorials/prechipped_datasets.html.

1 reply

natedailey7 Oct 3, 2023
Author

I actually manually cropped my training and validation images to be 326x326, in order to test out this package with small images. But I hadn't read this--good to know.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor Size Mismatch During Training #1949

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Tensor Size Mismatch During Training #1949

natedailey7 Oct 2, 2023

Replies: 2 comments · 2 replies

AdeelH Oct 3, 2023 Maintainer

natedailey7 Oct 3, 2023 Author

AdeelH Oct 3, 2023 Maintainer

natedailey7 Oct 3, 2023 Author

natedailey7
Oct 2, 2023

Replies: 2 comments 2 replies

AdeelH
Oct 3, 2023
Maintainer

natedailey7 Oct 3, 2023
Author

AdeelH
Oct 3, 2023
Maintainer

natedailey7 Oct 3, 2023
Author