Using different domain-specific ViT-L14 backbone #90

dnschouten · 2024-12-16T14:10:09Z

Hi!

Thanks for the great work on RoMa! I've been investigating its use in medical image registration (specifically histopathology) and it works quite well out of the box. There are however several histopathology-specific pretrained DinoV2 ViT-L14 models out there, so I've been experimenting with one of those for more domain-specific features. Unfortunately, my results have drastically deteriorated (e.g. barely any matches vs thousands with original backbone) with any other backbone than the original (while taking into account domain-specific image normalization etc.).

Would you have any thoughts why a different ViT-L14 may not work as expected? Would swapping out the backbone perhaps require retraining of the matcher as well? Or are there perhaps any other (minor) details I may have overlooked while implementing the domain-specific backbone?

Parskatt · 2024-12-16T14:29:11Z

Hi there! Unfortunately we extensively use the features themselves (not just the correlation), so if you swap the backbone you get big issues.

dnschouten · 2024-12-16T15:07:02Z

That's unfortunate. Would you see any avenues (i.e. retraining of the matcher to learn to handle different features) that might resolve this? I'd imagine this is very tricky for different architectures (i.e. ViT-S), but intuitively it seems like this would be quite reasonable with the exact same architecture with just different weights.

Parskatt · 2024-12-16T16:21:58Z

You could try mapping the new backbone to DINOv2 on some reasonable in distribution images, for example attach a linear head at the end of it. Then roma should be able to handle the input, as it's aligned with dinov2 features.

It's tricky in general to be invariant to the features without losing performance on the target tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using different domain-specific ViT-L14 backbone #90

Using different domain-specific ViT-L14 backbone #90

dnschouten commented Dec 16, 2024

Parskatt commented Dec 16, 2024

dnschouten commented Dec 16, 2024

Parskatt commented Dec 16, 2024

Using different domain-specific ViT-L14 backbone #90

Using different domain-specific ViT-L14 backbone #90

Comments

dnschouten commented Dec 16, 2024

Parskatt commented Dec 16, 2024

dnschouten commented Dec 16, 2024

Parskatt commented Dec 16, 2024