Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display version specific collection ids on resource cards, e.g. affable-shark/1.2 #424

Open
FynnBe opened this issue Oct 30, 2024 · 2 comments

Comments

@FynnBe
Copy link
Member

FynnBe commented Oct 30, 2024

currently we do not encourage version specific identifiers:
image

@oeway
Copy link
Collaborator

oeway commented Dec 4, 2024

Let's move the discussion here:
Here are some notes from the meeting minutes today:

(Fynn) Do we want model versions on the website?
(Tomaz) I'd argue we should just use the git model; branch names (animal names for us) move around, so if people want stability, they should reference specific versions, and out APIs should expose (or even require) versions as paramters;
(Iván) I think that it would be usefull for cases such as: I finetunned a model with affabable-shark:1.1, then if a new version comes out (e.g. affabable-shark:1.2) it's good to know wich version was trained with. It would promote retrocompatibility and stability.

Here are my thoughts on this, let's iterate this:
Initially, I thought, of course, we should have version supported, and I have already implemented versioning for the new artifact manager.
However, reading from what Tomaz and Iván's comments, I am now thinking we need version, but not promote it as the main way to refer to models. The point to argue against to have versions over emphasised are:

  • We have decided we one model is just one set of weights, characterised by a set of model weights which given a specific outputs when given a certain inputs. If we finetuned a model, based on affable-shark, it should not be a new version, any finetune will essentially like breeding, which makes another animal, or a baby shark of affable-shark, instead of a new version.
  • The versions we used so far are only because of format changes -- which essentially doesn't change the weights or the input/output mapping at all, it's all about implementation details -- this is different from what GIT for source code, any source code changes are mostly related to new features which will change the behavior of a software or library. Here we are storing the same model, it's just format changes, it won't change the behavior.
  • In that sense, linking to a specific version to ensure reproducibility is not that important, since all the versions should produce the same output given a certain input, it's just some version are broken for this and that software. They are different distributions of the same model targeting different software or implementation.

Therefore, I am thinking maybe it's not a bad idea to only use the animal names, and we present all the format versions, for weight types in a flattened way, always accessible from the same model with an url like affable-shark instead of affable-share:1.1. To be clear, we should still use versions to keep track of changes, however, we do not need to promote it as a best practice to refer to model, it introduces complexity, while without too much benefit.

If there is any change in the weights, we should just make another model, get another animal name, and also set the previous model as parent.

cc @FynnBe @Tomaz-Vieira @IvanHCenalmor

@FynnBe
Copy link
Member Author

FynnBe commented Dec 5, 2024

I agree with of all of you!
except maybe

linking to a specific version to ensure reproducibility is not that important

TLDR; So we aim to promote the 'context/latest/vanilla id', while being transparent about published version, right?

We do not need to promote it, but we need to make the versions visible.
For example adding converted ONNX weights may motivate a model version bump. logically it's the same, name, description etc are all the same.. just now there are also ONNX weights! or maybe I remove ONNX weights because they have a limitation (e.g. batch size may only be 1)... -> one version has something another does not and if that is the little something required for reproducibility it is suddenly important to tell them apart.

I do agree though that we do not need to overly promote the versions and can keep the "copy concept id " at the top of the model card. In most use-cases a newer version should not be worse..

We should maybe just add a section (maybe rework the compatibility check section?) where the specific versions are listed.. this is exactly like e.g. pytorch tells you to install.. you do pip install pytorch.. oh and here (link) are all the versions if you are looking for something specific...

I think it would make sense to combine this with the compatibility checks that are already version-specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants