Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add v2 recipe editing capabilities for bumping version & build number #2920

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

wolfv
Copy link
Contributor

@wolfv wolfv commented Aug 7, 2024

Hi @beckermr - this is the initial PR to port over the functionality for the recipe editing for v2 recipes.

I'll add some tests now.

Copy link

codecov bot commented Aug 7, 2024

Codecov Report

Attention: Patch coverage is 77.38854% with 71 lines in your changes missing coverage. Please review.

Project coverage is 77.07%. Comparing base (4b7ef4d) to head (7fee6de).

Files Patch % Lines
conda_forge_tick/update_recipe/v2/version.py 74.35% 20 Missing ⚠️
conda_forge_tick/update_recipe/v2/jinja/filters.py 35.71% 9 Missing ⚠️
...da_forge_tick/update_recipe/v2/conditional_list.py 75.75% 8 Missing ⚠️
conda_forge_tick/update_recipe/v2/jinja/objects.py 55.55% 8 Missing ⚠️
conda_forge_tick/update_recipe/v2/source.py 68.00% 8 Missing ⚠️
conda_forge_tick/update_recipe/v2/jinja/jinja.py 72.72% 6 Missing ⚠️
conda_forge_tick/migrators/version.py 72.22% 5 Missing ⚠️
conda_forge_tick/update_recipe/v2/build_number.py 83.87% 5 Missing ⚠️
conda_forge_tick/migrators/core.py 88.88% 1 Missing ⚠️
conda_forge_tick/update_recipe/v2/jinja/utils.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2920      +/-   ##
==========================================
- Coverage   77.13%   77.07%   -0.07%     
==========================================
  Files         112      125      +13     
  Lines       12046    12336     +290     
==========================================
+ Hits         9292     9508     +216     
- Misses       2754     2828      +74     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@wolfv
Copy link
Contributor Author

wolfv commented Aug 7, 2024

@beckermr - I ported all recipe editing here. It's a bit involved because the tool does the following:

  • evaluate the variables in the context using Jinja
  • find all the sources, irregardless of if/else branches
  • render the version with the variables loaded from the context
  • edit the recipe to insert the new hashes & version

lmk what you think! :)

@beckermr
Copy link
Contributor

beckermr commented Aug 7, 2024

Yep. If you look at the existing code it detects a bunch of errors and outputs an error message etc. we need to match that api.

@beckermr
Copy link
Contributor

beckermr commented Aug 7, 2024

Also the hashing bits should use the util in the repo. We've had issues with hashing stalling for some URLs and killing the bot. I'll leave more comments later. Sorry for the terse comments.

@wolfv
Copy link
Contributor Author

wolfv commented Aug 8, 2024

@beckermr - I just tackled the API of update_version, will make sure that the build number update also matches. Reused the existing hashing function.

Comment on lines +56 to +61
data = _load_yaml(file)
build_number_modified = _update_build_number_in_context(data, new_build_number)
if not build_number_modified:
_update_build_number_in_recipe(data, new_build_number)

return _dump_yaml_to_str(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People can and have mixed setting build numbers in both the jinja2 vars / context and explicitly in the recipe. I know it is awful. We should update the code to do both and check if the build number in the non-context part is templated or not before updating.

) -> dict[str, str]:
"""
Load all string values from the context dictionary as Jinja2 templates.
Use linux-64 as default target_platform, build_platform, and mpi.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that assuming linux as the default will result in invalid updates for some recipes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand where you are coming from but we do ignore the selectors in the version updating code (e.g. we go down both branches, then and else.

However, we do use the value for templating, so people could theoretically do something like

source:
  url: https://foo.${{ "linux" if linux else "bla" }}/...
  sha256: ${{ "foo" ...

But that doesn't seem very likely and also buggy so I don't think we'll see that a lot in the wild.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are folks allowed to use if..then in the context section to change the values of variables?

Copy link
Contributor Author

@wolfv wolfv Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that doesn't work in the context section. However you can use templating and inline Jinja expressions, such as:

context:
  version: "1.2.3"
  version_string: "v${{ version if linux else "foo" }}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right so we need to make this correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't change other items that might be platform dependent in the context section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the SHA hash will make it impossible, thankfully :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Folks could define parts of urls in the context section that are platform dependent that are not hashes. So when the bot comes through to render those urls and parts, it will compute the wrong hash since it might have a valid, but incorrect url for that platform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beckermr - i sorta get where you come from but is it really necessary to care about all of these edge cases on day one?

It is also not a very clever way to write the recipe. You would have to "double-if" the source.

E.g.

context:
  foo: ${{ "bla" if linux else "foo" }}

source:
  - if: linux
    then:
      url: ${{ foo }}
      sha256: 1abcd
    else:
      url: ${{ foo }}
      sha256: 2bcde

So I don't think many people would wanna write their sources like this (doesn't make much logical sense).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should care. My suggested procedure for this work was to abstract the current version update algorithm to be independent of the recipe format backend and then use that. It should handle most, if not all, of the cases I am bringing up.

Comment on lines +36 to +43
env = SandboxedEnvironment(
variable_start_string="${{",
variable_end_string="}}",
trim_blocks=True,
lstrip_blocks=True,
autoescape=jinja2.select_autoescape(default_for_string=False),
undefined=_MissingUndefined,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment on these options, why they are set, and how that relates to the supported features for the v2 recipe?

Also, will this parser fail on if conditions, for loops, or other non-supported jinja2 filters / syntax?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will fail. rattler-build will later fail if things aren't valid YAML though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can make it fail and output an error?

) -> dict[str, str]:
"""
Load all string values from the context dictionary as Jinja2 templates.
Use linux-64 as default target_platform, build_platform, and mpi.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I am concerned about the defaults here for platform etc.

return re.search(pattern, url) is not None


def update_hash(source: Source, url: str, hash_: Hash | None) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some people will set the hash in the context section. yay edge cases!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's pretty discouraged by the schema right now and I don't see the benefit for the users to set the hash in the context. We just inject it in the source and the old hash will persist in the context for now.
If someone gets annoyed we can modify this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could break some recipes and people will be even more annoyed by that. We need to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a lint to prevent this in conda-smithy: conda-forge/conda-smithy#2028

Still needs tests etc. but I think that's the better way forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beckermr how would it break a recipe though. You would have:

context:
  sha256: foo

source:
  url: bla
  sha256: "1234..."

At some point someone would come and clean up the left-over hash from the context.

It would only break if the SHA hash was part of the URL. But in that case, the autotick updater has basically no chance anyways :)

class HashError(Exception):
def __init__(self, url: str) -> None:
self.url = url
self.message = f"Could not hash {url}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is useful to add the templated string that was rendered and what version in that string was tried to this error so that folks can diagnose issues.

Comment on lines 128 to 129
template = env.from_string(url)
rendered_url = template.render(context_variables)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bot tries a series of url template modifications in order to get version updates to work. People do things like add / remove v, there are multiple pypi urls etc. We need to use those here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would all be done in the context or in the rendering process. There is no functional difference here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, you mean "people" as in upstream maintainers doing that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean upstream changes the urls and the bot tries to account for it. See here: https://github.com/regro/cf-scripts/blob/master/conda_forge_tick/url_transforms.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am slightly confused because I would have expected the version ticker thingy to already find these if necessary - and this to be present in the new_version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which also lets me wonder if we need to adjust the code somewhere else where the different versions are being tried out :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version ticker thing does this too, but it does not record which url variant worked. Hence it is done here again. It is possible that the recipe can change between when the version is found and when we update the recipe as well.

Copy link
Contributor

@beckermr beckermr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a bunch of comments. One more is that the tests are not complete in that they don't touch a lot of the edge cases.

This PR is getting pretty big, so my suggestion is that we split to three or four PRs: 1) add the parser + tests for it, 2) the build number bump, 3) the version change + basic tests, and 4) version tests for edge cases.

@wolfv
Copy link
Contributor Author

wolfv commented Aug 8, 2024

Thanks for the comments @beckermr - I'll try to get to it asap.

@beckermr
Copy link
Contributor

beckermr commented Aug 8, 2024

No rush!

@wolfv
Copy link
Contributor Author

wolfv commented Aug 8, 2024

I wanted to quickly type out what a v2 recipe looks like:

context:
  version: "0.9.6"

package:
  name: bottom
  version: ${{ version }}
  
source:
  url: https://github.com/ClementTsang/bottom/archive/refs/tags/${{ version }}.tar.gz
  sha256: 202130e0d7c362d0d0cf211f6a13e31be3a02f13f998f88571e59a7735d60667

build:
  script:
    - if: unix
      then:
        - cargo install --locked --root $PREFIX --path . --no-track
      else:
        - cargo install --locked --root %PREFIX% --path . --no-track

requirements:
  build:
    - ${{ compiler("rust") }}

tests:
  - script:
      - btm --help 

Now if you had multiple sources, or conditional sources, you are still forced to spell them out. E.g.

source:
  - if: linux
    then:
      url: https://github.com/ClementTsang/bottom/archive/refs/tags/${{ version }}-linux.tar.gz
      sha256: 202130e0d7c362d0d0cf211f6a13e31be3a02f13f998f88571e59a7735d60667
  - if: win
    then:
      url: https://github.com/ClementTsang/bottom/archive/refs/tags/${{ version }}-win.tar.gz
      sha256: 8a86c4eecf12446ff273afc03e1b3a09a911d0b7981db1af58cb45c439161295
  - if: osx
    then:
      url: https://github.com/ClementTsang/bottom/archive/refs/tags/${{ version }}-osx.tar.gz
      sha256: 688787d8ff144c502c7f5cffaafe2cc588d86079f9de88304c26b0cb99ce91c6

I looked at the code for the selector evaluation and it makes a lot of sense for meta.yamls'.

But for the new recipe format it doesn't really make sense. The use case doesn't really exist because people wouldn't write it that way. You wouldn't even really use any platform specific jinja variables in the URLs etc. since you can just insert the proper URL (just include the version).

So IMO we can ignore this for the new recipes. I also do not (really not) see a good use case for templating the SHA hash.

Can someone come up with a good use case for templating that hash (for the new recipe format)? Maybe I am blind! :)

@wolfv
Copy link
Contributor Author

wolfv commented Aug 8, 2024

Regarding the version variations - that makes sense. We only support a subset of what is being tried in the current munger but that should be fine. I'll try to get to it. There is also an item about the R-version needing some slight adjustment that we should handle.

@beckermr
Copy link
Contributor

beckermr commented Aug 8, 2024

Yeah it's not really about what you think is a good or bad use of the format. If it is allowed, it will happen and the bot will have to deal with it. :/

@mgorny
Copy link

mgorny commented Dec 23, 2024

Hope you don't me asking but where are we standing here? Is there anything I could do to help here? I'm new to this codebase, but I think I can spend some time on it; just don't want to step on anyone's toes.

@wolfv
Copy link
Contributor Author

wolfv commented Dec 27, 2024

Hi @mgorny - sorry for the slow response!

This was working, but Isuru and Matt wanted to cover a number of special cases (e.g. templated SHA256 hashes).

I have the following prototype to render out the sources here: prefix-dev/rattler-build-conda-compat#57

The idea is:

  • we render against all variant configs and extract all URL sources
  • check for updates and replace URLs and sha256 hashes as "text replacements" so that any comments stay intact, etc.

If you have some cycles to spend on this, that would be amazing. I added some tests in the PR linked that should show the URL source extraction to be working.

@mgorny
Copy link

mgorny commented Dec 30, 2024

Do I understand correctly that you're suggesting that I take the prototype code from prefix-dev/rattler-build-conda-compat#57 and try to integrate it into this pull request? Or do you have something else in mind?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants