Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release a new version of PyNWB on every HDMF release #1343

Closed
5 tasks done
rly opened this issue Mar 10, 2021 · 8 comments
Closed
5 tasks done

Release a new version of PyNWB on every HDMF release #1343

rly opened this issue Mar 10, 2021 · 8 comments
Labels
category: enhancement improvements of code or code behavior

Comments

@rly
Copy link
Contributor

rly commented Mar 10, 2021

In investigating #1334, I realize that any bug fixes in HDMF may not propagate to user's installations of PyNWB because PyNWB allows a range of HDMF versions >=2.1.0, <3. One install of PyNWB 1.4.0 might have HDMF version 2.1.0 and another might have version 2.4.0. Some of these bug fixes may break previously working code that relied on a bug, and it is difficult for users to figure out which version of PyNWB broke the code because the change is really in HDMF, which users should not need to know about.

I think it would be best for users not to have to list HDMF as a requirement or restrict versions of HDMF. Listing or restricting PyNWB versions as a requirement should suffice. So to make bug fixes in HDMF propagate to PyNWB in a visible, reproducible manner, I think we should release a new version of PyNWB every time we release a new version of HDMF that might affect user interaction with PyNWB.

As such, we should make an immediate 1.5.0 release of PyNWB to account for the changes between HDMF versions 2.1.0 and the latest 2.4.0. Since we were moving toward a 2.0 version of PyNWB, some recent commits may need to be extracted and saved for later.

Checklist

  • Have you ensured the feature or change was not already reported?
  • Have you included a brief and descriptive title?
  • Have you included a clear description of the problem you are trying to solve?
  • Have you included a minimal code snippet that reproduces the issue you are encountering?
  • Have you checked our Contributing document?
@rly rly added the category: enhancement improvements of code or code behavior label Mar 10, 2021
@t-b
Copy link
Collaborator

t-b commented Apr 2, 2021

Isn't the real issue that pynwb's version requirement on hdmf is too loose? Why not just require a fixed version of hdmf?

@rly
Copy link
Contributor Author

rly commented Apr 23, 2021

Yes, I now realize that releasing a new version of PyNWB on every HDMF release does not actually solve the original issue. PyNWB would still have to pin a version of HDMF.

Since PyNWB is a library, in general, it is preferred / conventional to accept a range of dependencies, so that updates in required packages, e.g., numpy, scipy, pandas, h5py, can also be installed. See also discussion in #1133

However, given the tight relationship between PyNWB and HDMF, pinning to a particular version of HDMF in order to make bug tracking and package management easier for the end-user seems worthwhile to me (then, PyNWB+HDMF would be a monolithic package, and users need not know about HDMF). This makes sense only if we make a PyNWB release on every HDMF release.

By pinning HDMF and making paired releases, a downside is that users cannot use an older release of PyNWB with a newer release of HDMF, but I am not sure why one would want to do that. To use the latest HDMF, users should update PyNWB which pins to the latest HDMF release.

To summarize:

  • pin HDMF and release PyNWB on every HDMF release -- GOOD
  • pin HDMF and don't release PyNWB on every HDMF release -- BAD (PyNWB users cannot get HDMF updates)
  • don't pin HDMF (allow a range of versions) -- OK (but PyNWB behavior is highly dependent on HDMF version installed, users must track both packages)

@yarikoptic I'm curious your thoughts on this.

@oruebel
Copy link
Contributor

oruebel commented Apr 23, 2021

To summarize:

* pin HDMF and release PyNWB on every HDMF release -- GOOD

* pin HDMF and don't release PyNWB on every HDMF release -- BAD (PyNWB users cannot get HDMF updates)

* don't pin HDMF (allow a range of versions) -- OK (but PyNWB behavior is highly dependent on HDMF version installed, users must track both packages)

Unlike libraries, such as numpy, which is a requirement for many different libraries, the vast majority of users only need HDMF because they are using PyNWB. As such, pinning to a particular version of HDMF seems less problematic here, because version-conflicts where users need to use different versions of HDMF for different packages are much less likely. In general I would say:

  • release PyNWB on every HDMF release and
  • pin HDMF to the latest version if it changes behavior in PyNWB
  • allow a range of versions for HDMF if those versions provide the same behavior in PyNWB. E.g., if PyNWB uses HDMF 2.7 and changes in HDMF 2.8 affect only internal optimizations that are not exposed in PyNWB than we can allow the range.

@rly
Copy link
Contributor Author

rly commented Apr 23, 2021

  • allow a range of versions for HDMF if those versions provide the same behavior in PyNWB. E.g., if PyNWB uses HDMF 2.7 and changes in HDMF 2.8 affect only internal optimizations that are not exposed in PyNWB than we can allow the range.

Perhaps I am misunderstanding here. PyNWB does not know ahead of time what the next minor/patch version of HDMF will do, and I would say, most bug fixes and added features in HDMF do affect behavior in PyNWB. So it does not solve the issue of PyNWB==1.4.0 behaving differently for HDMF 2.2.0, 2.3.0, etc. - if HDMF is not pinned, users still need to track their HDMF version.

@oruebel
Copy link
Contributor

oruebel commented Apr 23, 2021

most bug fixes and added features in HDMF do affect behavior in PyNWB

If behavior is affected, fixing HDMF version in PyNWB seems reasonable to me.

@yarikoptic
Copy link
Contributor

Pinning just would be a band aid with negative side effects. Making libraries adhere to semver properly (breakage of api/abi - boost of major) and then dependent libraries announcing what major version they need with announcing minimal version - is the way to go imho.
Can discuss more

@yarikoptic
Copy link
Contributor

Or, if so tight of a relationship, and if hdmf isn't used by anything else - absorb it into pynwb and say that independent hdmf is no longer

@rly
Copy link
Contributor Author

rly commented Apr 28, 2021

@yarikoptic Thanks for the feedback! HDMF is used by a couple other small projects of ours, so absorbing it is not preferable.

After further reading and thought, I agree with @yarikoptic and think we should 1) accept that PyNWB is a living software package and have PyNWB require a major version of HDMF with a minimal version, and 2) be super strict about bumping up HDMF major version numbers when we make any breaking changes in HDMF (even ones "internal" to PyNWB and HDMF).

If you care to read further thoughts (mostly for me)

I had initially opened this issue with the idea that PyNWB version x.y.z should be a reproducible software package targeted to end users. When a user reports a bug in PyNWB x.y.z, we would know exactly their environment and can trace the bug to PyNWB or one of its dependencies. If the bug is in HDMF, we would fix the bug and push new releases in both PyNWB and HDMF. To fully realize the vision of a monolithic PyNWB, PyNWB would have to pin all dependencies (and their dependencies and so on, i.e., pip freeze).

But this is in deep contrast to the idea that PyNWB is a library that is part of a larger ecosystem and is "living software". We do want PyNWB to be used in an environment with other packages without dependency conflicts, and we do want PyNWB to work with non-breaking updates/bugfixes from dependencies. In other words, do not pin versions. While this means that users and other software must track and manage their versions of both PyNWB and HDMF, this package management is acceptable by the community and is already a norm, e.g., when working with numpy, pandas, etc.

So my initial statement of "I think it would be best for users not to have to list HDMF as a requirement or restrict versions of HDMF." was ill-informed (I am more used to monolithic end-user applications) and incompatible with key use cases of PyNWB.

Credit to this useful discussion too: jupyterhub/mybinder.org-user-guide#161

@rly rly closed this as completed Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: enhancement improvements of code or code behavior
Projects
None yet
Development

No branches or pull requests

4 participants