-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: Manage notebooks in common way #2
Comments
We surely can use this technique there. So the output of the notebooks are visible in the documentation? That's great. The best thing would be to have this functionality directly from jupyter. Maybe the guys who developed this technique could make a PR? |
Yeah, we probably need a dedicated server with the data where we can run all of these via a CI system and upload the results to a webserver eventually (could be internal). |
We might want to look at this example from what Netflix (!) is doing with automated notebooks:
It could serve as a nice platform for automated benchmarking for each version release. Thanks to @adonath for mentioning this. |
We should definitely add @Bultako to the discussion here, as he did most of the work on |
Hi!
It is not a big effort to split out
We have filed several issues and comments in the
Yes.
In Gammapy we do the CI with Travis and fetch the needed datasets from a Github repo during the CI building. The notebooks are tested with It is a very simple method, we do not assert values like it's done in nbval, and we can test with different kernels (gammapy versions) --. Papermill seems great.
(other) cons using notebooks:
I copy here the useful guidelines exposed in the Netflix post
|
I've re-organized the notebooks (there were only 2 so far) into a more clean structure, and make a better README based on this discussion. I'll open some separate issues for the parts that need improvement (e.g. the build script, an example of how to do a summary, where to put library function code, etc). |
Also, the tests now use papermill instead of gammapy, but we still don't have a nice way to strip the output unless we also install gammapy (or just require that people committing should strip them). I've enabled the requirement that we have a code review before accepting a PR, so we could just enforce it that way. The two notebooks that are included so far break both rules (they use data committed to the repo, which is no longer allowed, and the output is still not stripped). |
it seems the nbstripout tool can be used instead of gammapy's solution: |
FYI -- We do have some issues with the |
I love the idea of benchmarks being a set of Jupyter notebooks, so they are easier to add and read.
The problem with notebooks in GIT is always:
There is a nice solution from the GammaPy developers: their
gammapy jupyter
command lets you manage a set of notebooks in a common way:gammapy jupyter strip
: removes output, so diffs make sense (usenbdime diff
for even better diffs)gammapy jupyter black
runs black code formatter on cells in all notebooksetc.
Then they only commit stripped noteboooks to git, and automatically run the notebooks producing documentation in Sphinx format for viewing.
Can we re-use this technique? Or perhaps ask them to split out that functionality into a stand-alone tool, so gammapy is not needed?
The text was updated successfully, but these errors were encountered: