-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regenerate notebooks for cell-type-wilms-tumor-06 #906
Comments
Are you referring to the content of the |
This is specifically referring to the notebooks
In fact, this situation describes all notebooks in |
I think as long as we have a copy of the code that was used to generate the rendered copies of the notebook (as in the Rmd files that live in
This seems like the move to me. I would just update any paths to save to |
I love this move! |
I actually disagree here. I think having out-of-date notebooks in an exploratory folder is generally fine, as long as there is documentation to say they are out of date. They are a record of analysis that was done, and having them available in the repository seems important for outside transparency. I would expect items in the I am not sure we need to rerun the notebooks with all exploratory steps at all? I would simply remove them from the workflow. If they really do need to be rerun, then we should be able to spin up enough compute to make it work, even if we have to do it outside Lightsail. |
Yes, they are rendered with results from the workflow, but they are part of an optional step in the workflow that performs exploratory analyses that do not directly contribute to the results. There are a few other notebooks like this: 03 which explores clusterings, 04 which explores label transfer results, and 07 which does the annotation. At a minimum, we should be keeping 07 and its HTML output in the repo since it's doing the annotation. Many notebooks in this module end up just inflating the repo size excessively without contributing much information, too:
|
Unfortunately, it is too late now for the size question: the files are still there unless we go through and prune them from the whole history. I agree that the correct move might been to only include a representative set, and not all exploratory notebooks, but we can't really solve the size issue retrospectively. |
Here's how I'm going to approach this:
|
This issue is consolidating #875 as well as new todo items for
cell-type-wilms-tumor-06
. Currently, notebooks in this module are out-of-date.Why are notebooks out of date?
Issue #875
New items
Additional notebooks are/will be out of date due to these currently-in-progress changes:
03_clustering_exploration.Rmd
00b_characterize_fetal_kidney_reference_Stewart.Rmd
,02a_label-transfer_fetal_full_reference_Cao.Rmd
, and02b_label-transfer_fetal_kidney_reference_Stewart.Rmd
Agenda
We'll want to re-run the module in full to regenerate notebooks, including all exploratory steps. It makes the most sense to wait until the next data release (currently planned for late November: https://github.com/AlexsLemonade/OpenScPCA-admin/issues/286) is out.
That said, these diffs make reviewing hard, because GitHub UI does not even want to tell how which/how many files changed. When committing these, we'll therefore either have to do it in chunks with several PRs, or the reviewer can use GitKraken to review instead. In my experience, GitKraken will give a full diff view that GitHub will not render.
The text was updated successfully, but these errors were encountered: