-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ingest: Add build-configs for CI #56
Comments
Thinking out loud on options to "subsample" ingest data.
|
I'm assuming there are two goals of ingest CI: a. Ensure an update to the ingest workflow works with existing NCBI data Re: the 3 options above
|
Thanks for the thoughts here @victorlin! I agree we care about (a) more to ensure the ingest workflow runs as expected. @corneliusroemer has raised nextstrain/measles#46 for not relying on external services in CI, which aligns with option [1]. I'll probably port whatever we implement in measles into this repo. |
Good discussion! I hadn't seen this here before the measles PR. I'm not sure how often datasets-cli changes their schema. I don't expect this to happen very often, but I might be wrong. If the zip package/schema changes, we could just start with test files after all ncbi datasets commands to not rely on stability of NCBI somewhat internal API (not sure how internal the downloaded package is, we'll find out). |
Copying @tsibley 's relevant comments from the measles issue:
|
Stepping back to consider the goals of ingest CI that @victorlin astutely pointed out
I think (a) is the more frequent check within a pathogen repo, while (b) is important to test when we update the NCBI datasets version in docker-base/conda-base. I don't thinks there's a simple way to separate those two concerns with the current |
From yesterday's dev chat:
|
Having reviewed this issue again, in light of my recent question in Slack about including Nextclade and geo data into the embedded example-data for ingest CI, I think doing that is most consistent with the reasoning in this issue and the previous comment from Jover about the dev chat summary. |
We already have the build-configs for CI in the phylogenetic workflow, so I think reasonable to add build-configs for CI in the ingest workflow. This will make it simpler for internal team to set up the GH Action workflow using
pathogen-repo-ci
.Things to consider
{}
) of an empty config file (Originally posted by @tsibley in 44f27a5)The text was updated successfully, but these errors were encountered: