Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: warn when standalone ref and tree ref don't match exactly #1474

Merged
merged 2 commits into from
Jun 5, 2024

Conversation

ivan-aksamentov
Copy link
Member

@ivan-aksamentov ivan-aksamentov commented Jun 4, 2024

Following conversation in #1455 (comment)

Let's add a warning if the reference sequence provided any of the possible ways (fasta in dataset files, through CLI argument, Web URL param, or Web "Customization" interface) does not exactly match (as in string comparison) the .root_sequence.nuc in Auspice JSON.

The warning is emitted:

  • CLI: into console
  • Web: into dev console, for every WebWorker "thread" (we don't currently have a good mechanism to display warnings int he app, especially the ones coming from Rust)

The warning message is the following. Please suggest improvements (paste a full quote into reply message or feel free to modify in the code).

Click to expand

Nextclade detected that reference sequence provided does not exactly match reference (root) sequence in Auspice JSON.

This could be due to one of the reasons:

  • Nextclade dataset author provided reference sequence and reference tree that are incompatible
  • The reference tree has been constructed incorrectly
  • The reference sequence provided using --input-ref CLI argument is not compatible with the reference tree in the dataset
  • The reference tree provided using --input-tree CLI argument is not compatible with the reference sequence in the dataset
  • The reference sequence provided using &input-ref parameter in Nextclade Web URL is not compatible with the reference tree in the dataset
  • The reference tree provided using &input-tree parameter in Nextclade Web URL is not compatible with the reference sequence in the dataset

This warning signals that there is a potential for failures if the mismatch is not intended.

Following conversation in #1455 (comment)

Let's add a warning if the reference sequence provided any of the possible ways (fasta in dataset files, through CLI argument, Web URL param, or Web "Customization" interface) does not exactly match (as in string comparison) the `.root_sequence.nuc` in Auspice JSON.

The warning message is the following. Please suggest improvements (paste a full quote into reply message or feel free to modify in the code).

<details>
<summary>Click to expand</summary>

> Nextclade detected that reference sequence provided does not exactly match reference (root) sequence in Auspice JSON.
>
> This could be due to one of the reasons:
>
> - Nextclade dataset author provided reference sequence and reference tree that are incompatible
> - The reference tree has been constructed incorrectly
> - The reference sequence provided using `--input-ref` CLI argument is not compatible with the reference tree in the dataset
> - The reference tree provided using `--input-tree` CLI argument is not compatible with the reference sequence in the dataset
> - The reference sequence provided using `&input-ref` parameter in Nextclade Web URL is not compatible with the reference tree in the dataset
> - The reference tree provided using `&input-tree` parameter in Nextclade Web URL is not compatible with the reference sequence in the dataset
>
> This warning signals that there is a potential for failures if the mismatch is not intended.

</details>
Copy link

vercel bot commented Jun 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
nextclade ✅ Ready (Inspect) Visit Preview Jun 4, 2024 8:37pm

@ivan-aksamentov
Copy link
Member Author

Should the warning also explain why the match is important and perhaps link the docs? It's kinda getting long. Not sure if the enumeration of all of the cases is also important here. It's basically a list of ways of how you can get ref and tree into the program.

ivan-aksamentov added a commit to nextstrain/nextclade_data that referenced this pull request Jun 4, 2024
This is the same check as in nextstrain/nextclade#1474, but during dataset indexing, to catch it earlier for datasets we control.
@ivan-aksamentov
Copy link
Member Author

Added the same check during index rebuild in the data repo for our own datasets: nextstrain/nextclade_data#206

@ivan-aksamentov ivan-aksamentov merged commit f9d426c into master Jun 5, 2024
20 checks passed
@ivan-aksamentov ivan-aksamentov deleted the feat/warn-ref-mismatch branch June 5, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant