Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update entropy types #1915

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Update entropy types #1915

wants to merge 10 commits into from

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Nov 21, 2024

Description of proposed changes

Updates TypeScript usage to be consistent with #1864, etc.

I thought of these changes after adding 6215591.

Checklist

This allows for more specific type narrowing.
Improves readability.
This allows some code editors to show the comment on hover.

I also added two new types (Phase, Frame) to replace some comments.
These were not consistent in the same file. I've also opted to not use
them in the tree component types.
The previous error message code indicated the possibility that this
property is unset. Express that directly in the type annotation and
simplify the error message code.
Checking against undefined is more direct.
@victorlin victorlin self-assigned this Nov 21, 2024
@nextstrain-bot nextstrain-bot temporarily deployed to auspice-victorlin-updat-jrdeox November 21, 2024 01:36 Inactive
end?: number
start?: number
segments?: JsonSegmentRange[]
strand?: Strand
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making sure this needed to become optional to match some usage somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 2d3a331 the matched usage was Object.prototype.hasOwnProperty.call(annotation, "strand") which to me indicates that it is optional. It's hard to tell from other usage because the entire JsonAnnotation object is represented by json.meta.genome_annotations here:

createGenomeMap(json.meta.genome_annotations),

entropy = entropyCreateState(json.meta.genome_annotations);

@jameshadfield might have a better idea.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's optional for the nuc annotation. You could argue it should be required, but we relaxed this and just assert that it's not negative:

if (nucAnnotation.strand==='-') throw new Error("Auspice can only display genomes represented as positive strand." +
"Note that -ve strand RNA viruses are typically annotated as 5' → 3'.");

For CDSs to be considered valid it must be "+" or "-":

const strand = annotation.strand;
if (!(strand==='+' || strand==='-')) {

At first, I disabled @typescript-eslint/consistent-type-assertions for
the type assertion in the original code, but that ended up polluting the
_frame() function with details unrelated to the direct purpose of the
function. Extracting the modulus operation into a separate function and
adding the details there felt like a better scope.
@victorlin victorlin temporarily deployed to auspice-victorlin-updat-jrdeox November 21, 2024 19:39 Inactive
@victorlin victorlin temporarily deployed to auspice-victorlin-updat-ih84hn December 6, 2024 17:48 Inactive
// enum Strand {'+', '-'} // other GFF-valid options are '.' and '?'
type Strand = string;
// '.' and '?' are GFF-valid options
type Strand = '+' | '-' | '.' | '?';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GFF spec allows '+' | '-' | '.' | '?' but our JSON schema doesn't use GFF spec here. There's some discussion I can dig up around this if you'd like (I wanted to use GFF spec) but Nextstrain uses None in python and null in JSON/JS instead of "." . Here's a salient comment:

/** GFF allows for strands '?' (features whose strandedness is relevant, but unknown) and '.' (features that are not stranded),
* which are represented by augur as '?' and null, respectively. (null comes from `None` in python.)

function calculateStackPosition(genes: Gene[], strand: (Strand|null) = null):number {
function calculateStackPosition(
genes: Gene[],
strand: Strand | null = null,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to a previous comment, null is probably not the right choice here (as it's a schema-valid strand). By the time this function is called we have already created cds objects (cdsFromAnnotation) and so restricted the strand to +/-, so in practice it doesn't mater. I'd suggest

  • (a) requiring the argument (and thus getting rid of the conditional if (strand) below)

and, potentially,

  • (b) using a more restricted type definition.

@@ -20,7 +20,7 @@ interface JsonAnnotation {
end?: number
start?: number
segments?: JsonSegmentRange[]
strand: Strand
strand?: Strand
Copy link
Member

@jameshadfield jameshadfield Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A broader comment / question about TS as it relates to the JSON dataset. Beyond being semantically valid JSON anything could be present in the JSON (we don't do formal schema validation within Auspice). So should any types relating to the dataset contents therefore allow any and every field to be optional, and also all values to be any type?

This relates to a7578d2, which proposes

-  if (typeof annotation.display_name === 'string') {
+  if (annotation.display_name !== undefined) {

(annotation here is direct from the JSON dataset). According to our schema display_name will always be unset or a string, but in reality it could be anything.

Comment on lines 335 to 352
return (positiveStrand ?
(start+phase-1)%3 :
Math.abs((end-phase-genomeLength)%3)) as Frame;
return positiveStrand ?
_mod3(start+phase-1) :
_mod3(end-phase-genomeLength);
}

/**
* Type-safe modulo operation. Return value is unsigned (i.e. non-negative).
*/
function _mod3(n: number): 0 | 1 | 2 {
if (!Number.isInteger(n)) {
throw new Error(`${n} is not an integer.`);
}

/* TypeScript cannot infer the exact range of values from a modulo operation,
* so it is manually provided.
*/
// eslint-disable-next-line @typescript-eslint/consistent-type-assertions
return (Math.abs(n) % 3) as 0 | 1 | 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, I find these changes kind of unfortunate (I have no doubt they are just to appease tsc). I'd prefer expanding out the ternary return into a longer function with more comments than additional small functions. I like the cast the best!

The lack of an integer type is pretty frustrating (JS, not TS, to be fair). If the numbers supplied to this function aren't integers we've got bigger problems in Auspice than a floating point frame!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants