Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New GWAS & QTL Trait Terms page #91

Merged
merged 2 commits into from
Feb 1, 2024
Merged

New GWAS & QTL Trait Terms page #91

merged 2 commits into from
Feb 1, 2024

Conversation

StevenCannon-USDA
Copy link

New page at /tools/trait_list/ shows all terms that give GWAS and QTL results in the Trait Association Search tool (/tools/search/trait.html). Clicking on QTL or GWAS adjacent to a trait term populates the Trait Association Search web component and initiates that search.

Copy link

@maxglycine maxglycine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The component does do a lookup. It populates a table that gives a brief synopsis of the paper. Unfortunately, it does not do anything else. There are no links to convert a paper to a gene or a QTL or map. It is essentially a deadend. Looking at the "Tools" list, I could not identify where I could use the copied study name to get any more data. Is this just a lack of imagination on my part? Until it is clear what to do with the report, the component is not complete.

@StevenCannon-USDA
Copy link
Author

StevenCannon-USDA commented Jan 25, 2024

@maxglycine Hi Rex - thanks for having a look.
I agree that the Trait Association Search web component should link to content. But that's an issue regarding the component rather than this page of trait terms. What I think would be most efficient is to accept and merge this trait-terms page (unless you have other suggestions for it), and then start a new issue to request linkouts for the Trait Association Search component.

Two natural linkouts from the Trait Association Search are to the source data and to the GWAS or QTL page.

Pinging @sammyjava, @adf-ncgr, and @That-Thing about the best place(s) to start an issue to add these linkouts. I am guessing https://github.com/legumeinfo/microservices ?

@sammyjava
Copy link

Or just link to the trait page itself (https://mines.legumeinfo.org/glycinemine/report.do?id=305000030) and let the user go from there, whether it be to a GWAS or to a QTLStudy. But yes, this would be the job of the microservices linkout service, I think.

Copy link

@jd-campbell jd-campbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a great start. Comments have already been mentioned about how these pages lead to a dead end. Sam already mentioned a suggestion.

Here are my general comments and questions for @StevenCannon-USDA:

  • I think the "gwas" and/or "qtl" after the trait should be changed up uppercase "GWAS" "QTL"
  • The category sections with many traits (example: Seed - composition) are very hard to read. My opinion is that having multiple lines with just three periods separating the traits makes it easy for the eye to skip lines or miss something. Is there a way to organize the list of traits in to multiple columns across the page?
  • Is the goal of this page just to link to the Trait Association Search page? Is there any other ideas about linking to Cmap-JS?

Thanks again for taking this on.

@adf-ncgr
Copy link
Collaborator

@StevenCannon-USDA yes put an issue into microservices, unless you think that legumeinfo/microservices#604 already covers what you have been wanting for this (and either way apologies that issue has been languishing!)

@StevenCannon-USDA
Copy link
Author

@jd-campbell - thanks for the suggestions. I'll try to make those changes on Friday.

Is goal of this page just to link to the Trait Association Search page? Is there any other ideas about linking to Cmap-JS?

I think that if we can identify suitable link targets, anything should be fair game for the linkout service. Maybe Cmap-JS -- but the Mine trait report is the most obvious first step. If you think of additional targets (URLs we can get to with available information), please add those in the issue below.

@adf-ncgr - thanks for the reminder about the existing issue. Now that a functioning component is in place, I think it's worth starting a new one: legumeinfo/microservices#616 .

@nathanweeks
Copy link

How is _data/trait_terms.yml generated / updated? Does it risk becoming stale?

@StevenCannon-USDA
Copy link
Author

How is _data/trait_terms.yml generated / updated? Does it risk becoming stale?

@nathanweeks - I extracted the terms from the two types of files that have such terms in the Data Store. The process could be semi-automated with a script; but until then, the list does risk becoming stale. The following are the command-line invocations:

cd /usr/local/www/data/v2/Glycine/max
zcat qtl/*/*qtl.tsv.gz | cut -f1 | sed '/^#/d' | 
  perl -pe 's/ +\d+-\d+//' | perl -pe 's/, .+//; s/-\d+$//; s/\.\d+$//' | 
  sort | uniq -c | perl -pe 's/^ *(\d+) +(.+)/$1\t$2/' > ~/soy_traits/QTL_traits.tsv

zcat gwas/*/*result*gz | cut -f1 | sed '/^#/d' | 
  perl -pe 's/ +\d+-\d+//' | perl -pe 's/, .+//; s/-\d+$//; s/\.\d+$//' | 
  sort | uniq -c | perl -pe 's/^ *(\d+) +(.+)/$1\t$2/' > ~/soy_traits/GWAS_traits.tsv

@nathanweeks
Copy link

  • The category sections with many traits (example: Seed - composition) are very hard to read. My opinion is that having multiple lines with just three periods separating the traits makes it easy for the eye to skip lines or miss something. Is there a way to organize the list of traits in to multiple columns across the page?

@jd-campbell That would seem more readable. Or perhaps a couple dropdown menus?

How is _data/trait_terms.yml generated / updated? Does it risk becoming stale?

@nathanweeks - I extracted the terms from the two types of files that have such terms in the Data Store. The process could be semi-automated with a script; but until then, the list does risk becoming stale. The following are the command-line invocations:

cd /usr/local/www/data/v2/Glycine/max
zcat qtl/*/*qtl.tsv.gz | cut -f1 | sed '/^#/d' | 
  perl -pe 's/ +\d+-\d+//' | perl -pe 's/, .+//; s/-\d+$//; s/\.\d+$//' | 
  sort | uniq -c | perl -pe 's/^ *(\d+) +(.+)/$1\t$2/' > ~/soy_traits/QTL_traits.tsv

zcat gwas/*/*result*gz | cut -f1 | sed '/^#/d' | 
  perl -pe 's/ +\d+-\d+//' | perl -pe 's/, .+//; s/-\d+$//; s/\.\d+$//' | 
  sort | uniq -c | perl -pe 's/^ *(\d+) +(.+)/$1\t$2/' > ~/soy_traits/GWAS_traits.tsv

@StevenCannon-USDA It would seem worth automating as part of the site build; I could volunteer for the task if there's consensus. Or otherwise, as a stopgap, perhaps the instructions for generating those files could be added to the README.md?

@nathanweeks
Copy link

Another question: are all QTL traits in AmiGO?

@sammyjava
Copy link

Traits are created per the collection, and therefore the publication. If they correspond to ontology terms, they are assigned to those terms in the OBO file in the collection. But a Trait could be "Wilting flower that looks like Cruella DeVille". There is probably not an AmiGO term for that one. :)

@sammyjava
Copy link

More detail: the concept of Trait is itself given by an ontology term (TO:0000387) and in the mines it extends Annotatable, which means it has an ontologyAnnotations collection:

<class name="Trait" extends="Annotatable" is-interface="true" term="https://browser.planteome.org/amigo/term/TO:0000387">
	<attribute name="description" type="java.lang.String"/>
	<attribute name="name" type="java.lang.String"/>
	<reference name="organism" referenced-type="Organism"/>
	<reference name="gwas" referenced-type="GWAS"/>
	<reference name="qtlStudy" referenced-type="QTLStudy"/>
	<collection name="qtls" referenced-type="QTL" reverse-reference="trait"/>
	<collection name="gwasResults" referenced-type="GWASResult" reverse-reference="trait"/>
</class>
<class name="Annotatable" is-interface="true">
	<attribute name="primaryIdentifier" type="java.lang.String" term="http://semanticscience.org/resource/SIO_000673"/>
	<collection name="ontologyAnnotations" referenced-type="OntologyAnnotation" reverse-reference="subject" term="http://semanticscience.org/resource/SIO_000255"/>
	<collection name="publications" referenced-type="Publication" reverse-reference="entities" term="http://purl.org/dc/terms/bibliographicCitation"/>
	<collection name="dataSets" referenced-type="DataSet" reverse-reference="entities" term="http://semanticscience.org/resource/SIO_001278"/>
</class>

The primaryIdentifier is created to be unique to the trait and study. It also has a name, which is something like "Seed color." If the trait has ontology terms, they are stored in the ontologyAnnotations collection of Trait.

Right now I'm working on stuff in graphql-server like retrieving ontology annotations for any Annotatable.

Note that Trait above has an organism (species) reference - that will allow us, if we want, to narrow down on species (max vs. soja, for example) if we want, which we cannot currently do in the web component. This is in the 5.1.0.4 mine model currently under development.

@StevenCannon-USDA
Copy link
Author

@jd-campbell - I have tried to restructure the page as you suggested. Please have a look.

The data pre-processing for this display is kind of ugly now -- at least in my implementation (I manually wrapped all of the tables into three column groups). There may be a better way to do it (maybe with javascript or a plugin), but I've run out of time to fuss with it. My suggestion is to go ahead with this as a prototype, and work on the data processing in a separate branch - probably along the lines that you suggest, @nathanweeks (but I think that automating the process isn't a high priority right now; the trait terms will be pretty stable, at least until we get a lot more GWAS and QTL data loaded).

Copy link

@jd-campbell jd-campbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tabular output is much easier to read.

@StevenCannon-USDA StevenCannon-USDA dismissed maxglycine’s stale review February 1, 2024 18:11

Addressed with a new issue (legumeinfo/microservices#616), for addition of linkouts to the Trait Association Search component.

@StevenCannon-USDA StevenCannon-USDA merged commit 365a293 into main Feb 1, 2024
3 of 5 checks passed
@StevenCannon-USDA StevenCannon-USDA deleted the traits-list2 branch February 22, 2024 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants