-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating the extract galaxy tools script to add two more columns one for reduced EDAM operation and one for reduced EDAM topic #52
Conversation
…dating the tools.tsv to include a reduced EDAM operation column and to produce some figures, more updates and figures are to follow
…perations reduction script along with the resulted updatedtools.tsv file
Tu use this in the CI could you please:
func(df) -> df (with 2 new columns having reduced terms) |
…for reduced EDAM operation and one for reduced EDAM topic, such that if a tool has multiple EDAM terms some of them are one the same branch in the EDAM ontology we keep only the ones in the leaf of this branch
Thanks @EngyNasr
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please simplify as suggested
bin/extract_galaxy_tools.py
Outdated
# Convert the cleaned row to a list of EDAM terms using the provided ontology | ||
edam_ontology = get_ontology('https://edamontology.org/EDAM_1.25.owl').load() | ||
|
||
terms = cleaned_row |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ?
bin/extract_galaxy_tools.py
Outdated
# only keep the class if it is not a parent class | ||
if include_class: | ||
new_classes.append(cla) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what errors are we getting ?
bin/extract_galaxy_tools.py
Outdated
@@ -516,9 +562,32 @@ def export_tools( | |||
|
|||
if add_usage_stats: | |||
df = add_usage_stats_for_all_server(df) | |||
|
|||
# df_edam = df[df['To keep']==True] | |||
df_edam1 =df[df['EDAM operation'].notna()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but this seems far to complicated. Can you not just do
df[["EDAM operation (reduced)", "EDAM topic (reduced)"]] = df[["EDAM operation", "EDAM topic"]].map(reduced_edam_term)
and reduced_edam_term
is a function that takes the EDAM term as input and return the reduced form.
the test worked on my local branch: paulzierep@6a01007 |
Updating the extract galaxy tools script to add two more columns one for reduced EDAM operation and one for reduced EDAM topic
Adding the Jupyter notebook and R scripts that are responsible for updating the tools.tsv to include a reduced EDAM operation column and to produce some figures, more updates and figures are to follow