Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Function Inventory | add documentation w/ docstring #3

Open
51 of 72 tasks
the-mayer opened this issue Aug 29, 2024 · 6 comments
Open
51 of 72 tasks

[DOC] Function Inventory | add documentation w/ docstring #3

the-mayer opened this issue Aug 29, 2024 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation, incl. R docstring/roxygen2 enhancement New feature or request good first issue Good for newcomers outreachy for outreachy interns package R package dev

Comments

@the-mayer
Copy link
Collaborator

List of expected functions from molevol_scripts/R

Ensure all functions are present and accounted for. Use list depth to determine if function should be user facing (ie @export).

Import and combining input files

  • scripts/convert_opinscls_tsv.R using
    • clean_clust_file
    • add_colnames (currently colnames.op_ins_cls and colnames.op_ins_cls.clus2table)
    • remove # rows and convert to columns w/ ID and clust_names
    • add_uniq_ids (to add GCA_ID, IPG, taxID columns based on AccNums)
    • add lineage

Cleanup

  • repeat2s

    • repeated domains with (s)
    • alternative using map?
  • remove_tails

    • by DomArch
  • remove_empty_rows

    • by Species, DomArch, ClustName, GenContext
  • cleanup_clust

    • ! ClustName acts as DomArch?
    • remove start and end '+'s
    • domains_keep remove rows without query (reads query_domains, domains_keep)
    • domains_rename (reads domains_rename)
    • ignored? (reads clustnames_ignore)
    • repeat2s: repeated domains with (s)
    • remove_tails
    • remove empty rows?
  • cleanup_species

    • remove_empty rows (!change it to an alert about AccNums w/ no lineage/spp)
    • removes special characters
    • check if empty rows/taxIDs are because of server retrieval errors!
  • cleanup_domarch

    • ignored (reads domains_ignore)
    • domains_keep remove rows without query (reads query_domains, domains_keep)
    • replaced domains (reads domains_rename)
    • remove start and end '+'s
    • repeat2s: repeated domains with (s)
    • remove empty rows?
    • remove_tails
  • cleanup_gencontext

    • reverse_operons
    • repeat2s: repeated domains with (s)
    • remove empty rows? risky. many eukaryotes don't carry gencontexts!
  • add_leaves

    • to_titlecase
    • convert_aln2fa | convert_aln2tsv
    • add_leaves | adding leaves based on AccNum, Lineage and Spp.
    • add DA to add_leaves too?
    • convert_accnum2fasta
    • filter_for_phylo

Summary stats

  • count_bycol
  • generate_wordcount
    • elements2words
    • words2wc
    • filter_freq | mostly used within other functions
  • summary_bylin (for DA, GC)
    • summ_DA_byLin
    • summ_GC_byDALin
    • summ_GC_byLin
  • summary_stats (for DA, GC)
    • summ_DA
    • summ_GC
  • total_counts
  • find_paralogs

Plotting

  • upset_plot
    • should depend on generate_wordcount
  • lineage_DA_plot
  • lineage_GC_plot
  • lineage_domain_repeats_plot??
  • wordcloud
  • msa_tree
    • msa_pdf
    • phylotree?
  • prot_network
    • by DA/domains
    • by GC/DA
    • subsets by prot/domain
@the-mayer the-mayer added documentation Improvements or additions to documentation, incl. R docstring/roxygen2 enhancement New feature or request labels Aug 29, 2024
@the-mayer the-mayer self-assigned this Aug 29, 2024
@jananiravi
Copy link
Member

@SunSummoner would you be interested in starting by generating a function map for the existing package? this will give us all an understanding of the dependent vs. independent functions & which of these need to be exported. Again, @the-mayer will be a good resource to check in regarding this.

@jananiravi jananiravi added good first issue Good for newcomers outreachy for outreachy interns labels Oct 1, 2024
@SunSummoner
Copy link
Collaborator

@SunSummoner would you be interested in starting by generating a function map for the existing package? this will give us all an understanding of the dependent vs. independent functions & which of these need to be exported. Again, @the-mayer will be a good resource to check in regarding this.

@jananiravi Yes, I will check in with him.

@jananiravi jananiravi changed the title Function Inventory Function Inventory | add documentation w/ docstring Oct 2, 2024
@jananiravi jananiravi added the package R package dev label Oct 4, 2024
@awasyn
Copy link
Collaborator

awasyn commented Oct 7, 2024

Hi if this helps, you can have look at this function dependency graph i created with viznetwork. it is interactive html file. when opened it takes about 5s to load the components. select functions by id an zoom in to see it's dependency. you can also view all function dependencies at once. @jananiravi @the-mayer
function_dependency_graph.zip

Some screenshots are added below for context.
Screenshot from 2024-10-08 00-04-22
Screenshot from 2024-10-08 00-05-11
Screenshot from 2024-10-08 00-05-24
Screenshot from 2024-10-08 00-06-05
Screenshot from 2024-10-08 00-06-15

@jananiravi
Copy link
Member

@awasyn @SunSummoner, please remind us if there are open PRs for this automatic function dependency map generation. Thanks!

@jananiravi jananiravi changed the title Function Inventory | add documentation w/ docstring [DOC] Function Inventory | add documentation w/ docstring Oct 26, 2024
@SunSummoner
Copy link
Collaborator

@awasyn @SunSummoner, please remind us if there are open PRs for this automatic function dependency map generation. Thanks!

@jananiravi None from my side yet.

@awasyn
Copy link
Collaborator

awasyn commented Oct 27, 2024

@awasyn @SunSummoner, please remind us if there are open PRs for this automatic function dependency map generation. Thanks!

I just pushed a PR on this. I hesitated before because I was of the impression the task was assigned to another contributor. Sorry for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation, incl. R docstring/roxygen2 enhancement New feature or request good first issue Good for newcomers outreachy for outreachy interns package R package dev
Projects
None yet
Development

No branches or pull requests

4 participants