-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Germplasm and Genotype data #67
Comments
Absolutely 👍 I just dont know what sort of input to expect so if you can share that we can take it from there. There are 3 components to dev seed a) guides on how to load the data if we can only get c) in for now thats fine, i think a) is great too, and i'm not sure how useful b) actually is... |
I will definitely contribute both A and C :-) For the data, I noticed the current minified set focuses on Fraxinus excelsior. Is the preference for public datasets (difficult for germplasm data) or would real-life data that is anonimized as Tripalus databasica also be welcome? As far as what the data looks like:
|
This is actually why i switched to writing python scripts that randomly generated biomaterials during the peer review process- to many qestions of which records did i use and why. i didnt go all the way and switch to a tripalus databasica because i still wanted to use "real" biological sequences since i wanted the feature annotations to make sense. So in your case we have to ask "will someone care if the biology makes sense?" In particular with variants, I guess. Ideal world you would contribute to F excelsior with fake or anonymous data. If thats not possible then I'd be ok with a separate organism folder for T. databasica if thats your preference. If you look at the biomaterial generator (https://github.com/statonlab/tripal_dev_seed/blob/master/bin/generate_biomaterials.py) you might be surprised at how easy it is to build a random metadata generator thanks to edit- well since that file produces XML its going to be a little extra confusing. how about https://github.com/statonlab/tripal_dev_seed/blob/master/bin/generate_expression.py which generates TSV data?
Are you hoping to integrate with the pre-generated devSeed Seeder in test suite? If so, would it be necessary to have all the files pre-generated for automatic loading? Or does your genotypes loader automatically load the resulting GFF/variants info? If it does then i dont see the need to provide both unless you want it .
yes please. |
I've been enjoying Tripal DevSeed as a quick way to get a testing environment up but need data to test germplasm and genotype-based functionality... I am willing to contribute to this project to see such data added if there is interest :-)
The text was updated successfully, but these errors were encountered: