You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was thinking we could take one (or both) of two approaches:
add arguments to function getNCBISeqID that specify additional query criteria for the call to the API (e.g. COI[Gene]). We could either make these named arguments, or allow them to be passed as ...
one drawback to querying on specific terms related to the genomic region is that there (surprisingly) doesn't seem to be great consistency in GenBank about naming conventions (e.g. COI is also sometimes CO1, cyt ox 1, cytochrome oxidase subunit 1, etc). There are two possible work-arounds:
make the function download all sequences for a species and then go back in and try to figure out which ones belong to the same genomic regions and which of one those regions is the one the user wanted (e.g. by fuzzy matching key words about the region or attempting to align all sequences and seeing which ones align to any sequences that are clearly labeled as matching the region of interest). I admit, this was the approach I at first had in mind
All approaches have some drawbacks and some advantages.
Whatever we end up going for I think there should be a way of saying either the user wants one or a few specific regions, or they just want a dump of all sequences available.
The text was updated successfully, but these errors were encountered:
I was thinking we could take one (or both) of two approaches:
getNCBISeqID
that specify additional query criteria for the call to the API (e.g.COI[Gene]
). We could either make these named arguments, or allow them to be passed as...
All approaches have some drawbacks and some advantages.
Whatever we end up going for I think there should be a way of saying either the user wants one or a few specific regions, or they just want a dump of all sequences available.
The text was updated successfully, but these errors were encountered: