Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

show_matching_datasets function restricts access for public histories #340

Open
mtekman opened this issue Oct 8, 2020 · 3 comments
Open

Comments

@mtekman
Copy link

mtekman commented Oct 8, 2020

Accessing datasets in a public history (that does not belong to me) is restricted in an inconsistent way.

When I try to match datasets using the show_matching_datasets function, it throws me a 403 error.

MWE:

from bioblend import galaxy

gi = galaxy.GalaxyInstance(url="https://usegalaxy.eu", key=<api-key>)
dc = galaxy.datasets.DatasetClient(gi)
hc = galaxy.histories.HistoryClient(gi)

name_filter="illumina.*"
##public_url="https://usegalaxy.eu/u/sars-cov2-bot/h/2020-09-28-update"
## I found the below ID in the page source of the above url
public_hid = "c36be749fd002b4d"

hc.show_matching_datasets(public_hid, name_filter=name_filter)

it waits for about minute, and then gives out:

403 error  "HistoryDatasetAssociation  is not accessible by user"

however I can still iterate over the history and access the datasets (albeit much more slowly)

from bioblend import galaxy

gi = galaxy.GalaxyInstance(url="https://usegalaxy.eu", key=<api-key>)
dc = galaxy.datasets.DatasetClient(gi)
hc = galaxy.histories.HistoryClient(gi)

name_filter="illumina.*"
public_hid = "c36be749fd002b4d"

tmp = hc.show_history(public_hid)
set_ids = tmp['state_ids']['ok']
for set in set_ids:
     print(dc.show_dataset(se)['name'])
@nsoranzo
Copy link
Member

nsoranzo commented Oct 8, 2020

@mtekman Does HistoryClient.show_matching_datasets() work fine on one of your histories (not on a public one)? It may be that some of the datasets in that history are not public themselves.

I think you can speed up your workaround a lot by passing contents=True, then you don't need the for loop.

@mtekman
Copy link
Author

mtekman commented Oct 8, 2020

Yes, this function works completely fine on my own histories.

Ah, let me try your solution

Edit: Yes, this does indeed work and is much faster -- thank you @nsoranzo !

@mtekman
Copy link
Author

mtekman commented Oct 8, 2020

Update using @nsoranzo faster workaround

from bioblend import galaxy

gi = galaxy.GalaxyInstance(url="https://usegalaxy.eu", key=<api-key>)
dc = galaxy.datasets.DatasetClient(gi)
hc = galaxy.histories.HistoryClient(gi)

name_filter="illumina"
public_hid = "c36be749fd002b4d"

tmp = hc.show_history(public_hid, contents=True)
datasets = filter(lambda x: ('state' in x) and (x['state'] == 'ok') and (name_filter in x['name']), tmp)
for data in datasets:
   print(data['name'])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants