-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R package for accessing data.gov.au open data sets via API #2
Comments
This would make a nice package. There's a lot of data here as you've noted. It would be good to have some focus I think, at least for the Unconf so that it's achievable. A package that accesses a certain group of data would be achievable or at least a good start could be made, I think. For example, there are 637 shape files available, http://www.data.gov.au/dataset?tags=Earth+Sciences&res_format=SHP or maybe more accessible 7 arcgrid files available, http://www.data.gov.au/dataset?tags=Earth+Sciences&res_format=arcgrid. I'm looking at spatial files since I tend to use those quite a bit, but I'm willing to help with other files. This type of data access is in the realm of my two R packages on CRAN right now. |
Pulling the data in at all would be the first step, but a valuable second step would be getting them R-ready, e.g. converting to If that all works out too easily, we could put some effort towards displaying them neatly like http://location.sa.gov.au/viewer/ or http://www.aginsight.sa.gov.au/ . |
This would make an awesome R package. Yeah I agree with @jonocarroll, importing the datasets would make it much easier to work with. So I guess the package would need at least two functions. One function to list all the available data sets (with names, descriptions, and links), a second function to download and import a given data set. Like @jonocarroll says, we could also include a shiny app display function to explore data sets. Do you think the package should implement caching similar to |
Sounds like a useful feature, @jeffreyhanson -- especially if we're saving the transformed/R-ready versions (too?). The data should be accessible via the API which I believe https://github.com/ropensci/ckanr should handle okay. There's a good chance that there's lots we can get done in just 2 days on this, especially with a division of labour across the various aspects. |
@jeffreyhanson, for caching data, I might suggest looking at |
@adamhsparks rappdirs looks really handy - thanks for the heads up! Perhaps we could list rappdirs under Suggests in the DESCRIPTION and use it if it's installed. Otherwise, it could save the data to the working directory (or a temporary directory?). |
@jeffreyhanson, already ahead of you mate. See my It uses exactly that functionality, The CRU data won't change, if the data here change, we'll need to check the local vs server files. |
As per ropensci/auunconf#16 -- this has a lot going for it, not the least of which is a similarity to #1.
The data is mostly well-organised with attached metadata, various formats, and proper attributions to the relevant department. It's an under-utilised resource as far as I can tell, and there are currently big pushes to better use this (e.g. GovHack challenges).
The text was updated successfully, but these errors were encountered: