-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in Script #33
Comments
I just did a test-run and I was able to execute this R code without problem. library( jsonlite )
library( R.utils )
# CREATE A DATA FRAME OF ELECTRONIC FILERS FROM IRS JSON FILES
dat1 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2011.json")[[1]]
dat2 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2012.json")[[1]]
dat3 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2013.json")[[1]]
dat4 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2014.json")[[1]]
dat5 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2015.json")[[1]]
dat6 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2016.json")[[1]]
dat7 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2017.json")[[1]]
efiler.index <- rbind( dat1, dat2, dat3, dat4, dat5, dat6, dat7 )
head( efiler.index )
library( xml2 )
library( dplyr )
### EXAMPLE ORGANIZATIONS FROM EACH PERIOD
V_990_2014 <- "https://s3.amazonaws.com/irs-form-990/201543089349301829_public.xml"
V_990_2012 <- "https://s3.amazonaws.com/irs-form-990/201322949349300907_public.xml"
V_990EZ_2014 <- "https://s3.amazonaws.com/irs-form-990/201513089349200226_public.xml"
V_990EZ_2012 <- "https://s3.amazonaws.com/irs-form-990/201313549349200311_public.xml"
### GENERATE ALL XPATHS: V 990 2014
doc <- read_xml( V_990_2014 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()
### GENERATE ALL XPATHS: V 990 2012
doc <- read_xml( V_990_2012 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()
### GENERATE ALL XPATHS: V 990EZ 2014
doc <- read_xml( V_990EZ_2014 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()
### GENERATE ALL XPATHS: V 990EZ 2012
doc <- read_xml( V_990EZ_2012 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()
|
Depending on how you're authenticating, you may be unable to get S3 data
via the s3:// protocol, while having no trouble downloading it anonymously
via https. When that happens, it's usually something about permissions on
your side--either your IAM role is too restrictive, or the client that
you're using to talk to S3 doesn't see your credentials. Do try to get it
working via S3, however--batch downloads via S3 are orders of magnitude
faster than individual https requests, which each require separate
handshakes between your machine and Amazon.
…--
David Bruce Borenstein, PhD
781.710.2789 (m)
https://www.linkedin.com/in/davidborenstein
On Mon, Apr 23, 2018 at 11:04 AM, Jesse Lecy ***@***.***> wrote:
I just did a test-run and I was able to execute this R code without
problem.
library( jsonlite )
library( R.utils )
# CREATE A DATA FRAME OF ELECTRONIC FILERS FROM IRS JSON FILES
dat1 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2011.json")[[1]]dat2 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2012.json")[[1]]dat3 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2013.json")[[1]]dat4 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2014.json")[[1]]dat5 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2015.json")[[1]]dat6 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2016.json")[[1]]dat7 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2017.json")[[1]]
efiler.index <- rbind( dat1, dat2, dat3, dat4, dat5, dat6, dat7 )
head( efiler.index )
library( xml2 )
library( dplyr )
### EXAMPLE ORGANIZATIONS FROM EACH PERIOD
V_990_2014 <- "https://s3.amazonaws.com/irs-form-990/201543089349301829_public.xml"
V_990_2012 <- "https://s3.amazonaws.com/irs-form-990/201322949349300907_public.xml"
V_990EZ_2014 <- "https://s3.amazonaws.com/irs-form-990/201513089349200226_public.xml"
V_990EZ_2012 <- "https://s3.amazonaws.com/irs-form-990/201313549349200311_public.xml"
### GENERATE ALL XPATHS: V 990 2014doc <- read_xml( V_990_2014 )
xml_ns_strip( doc )doc %>% xml_find_all( '//*') %>% xml_path()
### GENERATE ALL XPATHS: V 990 2012doc <- read_xml( V_990_2012 )
xml_ns_strip( doc )doc %>% xml_find_all( '//*') %>% xml_path()
### GENERATE ALL XPATHS: V 990EZ 2014doc <- read_xml( V_990EZ_2014 )
xml_ns_strip( doc )doc %>% xml_find_all( '//*') %>% xml_path()
### GENERATE ALL XPATHS: V 990EZ 2012doc <- read_xml( V_990EZ_2012 )
xml_ns_strip( doc )doc %>% xml_find_all( '//*') %>% xml_path()
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#33 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEPgnzPhH8Z5Nz4LOKWGHqnmbC847M68ks5tre2PgaJpZM4Tf7kv>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Has anyone successfully downloaded all data recently? Getting the following error:
The text was updated successfully, but these errors were encountered: