Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Script #33

Open
ats1958 opened this issue Apr 23, 2018 · 2 comments
Open

Error in Script #33

ats1958 opened this issue Apr 23, 2018 · 2 comments

Comments

@ats1958
Copy link

ats1958 commented Apr 23, 2018

Has anyone successfully downloaded all data recently? Getting the following error:

Exception in thread "main" java.lang.RuntimeException: java.io.IOException: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: All access to this object has been disabled (Service: Amazon S3; Status Code: 403; Error Code:

@lecy
Copy link
Member

lecy commented Apr 23, 2018

I just did a test-run and I was able to execute this R code without problem.

library( jsonlite )
library( R.utils )



# CREATE A DATA FRAME OF ELECTRONIC FILERS FROM IRS JSON FILES

dat1 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2011.json")[[1]]
dat2 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2012.json")[[1]]
dat3 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2013.json")[[1]]
dat4 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2014.json")[[1]]
dat5 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2015.json")[[1]]
dat6 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2016.json")[[1]]
dat7 <- fromJSON("https://s3.amazonaws.com/irs-form-990/index_2017.json")[[1]]

efiler.index <- rbind( dat1, dat2, dat3, dat4, dat5, dat6, dat7 )

head( efiler.index )


library( xml2 )
library( dplyr )


### EXAMPLE ORGANIZATIONS FROM EACH PERIOD

V_990_2014 <- "https://s3.amazonaws.com/irs-form-990/201543089349301829_public.xml"

V_990_2012 <- "https://s3.amazonaws.com/irs-form-990/201322949349300907_public.xml"

V_990EZ_2014 <- "https://s3.amazonaws.com/irs-form-990/201513089349200226_public.xml"

V_990EZ_2012 <- "https://s3.amazonaws.com/irs-form-990/201313549349200311_public.xml"





### GENERATE ALL XPATHS: V 990 2014
doc <- read_xml( V_990_2014 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()



### GENERATE ALL XPATHS: V 990 2012
doc <- read_xml( V_990_2012 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()



### GENERATE ALL XPATHS: V 990EZ 2014
doc <- read_xml( V_990EZ_2014 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()



### GENERATE ALL XPATHS: V 990EZ 2012
doc <- read_xml( V_990EZ_2012 )
xml_ns_strip( doc )
doc %>% xml_find_all( '//*') %>% xml_path()

@borenstein
Copy link

borenstein commented Apr 24, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants