Final Project #20

sunaynagoel · 2019-12-01T23:20:06Z

@Anthony-Howell-PhD I am running into following error while knitting the .rmd document.

Quitting from lines 56-78 (Final_Project_Outline_Storyboard-Goel.Rmd)
Error in loadNamespace(name) : there is no package called 'lorem'
Calls: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted

When I tried to install the package "lorem", following error was produced.

 Package LibPath Version Priority Depends Imports LinkingTo Suggests Enhances License License_is_FOSS License_restricts_use OS_type Archs MD5sum
 NeedsCompilation Built

Anyone else running into this issue?

The text was updated successfully, but these errors were encountered:

sunaynagoel · 2019-12-01T23:29:47Z

@Anthony-Howell-PhD. I was able to knit the file after including the following code.

install.packages1("devtools")
devtools::install_github("gadenbuie/lorem")

And later by calling the library (with all other libraries).
library (lorem)

Jigarci3 · 2019-12-02T04:07:33Z

@Anthony-Howell-PhD I might be completely off on this but I am trying to subset census.dats for my MSA. Here is my code

grep("^SEA", census.dats$msaname, value = TRUE)
these.sea <- census.dats$msaname == "SEATTLE-BELLEVUE-EVERETT, WA"
these.fips <- census.dats$fipscounty[ these.sea ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

sea.pop1 <-
  get_acs( geography = "tract", variables = "Median.HH.Value00", "Foreign.Born00", "Recent.Immigrant00", "Poor.English00", "Veteran00", "Poverty00", "Poverty.Black00", " Poverty.White00", "Poverty.Hispanic00", "Pop.Black00", "Pop.Hispanic00", "Pop.Unemp00", "Pop.Manufact00", "Pop.SelfEmp00", "Pop.Prof00", "Female.LaborForce00",
         state = "53", county = county.fips[state.fips=="53"], geometry = TRUE ) %>% 
         select( "TRTID10", estimate ) %>%
         rename( POP=estimate )

sea.pop2 <-
get_acs( geography = "tract", variables = "Median.HH.Value10", "Foreign.Born10", "Recent.Immigrant10", "Poor.English10", "Veteran10", "Poverty10", "Poverty.Black10", " Poverty.White10", "Poverty.Hispanic10", "Pop.Black10", "Pop.Hispanic10", "Pop.Unemp10", "Pop.Manufact10", "Pop.SelfEmp10", "Pop.Prof10", "Female.LaborForce10",
         state = "53", county = county.fips[state.fips=="53"], geometry = TRUE ) %>% 
         select( "TRTID10", estimate ) %>%
         rename( POP=estimate )

sea.pop <- rbind(sea.pop1, sea.pop2)

I am getting the following error: "Error in if (shift_geo) { : argument is not interpretable as logical"

I can't figure out how to correct this error or if I am on the right track with my attempt to only include Seattle data.

AntJam-Howell · 2019-12-02T05:15:35Z

@Jigarci3 You do not have to use the get_acs function to download data for the final project. The code chunk (below) gives you the 2000 and 2010 census variables. You have the census.dats dataframe that includes the tract ('TRTID10'), state ('state') and county ('county') information already. You need to subset the census.dats to include only the Seattle counties of your interest.

sunaynagoel · 2019-12-02T18:50:31Z

@Anthony-Howell-PhD. The main (top horizontal) navigation bar is hiding the titles and descriptions of the widgets below it. Is there anyway to customize it? I tried different things but could not achieve desired results. Thanks
I am attaching a screen shot.

lecy · 2019-12-02T18:54:53Z

You can create a custom Cascading Style Sheet (CSS) to moderate this behavior (you have not learned this yet), but the easiest solution is to simplify the menu bar.

Shorten the project title ("Community Analytics Practicum Extravaganza" is tongue-in-cheek, you can change it), and consider grouping some items (can you combine clustering, neighborhoods, and neighborhood change? ).

sunaynagoel · 2019-12-02T19:02:04Z

@lecy Thank you. Shortening the menu bar helped.

sunaynagoel · 2019-12-03T03:07:44Z

I was wondering if limiting the decimals places in the table displayed using datatable() to 4 or 5? Will it affect the predictions?

AntJam-Howell · 2019-12-03T15:15:23Z

It is common to round to 2 or 3 decimal places, which should not have any noticeable effect on model outcomes or predictions.

…

On Mon, Dec 2, 2019 at 8:07 PM sunaynagoel ***@***.***> wrote: I was wondering if limiting the decimals places in the table displayed using datatable() to 4 or 5? Will it affect the predictions? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AMK2Y72R52JTS6ILNPQ7XBLQWXEQDA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFX5RTI#issuecomment-560978125>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMK2Y77WAR5KDINASARLB6DQWXEQDANCNFSM4JTOIFQQ> .

-- Anthony Howell Asst. Prof. in Public Policy School of Public Affairs Arizona State University Faculty Profile <https://isearch.asu.edu/profile/3501621> (CV <https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0>)

etbartell · 2019-12-03T17:56:32Z

@Anthony-Howell-PhD. I was able to knit the file after including the following code.
install.packages1("devtools")
devtools::install_github("gadenbuie/lorem")
And later by calling the library (with all other libraries).
library (lorem)

@Anthony-Howell-PhD I'm having this same issue with knitting the original rmd but it was not solved with the code provided above. When I try it with this code:

knitr::opts_chunk$set(  message=F, warning=F, echo=F )

install.packages("devtools")
devtools::install_github("gadenbuie/lorem")

#Load in libraries
library( tidycensus )
library( tidyverse )
library( ggplot2 )
library( plyr )
library( stargazer )
library( corrplot )
library( purrr )
library( flexdashboard )
library( leaflet )
library( mclust )
library( pander )
library( DT )
library( lorem )

I get the following error message:

When I try to simply use install.package( "lorem" ), it tells me that "package ‘lorem’ is not available (for R version 3.6.1)". Do I need to download a different version of R? I thought we were all using the same version.

AntJam-Howell · 2019-12-03T18:06:39Z

@etbartell If you cannot download and load the lorem package, the easiest thing to do is go through the .rmd file and remove the lorem call feature. To do this, paste into your search box of the .rmd file to find all instances of the following code: r lorem::ipsum(paragraphs = 1)

You can then delete this code chunk one by one or all at once. Just remember everytime you see that code, it represents a place for to provide your own answer. You can still return to these places to provide your answer by searching for the <!--- symbol that denotes the instructions.

etbartell · 2019-12-03T18:51:39Z

@etbartell If you cannot download and load the lorem package, the easiest thing to do is go through the .rmd file and remove the lorem call feature. To do this, paste into your search box of the .rmd file to find all instances of the following code: r lorem::ipsum(paragraphs = 1)

You can then delete this code chunk one by one or all at once. Just remember everytime you see that code, it represents a place for to provide your own answer. You can still return to these places to provide your answer by searching for the <!--- symbol that denotes the instructions.

That worked, thanks!

meliapetersen · 2019-12-03T21:54:14Z

I'm having a weird issue with my code from lab 4 (it didn't happen when I turned in the lab, but it's happening now).

I'm getting the error that I am not using an argument:

Error in rename(., POP = estimate) : unused argument (POP = estimate)

When running this code:

crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )

these.seattle <- crosswalk$msaname == "SEATTLE-BELLEVUE-EVERETT, WA"
these.fips <- crosswalk$fipscounty[ these.seattle ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

seattle.pop <-
  get_acs( geography = "tract", variables = "B01003_001", state = "53", county = county.fips[state.fips=="53"], geometry = TRUE ) %>%
  select( GEOID, estimate ) %>%
  rename( POP = estimate )

URL <- "https://github.com/DS4PS/cpp-529-master/raw/master/data/ltdb_std_2010_sample.rds"
census.dat <- readRDS(gzcon(url( URL )))

# merge shapefile data with census data in new dataframe
seattle <- merge( seattle.pop, census.dat, by.x="GEOID", by.y="tractid" )
seattle2 <- seattle[ ! st_is_empty( seattle ) , ]
seattle.sp <- as_Spatial( seattle2 )
class( seattle.sp )

For the empirical framework portion of the dashboard.

Am I on the right track for this portion? I am also unclear on that as well. This was just the code I had from lab 4.

AntJam-Howell · 2019-12-03T22:04:46Z

@meliapetersen sorry to hear that is happening. My suggestion is to focus on understanding how to subset the census.dats dataset to only your MSA of interest. Based on your code, your counties of interest are ("029" "033" "061"). The census.dat dataframe have the actual names of the counties not numbers. It was intended that this dilemna would lead people to search online for county fips (see my google search screenshot attached). The first option is a concordance (attached also below). You will have to match the number of your fip counties to the names in the concordance, then subset those county names in your census.dats dataset.

Countyfipconcordance.pdf

meliapetersen · 2019-12-03T22:10:28Z

@meliapetersen sorry to hear that is happening. My suggestion is to focus on understanding how to subset the census.dats dataset to only your MSA of interest. Based on your code, your counties of interest are ("029" "033" "061"). The census.dat dataframe have the actual names of the counties not numbers. It was intended that this dilemna would lead people to search online for county fips (see my google search screenshot attached). The first option is a concordance (attached also below). You will have to match the number of your fip counties to the names in the concordance, then subset those county names in your census.dats dataset.

Countyfipconcordance.pdf

I see where I'm going wrong, thank you!

castower · 2019-12-03T23:13:13Z

@Anthony-Howell-PhD. I was able to knit the file after including the following code.
install.packages1("devtools")
devtools::install_github("gadenbuie/lorem")
And later by calling the library (with all other libraries).
library (lorem)
@Anthony-Howell-PhD I'm having this same issue with knitting the original rmd but it was not solved with the code provided above. When I try it with this code:
knitr::opts_chunk$set(  message=F, warning=F, echo=F )

install.packages("devtools")
devtools::install_github("gadenbuie/lorem")

#Load in libraries
library( tidycensus )
library( tidyverse )
library( ggplot2 )
library( plyr )
library( stargazer )
library( corrplot )
library( purrr )
library( flexdashboard )
library( leaflet )
library( mclust )
library( pander )
library( DT )
library( lorem )
I get the following error message:

When I try to simply use install.package( "lorem" ), it tells me that "package ‘lorem’ is not available (for R version 3.6.1)". Do I need to download a different version of R? I thought we were all using the same version.

@etbartell I ran into the same problem and found that entering the following code fixed it:

devtools::install_github("gadenbuie/lorem")

I read here for additional info: https://github.com/gadenbuie/lorem

Edit: oops, just realized this is the exact same code as above, I somehow overlooked that!

lepp12 · 2019-12-03T23:36:13Z

@Anthony-Howell-PhD

I'm running into a similar issue as other on the section requiring code from Lab 4. However, I'm not getting a descriptive error. When I run the following code:

crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )

these.san <- crosswalk$msaname == "SAN DIEGO, CA"
these.fips <- crosswalk$fipscounty[ these.san ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

san.pop <-
  get_acs( geography = "tract", variables = "B01003_001", state = "06", county = county.fips[state.fips=="06"], geometry = TRUE ) %>%
  select( GEOID, estimate ) %>%

I only get "Error: "

AntJam-Howell · 2019-12-03T23:39:39Z

You do not need to download data using get_acs. You already have the data you need with census.dats. You only need to subset the census.dats date to your chosen MSA (which is typically a few different counties). Please see the response to Melia above (pasted below) and let me know if that helps.

@meliapetersen sorry to hear that is happening. My suggestion is to focus on understanding how to subset the census.dats dataset to only your MSA of interest. Based on your code, your counties of interest are ("029" "033" "061"). The census.dat dataframe have the actual names of the counties not numbers. It was intended that this dilemna would lead people to search online for county fips (see my google search screenshot attached). The first option is a concordance (attached also below). You will have to match the number of your fip counties to the names in the concordance, then subset those county names in your census.dats dataset.

Countyfipconcordance.pdf

AntJam-Howell · 2019-12-03T23:39:54Z

@lepp12 please see above reply.

castower · 2019-12-04T00:07:44Z

@lepp12, if you don't want to have to Google the names, they are in the crosswalk dataset. Therefore, I just altered my data frame from the crosswalk to be as follows:

name.fips <- crosswalk$countyname[these.YOURCITY]
data.frame( state=state.fips, county=county.fips, FIPS=these.fips, name=name.fips)

This then gave me the names of each county.

AntJam-Howell · 2019-12-04T00:14:02Z

Nice find @castower

meliapetersen · 2019-12-04T00:19:28Z

I'm still having trouble understanding what I'm supposed to do with the names of the counties and pulling them from census.dats . I have identified the fip names, but is there a specific place I can refer to for an explanation of the code to pull just the select info for the rest of the dashboard? It feels like such a simple answer but I cannot seem to make sense of it. Thank you!

castower · 2019-12-04T00:23:27Z

I'm still having trouble understanding what I'm supposed to do with the names of the counties and pulling them from census.dats . I have identified the fip names, but is there a specific place I can refer to for an explanation of the code to pull just the select info for the rest of the dashboard? It feels like such a simple answer but I cannot seem to make sense of it. Thank you!

@meliapetersen I used the filter function to just select the needed counties

sunaynagoel · 2019-12-04T01:54:14Z

I am a little lost at reading transition matrix. Here is a screen shot of my transition matrix.

AntJam-Howell · 2019-12-04T02:03:11Z

Example: looking at the last row, 80.6 percent of counties classified as cluster 4 in 2000 was also clustered as cluster 4 in 2010. 12.9 percent moved into cluster 3, 6.4 percent moved into cluster 2, and no tracts moved into cluster 1. Depending on how your clusters are defined will help to explain how the meaning of these transitions. Note: the diagonal values indicate that tracts remained in same cluster grouping in 2000 and 2010.

On Tue, Dec 3, 2019 at 6:54 PM sunaynagoel ***@***.***> wrote: I am a little lost at reading transition matrix. Here is a screen shot of my transition matrix. [image: Screen Shot 2019-12-03 at 6 53 04 PM] <http://url> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AMK2Y7YGRMHIGLUT6K5J6I3QW4EUNA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF3OEYQ#issuecomment-561439330>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMK2Y7YN65ZQZTRQRRZDQCDQW4EUNANCNFSM4JTOIFQQ> .

-- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

etbartell · 2019-12-04T02:13:38Z

I'm having trouble understanding the change variables conceptually. If we were going for percent change, we would just use (2010var-2000var)/2000var, but since we're using the formula of 2000var/(2010var+1), I don't understand what the values are telling us. With the exception of home price, the other variables are all decimals, and adding 1 to the denominator completely alters its value. For example, if ForeigBornChange = 0.095, this doesn't mean that the foreign-born population changed by 9.5%. It's just what the formula spit out. I feel like I'm missing something. Does anyone have a solid grasp of what these variables mean?

castower · 2019-12-04T02:18:24Z

I have a question concerning the dorling maps. In Lab 4 we were creating them based on household income, but I'm not sure what we're clustering here. Should we group these by the cluster variable or something else? I may be overlooking a step, but I can't quite figure out what I'm plotting.

Thanks!

AntJam-Howell · 2019-12-04T02:45:06Z

@etbartell Nice question here and nice catch. Actually, it is more intuitive to have the change variables defined as 2010var/2000var rather than in the .rmd file which has it as 2000var/2010var. With respect to adding a constant to a variable, in this case it would be better to add a small value to the variables. So for home values, adding a 1 makes sense. When working with proportions it makes more sense to add a .01 instead of 1. I will update these changes to the .rmd file.

sunaynagoel · 2019-12-04T02:51:09Z

@etbartell Nice question here and nice catch. Actually, it is more intuitive to have the change variables defined as 2010var/2000var rather than in the .rmd file which has it as 2000var/2010var. With respect to adding a constant to a variable, in this case it would be better to add a small value to the variables. So for home values, adding a 1 makes sense. When working with proportions it makes more sense to add a .01 instead of 1. I will update these changes to the .rmd file.

This make so much more sense now. Thank @etbartell for asking this question and @Anthony-Howell-PhD for the help.

AntJam-Howell · 2019-12-04T02:51:38Z

@castower Besides household income, we also used dorling to map clusters in Lab 4. see the attached screenshot from lab 4 instructions.

castower · 2019-12-04T03:44:20Z

@Anthony-Howell-PhD Thank you!
I have another question about the data tab of the flexdashboard. Should there be labels on the blue tabs? I can't figure out how to name them.

castower · 2019-12-04T22:53:35Z

There is a way that it could be done. Could try to troubleshoot it on google search, but the easiest and perhaps more informative way is to change variable names either directly to the data or indirectly through ggplot. I googled change variable names in ggplot and the first option that pops up is the following link that may get you started (Link https://stackoverflow.com/questions/52656493/renaming-variable-names-in-a-ggplot2 )
…
On Wed, Dec 4, 2019 at 2:28 PM Courtney @.***> wrote: Is there anyway to set ggplot to not cut off the titles of my labels on the histogram grid? The look fine in RMarkdown, but when I knit the file some of the title labels are cut off: [image: Screen Shot 2019-12-04 at 1 26 09 PM] https://user-images.githubusercontent.com/54308186/70183052-e8d06400-1699-11ea-9873-446dd91d26c0.png — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AMK2Y7YX4L5MIFJBGA5MIK3QXAOHVA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF6RUFA#issuecomment-561846804>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y746FL45ZNT7PC6HZY3QXAOHVANCNFSM4JTOIFQQ .
-- Anthony Howell Asst. Prof. in Public Policy School of Public Affairs Arizona State University Faculty Profile https://isearch.asu.edu/profile/3501621 (CV https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0)

Thank you!

One other question, I discovered that my data set has one massive outlier for the House Price change variable (there's an instance where in 2000 the median house price was only $300 and in 2010 it was $284,900). Should I exclude this outlier since it's skewing the data (especially the mean) or just mention it in my summary?

Thanks!

castower · 2019-12-05T00:40:30Z

If anyone else has questions about changing the grid labels, this website has great instructions: https://www.datanovia.com/en/blog/how-to-change-ggplot-facet-labels/

castower · 2019-12-05T01:10:58Z

If anyone else has questions about changing the grid labels, this website has great instructions: https://www.datanovia.com/en/blog/how-to-change-ggplot-facet-labels/

Also want to note, that if you want to leave the variables alone, you can use the fig.width setting for r-markdown to widen the figure.

sunaynagoel · 2019-12-05T02:49:44Z

If anyone else has questions about changing the grid labels, this website has great instructions: https://www.datanovia.com/en/blog/how-to-change-ggplot-facet-labels/

Also want to note, that if you want to leave the variables alone, you can use the fig.width setting for r-markdown to widen the figure.

Thanks @castower

AntJam-Howell · 2019-12-05T03:21:13Z

@lepp12 the problem is that you've subsetted the data to only 2000 variables, run the prediction, and then you are trying to predict new data with the full dataset that includes both the 2000 and 2010 variables. I suggest instead of the following Census2000 <-census.dats you may want to create 2 separate datasets, 1 for 2010 and 1 for 2000, and make sure that the same variables are included in both.

AntJam-Howell · 2019-12-05T03:22:26Z

@castower There are several options, all of which are quite common, and you are free to choose what you think is best: windsorize the variable, trim the variable, remove the outlier outright, or take the log of the variable.

castower · 2019-12-05T06:56:39Z

@castower There are several options, all of which are quite common, and you are free to choose what you think is best: windsorize the variable, trim the variable, remove the outlier outright, or take the log of the variable.

@Anthony-Howell-PhD thank you!

meliapetersen · 2019-12-05T16:18:56Z

@meliapetersen you can remove the ### Identifying Communities. There is no output to show there.

@Anthony-Howell-PhD It looks like below there is a place to "interpret results" before the cluster analysis. If there is no output to show, are there results to interpret?

AntJam-Howell · 2019-12-05T16:54:49Z

The main point of that section is to define and label each of the cluster groupings, which can be done on the side panel of each cluster output figure. You are free to keep in the ### identifying communities and add some basic description of what you did, i.e. perform cluster analysis, but there is no visualization for now. You can add your own visualization if you want though.

AntJam-Howell · 2019-12-05T16:55:12Z

@meliapetersen see above reply

meliapetersen · 2019-12-05T21:44:42Z

Hi, I'm having trouble identifying the variables needed for the dorling cartograms. Can someone help me make a little but more sense of it in regards to how to merge the spatial information to the Census2010 data frame. I keep going back to Lab 4 and what we've coded thus far and I'm not fully understanding what I'm supposed to do. Thank you! :)

AntJam-Howell · 2019-12-05T21:51:52Z

@meliapetersen I would suggest getting the spatial data information using get_acs for your state. Once you have that, you will want to merge your census dataframe to the spatial dataframe.

SpatialData <-
get_acs( geography = "tract", variables = "B01003_001", state = "??", geometry = TRUE ) %>%
select( GEOID, estimate )

SpatialData<-merge(SpatialData, CensusDataframeNAME,all.x='GEOID',all.y='TRTID10')

That should get you started

lecy · 2019-12-05T22:22:04Z

@meliapetersen If helpful, the course GitHub site has information on how the dorling cartograms were built for labs 3 and 4. That code might be instructive:

https://github.com/DS4PS/cpp-529-master/blob/master/data/README.md

meliapetersen · 2019-12-05T22:23:02Z

@meliapetersen I would suggest getting the spatial data information using get_acs for your state. Once you have that, you will want to merge your census dataframe to the spatial dataframe.
SpatialData <-
get_acs( geography = "tract", variables = "B01003_001", state = "??", geometry = TRUE ) %>%
select( GEOID, estimate )

SpatialData<-merge(SpatialData, CensusDataframeNAME,all.x='GEOID',all.y='TRTID10')
That should get you started

Awesome, thank you! I think I figured it out, but I'm getting an error when I run my code:

census_api_key("b431c35dad89e2863681311677d12581e8f24c24")
options(tigris_use_cache = TRUE)

seattle.pop <-
  get_acs( geography = "tract", variables = "B01003_001", 
           state = "53", geometry = TRUE ) %>%
  select( GEOID, estimate )


seattle.pop$GEOID<-substring(seattle.pop$GEOID, 2)
seattle <- merge( seattle.pop, Census2010, by.x="GEOID", by.y="TRTID10" )


seattle2 <- seattle[ ! st_is_empty( seattle ) , ]


seattle.sp <- as_Spatial( seattle2 )
class( seattle.sp )

seattle.sp <- spTransform( seattle.sp, CRS("+init=epsg:3395"))
seattle.sp <- seattle.sp[ seattle.sp$POP != 0 & (! is.na( seattle.sp$POP )) , ]

seattle.sp$pop.w <- seattle.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
seattle_dorling <- seattle_dorling( x=seattle.sp, weight="pop.w", k=0.05 )

tm_shape( seattle_dorling ) + 
  tm_polygons( size="POP", col="cluster", n=4, style="cat", palette="Spectral")

Error:

Error in st_cast_sfc_default(x) : list item(s) not of class sfg

lecy · 2019-12-05T22:31:22Z

I think you are missing an argument here:

substring( seattle.pop$GEOID, 2 )

You usually have a starting position and ending position for the substring() function.

What does the Census2010 tract ID TRTID10 look like? Is it state, county, and tract FIP IDs, or just one of them?

SS-CCC-TTTTTT

RickyDuran · 2019-12-05T22:48:55Z

I am having trouble with the same area (creating dorling):

census_api_key("42bf5fcc6e6a6f05ebe97a0e647a5216a708613a")

aus.pop <-
get_acs( geography = "tract", variables = "B01003_001",
         state = "48", county = county.fips[state.fips=="48"], geometry = TRUE ) %>% 
         select( GEOID, estimate )

aus <- merge( aus.pop, austin.data, by.x="GEOID", by.y="TRTID10" )

aus.sp <- as_Spatial( aus )

class( aus.sp )

aus.sp <- spTransform( aus.sp, CRS("+init=epsg:3395"))

aus.sp <- aus.sp[ aus.sp$POP != 0 & (! is.na( aus.sp$POP )) , ]

aus_dorling <- cartogram_dorling( x=aus.sp, weight="pop.w", k=0.05 )

I get the error:
Error in packcircles::circleRepelLayout(x = dat.init, xysizecols = 1:3, : all sizes are missing and/or non-positive

meliapetersen · 2019-12-05T22:52:16Z

@lecy So I noticed that I didn't have the county = county.fips[state.fips=="53"]argument in my seattle.pop code, but I don't know if that's what you were talking about because that didn't fix the issue. I'm still getting the same argument.
code here:

census_api_key("b431c35dad89e2863681311677d12581e8f24c24")
options(tigris_use_cache = TRUE)

seattle.pop <-
  get_acs( geography = "tract", variables = "B01003_001", 
           state = "53", county = county.fips[state.fips=="53"], geometry = TRUE ) %>%
  select( GEOID, estimate )


seattle.pop$GEOID<-substring(seattle.pop$GEOID, 2)
seattle <- merge( seattle.pop, Census2010, by.x="GEOID", by.y="TRTID10" )


seattle2 <- seattle[ ! st_is_empty( seattle ) , ]


seattle.sp <- as_Spatial( seattle2 )
class( seattle.sp )

seattle.sp <- spTransform( seattle.sp, CRS("+init=epsg:3395"))
seattle.sp <- seattle.sp[ seattle.sp$POP != 0 & (! is.na( seattle.sp$POP )) , ]

seattle.sp$pop.w <- seattle.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
seattle_dorling <- seattle_dorling( x=seattle.sp, weight="pop.w", k=0.05 )

tm_shape( seattle_dorling ) + 
  tm_polygons( size="POP", col="cluster", n=4, style="cat", palette="Spectral")

I'm not quite sure I understand what I need to add to the substring argument.

lecy · 2019-12-05T23:06:56Z

@RickyDuran Do you have a POP variable? I might have renamed it from the default census name.

Did you create the weighted pop variable?

phx$pop.w <- phx$POP / 10000   # standardizes it to max of 1.5

You might have to adjust the denominator depending on the max population. If you also have an NA or a 0 for population it could create the error you are getting. You should drop those polygons (filter them out) before the conversion.

summary( phx$POP )

lecy · 2019-12-05T23:18:48Z

@meliapetersen The get_acs() function requires both state and county fips. Here is how they are being generated from the original GEOID (which is state-county-tract FIPS combined):

crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )
these.msp <- crosswalk$msaname == "MINNEAPOLIS-ST. PAUL, MN-WI"
these.fips <- crosswalk$fipscounty[ these.msp ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

data.frame( these.fips, state.fips, county.fips ) %>% pander()

	these.fips	state.fips	county.fips
1	27003	27	003
2	27019	27	019
3	27025	27	025
4	27037	27	037
5	27053	27	053
6	27059	27	059
7	27123	27	123
8	27139	27	139
9	27141	27	141
10	27163	27	163
11	27171	27	171
12	55093	55	093
13	55109	55	109

If your MSA spans two states then you need to split these codes into two separate calls to get_acs(), I believe. One for each state using the corresponding county FIPS. Recall that county FIPS are not unique since each state will have a 001, 002, etc.

> county.fips
 [1] "003" "019" "025" "037" "053" "059" "123" "139" "141" "163" "171" "093"
[13] "109"
> county.fips[ state.fips=="55" ]
[1] "093" "109"

Substring is pulling out each FIPS.

meliapetersen · 2019-12-05T23:29:49Z

@lecy So I fixed the issue I was having, and now it's giving me a different error. I added:

crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )

these.seattle <- crosswalk$msaname == "SEATTLE-BELLEVUE-EVERETT, WA"
these.fips <- crosswalk$fipscounty[ these.seattle ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

name.fips <- crosswalk$countyname[these.seattle]

census_api_key("b431c35dad89e2863681311677d12581e8f24c24")
options(tigris_use_cache = TRUE)

And now it is giving me an issue with this line of code:


seattle.sp$pop.w <- seattle.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
seattle_dorling <- cartogram_dorling( x=seattle.sp, weight="pop.w", k=0.05 )

Giving me this error:
[1] "SpatialPolygonsDataFrame"
attr(,"package")
[1] "sp"
Error in x@polygons[[1]] : subscript out of bounds

Here is both code chunks now:

crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )

these.seattle <- crosswalk$msaname == "SEATTLE-BELLEVUE-EVERETT, WA"
these.fips <- crosswalk$fipscounty[ these.seattle ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

name.fips <- crosswalk$countyname[these.seattle]

census_api_key("b431c35dad89e2863681311677d12581e8f24c24")
options(tigris_use_cache = TRUE)

seattle.pop <-
  get_acs( geography = "tract", variables = "B01003_001", 
           state = "53", county = county.fips[state.fips=="53"], geometry = TRUE ) %>%
  select( GEOID, estimate )


seattle.pop$GEOID<-substring(seattle.pop$GEOID, 1)
seattle <- merge( seattle.pop, Census2010, by.x="GEOID", by.y="TRTID10" )


seattle2 <- seattle[ ! st_is_empty( seattle ) , ]


seattle.sp <- as_Spatial( seattle2 )
class( seattle.sp )

seattle.sp <- spTransform( seattle.sp, CRS("+init=epsg:3395"))
seattle.sp <- seattle.sp[ seattle.sp$POP != 0 & (! is.na( seattle.sp$POP )) , ]

seattle.sp$pop.w <- seattle.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
seattle_dorling <- cartogram_dorling( x=seattle.sp, weight="pop.w", k=0.05 )

tm_shape( seattle_dorling ) + 
  tm_polygons( size="POP", col="cluster", n=4, style="cat", palette="Spectral")

JaesaR · 2019-12-05T23:44:01Z

I think you are missing an argument here:
substring( seattle.pop$GEOID, 2 )
You usually have a starting position and ending position for the substring() function.

What does the Census2010 tract ID TRTID10 look like? Is it state, county, and tract FIP IDs, or just one of them?
SS-CCC-TTTTTT

I am having the same issue that Melia is having, but adding the code she added to fix this issue did not work for me.

My code looks like this:

crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )

these.chi <- crosswalk$msaname == "CHICAGO, IL"
these.fips <- crosswalk$fipscounty[ these.chi ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

name.fips <- crosswalk$countyname[these.chi]

census_api_key("624bc0325068577dab800279b9251a06f1200af3")
options(tigris_use_cache = TRUE)
chi.pop <-
get_acs( geography = "tract", variables = "B01003_001",
         state = "06", county = county.fips[state.fips=="17"], geometry = TRUE ) %>% 
         select( GEOID, estimate ) %>%
         dplyr::rename(POP = estimate)

# merge shapefile dT with census data in new datframe

chi.pop$GEOID<-substring(chi.pop$GEOID, 2)
chi <- merge( chi.pop, census.dats, by.x="GEOID", by.y="TRTID10" )
chi2 <- chi[! st_is_empty(chi), ]
chi.sp <- as_Spatial( chi2 )
class( chi.sp )

# project map and remove empty tracts
chi.sp <- spTransform( chi.sp, CRS("+init=epsg:3395"))
chi.sp <- chi.sp[ chi.sp$POP != 0 & (! is.na( chi.sp$POP )) , ]

# convert census tract polygons to dorling cartogram
chi.sp$pop.w <- chi.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
chi_dorling <- cartogram_dorling( x=chi.sp, weight="pop.w", k=0.05 )

tm_shape( chi_dorling ) + 
  tm_polygons( size="POP", col="cluster", n=4, style="cat", palette="Spectral")

plot(chi.sp)

And is returning the error: "Error in st_cast_sfc_default(x) : list item(s) not of class sfg"

I do not understand what you mean about the substring argument not being complete. Am i supposed to add the FIPS within that argument?

RickyDuran · 2019-12-06T00:01:18Z

@lecy

I believe I have pop data, I am doing the exact same thing for downloading shapefiles with popuoation data as in lab 4, although when I leave in "rename( POP=estimate )" in

aus.pop <- 
get_acs( geography = "tract", 
variables = "B01003_001", 
state = "48", 
county = county.fips[state.fips=="48"], 
geometry = TRUE ) %>% 
select( GEOID, estimate ) %>% 
rename( POP=estimate )

I get ERROR in rename(POP = estimate) : (POP=estimate) not used

When I take it out, I also get Error in x@polygons[[1]] : subscript out of bounds at:
aus.sp$pop.w <- aus.sp$POP / 9000

AntJam-Howell · 2019-12-06T01:02:46Z

Hi everyone, As many of you are experiencing difficulties trying to recreate a dorling map using a new dataset, I’ve decided to make the dorling map optional. Part of the problem is that each chosen msa often times have unique problems that require unique troubleshooting that can cost quite a bit of time. Given the fast approaching deadline and the other requirements of the lab it’s important to push through and get the rest of the parts completed. For any of you who have successfully created the dorling map for the final project a bonus of 10% will be applied to your final project. For those of you that opt to not creat the dorling map there is no penalty. Hope that helps relieve the workload stress. Best

On Thu, Dec 5, 2019 at 5:01 PM Ricky Duran ***@***.***> wrote: @lecy <https://github.com/lecy> I believe I have pop data, I am doing the exact same thing for downloading shapefiles with popuoation data as in lab 4, although when I leave in "rename( POP=estimate )" in 'aus.pop <- get_acs( geography = "tract", variables = "B01003_001", state = "48", county = county.fips[state.fips=="48"], geometry = TRUE ) %>% select( GEOID, estimate ) %>% rename( POP=estimate )' I get *ERROR in rename(POP = estimate) : (POP=estimate) not used* When I take it out, I also get *Error in ***@***.***[[1]] : subscript out of bounds* at: aus.sp$pop.w <- aus.sp$POP / 9000 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AMK2Y77BMURLL4HXOJDGXM3QXGI45A5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCR7PQ#issuecomment-562372542>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMK2Y72VM426H6IOPVQ2PEDQXGI45ANCNFSM4JTOIFQQ> .

-- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

castower · 2019-12-06T01:28:16Z

@Anthony-Howell-PhD I just finished recording my video and it came out to right at 24 mins and 53 seconds. Is this okay?

AntJam-Howell · 2019-12-06T01:31:54Z

Absolutely!

On Thu, Dec 5, 2019 at 6:28 PM Courtney ***@***.***> wrote: @Anthony-Howell-PhD <https://github.com/Anthony-Howell-PhD> I just finished recording my video and it came out to right at 24 mins and 53 seconds. Is this okay? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AMK2Y7YIW2B5ZGNJL7RUGWDQXGTDDA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCWZSI#issuecomment-562392265>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMK2Y745WSTF73CQ4XS7KA3QXGTDDANCNFSM4JTOIFQQ> .

-- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

lecy · 2019-12-06T01:32:24Z

@meliapetersen Not sure about your current error, this seems to work (but I don't know what data you were joining with the Census2010 object because you did not include that code):

library( sp )          # work with shapefiles
library( sf )          # work with shapefiles - simple features format
library( dplyr )       # data wrangling 
library( tidycensus )
library( cartogram )  # spatial maps w/ tract size bias reduction
library( maptools )   # spatial object manipulation 


crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv",  stringsAsFactors=F, colClasses="character" )

these.seattle <- crosswalk$msaname == "SEATTLE-BELLEVUE-EVERETT, WA"
these.fips <- crosswalk$fipscounty[ these.seattle ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

census_api_key("b431c35dad89e2863681311677d12581e8f24c24")
options( tigris_use_cache = TRUE )

# only have one state, so can use county fips directly
seattle.pop <-
  get_acs( geography = "tract", variables = "B01003_001", 
           state = "53", county = c("029", "033", "061"), geometry = TRUE ) %>%
  select( GEOID, estimate ) %>%
  rename( POP=estimate )

# I am not sure how you create Census2010 here
# seattle.pop$GEOID <- substring( seattle.pop$GEOID, 1 )
# seattle <- merge( seattle.pop, Census2010, by.x="GEOID", by.y="TRTID10" )
seattle <- seattle.pop

class( seattle.pop )  # sf
seattle2 <- seattle[ ! st_is_empty( seattle ) , ]

seattle.sp <- as_Spatial( seattle2 )
class( seattle.sp )  # sp
seattle.sp <- spTransform( seattle.sp, CRS("+init=epsg:3395"))

nrow( seattle.sp )  # 569
seattle.sp <- seattle.sp[ seattle.sp$POP != 0 & (! is.na( seattle.sp$POP )) , ]
nrow( seattle.sp )  # 567

seattle.sp$pop.w <- seattle.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
summary( seattle.sp$pop.w )

seattle_dorling <- cartogram_dorling( x=seattle.sp, weight="pop.w", k=0.05 )

plot( seattle_dorling, col="red" )

castower · 2019-12-06T01:33:29Z

Absolutely!
On Thu, Dec 5, 2019 at 6:28 PM Courtney @.***> wrote: @Anthony-Howell-PhD https://github.com/Anthony-Howell-PhD I just finished recording my video and it came out to right at 24 mins and 53 seconds. Is this okay? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AMK2Y7YIW2B5ZGNJL7RUGWDQXGTDDA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCWZSI#issuecomment-562392265>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMK2Y745WSTF73CQ4XS7KA3QXGTDDANCNFSM4JTOIFQQ .
-- Anthony Howell School of Public Affairs Arizona State University (W) www.tonyjhowell.com

@Anthony-Howell-PhD thank you!

lecy · 2019-12-06T01:43:14Z

@JaesaR Where does "06" come from?

these.chi <- crosswalk$msaname == "CHICAGO, IL"
these.fips <- crosswalk$fipscounty[ these.chi ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

census_api_key("624bc0325068577dab800279b9251a06f1200af3")
options(tigris_use_cache = TRUE)
chi.pop <-
get_acs( geography = "tract", variables = "B01003_001",
         state = "06", county = county.fips, geometry = TRUE ) %>% 
         select( GEOID, estimate ) %>%
         dplyr::rename(POP = estimate)

data.frame( state.fips, county.fips )

state.fips	county.fips
17	031
17	037
17	043
17	063
17	089
17	093
17	097
17	111
17	197

lecy · 2019-12-06T01:49:02Z

@JaesaR this works, note the commented out code:

these.chi <- crosswalk$msaname == "CHICAGO, IL"
these.fips <- crosswalk$fipscounty[ these.chi ]
these.fips <- na.omit( these.fips )

state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )

data.frame( state.fips, county.fips ) 

census_api_key("624bc0325068577dab800279b9251a06f1200af3")
options(tigris_use_cache = TRUE)
chi.pop <-
get_acs( geography = "tract", variables = "B01003_001",
         state = "17", county = county.fips, geometry = TRUE ) %>% 
         select( GEOID, estimate ) %>%
         dplyr::rename(POP = estimate)

# chi.pop$GEOID<-substring(chi.pop$GEOID, 2)
# chi <- merge( chi.pop, census.dats, by.x="GEOID", by.y="TRTID10" )
chi <- chi.pop

chi2 <- chi[! st_is_empty(chi), ]
chi.sp <- as_Spatial( chi2 )
class( chi.sp )

# project map and remove empty tracts
chi.sp <- spTransform( chi.sp, CRS("+init=epsg:3395"))
chi.sp <- chi.sp[ chi.sp$POP != 0 & (! is.na( chi.sp$POP )) , ]

# convert census tract polygons to dorling cartogram
chi.sp$pop.w <- chi.sp$POP / 9000 # max(msp.sp$POP)   # standardizes it to max of 1.5
chi_dorling <- cartogram_dorling( x=chi.sp, weight="pop.w", k=0.05 )

plot( chi_dorling, col="steelblue" )

lecy · 2019-12-06T01:52:46Z

@RickyDuran you did not have enough code for a reproducible example. So I'm not sure how the error is introduced, but what you have looks fine.

Final Project #20

Final Project #20

Comments

sunaynagoel commented Dec 1, 2019

sunaynagoel commented Dec 1, 2019

Jigarci3 commented Dec 2, 2019

AntJam-Howell commented Dec 2, 2019

sunaynagoel commented Dec 2, 2019

lecy commented Dec 2, 2019

sunaynagoel commented Dec 2, 2019

sunaynagoel commented Dec 3, 2019

AntJam-Howell commented Dec 3, 2019 via email

etbartell commented Dec 3, 2019 • edited Loading

AntJam-Howell commented Dec 3, 2019

etbartell commented Dec 3, 2019

meliapetersen commented Dec 3, 2019

AntJam-Howell commented Dec 3, 2019

meliapetersen commented Dec 3, 2019

castower commented Dec 3, 2019 • edited Loading

lepp12 commented Dec 3, 2019

AntJam-Howell commented Dec 3, 2019

AntJam-Howell commented Dec 3, 2019

castower commented Dec 4, 2019

AntJam-Howell commented Dec 4, 2019

meliapetersen commented Dec 4, 2019

castower commented Dec 4, 2019

sunaynagoel commented Dec 4, 2019

AntJam-Howell commented Dec 4, 2019 via email

etbartell commented Dec 4, 2019

castower commented Dec 4, 2019

AntJam-Howell commented Dec 4, 2019

sunaynagoel commented Dec 4, 2019

AntJam-Howell commented Dec 4, 2019

castower commented Dec 4, 2019

castower commented Dec 4, 2019

castower commented Dec 5, 2019

castower commented Dec 5, 2019

sunaynagoel commented Dec 5, 2019

AntJam-Howell commented Dec 5, 2019

AntJam-Howell commented Dec 5, 2019

castower commented Dec 5, 2019

meliapetersen commented Dec 5, 2019

AntJam-Howell commented Dec 5, 2019

AntJam-Howell commented Dec 5, 2019

meliapetersen commented Dec 5, 2019

AntJam-Howell commented Dec 5, 2019

lecy commented Dec 5, 2019

meliapetersen commented Dec 5, 2019

lecy commented Dec 5, 2019

RickyDuran commented Dec 5, 2019 • edited by lecy Loading

meliapetersen commented Dec 5, 2019

lecy commented Dec 5, 2019

lecy commented Dec 5, 2019 • edited Loading

meliapetersen commented Dec 5, 2019

JaesaR commented Dec 5, 2019

RickyDuran commented Dec 6, 2019 • edited by lecy Loading

AntJam-Howell commented Dec 6, 2019 via email

castower commented Dec 6, 2019

AntJam-Howell commented Dec 6, 2019 via email

lecy commented Dec 6, 2019 • edited Loading

castower commented Dec 6, 2019

lecy commented Dec 6, 2019

lecy commented Dec 6, 2019 • edited Loading

lecy commented Dec 6, 2019

etbartell commented Dec 3, 2019 •

edited

Loading

castower commented Dec 3, 2019 •

edited

Loading

RickyDuran commented Dec 5, 2019 •

edited by lecy

Loading

lecy commented Dec 5, 2019 •

edited

Loading

RickyDuran commented Dec 6, 2019 •

edited by lecy

Loading

lecy commented Dec 6, 2019 •

edited

Loading

lecy commented Dec 6, 2019 •

edited

Loading