-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Final Project #20
Comments
@Anthony-Howell-PhD. I was able to knit the file after including the following code.
And later by calling the library (with all other libraries). |
@Anthony-Howell-PhD I might be completely off on this but I am trying to subset census.dats for my MSA. Here is my code
I am getting the following error: "Error in if (shift_geo) { : argument is not interpretable as logical" I can't figure out how to correct this error or if I am on the right track with my attempt to only include Seattle data. |
@Jigarci3 You do not have to use the |
You can create a custom Cascading Style Sheet (CSS) to moderate this behavior (you have not learned this yet), but the easiest solution is to simplify the menu bar. Shorten the project title ("Community Analytics Practicum Extravaganza" is tongue-in-cheek, you can change it), and consider grouping some items (can you combine clustering, neighborhoods, and neighborhood change? ). |
@lecy Thank you. Shortening the menu bar helped. |
I was wondering if limiting the decimals places in the table displayed using datatable() to 4 or 5? Will it affect the predictions? |
It is common to round to 2 or 3 decimal places, which should not have any
noticeable effect on model outcomes or predictions.
…On Mon, Dec 2, 2019 at 8:07 PM sunaynagoel ***@***.***> wrote:
I was wondering if limiting the decimals places in the table displayed
using datatable() to 4 or 5? Will it affect the predictions?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20?email_source=notifications&email_token=AMK2Y72R52JTS6ILNPQ7XBLQWXEQDA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFX5RTI#issuecomment-560978125>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMK2Y77WAR5KDINASARLB6DQWXEQDANCNFSM4JTOIFQQ>
.
--
Anthony Howell
Asst. Prof. in Public Policy
School of Public Affairs
Arizona State University
Faculty Profile <https://isearch.asu.edu/profile/3501621> (CV
<https://www.dropbox.com/s/b1pxccpwxm6fats/Howell.CV.pdf?dl=0>)
|
@Anthony-Howell-PhD I'm having this same issue with knitting the original rmd but it was not solved with the code provided above. When I try it with this code:
I get the following error message: When I try to simply use install.package( "lorem" ), it tells me that "package ‘lorem’ is not available (for R version 3.6.1)". Do I need to download a different version of R? I thought we were all using the same version. |
@etbartell If you cannot download and load the lorem package, the easiest thing to do is go through the .rmd file and remove the lorem call feature. To do this, paste into your search box of the .rmd file to find all instances of the following code: You can then delete this code chunk one by one or all at once. Just remember everytime you see that code, it represents a place for to provide your own answer. You can still return to these places to provide your answer by searching for the <!--- symbol that denotes the instructions. |
That worked, thanks! |
I'm having a weird issue with my code from lab 4 (it didn't happen when I turned in the lab, but it's happening now). I'm getting the error that I am not using an argument: Error in rename(., POP = estimate) : unused argument (POP = estimate) When running this code:
For the empirical framework portion of the dashboard. Am I on the right track for this portion? I am also unclear on that as well. This was just the code I had from lab 4. |
@meliapetersen sorry to hear that is happening. My suggestion is to focus on understanding how to subset the |
I see where I'm going wrong, thank you! |
@etbartell I ran into the same problem and found that entering the following code fixed it:
I read here for additional info: https://github.com/gadenbuie/lorem Edit: oops, just realized this is the exact same code as above, I somehow overlooked that! |
@Anthony-Howell-PhD I'm running into a similar issue as other on the section requiring code from Lab 4. However, I'm not getting a descriptive error. When I run the following code: crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv", stringsAsFactors=F, colClasses="character" )
these.san <- crosswalk$msaname == "SAN DIEGO, CA"
these.fips <- crosswalk$fipscounty[ these.san ]
these.fips <- na.omit( these.fips )
state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )
san.pop <-
get_acs( geography = "tract", variables = "B01003_001", state = "06", county = county.fips[state.fips=="06"], geometry = TRUE ) %>%
select( GEOID, estimate ) %>% I only get "Error: " |
You do not need to download data using
|
@lepp12 please see above reply. |
@lepp12, if you don't want to have to Google the names, they are in the crosswalk dataset. Therefore, I just altered my data frame from the crosswalk to be as follows:
This then gave me the names of each county. |
Nice find @castower |
I'm still having trouble understanding what I'm supposed to do with the names of the counties and pulling them from |
@meliapetersen I used the filter function to just select the needed counties |
Example: looking at the last row, 80.6 percent of counties classified as
cluster 4 in 2000 was also clustered as cluster 4 in 2010. 12.9 percent
moved into cluster 3, 6.4 percent moved into cluster 2, and no tracts moved
into cluster 1. Depending on how your clusters are defined will help to
explain how the meaning of these transitions. Note: the diagonal values
indicate that tracts remained in same cluster grouping in 2000 and 2010.
On Tue, Dec 3, 2019 at 6:54 PM sunaynagoel ***@***.***> wrote:
I am a little lost at reading transition matrix. Here is a screen shot of
my transition matrix.
[image: Screen Shot 2019-12-03 at 6 53 04 PM]
<http://url>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20?email_source=notifications&email_token=AMK2Y7YGRMHIGLUT6K5J6I3QW4EUNA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF3OEYQ#issuecomment-561439330>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMK2Y7YN65ZQZTRQRRZDQCDQW4EUNANCNFSM4JTOIFQQ>
.
--
Anthony Howell
School of Public Affairs
Arizona State University
(W) www.tonyjhowell.com
|
I'm having trouble understanding the change variables conceptually. If we were going for percent change, we would just use (2010var-2000var)/2000var, but since we're using the formula of 2000var/(2010var+1), I don't understand what the values are telling us. With the exception of home price, the other variables are all decimals, and adding 1 to the denominator completely alters its value. For example, if ForeigBornChange = 0.095, this doesn't mean that the foreign-born population changed by 9.5%. It's just what the formula spit out. I feel like I'm missing something. Does anyone have a solid grasp of what these variables mean? |
I have a question concerning the dorling maps. In Lab 4 we were creating them based on household income, but I'm not sure what we're clustering here. Should we group these by the cluster variable or something else? I may be overlooking a step, but I can't quite figure out what I'm plotting. Thanks! |
@etbartell Nice question here and nice catch. Actually, it is more intuitive to have the change variables defined as 2010var/2000var rather than in the .rmd file which has it as 2000var/2010var. With respect to adding a constant to a variable, in this case it would be better to add a small value to the variables. So for home values, adding a 1 makes sense. When working with proportions it makes more sense to add a .01 instead of 1. I will update these changes to the .rmd file. |
This make so much more sense now. Thank @etbartell for asking this question and @Anthony-Howell-PhD for the help. |
@castower Besides household income, we also used dorling to map clusters in Lab 4. see the attached screenshot from lab 4 instructions. |
Thank you! One other question, I discovered that my data set has one massive outlier for the House Price change variable (there's an instance where in 2000 the median house price was only $300 and in 2010 it was $284,900). Should I exclude this outlier since it's skewing the data (especially the mean) or just mention it in my summary? Thanks! |
If anyone else has questions about changing the grid labels, this website has great instructions: https://www.datanovia.com/en/blog/how-to-change-ggplot-facet-labels/ |
Also want to note, that if you want to leave the variables alone, you can use the fig.width setting for r-markdown to widen the figure. |
Thanks @castower |
@lepp12 the problem is that you've subsetted the data to only 2000 variables, run the prediction, and then you are trying to predict new data with the full dataset that includes both the 2000 and 2010 variables. I suggest instead of the following |
@castower There are several options, all of which are quite common, and you are free to choose what you think is best: windsorize the variable, trim the variable, remove the outlier outright, or take the log of the variable. |
@Anthony-Howell-PhD thank you! |
@Anthony-Howell-PhD It looks like below there is a place to "interpret results" before the cluster analysis. If there is no output to show, are there results to interpret? |
The main point of that section is to define and label each of the cluster groupings, which can be done on the side panel of each cluster output figure. You are free to keep in the ### identifying communities and add some basic description of what you did, i.e. perform cluster analysis, but there is no visualization for now. You can add your own visualization if you want though. |
@meliapetersen see above reply |
Hi, I'm having trouble identifying the variables needed for the dorling cartograms. Can someone help me make a little but more sense of it in regards to how to merge the spatial information to the Census2010 data frame. I keep going back to Lab 4 and what we've coded thus far and I'm not fully understanding what I'm supposed to do. Thank you! :) |
@meliapetersen I would suggest getting the spatial data information using get_acs for your state. Once you have that, you will want to merge your census dataframe to the spatial dataframe.
That should get you started |
@meliapetersen If helpful, the course GitHub site has information on how the dorling cartograms were built for labs 3 and 4. That code might be instructive: https://github.com/DS4PS/cpp-529-master/blob/master/data/README.md |
Awesome, thank you! I think I figured it out, but I'm getting an error when I run my code:
Error: Error in st_cast_sfc_default(x) : list item(s) not of class sfg |
I think you are missing an argument here: substring( seattle.pop$GEOID, 2 ) You usually have a starting position and ending position for the substring() function. What does the
|
I am having trouble with the same area (creating dorling): census_api_key("42bf5fcc6e6a6f05ebe97a0e647a5216a708613a")
aus.pop <-
get_acs( geography = "tract", variables = "B01003_001",
state = "48", county = county.fips[state.fips=="48"], geometry = TRUE ) %>%
select( GEOID, estimate )
aus <- merge( aus.pop, austin.data, by.x="GEOID", by.y="TRTID10" )
aus.sp <- as_Spatial( aus )
class( aus.sp )
aus.sp <- spTransform( aus.sp, CRS("+init=epsg:3395"))
aus.sp <- aus.sp[ aus.sp$POP != 0 & (! is.na( aus.sp$POP )) , ]
aus_dorling <- cartogram_dorling( x=aus.sp, weight="pop.w", k=0.05 ) I get the error: |
@lecy So I noticed that I didn't have the
I'm not quite sure I understand what I need to add to the substring argument. |
@RickyDuran Do you have a POP variable? I might have renamed it from the default census name. Did you create the weighted pop variable? phx$pop.w <- phx$POP / 10000 # standardizes it to max of 1.5 You might have to adjust the denominator depending on the max population. If you also have an NA or a 0 for population it could create the error you are getting. You should drop those polygons (filter them out) before the conversion. summary( phx$POP ) |
@meliapetersen The crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv", stringsAsFactors=F, colClasses="character" )
these.msp <- crosswalk$msaname == "MINNEAPOLIS-ST. PAUL, MN-WI"
these.fips <- crosswalk$fipscounty[ these.msp ]
these.fips <- na.omit( these.fips )
state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )
data.frame( these.fips, state.fips, county.fips ) %>% pander()
If your MSA spans two states then you need to split these codes into two separate calls to get_acs(), I believe. One for each state using the corresponding county FIPS. Recall that county FIPS are not unique since each state will have a 001, 002, etc. > county.fips
[1] "003" "019" "025" "037" "053" "059" "123" "139" "141" "163" "171" "093"
[13] "109"
> county.fips[ state.fips=="55" ]
[1] "093" "109" Substring is pulling out each FIPS. |
@lecy So I fixed the issue I was having, and now it's giving me a different error. I added:
And now it is giving me an issue with this line of code:
Giving me this error: Here is both code chunks now:
|
I am having the same issue that Melia is having, but adding the code she added to fix this issue did not work for me. My code looks like this:
And is returning the error: "Error in st_cast_sfc_default(x) : list item(s) not of class sfg" I do not understand what you mean about the substring argument not being complete. Am i supposed to add the FIPS within that argument? |
I believe I have pop data, I am doing the exact same thing for downloading shapefiles with popuoation data as in lab 4, although when I leave in "rename( POP=estimate )" in aus.pop <-
get_acs( geography = "tract",
variables = "B01003_001",
state = "48",
county = county.fips[state.fips=="48"],
geometry = TRUE ) %>%
select( GEOID, estimate ) %>%
rename( POP=estimate ) I get ERROR in rename(POP = estimate) : (POP=estimate) not used When I take it out, I also get Error in x@polygons[[1]] : subscript out of bounds at: |
Hi everyone,
As many of you are experiencing difficulties trying to recreate a dorling
map using a new dataset, I’ve decided to make the dorling map optional.
Part of the problem is that each chosen msa often times have unique
problems that require unique troubleshooting that can cost quite a bit of
time. Given the fast approaching deadline and the other requirements of the
lab it’s important to push through and get the rest of the parts completed.
For any of you who have successfully created the dorling map for the final
project a bonus of 10% will be applied to your final project. For those of
you that opt to not creat the dorling map there is no penalty.
Hope that helps relieve the workload stress.
Best
On Thu, Dec 5, 2019 at 5:01 PM Ricky Duran ***@***.***> wrote:
@lecy <https://github.com/lecy>
I believe I have pop data, I am doing the exact same thing for downloading
shapefiles with popuoation data as in lab 4, although when I leave in
"rename( POP=estimate )" in
'aus.pop <- get_acs( geography = "tract", variables = "B01003_001",
state = "48", county = county.fips[state.fips=="48"], geometry = TRUE ) %>%
select( GEOID, estimate ) %>%
rename( POP=estimate )'
I get *ERROR in rename(POP = estimate) : (POP=estimate) not used*
When I take it out, I also get *Error in ***@***.***[[1]] : subscript out
of bounds* at:
aus.sp$pop.w <- aus.sp$POP / 9000
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20?email_source=notifications&email_token=AMK2Y77BMURLL4HXOJDGXM3QXGI45A5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCR7PQ#issuecomment-562372542>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMK2Y72VM426H6IOPVQ2PEDQXGI45ANCNFSM4JTOIFQQ>
.
--
Anthony Howell
School of Public Affairs
Arizona State University
(W) www.tonyjhowell.com
|
@Anthony-Howell-PhD I just finished recording my video and it came out to right at 24 mins and 53 seconds. Is this okay? |
Absolutely!
On Thu, Dec 5, 2019 at 6:28 PM Courtney ***@***.***> wrote:
@Anthony-Howell-PhD <https://github.com/Anthony-Howell-PhD> I just
finished recording my video and it came out to right at 24 mins and 53
seconds. Is this okay?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#20?email_source=notifications&email_token=AMK2Y7YIW2B5ZGNJL7RUGWDQXGTDDA5CNFSM4JTOIFQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGCWZSI#issuecomment-562392265>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMK2Y745WSTF73CQ4XS7KA3QXGTDDANCNFSM4JTOIFQQ>
.
--
Anthony Howell
School of Public Affairs
Arizona State University
(W) www.tonyjhowell.com
|
@meliapetersen Not sure about your current error, this seems to work (but I don't know what data you were joining with the Census2010 object because you did not include that code): library( sp ) # work with shapefiles
library( sf ) # work with shapefiles - simple features format
library( dplyr ) # data wrangling
library( tidycensus )
library( cartogram ) # spatial maps w/ tract size bias reduction
library( maptools ) # spatial object manipulation
crosswalk <- read.csv( "https://raw.githubusercontent.com/DS4PS/cpp-529-master/master/data/cbsatocountycrosswalk.csv", stringsAsFactors=F, colClasses="character" )
these.seattle <- crosswalk$msaname == "SEATTLE-BELLEVUE-EVERETT, WA"
these.fips <- crosswalk$fipscounty[ these.seattle ]
these.fips <- na.omit( these.fips )
state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )
census_api_key("b431c35dad89e2863681311677d12581e8f24c24")
options( tigris_use_cache = TRUE )
# only have one state, so can use county fips directly
seattle.pop <-
get_acs( geography = "tract", variables = "B01003_001",
state = "53", county = c("029", "033", "061"), geometry = TRUE ) %>%
select( GEOID, estimate ) %>%
rename( POP=estimate )
# I am not sure how you create Census2010 here
# seattle.pop$GEOID <- substring( seattle.pop$GEOID, 1 )
# seattle <- merge( seattle.pop, Census2010, by.x="GEOID", by.y="TRTID10" )
seattle <- seattle.pop
class( seattle.pop ) # sf
seattle2 <- seattle[ ! st_is_empty( seattle ) , ]
seattle.sp <- as_Spatial( seattle2 )
class( seattle.sp ) # sp
seattle.sp <- spTransform( seattle.sp, CRS("+init=epsg:3395"))
nrow( seattle.sp ) # 569
seattle.sp <- seattle.sp[ seattle.sp$POP != 0 & (! is.na( seattle.sp$POP )) , ]
nrow( seattle.sp ) # 567
seattle.sp$pop.w <- seattle.sp$POP / 9000 # max(msp.sp$POP) # standardizes it to max of 1.5
summary( seattle.sp$pop.w )
seattle_dorling <- cartogram_dorling( x=seattle.sp, weight="pop.w", k=0.05 )
plot( seattle_dorling, col="red" ) |
@Anthony-Howell-PhD thank you! |
@JaesaR Where does "06" come from? these.chi <- crosswalk$msaname == "CHICAGO, IL"
these.fips <- crosswalk$fipscounty[ these.chi ]
these.fips <- na.omit( these.fips )
state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )
census_api_key("624bc0325068577dab800279b9251a06f1200af3")
options(tigris_use_cache = TRUE)
chi.pop <-
get_acs( geography = "tract", variables = "B01003_001",
state = "06", county = county.fips, geometry = TRUE ) %>%
select( GEOID, estimate ) %>%
dplyr::rename(POP = estimate) data.frame( state.fips, county.fips )
|
@JaesaR this works, note the commented out code: these.chi <- crosswalk$msaname == "CHICAGO, IL"
these.fips <- crosswalk$fipscounty[ these.chi ]
these.fips <- na.omit( these.fips )
state.fips <- substr( these.fips, 1, 2 )
county.fips <- substr( these.fips, 3, 5 )
data.frame( state.fips, county.fips )
census_api_key("624bc0325068577dab800279b9251a06f1200af3")
options(tigris_use_cache = TRUE)
chi.pop <-
get_acs( geography = "tract", variables = "B01003_001",
state = "17", county = county.fips, geometry = TRUE ) %>%
select( GEOID, estimate ) %>%
dplyr::rename(POP = estimate)
# chi.pop$GEOID<-substring(chi.pop$GEOID, 2)
# chi <- merge( chi.pop, census.dats, by.x="GEOID", by.y="TRTID10" )
chi <- chi.pop
chi2 <- chi[! st_is_empty(chi), ]
chi.sp <- as_Spatial( chi2 )
class( chi.sp )
# project map and remove empty tracts
chi.sp <- spTransform( chi.sp, CRS("+init=epsg:3395"))
chi.sp <- chi.sp[ chi.sp$POP != 0 & (! is.na( chi.sp$POP )) , ]
# convert census tract polygons to dorling cartogram
chi.sp$pop.w <- chi.sp$POP / 9000 # max(msp.sp$POP) # standardizes it to max of 1.5
chi_dorling <- cartogram_dorling( x=chi.sp, weight="pop.w", k=0.05 )
plot( chi_dorling, col="steelblue" ) |
@RickyDuran you did not have enough code for a reproducible example. So I'm not sure how the error is introduced, but what you have looks fine. |
@Anthony-Howell-PhD I am running into following error while knitting the .rmd document.
Quitting from lines 56-78 (Final_Project_Outline_Storyboard-Goel.Rmd)
Error in loadNamespace(name) : there is no package called 'lorem'
Calls: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted
When I tried to install the package "lorem", following error was produced.
Anyone else running into this issue?
The text was updated successfully, but these errors were encountered: