Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lab 02 - Error Output must be unique #4

Open
ecking opened this issue Mar 22, 2020 · 6 comments
Open

Lab 02 - Error Output must be unique #4

ecking opened this issue Mar 22, 2020 · 6 comments
Labels

Comments

@ecking
Copy link

ecking commented Mar 22, 2020

I've been starring at this awhile and not sure why i'm getting the error: "Error: Each row of output must be identified by a unique combination of keys. Keys are shared for 6440 rows: * 1, 2 * 3, 4 * 5, 6 * 7, 8 * 9, 10 * 11, 12 * 13, 14 * ..."

When I run the first chunk here, I notice that the data under variable is the actual label HHincome and HHvalue instead of the value B19013... does that have something to do with why I'm getting this message?

CenDF <- c(HHincome = "B19013_001",
              HHvalue = "B25077_001")

county_HH <- get_acs(geography = "county",
                      year = 2017,
                      survey = "acs5",
                     variables= CenDF, 
                     geometry=T)
head(county_HH)
```

```{r, message=F, warning=F}
county_HH<-county_HH %>%
  mutate(variable=case_when( 
    variable=="B19013_001" ~ "HHincome",
    variable=="B25077_001" ~ "HHvalue")) %>%
              select(-moe) %>%
              spread(variable, estimate)%>%
              mutate(HHInc_HousePrice_Ratio=round(HHincome/HHvalue*100,2)) 


head(county_HH)


@ecking
Copy link
Author

ecking commented Mar 23, 2020

HI guys, I solved the issue and I think it's important to note this as I don't recall it being in the lecture (i could be wrong) But in the first chunk of this code, I had B19013_001.... I added an E onto it and that code chunk worked. In the lower code chunk for the variable you do NOT have an E.

@lecy
Copy link
Contributor

lecy commented Mar 24, 2020

@Anthony-Howell-PhD just making sure you are receiving these notification?

@AntJam-Howell
Copy link
Collaborator

@ecking Thanks for your post and glad you attempted to resolve the problem. Its important to note though that when you changed the variable name from B19013_001 to B19013_001E, you did not solve the problem, you actually called upon a new variable.

Let me explain the actual problem and the proper way to troubleshoot. The problem is that when you create CenDF you convert the variable names from original format (i.e. B19013_001) to easier to read format (e.g. HHincome). You can confirm this by head(county_HH) and look at Variable column.

In the next step, you include the following code, which is not needed and should be removed:
mutate(variable=case_when(
variable=="B19013_001" ~ "HHincome",
variable=="B25077_001" ~ "HHvalue")) %>%

What mutate is doing here is taking the value of the original variable (e.g. B19013_001) and converting it to an easier to read format (e.g. HHincome). However, you did this already when you created CenDF, so when you try to mutate again there is no B19013_001 name in the data that is why you get the error.

Please remove the mutate part of the code, and re-run the analysis that you posted.

@AntJam-Howell
Copy link
Collaborator

@ALL @ecking. Apologies for the delay in responding. Please remember to include @Anthony-Howell-PhD so that I receive the message more quickly.

@DS4PS DS4PS deleted a comment from AntJam-Howell Mar 25, 2020
@ecking
Copy link
Author

ecking commented Mar 25, 2020

So I re-ran the code without that mutate section and I hit an error saying error: object HHvalue not found. I'll have to try again after work today. But if you have any idea on what that error message means, it'd be great to know!

Initially I thought the mutate section was not needed either, but I copied and pasted the two parts from the ppt so figured it must be correct if written that way. Guess I was wrong! ha.

Thanks!

@AntJam-Howell
Copy link
Collaborator

Some of the code you will be able to use exactly by copying and pasting from lecture. Other code though will require you to closely look at the code and make some minor manipulations to make sure you are learning what the code is actually doing.

A hint, after you run the following code,

     CenDF <- c(HHincome = "B19013_001",
              HHvalue = "B25077_001")

    county_HH <- get_acs(geography = "county",
                      year = 2017,
                      survey = "acs5",
                     variables= CenDF, 
                     geometry=T)
    head(county_HH

you need to make sure that the values for Variable are named correctly as HHincome and HHvalue. Then when you use the spread(variable, estimate) command, you are transforming the data from long format to wide format, which then allows you to create a new variable HHInc_HousePrice_Ratio using mutate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants