Skip to content

Commit

Permalink
Merge pull request #196 from UI-Research/iss159-fix-data
Browse files Browse the repository at this point in the history
Fix missing variables air quality included!
  • Loading branch information
Deckart2 authored Apr 19, 2023
2 parents da762c2 + 47e9da0 commit 9586e82
Show file tree
Hide file tree
Showing 170 changed files with 53,413 additions and 50,641 deletions.
6 changes: 6 additions & 0 deletions R/load_place_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@ load_place_data <- function() {
) %>%
prep_data(geography = "place")

data_race <- read_csv(
here("mobility-metrics", "07_mobility-metrics_place_race_longitudinal.csv")
) %>%
prep_data(geography = "place")

data_race_share <- read_csv(
here("mobility-metrics", "08_place_mobility-metrics_race-share_longitudinal.csv"),
col_types = cols(
Expand Down Expand Up @@ -117,6 +122,7 @@ load_place_data <- function() {
recent = data_recent,
years = data_years,
race_ethnicity = data_race_ethnicity,
race = data_race,
race_share = data_race_share,
env = data_env,
education_income = data_education_income,
Expand Down
11 changes: 8 additions & 3 deletions R/prep_data.R
Original file line number Diff line number Diff line change
Expand Up @@ -71,23 +71,28 @@ prep_data <- function(data, geography = "county") {
# filter to get the ones only existing in the data
perc_vars_in_data <- all_perc_vars[(all_perc_vars %in% colnames(data))]

quality_boolean <- grepl("_quality", colnames(data))
quality_variables <- colnames(data)[quality_boolean]
quality_variables <- quality_variables[quality_variables != "index_air_quality"]


numeric_vars_one_digit <- data %>%
select(
-matches("ratio_average_to_living_wage"),
-matches("share_desc_rep"),
-fips,
-matches("_quality"),
-any_of(quality_variables),
-matches("year"),
-starts_with("rate_learning"),
-all_of(perc_vars_in_data),
-starts_with("pctl"),
-starts_with("pctl")
) %>%
select_if(is.numeric) %>%
names()


data <- data %>%
mutate_at(vars(ends_with("_quality")),
mutate_at(vars(any_of(quality_variables)),
function(x) recode(x, `1` = "Strong", `2` = "Marginal", `3` = "Weak")) %>%
mutate(
across(
Expand Down
8 changes: 4 additions & 4 deletions R/varlist.R
Original file line number Diff line number Diff line change
Expand Up @@ -208,12 +208,12 @@ neonatal_varlist <- list(

env_varlist <- list(
summary_vars = c(
"Air quality index" = "air_quality_index"
"Air quality index" = "index_air_quality"
),
detail_vars = c(
"Air quality index" = "air_quality_index",
"Air quality index_ci" = "air_quality_index_ci",
"air_quality_index_quality" = "air_quality_index_quality"
"Air quality index" = "index_air_quality",
"Air quality index_ci" = "index_air_quality_ci",
"air_quality_index_quality" = "index_air_quality_quality"
)
)

Expand Down
4 changes: 1 addition & 3 deletions create_standard_pages.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@
#This does not handle any comparisons. For bespoke comparisons for individual
# counties, please see the create_bespoke_pages.R script

#Gabe Morrison and Aaron Williams

#2023-03-15
#Gabe Morrison and Aaron R. Williams

library(tidyverse)
library(quarto)
Expand Down
2 changes: 1 addition & 1 deletion data/00_metrics-summary_county.csv
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Financial security,share_debt_col,share_debt_col_quality,1,race_share,Metric: Sh
Wealth-building opportunities,"ratio_black_nh_house_value_households, ratio_hispanic_house_value_households, ratio_other_nh_house_value_households, ratio_white_nh_house_value_households","ratio_black_nh_house_value_households_quality, ratio_hispanic_house_value_households_quality, ratio_other_nh_house_value_households_quality, ratio_white_nh_house_value_households_quality",3,none,Metric: Ratio of the share of a community’s housing wealth held by a racial or ethnic group to the share of households of the same group,US Census Bureau’s 2021 1-Year American Community Survey Public Use Microdata Sample (via IPUMS); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time period: 2021),US Census Bureau’s 2018 & 2021 1-Year American Community Survey Public Use Microdata Sample (via IPUMS); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time periods: 2018 & 2021),"The percentage to the left of the colon for a given racial group reflects their share of primary-residence housing wealth in a community, and the percentage to the right of the colon reflects the number of households who are headed by a member of that racial group as a share of the community’s total number of households. If the percentage on the left side of the colon is smaller than the percentage on the right side, then that group has less proportionate housing wealth compared to their presence in the community. The greater the gap between these percentages, the more inequality in housing wealth in the community. This metric is based on self-reported housing value, does not account for the extent of mortgage debt, and does not account for other important demographic variations such as differences in age composition across race and ethnic groups, and as such this metric may not fully reflect the size of the actual housing wealth gap.",,,"2018, 2021"
Access to health services,ratio_population_pc_physician,ratio_population_pc_physician_quality,3,none,Metric: Ratio of population per primary care physician,"US Department of Health and Human Services, Health Resources and Services Administration, Area Health Resources File, 2020-21 (via County Health Rankings, 2022). (Time period: 2019)",,"The ratio represents the number of people served by one primary care physician in a county. It assumes the population is equally distributed across physicians and does not account for actual physician patient load. Missing values are reported for counties with population greater than 2,000 and 0 primary care physicians. The metric does not include nurse practitioners, physician assistants, or other primary care providers who are not physicians.",,,"2018, 2021"
Neonatal health,rate_low_birth_weight,rate_low_birth_weight_quality,1,race_ethnicity,Metric: Share with low birth weight,"Centers for Disease Control and Prevention National Center for Health Statistics, Division of Vital Statistics, Natality data, 2020 (via CDC WONDER). (Time period: 2020)","Centers for Disease Control and Prevention National Center for Health Statistics, Division of Vital Statistics, Natality data, 2018 & 2020 (via CDC WONDER). (Time period: 2018 & 2020)","The share of babies born weighing less than 5 pounds 8 ounces (<2,500 grams) out of all births with available birthweight information.",Race and ethnicity is based on the mother’s characteristics.,,"2018, 2020"
Environmental quality,air_quality_index,air_quality_index_quality,3,"race_share, poverty",Metric: Air quality index,"US Environmental Protection Agency’s AirToxScreen data, 2018 (based on 2017 National Emissions Inventory data). (Time period: 2017-18)","Environmental Protection Agency’s National Air Toxics Assessment data, 2014 and AirToxScreen data, 2018 (based on 2014 & 2017 National Emissions Inventory data); US Census Bureau’s 2014 & 2018 5-Year American Community Survey. (Time periods: 2010-14 & 2014-18)","The index is a linear combination of standardized EPA estimates of air quality carcinogenic, respiratory, and neurological hazards measured at the census tract level. Values are inverted and percentile ranked nationally and range from 0 to 100. The higher the index value, the less exposure to toxins harmful to human health.",<br><br>'Majority' means that at least 60% of residents in a census tract are members of the specified group. 'High poverty' means that 40% or more of people in a census tract live in families with incomes below the federal poverty line.,,"2014, 2018"
Environmental quality,index_air_quality,index_air_quality_quality,3,"race_share, poverty",Metric: Air quality index,"US Environmental Protection Agency’s AirToxScreen data, 2018 (based on 2017 National Emissions Inventory data). (Time period: 2017-18)","Environmental Protection Agency’s National Air Toxics Assessment data, 2014 and AirToxScreen data, 2018 (based on 2014 & 2017 National Emissions Inventory data); US Census Bureau’s 2014 & 2018 5-Year American Community Survey. (Time periods: 2010-14 & 2014-18)","The index is a linear combination of standardized EPA estimates of air quality carcinogenic, respiratory, and neurological hazards measured at the census tract level. Values are inverted and percentile ranked nationally and range from 0 to 100. The higher the index value, the less exposure to toxins harmful to human health.",<br><br>'Majority' means that at least 60% of residents in a census tract are members of the specified group. 'High poverty' means that 40% or more of people in a census tract live in families with incomes below the federal poverty line.,,"2014, 2018"
Safety from trauma,rate_injury_death,rate_injury_death_quality,1,none,"Metric: Deaths due to injury per 100,000 people","National Center for Health Statistics, 2016-20, drawn from the National Vital Statistics System (via County Health Rankings, 2022). (Time period: 2016-20)",,"Injury deaths is the number of deaths from planned (e.g., homicide or suicide) and unplanned (e.g., motor vehicle deaths) injuries per 100,000 people. Deaths are counted in the county of residence for the person who died, rather than the county where the death occurred. A missing value is reported for counties with fewer than 10 injury deaths in the time frame.",,,2020
Political participation,share_election_turnout,share_election_turnout_quality,3,none,Metric: Share of the voting-age population who turn out to vote,"Massachusetts Institute of Technology Election Data and Science Lab, 2020; US Census Bureau’s 2020 5-Year American Community Survey Citizen Voting Age Population Special Tabulation. (Time period: 2016-20)","Massachusetts Institute of Technology Election Data and Science Lab, 2016 & 2020; US Census Bureau’s 2016 & 2020 5-Year American Community Survey Citizen Voting Age Population Special Tabulation. (Time periods: 2012-16 & 2016-20)",This metric measures the share of the citizen voting-age population that voted in the most recent presidential election.,,,"2016, 2020"
Descriptive representation among local officials,"share_desc_rep_asian_other, share_desc_rep_black_nonhispanic, share_desc_rep_hispanic, share_desc_rep_white_nonhispanic","share_desc_rep_asian_other_quality, share_desc_rep_black_nonhispanic_quality, share_desc_rep_hispanic_quality, share_desc_rep_white_nonhispanic_quality",3,none,Metric: Ratio of the share of local elected officials of a racial or ethnic group to the share of residents of the same racial or ethnic group.<em>Part of this metric is shown. See the notes for information on finalizing this metric.</em>,US Census Bureau’s 2021 5-Year American Community Survey. (Time period: 2017-21),,"Shown are the share of that racial or ethnic group in your community. The community will need to calculate the missing percentages in order to complete the descriptive representation metric. See the [Planning Guide](https://upward-mobility.urban.org/boosting-upward-mobility-planning-guide-local-action) (pg. 27) on how to calculate the missing percentage.
Expand Down
2 changes: 1 addition & 1 deletion data/00_metrics-summary_place.csv
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Employment opportunities,share_employed,share_employed_quality,3,race_ethnicity,
Opportunities for income,pctl_income,pctl_income_quality,2,race_ethnicity,"Metric: Household income at the 20th, 50th, and 80th percentiles",US Census Bureau’s 2021 1-Year American Community Survey Public Use Microdata Sample (via IPUMS); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time Period: 2021),US Census Bureau’s 2018 & 2021 5-Year American Community Survey Public Use Microdata Sample (via IPUMS); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time Periods: 2014-18 & 2017-21),"To identify income percentiles, all households are ranked by income from lowest to highest. The income level threshold for the poorest 20 percent of households is the value at the 20th percentile. The 50th percentile income threshold indicates the median, with half of households earning less and half of households earning more. The income level threshold for the richest 20 percent of households is the value at the 80th percentile. The difference in income between households at the 20th percentile and the 80th percentile illustrates the level of local economic inequality.",,,2021
Financial security,share_debt_col,share_debt_col_quality,3,race_share,Metric: Share with debt in collections,"August 2021 credit bureau data, from Urban Institute’s Financial Health and Wealth Dashboard. (Time period: August 2021)",,"The city-level measure captures the share of adults in an area with a credit bureau record with any derogatory debt, which is primarily debt in collections.","For city-level August 2021 data, ‘majority’ means that at least 50% of residents in a zip code are members of the specified population group.",,2021
Wealth-building opportunities,"ratio_black_nh_house_value_households, ratio_hispanic_house_value_households, ratio_other_nh_house_value_households, ratio_white_nh_house_value_households","ratio_black_nh_house_value_households_quality, ratio_hispanic_house_value_households_quality, ratio_other_nh_house_value_households_quality, ratio_white_nh_house_value_households_quality",3,none,Metric: Ratio of the share of a community’s housing wealth held by a racial or ethnic group to the share of households of the same group,US Census Bureau’s 2021 1-Year American Community Survey Public Use Microdata Sample (via IPUMS); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time period: 2021),US Census Bureau’s 2018 & 2021 1-Year American Community Survey Public Use Microdata Sample (via IPUMS); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time periods: 2018 & 2021),"The percentage to the left of the colon for a given racial group reflects their share of primary-residence housing wealth in a community, and the percentage to the right of the colon reflects the number of households who are headed by a member of that racial group as a share of the community’s total number of households. If the percentage on the left side of the colon is smaller than the percentage on the right side, then that group has less proportionate housing wealth compared to their presence in the community. The greater the gap between these percentages, the more inequality in housing wealth in the community. This metric is based on self-reported housing value, does not account for the extent of mortgage debt, and does not account for other important demographic variations such as differences in age composition across race and ethnic groups, and as such this metric may not fully reflect the size of the actual housing wealth gap.",,,"2018, 2021"
Environmental quality,air_quality_index,air_quality_index_quality,3,"race_share, poverty",Metric: Air quality index,"US Environmental Protection Agency’s AirToxScreen data, 2018 (based on 2017 National Emissions Inventory data); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time period: 2017-18)","Environmental Protection Agency’s National Air Toxics Assessment data, 2014 and AirToxScreen data, 2018 (based on 2014 & 2017 National Emissions Inventory data); US Census Bureau’s 2014 & 2018 5-Year American Community Survey; Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time periods: 2010-14 & 2014-18)","The index is a linear combination of standardized EPA estimates of air quality carcinogenic, respiratory, and neurological hazards measured at the census tract level. Values are inverted and percentile ranked nationally and range from 0 to 100. The higher the index value, the less exposure to toxins harmful to human health.",<br><br>'Majority' means that at least 60% of residents in a census tract are members of the specified group. 'High poverty' means that 40% or more of people in a census tract live in families with incomes below the federal poverty line.,,"2014, 2018"
Environmental quality,index_air_quality,index_air_quality_quality,3,"race_share, poverty",Metric: Air quality index,"US Environmental Protection Agency’s AirToxScreen data, 2018 (based on 2017 National Emissions Inventory data); Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time period: 2017-18)","Environmental Protection Agency’s National Air Toxics Assessment data, 2014 and AirToxScreen data, 2018 (based on 2014 & 2017 National Emissions Inventory data); US Census Bureau’s 2014 & 2018 5-Year American Community Survey; Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time periods: 2010-14 & 2014-18)","The index is a linear combination of standardized EPA estimates of air quality carcinogenic, respiratory, and neurological hazards measured at the census tract level. Values are inverted and percentile ranked nationally and range from 0 to 100. The higher the index value, the less exposure to toxins harmful to human health.",<br><br>'Majority' means that at least 60% of residents in a census tract are members of the specified group. 'High poverty' means that 40% or more of people in a census tract live in families with incomes below the federal poverty line.,,"2014, 2018"
Political participation,share_election_turnout,share_election_turnout_quality,3,none,Metric: Share of the voting-age population who turn out to vote,"Voting and Election Science Team, Precinct-Level Election Results 2020 (via Harvard Dataverse); US Census Bureau’s 2020 5-Year American Community Survey Citizen Voting Age Population Special Tabulation; Missouri Census Data Center Geocorr 2022: Geographic Correspondence Engine. (Time period: 2016-20)",,This metric measures the share of the citizen voting-age population that voted in the most recent presidential election.,,,2020
Descriptive representation among local officials,"share_desc_rep_asian_other, share_desc_rep_black_nonhispanic, share_desc_rep_hispanic, share_desc_rep_white_nonhispanic","share_desc_rep_asian_other_quality, share_desc_rep_black_nonhispanic_quality, share_desc_rep_hispanic_quality, share_desc_rep_white_nonhispanic_quality",3,none,Metric: Ratio of the share of local elected officials of a racial or ethnic group to the share of residents of the same racial or ethnic group.<em>Part of this metric is shown. See the notes for information on finalizing this metric.</em>,US Census Bureau’s 2021 5-Year American Community Survey. (Time period: 2017-21),,"Shown are the share of that racial or ethnic group in your community. The community will need to calculate the missing percentages in order to complete the descriptive representation metric. See the [Planning Guide](https://upward-mobility.urban.org/boosting-upward-mobility-planning-guide-local-action) (pg. 27) on how to calculate the missing percentage. Say that of your 10 elected officials, nine are White, non-Hispanic and your community’s population is half White, non-Hispanic, the metric will read as “90.0%:50.0%.” If the share of local officials is higher than the share of people in the community, then this group is over-represented. If the share of local officials is lower than the share of people in the community, then this group is under-represented. We are presenting this as a ratio of percentages because it provides important context.<br><br>The quality index reflects the data quality only of the given value.",,,2021
Safety from crime,"rate_violent_crime, rate_property_crime",rate_crime_quality,3,none,"Metric: Reported property crimes per 100,000 people and reported violent crimes per 100,000 people",Federal Bureau of Investigations (FBI) National Incident Based Reporting System (via Kaplan J (2021). National Incident-Based Reporting System (NIBRS) Data. https://nibrsbook.com/); US Census Bureau’s 2021 1-Year American Community Survey. (Time period: 2021),,"Rates are calculated as the number of reported crimes against proprty or people per 100,000 people. Although these are the best national data source, communities should use their local data if they are available. The FBI cautions against using NIBRS data to rank or compare locales because there are many factors that cause the nature and type of crime to vary from place to place.",,,2021
Expand Down

Large diffs are not rendered by default.

Loading

0 comments on commit 9586e82

Please sign in to comment.