Skip to content

Commit

Permalink
Merge pull request #5 from BU-Spark/district7_teamA_prelim
Browse files Browse the repository at this point in the history
Team A District7 Project Early Insights - Updated
  • Loading branch information
xyyy9 authored Oct 23, 2024
2 parents f8f0ce0 + 52a03a1 commit 20b66ab
Show file tree
Hide file tree
Showing 4 changed files with 154 additions and 47 deletions.
18 changes: 14 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,17 @@
# Boston City Councilor Tania Fernandes Anderson District 7
# BOSTON CITY COUNCILOR ANDERSON: DISRICT 7

Create a new branch from dev, add changes on the new branch you just created.

Open a Pull Request to dev. Add your PM and TPM as reviewers.
For our project with Councilor Anderson, we will be focusing on analyzing population and economic development data for District 7 in Boston.

At the end of the semester during project wrap up open a final Pull Request to main from dev branch.
Our team will analyze various datasets, including population demographics such as race, ethnicity, age, and education levels, as well as economic indicators like household income, the number of registered businesses, and job data for BIPOC workers. We'll also be tracking these metrics over the past 10 years, looking at changes in the district compared to the Boston city average, and possibly other districts.

Our ultimate goal is to create a dashboard that visually presents these findings, helping Councilor Anderson identify areas where residents, particularly marginalized groups, are seeing improvements or facing ongoing challenges. We’ll iterate on the dashboard design with Councilor Anderson’s feedback to ensure it highlights the most critical insights.

In our early insights we analyzed Boston 311 service request data to understand the distribution of opened and closed cases across the city. We cleaned the dataset by removing irrelevant or missing information and focused on filtering cases by their status—opened or closed.
This allowed us to assess the city's responsiveness to non-emergency issues like potholes, streetlights, and waste management.
It is under the name 311_exploration.ipynb file name in the f24-team-a folder.

Using GeoPandas and a Boston Shapefile, we created heatmaps to visualize the geographical distribution of these cases.
Those are under the names: Boston_case_heatmap.html and a Boston_case_map.html

Opened cases represent unresolved issues, while closed cases indicate resolved ones. The heatmaps highlighted areas with high concentrations of service requests, providing insight into neighborhoods needing more attention and the overall effectiveness of the 311 system.
150 changes: 108 additions & 42 deletions fa24-team-a/311_requests/311_exploration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,32 @@
"cells": [
{
"cell_type": "code",
"execution_count": 66,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import geopandas as gpd\n",
"import folium"
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Loading Dataframes for 311 requests"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\ishan\\AppData\\Local\\Temp\\ipykernel_4104\\1997487801.py:2: DtypeWarning: Columns (13) have mixed types. Specify dtype option on import or set low_memory=False.\n",
"C:\\Users\\ishan\\AppData\\Local\\Temp\\ipykernel_10880\\1997487801.py:2: DtypeWarning: Columns (13) have mixed types. Specify dtype option on import or set low_memory=False.\n",
" df2 = pd.read_csv('311_req_2023.csv')\n"
]
}
Expand All @@ -34,17 +39,18 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# filtered out dataframes to narrow down the data to just District 7\n",
"df1 = df1[df1['neighborhood_services_district']=='7']\n",
"df2 = df2[df2['neighborhood_services_district']=='7']"
]
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 6,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -482,7 +488,7 @@
"[14437 rows x 30 columns]"
]
},
"execution_count": 16,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -493,7 +499,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 7,
"metadata": {},
"outputs": [
{
Expand All @@ -509,48 +515,30 @@
" dtype='object')"
]
},
"execution_count": 14,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Looking at the columns to decide which are relevant to the task\n",
"df1.columns"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'16'"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df1['ward'][235768]"
]
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"# Dropping unnnecessary columns\n",
"columns_to_drop = ['closure_reason', 'case_title','queue','submitted_photo', 'closed_photo','fire_district', 'pwd_district','ward', 'precinct']\n",
"df1.drop(columns_to_drop,axis=1,inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"execution_count": 10,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -988,7 +976,7 @@
"[14437 rows x 21 columns]"
]
},
"execution_count": 35,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -997,9 +985,16 @@
"df1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Completed v/s Pending requests"
]
},
{
"cell_type": "code",
"execution_count": 46,
"execution_count": 11,
"metadata": {},
"outputs": [
{
Expand All @@ -1012,6 +1007,7 @@
}
],
"source": [
"# Conversion into Date time format\n",
"df1['open_dt'] = pd.to_datetime(df1['open_dt'])\n",
"df1['closed_dt'] = pd.to_datetime(df1['closed_dt'])\n",
"df1['sla_target_dt'] = pd.to_datetime(df1['sla_target_dt'])\n",
Expand All @@ -1023,9 +1019,37 @@
"print(f\"Total outstanding requests in D7: {outstanding_requests}\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### On-time v/s Overdue requests"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total ontime requests in D7: 11257\n",
"Total overdue requests in D7: 3180\n"
]
}
],
"source": [
"ontime_requests = df1[df1['on_time'] == 'ONTIME'].shape[0]\n",
"overdue_requests = df1[df1['on_time'] == 'OVERDUE'].shape[0]\n",
"print(f\"Total ontime requests in D7: {ontime_requests}\")\n",
"print(f\"Total overdue requests in D7: {overdue_requests}\")"
]
},
{
"cell_type": "code",
"execution_count": 48,
"execution_count": 13,
"metadata": {},
"outputs": [
{
Expand All @@ -1035,7 +1059,7 @@
" 'City Worker App', 'Employee Generated'], dtype=object)"
]
},
"execution_count": 48,
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
Expand All @@ -1046,7 +1070,7 @@
},
{
"cell_type": "code",
"execution_count": 47,
"execution_count": 14,
"metadata": {},
"outputs": [
{
Expand All @@ -1068,7 +1092,23 @@
},
{
"cell_type": "code",
"execution_count": 57,
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"import folium"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Open v/s Closed Cases Map"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -1078,6 +1118,7 @@
"\n",
"boston_map = folium.Map(location=[df_filtered['latitude'].mean(), df_filtered['longitude'].mean()], zoom_start=12)\n",
"\n",
"# Red marks open cases and green marks the closed ones\n",
"def get_marker_color(status):\n",
" if status == 'Open':\n",
" return 'red'\n",
Expand All @@ -1098,9 +1139,16 @@
"boston_map.save(map_filename)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Open v/s Closed Cases Heat-Map"
]
},
{
"cell_type": "code",
"execution_count": 58,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -1128,6 +1176,13 @@
"boston_map.save(map_filename)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Heat map with GeoPandas and D7 shapefile"
]
},
{
"cell_type": "code",
"execution_count": 86,
Expand All @@ -1154,6 +1209,7 @@
],
"source": [
"neighborhoods_gdf = gpd.read_file('CityCouncilDistricts_2023_5_25_1809585831812363727/CityCouncilDistricts_2024.shp')\n",
"# This filters the neighborhoods GeoDataFrame to include only rows where the 'DISTRICT' column is 7\n",
"neighborhoods_filtered = neighborhoods_gdf[neighborhoods_gdf['DISTRICT'] == 7]\n",
"\n",
"# Number of cases by neighborhood_services_district and case_status\n",
Expand All @@ -1165,9 +1221,11 @@
"if case_counts.index.dtype == 'object':\n",
" case_counts.index = case_counts.index.astype(int)\n",
"\n",
"# Merging the filtered neighborhood data with the case counts based on the 'DISTRICT' column.\n",
"neighborhoods_filtered = neighborhoods_filtered.merge(case_counts, left_on='DISTRICT', right_index=True, how='left')\n",
"neighborhoods_filtered.fillna(0, inplace=True)\n",
"\n",
"# Plotting the heatmap for closed cases\n",
"fig, axes = plt.subplots(1, 2, figsize=(20, 10))\n",
"closed_cases_limits = (0, neighborhoods_filtered['Closed'].max() * 1.2) \n",
"open_cases_limits = (0, neighborhoods_filtered['Open'].max() * 1.2) \n",
Expand All @@ -1181,6 +1239,7 @@
"axes[0].set_xlabel('Longitude')\n",
"axes[0].set_ylabel('Latitude')\n",
"\n",
"# Plotting the heatmap for open cases\n",
"neighborhoods_filtered.plot(column='Open', ax=axes[1], legend=True, \n",
" cmap='Greens', edgecolor='black', \n",
" legend_kwds={'label': \"Number of Open Cases\", 'orientation': \"vertical\"},\n",
Expand All @@ -1194,11 +1253,18 @@
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "myenv",
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
Expand All @@ -1212,7 +1278,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
"version": "3.10.11"
}
},
"nbformat": 4,
Expand Down
Binary file removed fa24-team-a/Project Early Insights.pdf
Binary file not shown.
Loading

0 comments on commit 20b66ab

Please sign in to comment.