Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display local council boundaries and add some basic council data #506

Merged
merged 59 commits into from
Apr 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
ce55c5b
import local council areas and geometry
struan Mar 5, 2024
44c6e23
display local council boundaries on the map
struan Mar 5, 2024
7cc2dbd
add council area types to AreaType model
struan Mar 6, 2024
51bac3f
hide MP tab from council area pages
struan Mar 6, 2024
ce9ff6a
update map header area type description to handle councils
struan Mar 6, 2024
586751e
do not include MP name with council data
struan Mar 6, 2024
de05f7e
update areas available for GSS fake dataset
struan Mar 6, 2024
76e5d0b
update table display to cope with councils
struan Mar 6, 2024
f818b47
add mixin to allow base importers to handle multiple area types
struan Mar 11, 2024
3cf03a1
import some basic council data
struan Mar 11, 2024
208482d
only allow shadeable datasets to be selected for shading
struan Mar 12, 2024
45a475e
import council types
struan Mar 12, 2024
e4385ef
Do not default to datasets with descriptions matching their label
zarino Mar 14, 2024
fbbb06e
Thousands separators for large numbers on Area page
zarino Mar 14, 2024
7358922
Location search now handles all area types
zarino Apr 1, 2024
fdbe84b
Added UI for area type in Map page
lucascumsille Apr 4, 2024
d567bd0
Added dropdown icon
lucascumsille Apr 8, 2024
f818bfb
Only show “MP” column in Explore page modal if area_type has an MP
zarino Apr 23, 2024
d4c18aa
import council RUC classification
struan Mar 13, 2024
8d4e7b5
imports for council climate emergency declarations and net zero targets
struan Mar 13, 2024
b3ada63
add some useful import functions
struan Mar 13, 2024
d98c43f
imports for council emissions data
struan Mar 13, 2024
26ff3fc
add council type filter function to import utils
struan Mar 14, 2024
fa45f78
update council filter to only return current authorities
struan Mar 19, 2024
2b73702
update council imports to filter out types not being imported
struan Mar 19, 2024
7b98666
import if council has a climate action plan
struan Mar 20, 2024
0bd7a40
update area page to allow related categories for place
struan Mar 21, 2024
be5b591
improve display of net zero declarations
struan Mar 21, 2024
f52bf1d
import council countries
struan Mar 21, 2024
58a7cbb
update LatLon generator to add columns for all area types
struan Mar 21, 2024
2e45a0b
add mapit calls to get GSS codes by area type from lat/lon and pc
struan Mar 21, 2024
2579d7f
fix base constituency count importer to handle multiple area types
struan Mar 21, 2024
ee24f6e
import GWGW 2022 council event details
struan Mar 21, 2024
0c294a1
add council areas to WI CSV generator
struan Mar 25, 2024
355becb
add council areas to foodbank CSV generator
struan Mar 25, 2024
365b2ff
add council areas to national trust property CSV generator
struan Mar 25, 2024
8601b0f
add council areas to RSPB reserve CSV generator
struan Mar 25, 2024
0ffa0f1
add council areas to power postcode generator
struan Mar 25, 2024
59b5a5b
add council areas to onshore windfarm generator
struan Mar 25, 2024
eb2ac76
add council areas to save the children generator
struan Mar 25, 2024
cfca098
add council areas to wildlife trust reserves generator
struan Mar 25, 2024
245a6ae
update power postcodes importer to include councils
struan Mar 26, 2024
8eb40ad
update foodbank importer to handle councils
struan Mar 26, 2024
45ca360
update GBGW importers to handle councils
struan Mar 26, 2024
7dab603
update NT property importer to handle councils
struan Mar 26, 2024
1526f71
update windfarm importer to handle councils
struan Mar 26, 2024
1f3781f
update RSPB importer to handle councils
struan Mar 26, 2024
672038a
update save the children importer to handle councils
struan Mar 26, 2024
c7766b6
update wildlife trust reserve importer to handle councils
struan Mar 26, 2024
f17ad01
update WI group importer to handle councils
struan Mar 26, 2024
f4bc7ab
import council polling data from Onward and RenewableUK
struan Mar 27, 2024
673153d
import council level IMD data
struan Apr 2, 2024
f01ca1e
generate/import tearfund church data for all areas
struan Apr 9, 2024
112c7b4
update dataset list generator to add council area types
struan Apr 9, 2024
89c1599
local council action scorecards import
struan Apr 17, 2024
8c767e9
display changes to support local council scorecards
struan Apr 17, 2024
408b8df
fix constituency country importer for multiple areas
struan Apr 24, 2024
9c98b7f
utility command to run all the council importers
struan Apr 24, 2024
f983cff
fix HNH import script bugs
struan Apr 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion hub/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ class DataSetAdmin(admin.ModelAdmin):
"is_public",
)
list_editable = ("order", "featured", "is_public")
list_filter = ("category", "featured", "data_type", "is_public")
list_filter = ("category", "featured", "areas_available", "is_public", "data_type")
ordering = ("category", "order", "label")
search_fields = ["name", "label", "description", "source", "source_label"]

Expand Down
4 changes: 2 additions & 2 deletions hub/fixtures/areas.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"fields": {
"name": "Constituency",
"code": "WMC",
"area_type": "Constituency",
"area_type": "Westminster Constituency",
"description": "Constituency"
}
},
Expand All @@ -15,7 +15,7 @@
"fields": {
"name": "Constituency 2023",
"code": "WMC23",
"area_type": "Constituency 2023",
"area_type": "Westminster Constituency",
"description": "Constituency 2023"
}
},
Expand Down
94 changes: 94 additions & 0 deletions hub/import_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
from datetime import date
from functools import lru_cache

import pandas as pd
from mysoc_dataset import get_dataset_url

council_types = {"STC": ["CTY", "LBO", "MD", "SCO", "NID", "UA", "WPA"], "DIS": ["NMD"]}


@lru_cache
def get_authority_mapping() -> pd.DataFrame:
"""
Return a dataframe mapping different names to authority code
"""
url = get_dataset_url(
repo_name="uk_local_authority_names_and_codes",
package_name="uk_la_future",
version_name="1",
file_name="lookup_name_to_registry.csv",
done_survey=True,
)
return pd.read_csv(url)


@lru_cache
def get_council_df():
"""
Return a dataframe of councils that are live or historical as of a given date
"""
url = get_dataset_url(
repo_name="uk_local_authority_names_and_codes",
package_name="uk_la_future",
version_name="1",
file_name="uk_local_authorities_future.csv",
done_survey=True,
)
return pd.read_csv(url)


def add_gss_codes(df: pd.DataFrame, code_column: str):
"""
Given a DataFrame with a column called "authority_code", add a column called "gss_code"
"""
authority_df = get_council_df()

rows = len(df[code_column])
df["gss_code"] = pd.Series([None] * rows, index=df.index)

for index, row in df.iterrows():
authority_code = row[code_column]
if not pd.isnull(authority_code):
authority_match = authority_df[
authority_df["local-authority-code"] == authority_code
]
df.at[index, "gss_code"] = authority_match["gss-code"].values[0]

return df


def _filter_authority_type(df: pd.DataFrame, types: list, gss_code: str):
authority_df = get_council_df()

today = date.today()

rows = len(df[gss_code])
df["type"] = pd.Series([None] * rows, index=df.index)
df["start-date"] = pd.Series([None] * rows, index=df.index)
df["end-date"] = pd.Series([None] * rows, index=df.index)
for index, row in df.iterrows():
if not pd.isnull(row[gss_code]):
authority_match = authority_df[authority_df["gss-code"] == row[gss_code]]
df.at[index, "type"] = authority_match["local-authority-type"].values[0]
df.at[index, "start-date"] = pd.to_datetime(
authority_match["start-date"].values[0]
).date()
df.at[index, "end-date"] = pd.to_datetime(
authority_match["end-date"].values[0]
).date()

df = df.loc[df["type"].isin(types)]

# only select authorities with a start date in the past
df = df.loc[(df["start-date"] < today) | df["start-date"].isna()]

# only select authorities with an end date in the future
df = df.loc[(df["end-date"] > today) | df["end-date"].isna()]

return df


def filter_authority_type(
df: pd.DataFrame, authority_type: str, gss_code: str = "gss-code"
):
return _filter_authority_type(df, council_types[authority_type], gss_code)
132 changes: 107 additions & 25 deletions hub/management/commands/base_generators.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,28 @@
RateLimitException,
)

mapit_types = {
"LBO": "STC",
"UTA": "STC",
"COI": "STC",
"LGD": "STC",
"CTY": "STC",
"MTD": "STC",
"NMD": "DIS",
"DIS": "DIS",
"WMC": "WMC",
"WMCF": "WMC23",
}


class BaseLatLonGeneratorCommand(BaseCommand):
uses_gss = False
uses_postcodes = False
out_file = None
location_col = "lat_lon"
legacy_col = "area"
cols = ["WMC", "WMC23", "STC", "DIS"]

tqdm.pandas()

def get_dataframe(self):
Expand All @@ -25,53 +45,112 @@ def get_dataframe(self):

return df

def _process_lat_long(self, lat_lon=None, row_name=None):
lat = lat_lon[0]
lon = lat_lon[1]

if not pd.isna(lat) and not pd.isna(lon):
def _process_location(self, lat_lon=None, postcode=None, row_name=None):
lat, lon = None, None
if lat_lon is not None:
lat = lat_lon[0]
lon = lat_lon[1]

cols = [self.legacy_col, *self.cols]
if (self.uses_postcodes and not pd.isna(postcode)) or (
not pd.isna(lat) and not pd.isna(lon)
):
areas = {}
try:
mapit = MapIt()
gss_codes = mapit.wgs84_point_to_gss_codes(lon, lat)

area = Area.objects.filter(gss__in=gss_codes).first()
if area:
return area.name
if self.uses_postcodes:
gss_codes = mapit.postcode_point_to_gss_codes_with_type(postcode)
else:
return None
gss_codes = mapit.wgs84_point_to_gss_codes_with_type(lon, lat)

for area_type, code in gss_codes.items():
if mapit_types.get(area_type, None) is not None:
if self.uses_gss:
areas[mapit_types[area_type]] = code
else:
area = Area.objects.filter(
gss=code, area_type__code=mapit_types[area_type]
).first()
areas[mapit_types[area_type]] = area.name
else:
continue
except (
NotFoundException,
BadRequestException,
InternalServerErrorException,
ForbiddenException,
) as error:
print(f"Error fetching row {row_name} with {lat}, {lon}: {error}")
return None
location_data = lat_lon
if self.uses_postcodes:
location_data = postcode
self.stderr.write(
f"Error fetching row {row_name} with {location_data}: {error}"
)
return pd.Series([None for t in cols], index=cols)
except RateLimitException as error:
print(f"Mapit Error - {error}, waiting for a minute")
self.stderr.write(f"Mapit Error - {error}, waiting for a minute")
sleep(60)
return False

areas[self.legacy_col] = areas.get("WMC", None)
vals = [areas.get(t, None) for t in cols]
return pd.Series(vals, index=cols)
else:
print(f"missing lat or lon for row {row_name}")
return None
self.stderr.write(f"missing location data for row {row_name}")
return pd.Series([None for t in cols], index=cols)

def process_lat_long(self, lat_lon=None, row_name=None):
success = self._process_lat_long(lat_lon=lat_lon, row_name=row_name)
def process_location(self, lat_lon=None, postcode=None, row_name=None):
success = self._process_location(
lat_lon=lat_lon, postcode=postcode, row_name=row_name
)
# retry once if it fails so we can catch rate limit errors
if success is False:
return self._process_lat_long(lat_lon=lat_lon, row_name=row_name)
return self._process_location(
lat_lon=lat_lon, postcode=postcode, row_name=row_name
)
else:
return success

def get_location_from_row(self, row):
if self.uses_postcodes:
return {"postcode": row["postcode"]}
else:
return {"lat_lon": [row["lat"], row["lon"]]}

def process_data(self, df):
if not self._quiet:
self.stdout.write("Generating Area name from lat + lon values")
self.stdout.write("Generating Area details from location values")

df["area"] = df.progress_apply(
lambda row: self.process_lat_long(
self.get_lat_lon_from_row(row), row[self.row_name]
),
axis=1,
if not self._ignore and self.out_file is not None:
try:
# check that we've got all the output we're expecting before using
# the old values
old_df = pd.read_csv(self.out_file)
usecols = list(set(self.cols).intersection(df.columns))
if len(usecols) == len(self.cols):
old_df = pd.read_csv(
self.out_file, usecols=[self.lat_lon_row, *self.cols]
)
location_lookup = {
row[self.location_col]: row[self.legacy_col]
for index, row in old_df.iterrows()
}
if not self._quiet:
self.stdout.write("Reading codes from existing file")
df[self.legacy_col] = df.apply(
lambda row: location_lookup.get((row[self.location_col]), None),
axis=1,
)
except FileNotFoundError:
self.stderr.write("No existing file.")

df = df.join(
df.progress_apply(
lambda row: self.process_location(
row_name=row[self.row_name], **self.get_location_from_row(row)
),
axis=1,
)
)

return df
Expand All @@ -89,6 +168,9 @@ def add_arguments(self, parser):

def handle(self, quiet=False, ignore=False, *args, **options):
self._quiet = quiet

if not self._quiet:
self.stdout.write(self.message)
self._ignore = ignore
df = self.get_dataframe()
out_df = self.process_data(df)
Expand Down
Loading
Loading