Skip to content

Commit

Permalink
Expand Mobility Data Limitations (#48)
Browse files Browse the repository at this point in the history
  • Loading branch information
g4brielvs authored Oct 24, 2023
1 parent 31c2677 commit dd942fa
Show file tree
Hide file tree
Showing 6 changed files with 114 additions and 522 deletions.
22 changes: 11 additions & 11 deletions docs/_toc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ parts:
- file: docs/introduction_to_data_goods
- file: docs/foundational_datasets_and_data_products
sections:
- file: notebooks/earthquake-intensity/earthquake_intensity.ipynb
- file: notebooks/earthquake-intensity/earthquake_intensity.ipynb
- caption: Data Products
chapters:
- file: notebooks/ntl-analysis/README
Expand All @@ -18,15 +18,15 @@ parts:
sections:
- file: notebooks/mobility/stops/README
sections:
- file: notebooks/mobility/stops/01a-aoi-and-tessellation.ipynb
- file: notebooks/mobility/stops/01b-convenience-sampling.ipynb
- file: notebooks/mobility/stops/02-validate-mobility-data.ipynb
- file: notebooks/mobility/stops/03a-count-within-aoi.ipynb
- file: notebooks/mobility/stops/03b-estimate-stay-locations.ipynb
- file: notebooks/mobility/stops/01a-aoi-and-tessellation.ipynb
- file: notebooks/mobility/stops/01b-convenience-sampling.ipynb
- file: notebooks/mobility/stops/02-validate-mobility-data.ipynb
- file: notebooks/mobility/stops/03a-count-within-aoi.ipynb
- file: notebooks/mobility/stops/03b-estimate-stay-locations.ipynb
- url: https://datapartnership.org/turkiye-earthquake-impact/notebooks/mobility/activity.html
title: Estimating Activity based on Mobility Data
title: Estimating Activity through on Mobility Data
- url: https://datapartnership.org/turkiye-earthquake-impact/notebooks/mobility/visits.html
title: Estimating Activity based on Visits to Points of Interest
title: Estimating Activity based on Visits to Points of Interest through Mobility Data
- file: notebooks/hsos-survey/README.md
- file: notebooks/ais-analysis/README
sections:
Expand All @@ -36,14 +36,14 @@ parts:
- file: notebooks/traffic/README
- file: notebooks/syria-forest-cover/2023-summer-tree-cover-loss.md
sections:
- file: notebooks/syria-forest-cover/syria_forest.ipynb
- file: notebooks/syria-forest-cover/syria_forest.ipynb
- file: notebooks/internet-connectivity/README
sections:
- file: notebooks/internet-connectivity/ookla-speedtest-analysis.ipynb
- file: notebooks/internet-connectivity/ookla-speedtest-analysis.ipynb
- file: notebooks/conflict/acled.ipynb
- file: notebooks/vegetation-conditions/README
sections:
- file: notebooks/vegetation-conditions/Seasonality_Parameters_Data_Extraction.md
- file: notebooks/vegetation-conditions/Seasonality_Parameters_Data_Extraction.md
- caption: Insights and Indicators
chapters:
- file: docs/insights-and-indicators
Expand Down
22 changes: 7 additions & 15 deletions notebooks/mobility/README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,20 @@
# Population Movement Trends
# Movement Trends

This pilot study seeks to demonstrate the potential of mobility data as a powerful tool for estimating population movement trends and, particularly, in context of emergencies and data scarcity.
Understanding population movement trends is of paramount importance for various fields, including urban planning, disaster response, and public policy formulation. Traditional methods of data collection, such as census surveys, while valuable, often face limitations in terms of scale, timeliness, and granularity. Mobility data, sourced from mobile devices, GPS, and other location-based services, offers an alternative approach that can transcend these limitations. By analyzing anonymized and aggregated location data generated by a sample of individuals, we aim to uncover nuanced patterns of movement within urban and rural settings, and gauge the impact of external factors, such as public events or emergencies, on population mobility. It is crucial to emphasize that this approach, however, does not come without **significant limitations**, in particular, in terms of sample bias.

Understanding population movement trends is of paramount importance for various fields, including urban planning, disaster response, and public policy formulation. Traditional methods of data collection, such as census surveys and transportation studies, while valuable, often face limitations in terms of scale, timeliness, and granularity. Mobility data, sourced from mobile devices, GPS, and other location-based services, offers an alternative approach that can transcend these limitations. By analyzing anonymized and aggregated location data sample of individuals, we aim to uncover nuanced patterns of movement within urban and rural settings, and gauge the impact of external factors, such as public events or emergencies, on population mobility. It is paramount to emphrasize that this approach does not come with significant limitations, in particular, in terms of sample bias.

The objectives of this study are threefold:
This pilot study seeks to demonstrate the potential of mobility data as a powerful tool for estimating population movement trends, particularly, in context of emergencies and data scarcity. The objectives of this study are threefold:

- To assess the feasibility and accuracy of using mobility data for estimating population movement trends.
- To develop analytical methods and models that can extract meaningful insights from this data.
- To showcase the practical applications and relevance of such insights for informed decision-making in various domains.

This pilot study resulted in following (working) outputs:
This pilot study resulted in following (experimental) outputs:

- {ref}`mobility-stops`
- [Türkiye-Syria Earquake Impact](https://datapartnership.org/turkiye-earthquake-impact/notebooks/mobility/README.html)

## Data Availability Statement

Data are available upon request through the [Development Data Partnership](https://datapartnership.org). Licensing and access information for all other datasets are included in the documentation.
## Data

## Limitations
### Data Availability Statement

```{warning}
- **Sample Bias:** The sampled population is composed of GPS-enabled devices drawn out from a longituginal mobility data panel. It is important to emphasize the sampled population is obtained via convenience sampling and that the mobility data panel represents only a subset of the total population in an area at a time, specifically only users that turned on location tracking on their mobile device. Thus, derived metrics do not represent the total population density.
- **Incomplete Coverage:** Mobility data is typically collected from sources such as mobile phone networks, GPS devices, or transportation systems. These sources may not be representative of the entire population or all economic activities, leading to sample bias and potentially inaccurate estimations.Not all individuals or businesses have access to devices or services that generate mobility data. This can result in incomplete coverage and potential underrepresentation of certain demographic groups or economic sectors.
- **Lack of Contextual Information:** Mobility data primarily captures movement patterns and geolocation information. It may lack other crucial contextual information, such as transactional data, business types, or specific economic activities, which are essential for accurate estimation of economic activity.
```
Data are available upon request through the [Development Data Partnership](https://datapartnership.org). Licensing and access information for all other datasets are included throughout this documentation.
68 changes: 32 additions & 36 deletions notebooks/mobility/stops/01b-convenience-sampling.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
"id": "dba12959-03fc-4875-9ab5-35888785c799",
"metadata": {},
"source": [
"# Convenience Sampling\n",
"# Constructiong Samples\n",
"\n",
"In this step, we create sub-panels **A** (formal) and **B** (informal) as described in the [methodological notes](README.md) of this pilot study. The sub-panels are composed of longitudial mobility data generated by GPS-enabled devices based on whether they detected in the proximity of points of interest [Region A or Region B](01a-aoi-and-tessellation.ipynb#regions-a-b) throughout the time horizon. \n",
"In this step, we create sub-panels **A** (formal) and **B** (informal) as described in the [methodological notes](README.md) of this pilot study. The sub-panels are composed of longitudinal mobility data generated by GPS-enabled devices based on whether they were detected within the proximity of [Region A or Region B](01a-aoi-and-tessellation.ipynb#regions-a-b) throughout the time horizon. \n",
"\n",
"The **A** (formal) and **B** (informal) sub-panels are respectively defined as follows.\n",
"\n",
Expand All @@ -32,7 +32,7 @@
"id": "404bd056-39d2-4cce-9073-402603397a1e",
"metadata": {
"tags": [
"hide-cell"
"remove-cell"
]
},
"outputs": [],
Expand All @@ -42,25 +42,6 @@
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1081eafc-d4c7-40e5-9895-a366f1b79d08",
"metadata": {
"tags": [
"parameters"
]
},
"outputs": [],
"source": [
"# Parameters \n",
"# https://papermill.readthedocs.io/en/latest/usage-parameterize.html\n",
"DASK_SCHEDULER_ADDRESS = None\n",
"\n",
"AOI = \"id=7&name=A\"\n",
"NAME = \"A\""
]
},
{
"cell_type": "code",
"execution_count": 3,
Expand All @@ -85,6 +66,26 @@
"## Data"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1081eafc-d4c7-40e5-9895-a366f1b79d08",
"metadata": {
"tags": [
"parameters",
"hide-cell"
]
},
"outputs": [],
"source": [
"# Parameters\n",
"# https://papermill.readthedocs.io/en/latest/usage-parameterize.html\n",
"DASK_SCHEDULER_ADDRESS = None\n",
"\n",
"AOI = \"id=7&name=A\"\n",
"NAME = \"A\""
]
},
{
"cell_type": "markdown",
"id": "30d4bb84-5a6c-487a-9f5f-079ccded8c14",
Expand Down Expand Up @@ -356,7 +357,11 @@
"cell_type": "code",
"execution_count": 8,
"id": "748dcf20-7ef3-4e5a-8845-ff32b52c1dff",
"metadata": {},
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -398,7 +403,7 @@
"tags": []
},
"source": [
"## Sampling\n",
"## Sampling Strategy\n",
"\n",
"We select sub-panels of devices using convenience sampling, a non-probability form of sampling. The sampling method is a **key limitation** of this approach."
]
Expand Down Expand Up @@ -451,20 +456,11 @@
{
"cell_type": "code",
"execution_count": null,
"id": "fc2c6c6a-df80-49ff-80e8-61f95c1b30ff",
"id": "bcf34670",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2022-12-19 20:57:05,860 - distributed.scheduler - WARNING - Received heartbeat from unregistered worker 'tcp://127.0.0.1:52557'.\n",
"2022-12-19 20:57:05,864 - distributed.scheduler - WARNING - Received heartbeat from unregistered worker 'tcp://127.0.0.1:52562'.\n"
]
}
],
"outputs": [],
"source": [
"devices = ddf[ddf[\"h3_10\"].isin(AOI[\"hex_id\"])][\"uid\"].unique().compute()\n",
"devices = ddf[ddf[\"hex_id\"].isin(AOI[\"hex_id\"])][\"uid\"].unique().compute()\n",
"devices = devices.to_frame()"
]
},
Expand Down
12 changes: 11 additions & 1 deletion notebooks/mobility/stops/03a-count-within-aoi.ipynb

Large diffs are not rendered by default.

Loading

0 comments on commit dd942fa

Please sign in to comment.