Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObservedProperty label & test data loader through ingestion #209

Merged
merged 23 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
b64fe73
refactor: ObservedProperty description and label generation
fjugipe Oct 16, 2024
82b4397
test: refactor integration tests to be CF units & parameters compliant
fjugipe Oct 16, 2024
c3f2b02
test: refactor id hash, add fields to mdata
fjugipe Oct 17, 2024
cb691e0
test: KNMI test data through ingest & update data-loader depencies
fjugipe Oct 17, 2024
9b269fb
refactor: data-loader Dockerfile, compose and just commands with inge…
fjugipe Oct 17, 2024
cbd22c5
ci: test-datastore to use ingest_load
fjugipe Oct 17, 2024
f73f2c3
test: Dockerize ingest unit tests & add just commands
fjugipe Oct 18, 2024
6387038
ci: run ingest tests in docker and publish results
fjugipe Oct 18, 2024
f502f45
fix: copy-proto before ingest_unit
fjugipe Oct 18, 2024
686653c
ci: re-order publish test results
fjugipe Oct 18, 2024
0d101cf
ci: add cleanup to ingest-test
fjugipe Oct 18, 2024
72ac8a7
ci: use multiple-files for a single comment
fjugipe Oct 18, 2024
9cca7ed
ci: fix artifacts path for coverage reporting
fjugipe Oct 18, 2024
395f9f0
ci: update coverage comment with correct coverage path and badges
fjugipe Oct 18, 2024
167e775
style: code formatting
fjugipe Nov 5, 2024
c5de77e
refactor: move helper functions to utilities.py
fjugipe Nov 5, 2024
5661a87
refactor: data-loaders to use utilities.py
fjugipe Nov 5, 2024
96526fe
style: move core logic to main()
fjugipe Nov 5, 2024
603ffef
refactor: use functools.partial instead of starmap with requests
fjugipe Nov 5, 2024
086e592
test: modify test data
fjugipe Nov 7, 2024
23103c4
style: consistent naming
fjugipe Nov 7, 2024
fef62df
test: adjust test data generation & test responses
fjugipe Nov 11, 2024
2347b27
Merge branch 'main' into issue_194_204-data-loader-to-ingest
fjugipe Nov 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 36 additions & 36 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ jobs:
run: just services

- name: Load the data into the database
run: just load
run: just ingest_load
lukas-phaf marked this conversation as resolved.
Show resolved Hide resolved

- name: Run the integration test
run: just integration
Expand Down Expand Up @@ -113,56 +113,56 @@ jobs:
with:
python-version: ${{ matrix.python-version }}
architecture: x64
- name: Checkout Source
uses: actions/checkout@v4

- name: Install Dependencies
run: |
pip install --upgrade pip
pip install pytest-timeout
pip install pytest-cov
pip install httpx
pip install -r ./ingest/requirements.txt
pip install ./ingest
cd ./ingest && python3 api/generate_standard_name.py
- name: Install just
run: ./ci/scripts/install-just.sh

- name: Copy protobuf files
run: just copy-proto

- name: Copy Protobuf file to api directory and build
run: |
mkdir ./ingest/protobuf
cp ./protobuf/datastore.proto ./ingest/protobuf/datastore.proto
python -m grpc_tools.protoc --proto_path=./ingest/protobuf --python_out=./ingest --grpc_python_out=./ingest ./ingest/protobuf/datastore.proto
- name: Run the unit test
run: just ingest_unit

- name: Run Tests
run: |
cd ingest
mkdir -p /tmp/metrics
PROMETHEUS_MULTIPROC_DIR=/tmp/metrics python -m pytest -v --timeout=60
- name: Archive test artifacts
uses: actions/upload-artifact@v4
with:
name: ingest-test-results-artifact
path: |
ingest/test/output/pytest-coverage.txt
ingest/test/output/pytest.xml

- name: Cleanup
if: always()
run: just destroy

publish-test-results:
needs: test-datastore
needs:
- test-datastore
- test-ingest
runs-on: ubuntu-latest
if: github.event.ref_type != 'tag'
if: github.event_name != 'push' || github.event.ref_type != 'tag'
permissions:
contents: write
issues: write
pull-requests: write
steps:
- name: Download test results so that they can be published
- name: Download test-datastore results so that they can be published
uses: actions/download-artifact@v4
with:
name: test-results-artifact
path: ./artifacts/test-results

- name: Download test-ingest results so that they can be published
uses: actions/download-artifact@v4
with:
name: ingest-test-results-artifact
path: ./artifacts/ingest-results

- name: Comment coverage
uses: MishaKav/pytest-coverage-comment@main
with:
pytest-coverage-path: api/test/output/pytest-coverage.txt
coverage-path-prefix: api/test/output/
title: API Unit Test Coverage Report
hide-badge: true
hide-report: false
create-new-comment: false
hide-comment: false
report-only-changed-files: false
remove-link-from-badge: false
junitxml-path: api/test/output/pytest.xml
junitxml-title: API Unit Test Coverage Summary
title: Unit Test Coverage Report
pytest-coverage-path: ./artifacts/test-results/api/test/output/pytest-coverage.txt
multiple-files: |
API Unit Tests, ./artifacts/test-results/api/test/output/pytest-coverage.txt, ./artifacts/test-results/api/test/output/pytest.xml
Ingest Unit Tests, ./artifacts/ingest-results/pytest-coverage.txt, ./artifacts/ingest-results/pytest.xml
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -212,3 +212,4 @@ cf_standard_names_v84.txt
# API
api/test/output/
datastore/load-test/output/
ingest/test/output/
6 changes: 3 additions & 3 deletions api/formatters/covjson.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
def make_parameter(ts_mdata):
level = convert_cm_to_m(ts_mdata.level)
period = seconds_to_iso_8601_duration(ts_mdata.period)
label = " ".join(ts_mdata.standard_name.capitalize().split("_"))

custom_fields = {
"rodeo:standard_name": ts_mdata.standard_name,
Expand All @@ -40,12 +41,11 @@ def make_parameter(ts_mdata):

return Parameter(
description={
"en": f"{ts_mdata.standard_name} at {level}m, "
f"aggregated over {period} with method '{ts_mdata.function}'",
"en": f"{label} at {level}m, " f"aggregated over {period} with method '{ts_mdata.function}'",
fjugipe marked this conversation as resolved.
Show resolved Hide resolved
},
observedProperty=ObservedProperty(
id=f"https://vocab.nerc.ac.uk/standard_name/{ts_mdata.standard_name}",
label={"en": ts_mdata.parameter_name},
label={"en": label},
),
measurementType=MeasurementType(
method=ts_mdata.function,
Expand Down
5 changes: 3 additions & 2 deletions api/metadata_endpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,17 +110,18 @@ async def get_collection_metadata(base_url: str, is_self) -> Collection:
ts = group.combo
level = convert_cm_to_m(ts.level)
period = seconds_to_iso_8601_duration(ts.period)
label = " ".join(ts.standard_name.capitalize().split("_"))

custom_fields = {
"rodeo:standard_name": ts.standard_name,
"rodeo:level": level,
}

parameter = Parameter(
description=f"{ts.standard_name} at {level}m, aggregated over {period} with method '{ts.function}'",
description=f"{label} at {level}m, aggregated over {period} with method '{ts.function}'",
observedProperty=ObservedProperty(
id=f"https://vocab.nerc.ac.uk/standard_name/{ts.standard_name}",
label=ts.parameter_name,
label=label,
),
measurementType=MeasurementType(
method=ts.function,
Expand Down
12 changes: 6 additions & 6 deletions api/test/test_data/test_coverages_covjson.json
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,12 @@
"air_temperature:2.0:mean:PT1M": {
"type": "Parameter",
"description": {
"en": "air_temperature at 2.0m, aggregated over PT1M with method 'mean'"
"en": "Air temperature at 2.0m, aggregated over PT1M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/air_temperature",
"label": {
"en": "air_temperature:2.0:mean:PT1M"
"en": "Air temperature"
}
},
"unit": {
Expand Down Expand Up @@ -139,12 +139,12 @@
"air_temperature:2.0:mean:PT1M": {
"type": "Parameter",
"description": {
"en": "air_temperature at 2.0m, aggregated over PT1M with method 'mean'"
"en": "Air temperature at 2.0m, aggregated over PT1M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/air_temperature",
"label": {
"en": "air_temperature:2.0:mean:PT1M"
"en": "Air temperature"
}
},
"unit": {
Expand Down Expand Up @@ -186,12 +186,12 @@
"air_temperature:2.0:mean:PT1M": {
"type": "Parameter",
"description": {
"en": "air_temperature at 2.0m, aggregated over PT1M with method 'mean'"
"en": "Air temperature at 2.0m, aggregated over PT1M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/air_temperature",
"label": {
"en": "air_temperature:2.0:mean:PT1M"
"en": "Air temperature"
}
},
"unit": {
Expand Down
8 changes: 4 additions & 4 deletions api/test/test_data/test_feature_collection.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,12 @@
"air_pressure_at_sea_level:1.0:mean:PT1M": {
"type": "Parameter",
"description": {
"en": "air_pressure_at_sea_level at 1.0m, aggregated over PT1M with method 'mean'"
"en": "Air pressure at sea level at 1.0m, aggregated over PT1M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/air_pressure_at_sea_level",
"label": {
"en": "air_pressure_at_sea_level:1.0:mean:PT1M"
"en": "Air pressure at sea level"
}
},
"unit": {
Expand All @@ -67,12 +67,12 @@
"air_temperature:0.1:minimum:PT10M": {
"type": "Parameter",
"description": {
"en": "air_temperature at 0.1m, aggregated over PT10M with method 'minimum'"
"en": "Air temperature at 0.1m, aggregated over PT10M with method 'minimum'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/air_temperature",
"label": {
"en": "air_temperature:0.1:minimum:PT10M"
"en": "Air temperature"
}
},
"unit": {
Expand Down
12 changes: 6 additions & 6 deletions api/test/test_data/test_multiple_covjson.json
Original file line number Diff line number Diff line change
Expand Up @@ -48,12 +48,12 @@
"relative_humidity:2.0:mean:PT1M": {
"type": "Parameter",
"description": {
"en": "relative_humidity at 2.0m, aggregated over PT1M with method 'mean'"
"en": "Relative humidity at 2.0m, aggregated over PT1M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/relative_humidity",
"label": {
"en": "relative_humidity:2.0:mean:PT1M"
"en": "Relative humidity"
}
},
"unit": {
Expand All @@ -71,12 +71,12 @@
"wind_from_direction:2.0:mean:PT10M": {
"type": "Parameter",
"description": {
"en": "wind_from_direction at 2.0m, aggregated over PT10M with method 'mean'"
"en": "Wind from direction at 2.0m, aggregated over PT10M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/wind_from_direction",
"label": {
"en": "wind_from_direction:2.0:mean:PT10M"
"en": "Wind from direction"
}
},
"unit": {
Expand All @@ -94,12 +94,12 @@
"wind_speed:10.0:mean:PT10M": {
"type": "Parameter",
"description": {
"en": "wind_speed at 10.0m, aggregated over PT10M with method 'mean'"
"en": "Wind speed at 10.0m, aggregated over PT10M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/wind_speed",
"label": {
"en": "wind_speed:10.0:mean:PT10M"
"en": "Wind speed"
}
},
"unit": {
Expand Down
4 changes: 2 additions & 2 deletions api/test/test_data/test_single_covjson.json
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,12 @@
"wind_speed:10.0:mean:PT10M": {
"type": "Parameter",
"description": {
"en": "wind_speed at 10.0m, aggregated over PT10M with method 'mean'"
"en": "Wind speed at 10.0m, aggregated over PT10M with method 'mean'"
},
"observedProperty": {
"id": "https://vocab.nerc.ac.uk/standard_name/wind_speed",
"label": {
"en": "wind_speed:10.0:mean:PT10M"
"en": "Wind speed"
}
},
"unit": {
Expand Down
10 changes: 10 additions & 0 deletions datastore/data-loader/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ FROM python:3.11-slim-bookworm

SHELL ["/bin/bash", "-eux", "-o", "pipefail", "-c"]

ARG THROUGH_INGEST=false
ENV DOCKER_PATH="/clients/python"

COPY "test-data/KNMI/20221231.nc" "${DOCKER_PATH}/test-data/KNMI/20221231.nc"
Expand Down Expand Up @@ -29,7 +30,16 @@ RUN python -m grpc_tools.protoc \
--grpc_python_out="${DOCKER_PATH}"

COPY "./parameters.py" "${DOCKER_PATH}/parameters.py"

# Copy both loaders and conditionally use the wanted one
COPY "./client_knmi_station_ingest.py" "${DOCKER_PATH}/client_knmi_station_ingest.py"
COPY "./client_knmi_station.py" "${DOCKER_PATH}/client_knmi_station.py"

RUN if [ "${THROUGH_INGEST}" = "true" ]; then \
mv "${DOCKER_PATH}/client_knmi_station_ingest.py" "${DOCKER_PATH}/client_knmi_station.py"; \
else \
rm "${DOCKER_PATH}/client_knmi_station_ingest.py"; \
fi

lukas-phaf marked this conversation as resolved.
Show resolved Hide resolved
WORKDIR "${DOCKER_PATH}"
CMD ["python", "-u", "./client_knmi_station.py"]
Loading