Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DE-101: Re-platform application to Python #11

Merged
merged 20 commits into from
Jul 23, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 0 additions & 14 deletions .c8rc

This file was deleted.

26 changes: 26 additions & 0 deletions .github/workflows/run-unit-tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Run Python unit tests

on: [ push ]
fatimarahman marked this conversation as resolved.
Show resolved Hide resolved

jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v3

- name: Set up Python 3.9
uses: actions/setup-python@v4
with:
python-version: '3.9'
cache: 'pip'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt

- name: Run linter and test suite
run: |
make lint
make test
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,15 @@ node_modules

coverage

.aws-sam/
.coverage
.DS_Store
.vscode/
__pycache__/
.terraform.lock.hcl
.terraform
.pytest_cache
dist.zip
*env/
*.py[cod]
*$py.class
1 change: 0 additions & 1 deletion .nvmrc

This file was deleted.

1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.9.16
fatimarahman marked this conversation as resolved.
Show resolved Hide resolved
34 changes: 0 additions & 34 deletions .travis.yml

This file was deleted.

21 changes: 21 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.DEFAULT: help

help:
@echo "make help"
@echo " display this help statement"
@echo "make run"
@echo " run the application in devel"
@echo "make test"
@echo " run associated test suite with pytest"
@echo "make lint"
@echo " lint project files using the black linter"

run:
export ENVIRONMENT=devel; \
python -c 'import lambda_function; lambda_function.lambda_handler(None, None)'

test:
pytest tests -W ignore::DeprecationWarning

lint:
black ./ --check --exclude="(env/)|(tests/)"
66 changes: 23 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,48 @@
# Kinesis Firehose Avro to Json Transformer Lambda
[![Build Status](https://travis-ci.org/NYPL/firehose-avro-to-json-transformer.svg?branch=main)](https://travis-ci.org/NYPL/firehose-avro-to-json-transformer)

This app reads from Firehose Kinesis streams, decodes the records using the appropriate Avro schema based on the stream name, and returns the resulting records as either JSON or CSV (base64 encoded). This app is responsible for decoding records immediately before ingest into the [BIC](https://github.com/NYPL/BIC).

## Version
> v1.0.1

## Installation

Install all Node dependencies via NPM

```console
nvm use
npm install
```
This Python application is responsible for Avro-decoding events immediately before ingestion into the [BIC](https://github.com/NYPL/BIC). Originally developed for the Data Warehouse, this is deployed as an AWS Lambda (["AvroToJsonTransformer-qa"](https://console.aws.amazon.com/lambda/home?region=us-east-1#/functions/AvroToJsonTransformer-qa?tab=configuration) and ["AvroToJsonTransformer-production"](https://console.aws.amazon.com/lambda/home?region=us-east-1#/functions/AvroToJsonTransformer-production?tab=configuration)). In essence, the code does the following:
- Decodes the incoming batch of records using the corresponding Avro schema, which is determined based on the name of the incoming Kinesis stream
- Converts said records into a hash with `recordId`, `result: 'Ok'`, and `data` containing a JSON or CSV serialization of the record, which is also base64 encoded
- Returns processed records in this format: `{ records: [ { recordId: '[record id]', result: 'Ok', data: 'eyJmb28iOiJiYXIifQ....' }, ... ] }`

## Running Locally

Use the [sam cli](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) to run the lambda on arbitrary firehose events. To process a firehose event containing 3 CircTrans records and print out the result:
Use the [sam cli](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) to run the Lambda on arbitrary Firehose events. To process a Firehose event containing 3 CircTrans records and print out the result:

```
sam local invoke --profile nypl-digital-dev -t sam.qa.yml -e sample/firehose-CircTrans-3-records-encoded.json
sam local invoke --profile nypl-digital-dev -t config/sam.qa.yml -e sample/firehose-CircTrans-3-records-encoded.json
```

## Contributing
The [sample](./sample) folder contains sample Firehose events and their expected outcomes after Lambda event handling, so you can test the efficacy of your code with various schemas.

With Python, you also have the option of using the [python-lambda-local](https://pypi.org/project/python-lambda-local/) package for local development!

## Contributing / Deployment

This repo uses the ["PRs Target Main, Merge to Deployment Branches" git workflow](https://github.com/NYPL/engineering-general/blob/main/standards/git-workflow.md#prs-target-main-merge-to-deployment-branches):
- Cut PRs from `main`
- Merge `main` > `qa`
- Merge `main` > `production`

## Deployment

This app is deployed via Travis-CI using terraform. Code in `qa` is pushed to AvroToJsonTransformer-qa. Code in `production` is pushed to AvroToJsonTransformer-production.

## Tests
This app is deployed via Travis-CI using Terraform. Code in `qa` is pushed to ["AvroToJsonTransformer-qa"](https://console.aws.amazon.com/lambda/home?region=us-east-1#/functions/AvroToJsonTransformer-qa?tab=configuration). Code in `production` is pushed to ["AvroToJsonTransformer-production"](https://console.aws.amazon.com/lambda/home?region=us-east-1#/functions/AvroToJsonTransformer-production?tab=configuration).

To run all tests found in `./test/`:

```console
npm run test
## Test Coverage
Use the Python [coverage package](https://coverage.readthedocs.io/en/7.6.0/) to measure test coverage:
```

To run a specific test for the given filename:

```console
npm run test [filename].test.js
coverage run -m pytest
```

### Test Coverage

This repo uses c8 to compute test coverage (because [Istanbul](https://github.com/istanbuljs/nyc) doesn't appear to support ESM at writing). Coverage reports are included at the end of `npm test`. For a detailed line-by-line breakdown, view the HTML report:

```console
npm run coverage-report
open coverage/index.html
To see what exactly which lines are missing testing:
```
coverage report -m
```

### Linting

This codebase uses [Standard JS](https://www.npmjs.com/package/standard) as the JavaScript linter.
## Linting

To check for linting errors:
This codebase uses [Black](https://github.com/psf/black) as the Python linter.

```console
npm run lint
To format the codebase as a whole:
```
make lint
```
6 changes: 6 additions & 0 deletions config/devel.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
PLAINTEXT_VARIABLES:
ENVIRONMENT: devel
NYPL_DATA_API_BASE_URL: https://qa-platform.nypl.org/api/v0.1/
fatimarahman marked this conversation as resolved.
Show resolved Hide resolved
SCHEMA_PATH: current-schemas/
fatimarahman marked this conversation as resolved.
Show resolved Hide resolved
...
4 changes: 0 additions & 4 deletions config/production.env

This file was deleted.

6 changes: 6 additions & 0 deletions config/production.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
fatimarahman marked this conversation as resolved.
Show resolved Hide resolved
PLAINTEXT_VARIABLES:
ENVIRONMENT: production
NYPL_DATA_API_BASE_URL: https://platform.nypl.org/api/v0.1/
SCHEMA_PATH: current-schemas/
...
4 changes: 0 additions & 4 deletions config/qa.env

This file was deleted.

6 changes: 6 additions & 0 deletions config/qa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
PLAINTEXT_VARIABLES:
ENVIRONMENT: qa
NYPL_DATA_API_BASE_URL: https://qa-platform.nypl.org/api/v0.1/
SCHEMA_PATH: current-schemas/
...
5 changes: 3 additions & 2 deletions sam.qa.yml → config/sam.qa.yml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file used by the python lambda package you mentioned? Also remove SCHEMA_NAME var below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be used for sam local or the python lambda local usage! This is mainly a holdover from the NodeJS implementation, since we have plenty of sample events (in the sample folder) to use and verify when running local changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does sam local or the python lambda package actually run the QA Lambda function or does it just mimic it? Like when you run it could I go on AWS and see the logs from the run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sam local mimics the Lambda function locally and stores the logs locally. So this will not be visible on AWS. When you run the command, the output shows up on the terminal.

Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ Resources:
AvroToJsonTransformer:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: nodejs14.x
CodeUri: .
Handler: lambda_function.lambda_handler
Runtime: python3.9
Timeout: 10
Environment:
Variables:
Expand Down
1 change: 0 additions & 1 deletion context.json

This file was deleted.

10 changes: 10 additions & 0 deletions deployment_script.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/zsh

rm -f -r ./package
rm -f deployment-package.zip
pip3.9 install --target ./package -r requirements.txt
cd package
zip -r ../deployment-package.zip .
cd ..
zip deployment-package.zip lambda_function.py
zip deployment-package.zip record_processor.py
1 change: 0 additions & 1 deletion event_sources.json

This file was deleted.

120 changes: 0 additions & 120 deletions index.js

This file was deleted.

Loading
Loading