Skip to content

Commit

Permalink
Merge branch '2693-cat2-messaging-cleanup' into fix/cat3-error-cleanup
Browse files Browse the repository at this point in the history
  • Loading branch information
jtimpe committed Jun 27, 2024
2 parents 3b0a1b8 + b95fb9f commit 94e93f5
Show file tree
Hide file tree
Showing 55 changed files with 2,877 additions and 37,914 deletions.
9 changes: 4 additions & 5 deletions docs/How-We-Work/Team-Composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,13 @@ Please refer to the [Team Members doc](https://hhsgov.sharepoint.com/:w:/r/sites
* Alexandra Pennington, OFA, tech lead

**Raft**
* Connor Smith, Raft, facilitator/product manager
* Miles Reiter, Raft, design lead + senior ux/ui researcher and designer
* Diana Liang, Raft, ux/ui researcher and designer
* Rob Gendron, Raft, facilitator/product manager
* Victoria Amoroso, Raft, design lead + senior ux/ui researcher and designer
* Miles Reiter, Raft, senior ux/ui researcher and designer
* Andrew Jameson, Raft, tech lead
* Cameron Smart, Raft, full stack engineer
* Jan Timpe, Raft, full stack engineer
* Mo Sohani, Raft, full stack engineer
* George Hudson, Raft, devops engineer
* Eric Lipe, Raft, full stack engineer

## Subject Matter Experts
**OFA Data Team**
Expand Down
26 changes: 0 additions & 26 deletions docs/Security-Compliance/File-Transfer-TDRS/README.md

This file was deleted.

1 change: 0 additions & 1 deletion docs/Security-Compliance/File-Transfer-TDRS/diagram.drawio

This file was deleted.

Binary file not shown.
378 changes: 378 additions & 0 deletions docs/Technical-Documentation/diagrams/parsing.drawio

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/Technical-Documentation/parsing-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# High Level Parsing Flow

Parsing begins after a user submits a datafile or datafiles via the frontend. The submission generates a new Celery task
or tasks which are enqueued to Redis. As work becomes available the Celery workers dequeue a task from Redis and begin
working them. The parsing task gets the Datafile Django model and begins iterating over each line in the file. For each
line in the file this task: parses the line into a new record, performs category 1 - 3 validation on the record,
performs exact duplicate and partial duplicate detection, performs category 4 validation, and stores the record in a
cache to be bulk created/serialized to the database and ElasticSearch. The image below provides a high level flow of the
aforementioned steps.

![Parsing Flow](./diagrams/parsing.png)
56 changes: 0 additions & 56 deletions docs/Technical-Documentation/secret-key-rotation-steps.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ To maintain good security, we will periodically rotate the following secret keys
- CF deployer keys (_for continuous delivery_)
- JWT keys (_external user auth_)
- ACF AMS keys (_internal user auth_)
- ACF Titan server keys (_for file transfers between TDP and TDRS_)
- Django secret keys ([_cryptographic signing_](https://docs.djangoproject.com/en/4.0/topics/signing/#module-django.core.signing))

This document outlines the process for doing this for each set of keys.
Expand Down Expand Up @@ -154,61 +153,6 @@ Service requests tickets must be submitted by Government-authorized personnel wi
2. Update environment variables in CircleCI and relevant cloud.gov backend applications after ticket completed by OCIO. [Restage applications](https://cloud.gov/docs/deployment/app-maintenance/#restaging-your-app).
</details>

**<details><summary>ACF Titan Server Keys</summary>**
The ACF OCIO Ops team manages these credentials for all environments (dev, staging, and prod), so we will need to submit a service request ticket whenever we need keys rotated.

Service requests tickets must be submitted by Government-authorized personnel with Government computers and PIV access (e.g. Raft tech lead for lower environments and TDP sys admins for production environment). Please follow the procedures below:

1. Generate new public/private key pair

Below is an example of how to generate new titan public/private key pair from _Git BASH for Windows_. Two files called `filename_where_newtitan_keypair_saved` are created: one is the _private_ key and the other is a _public_ key (the latter is saved with a _.pub_ extention).
(note: the info below is not associated with any real keys)

```
$ ssh-keygen -t rsa -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/c/Users/username/.ssh/id_rsa): filename_where_newtitan_keypair_saved
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in filename_where_newtitan_keypair_saved
Your public key has been saved in filename_where_newtitan_keypair_saved.pub
The key fingerprint is:
SHA256:BY6Nl0hCjIrI9yZMBGH2vbDFLCTq2DsFQXQTmLydwjI
The key's randomart image is:
+---[RSA 4096]----+
| X*B*.. . |
|+ O+=+ * o |
|=oo* *+ = . |
|Eo++B .. . |
|.+=oo. S |
| = o |
| o o |
| . |
| |
+----[SHA256]-----+
```

2. Submit request tickets from government-issued email address and use the email template located on **page 2** of [this document.](https://hhsgov.sharepoint.com/:w:/r/sites/TANFDataPortalOFA/Shared%20Documents/compliance/Authentication%20%26%20Authorization/ACF%20AMS%20docs/OCIO%20OPERATIONS%20REQUEST%20TEMPLATES.docx?d=w5332585c1ecf49a4aeda17674f687154&csf=1&web=1&e=aQyIPz) cc OFA tech lead on lower environment requests.

The request should include:
- the titan service account name (i.e. `tanfdp` for prod; `tanfdpdev` for dev/staging)
- the newly generated public key from `filename_where_newtitan_keypair_saved.pub`

3. When OCIO confirms that the change has been made, add the private key from `filename_where_newtitan_keypair_saved` to CircleCI as an environment variable. The variable name is `ACFTITAN_KEY`. **Please note**: the value needs must be edited before adding to CircleCI. It should be a one-line string with underscores ("_") replacing the spaces at the end of every line. See example below:

```
-----BEGIN OPENSSH PRIVATE KEY-----_somehashvalue_-----END OPENSSH PRIVATE KEY-----
```

4. Re-run the deployment workflow from CircleCI and confirm that the updated key value pair has been added to the relevant cloud.gov backend application.
</details>

**<details><summary>Django secret keys</summary>**

Expand Down
3 changes: 0 additions & 3 deletions scripts/deploy-backend.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,6 @@ echo backend_app_name: "$backend_app_name"
set_cf_envs()
{
var_list=(
"ACFTITAN_HOST"
"ACFTITAN_KEY"
"ACFTITAN_USERNAME"
"AMS_CLIENT_ID"
"AMS_CLIENT_SECRET"
"AMS_CONFIGURATION_ENDPOINT"
Expand Down
3 changes: 0 additions & 3 deletions tdrs-backend/.env.example
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,3 @@ ELASTIC_HOST=elastic:9200

# testing
CYPRESS_TOKEN=local-cypress-token

# sftp
ACFTITAN_SFTP_PYTEST=local-acftitan-key
2 changes: 0 additions & 2 deletions tdrs-backend/Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,6 @@ celery = "==5.2.6"
redis = "==4.1.2"
flower = "==1.1.0"
django-celery-beat = "==2.2.1"
paramiko = "==2.11.0"
pytest_sftpserver = "==1.3.0"
elasticsearch = "==7.13.4" # REQUIRED - v7.14.0 introduces breaking changes
django-elasticsearch-dsl = "==7.3"
django-elasticsearch-dsl-drf = "==0.22.5"
Expand Down
84 changes: 9 additions & 75 deletions tdrs-backend/Pipfile.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 0 additions & 4 deletions tdrs-backend/docker-compose.local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,8 @@ services:
- AMS_CLIENT_ID
- AMS_CLIENT_SECRET
- AMS_CONFIGURATION_ENDPOINT
- ACFTITAN_HOST
- ACFTITAN_KEY
- ACFTITAN_USERNAME
- REDIS_URI=redis://redis-server:6379
- REDIS_SERVER_LOCAL=TRUE
- ACFTITAN_SFTP_PYTEST
- SENDGRID_API_KEY
volumes:
- .:/tdpapp
Expand Down
4 changes: 0 additions & 4 deletions tdrs-backend/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,8 @@ services:
- AMS_CLIENT_ID
- AMS_CLIENT_SECRET
- AMS_CONFIGURATION_ENDPOINT
- ACFTITAN_HOST
- ACFTITAN_KEY
- ACFTITAN_USERNAME
- REDIS_URI=redis://redis-server:6379
- REDIS_SERVER_LOCAL=TRUE
- ACFTITAN_SFTP_PYTEST
- CYPRESS_TOKEN
- DJANGO_DEBUG
- SENDGRID_API_KEY
Expand Down
12 changes: 1 addition & 11 deletions tdrs-backend/tdpservice/data_files/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
from tdpservice.data_files.util import get_xls_serialized_file
from tdpservice.data_files.models import DataFile, get_s3_upload_path
from tdpservice.users.permissions import DataFilePermissions, IsApprovedPermission
from tdpservice.scheduling import sftp_task, parser_task
from tdpservice.scheduling import parser_task
from tdpservice.data_files.s3_client import S3Client
from tdpservice.parsers.models import ParserError
from tdpservice.parsers.serializers import ParsingErrorSerializer
Expand Down Expand Up @@ -59,7 +59,6 @@ def create(self, request, *args, **kwargs):

# only if file is passed the virus scan and created successfully will we perform side-effects:
# * Send to parsing
# * Upload to ACF-TITAN
# * Send email to user

logger.debug(f"{self.__class__.__name__}: status: {response.status_code}")
Expand All @@ -74,15 +73,6 @@ def create(self, request, *args, **kwargs):
parser_task.parse.delay(data_file_id)
logger.info("Submitted parse task to queue for datafile %s.", data_file_id)

sftp_task.upload.delay(
data_file_pk=data_file_id,
server_address=settings.ACFTITAN_SERVER_ADDRESS,
local_key=settings.ACFTITAN_LOCAL_KEY,
username=settings.ACFTITAN_USERNAME,
port=22
)
logger.info("Submitted upload task to redis for datafile %s.", data_file_id)

app_name = settings.APP_NAME + '/'
key = app_name + get_s3_upload_path(data_file, '')
version_id = self.get_s3_versioning_id(response.data.get('original_filename'), key)
Expand Down
18 changes: 11 additions & 7 deletions tdrs-backend/tdpservice/parsers/aggregates.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
"""Aggregate methods for the parsers."""
from .row_schema import SchemaManager
from .models import ParserError
from .models import ParserError, ParserErrorCategoryChoices
from .util import month_to_int, \
transform_to_months, fiscal_to_calendar, get_prog_from_section
from .schema_defs.utils import get_program_models, get_text_from_df
from django.db.models import Q as Query


def case_aggregates_by_month(df, dfs_status):
Expand Down Expand Up @@ -39,22 +40,25 @@ def case_aggregates_by_month(df, dfs_status):
if isinstance(schema_model, SchemaManager):
schema_model = schema_model.schemas[0]

curr_case_numbers = set(schema_model.document.Django.model.objects.filter(datafile=df)
.filter(RPT_MONTH_YEAR=rpt_month_year)
curr_case_numbers = set(schema_model.document.Django.model.objects.filter(datafile=df,
RPT_MONTH_YEAR=rpt_month_year)
.distinct("CASE_NUMBER").values_list("CASE_NUMBER", flat=True))
case_numbers = case_numbers.union(curr_case_numbers)

total += len(case_numbers)
cases_with_errors += ParserError.objects.filter(file=df).filter(
case_number__in=case_numbers).distinct('case_number').count()
cases_with_errors += ParserError.objects.filter(file=df, case_number__in=case_numbers)\
.distinct('case_number').count()
accepted = total - cases_with_errors

aggregate_data['months'].append({"month": month,
"accepted_without_errors": accepted,
"accepted_with_errors": cases_with_errors})

aggregate_data['rejected'] = ParserError.objects.filter(file=df).filter(case_number=None).distinct("row_number")\
.exclude(row_number=0).count()
error_type_query = Query(error_type=ParserErrorCategoryChoices.PRE_CHECK) | \
Query(error_type=ParserErrorCategoryChoices.CASE_CONSISTENCY)

aggregate_data['rejected'] = ParserError.objects.filter(error_type_query, file=df)\
.distinct("row_number").exclude(row_number=0).count()

return aggregate_data

Expand Down
Loading

0 comments on commit 94e93f5

Please sign in to comment.