Skip to content

Commit

Permalink
Merge branch 'develop' into 2411-metadata-parsed-datafiles
Browse files Browse the repository at this point in the history
  • Loading branch information
raftmsohani authored Sep 20, 2023
2 parents af49261 + 4d5ed4b commit 9a85ac5
Show file tree
Hide file tree
Showing 18 changed files with 278 additions and 33 deletions.
2 changes: 1 addition & 1 deletion .circleci/base_config.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
version: 2.1

orbs:
node: circleci/node@4.7.0
node: circleci/node@5.1.0
terraform: circleci/[email protected]
jq: circleci/[email protected]

Expand Down
51 changes: 51 additions & 0 deletions docs/Sprint-Review/sprint-80-summary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Sprint 80 Summary

08/16/23 - 08/29/23

Velocity: Dev (20)

## Sprint Goal
* Continue parsing engine development for TANF Sections (01-04), complete decoupling backend application spike and continue integration test epic (2282).
* UX to continue regional staff research, service design blueprint (.1 and .2) and bridge onboarding to >85% of total users
* DevOps to investigate nightlyscan issues and resolve utlity images for CircleCI and container registry.


## Tickets
### Completed/Merged
* [#2369 As tech lead, we need the parsing engine to run quailty checks across TANF section 1](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2369)
* [#1110 TANF (03) Parsing and Validation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1110)
* [#2282 As tech lead, I want a file upload integration test](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2282)
* [#1784 - Email Relay](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1784)

### Ready to Merge
* N/A

### Submitted (QASP Review, OCIO Review)
* [#1109 TANF (02) Parsing and Validation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1109)

### Closed (not merged)
* N/A

## Moved to Next Sprint (Blocked, Raft Review, In Progress)
### In Progress
* [#2116 Container Registry Creation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2116)
* [#2429 Singular ClamAV scanner](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2429)
* [#1111 TANF (04) Parsing and Validation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1111)


### Blocked
* N/A


### Raft Review
* [#1610 As a user, I need information about the acceptance of my data and a link for the error report](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1610)
* [#1612 Detailed case level metadata](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1612)
* [#1613 As a developer, I need parsed file meta data (TANF Section 1)](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/board)
* [#2626 (Spike) improve parsing logging](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2626)

### Demo
* Internal:
* 2369 / 1110 - TANF Sections (01 and 03) Parsing and Validation
* External:
* 2369 / 1110 - TANF Sections (01 and 03) Parsing and Validation

36 changes: 36 additions & 0 deletions docs/Technical-Documentation/Zap-Scan-HTML-Report.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,39 @@ link to view the running process at CircleCI
4. Click the `owasp_report.html` link to view the report.

![image](images/report.png)

### Configuring Report Output

We use separate files for configuring the ZAP scanner for the front and back end applications
Backend: [tdrs-backend/reports/zap.conf](../../tdrs-backend/reports/zap.conf)
Frontend: [tdrs-frontend/reports/zap.conf](../../tdrs-frontend/reports/zap.conf)

These files have a list of error codes and what to do with them. We have some of these set
to IGNORE because they do not apply to our configuration but were returning false positives
for test failures. For each of these, we should have a comment as to why the test is being
ignored.

Can use Postman to mimic the test parameters before ignoring to verify
The [free version of Postman](https://www.postman.com/downloads/), the app or web version, can be used for this.
examples:
![image](images/postman_example1.png)
![image](images/postman_example2.png)

### Invoking the OWASP Zap Scanner

We build out how we invoke the zap scanner using our [zap-scanner](../../scripts/zap-scanner.sh) script.

As part of that, we pass some additional configuration that includes a list of urls to exclude from the
scan.
`ZAP_CLI_OPTIONS` contains this list.
It is important to note, not to include the frontend or backend endpoint we want to test the scanner out
on.

e.g. do not include something like this in the `-config globalexcludeurl.url_list.url` configuration options:
```
-config globalexcludeurl.url_list.url\(3\).regex='^https?://.*\.hhs.gov\/.*$' \
-config globalexcludeurl.url_list.url\(3\).description='Site - acf.hhs.gov' \
-config globalexcludeurl.url_list.url\(3\).enabled=true \
```

It will not be able to find the endpoint for the tests and the output is confusing as to what is happening.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions tdrs-backend/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ services:
./gunicorn_start.sh && celery -A tdpservice.settings worker -l info"
ports:
- "5555:5555"
tty: true
depends_on:
- clamav-rest
- localstack
Expand Down
2 changes: 1 addition & 1 deletion tdrs-backend/gunicorn_start.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ fi

#
echo "Applying database migrations"
#python manage.py migrate
python manage.py migrate
#python manage.py populate_stts
#python manage.py collectstatic --noinput

Expand Down
11 changes: 9 additions & 2 deletions tdrs-backend/reports/zap.conf
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,11 @@
40014 FAIL (Cross Site Scripting (Persistent) - Active/release)
40016 FAIL (Cross Site Scripting (Persistent) - Prime - Active/release)
40017 FAIL (Cross Site Scripting (Persistent) - Spider - Active/release)
40018 WARN (SQL Injection - Active/release)
##### IGNORE (SQL Injection - Active/release) as it doesn't apply to us and is giving
##### false positives because it takes us to a default django page notifying us
##### of the 403 forbidden, instead of just a 403 being returned. The test is
##### treating this as though the SQL injection worked, since a page is returned.
40018 IGNORE (SQL Injection - Active/release)
40019 FAIL (SQL Injection - MySQL - Active/beta)
40020 FAIL (SQL Injection - Hypersonic SQL - Active/beta)
40021 FAIL (SQL Injection - Oracle - Active/beta)
Expand All @@ -93,7 +97,10 @@
40029 FAIL (Trace.axd Information Leak - Active/beta)
40032 FAIL (.htaccess Information Leak - Active/release)
40034 FAIL (.env Information Leak - Active/beta)
40035 FAIL (Hidden File Finder - Active/beta)
##### IGNORE (Hidden File Finder - Active/beta) due to false failing similar to SQL
##### Injection false positive above. Replicating parameters of the test
##### result in
40035 IGNORE (Hidden File Finder - Active/beta)
41 FAIL (Source Code Disclosure - Git - Active/beta)
42 FAIL (Source Code Disclosure - SVN - Active/beta)
43 FAIL (Source Code Disclosure - File Inclusion - Active/beta)
Expand Down
38 changes: 38 additions & 0 deletions tdrs-backend/tdpservice/core/logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
"""Contains core logging functionality for TDP."""

import logging

class ColorFormatter(logging.Formatter):
"""Simple formatter class to add color to log messages based on log level."""

BLACK = '\033[0;30m'
RED = '\033[0;31m'
GREEN = '\033[0;32m'
BROWN = '\033[0;33m'
BLUE = '\033[0;34m'
PURPLE = '\033[0;35m'
CYAN = '\033[0;36m'
GREY = '\033[0;37m'

DARK_GREY = '\033[1;30m'
LIGHT_RED = '\033[1;31m'
LIGHT_GREEN = '\033[1;32m'
YELLOW = '\033[1;33m'
LIGHT_BLUE = '\033[1;34m'
LIGHT_PURPLE = '\033[1;35m'
LIGHT_CYAN = '\033[1;36m'
WHITE = '\033[1;37m'

RESET = "\033[0m"

def __init__(self, *args, **kwargs):
self._colors = {logging.DEBUG: self.CYAN,
logging.INFO: self.GREEN,
logging.WARNING: self.YELLOW,
logging.ERROR: self.LIGHT_RED,
logging.CRITICAL: self.RED}
super(ColorFormatter, self).__init__(*args, **kwargs)

def format(self, record):
"""Format the record to be colored based on the log level."""
return self._colors.get(record.levelno, self.WHITE) + super().format(record) + self.RESET
9 changes: 9 additions & 0 deletions tdrs-backend/tdpservice/data_files/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,15 @@ def find_latest_version(self, year, quarter, section, stt):
version=version, year=year, quarter=quarter, section=section, stt=stt,
).first()

def __repr__(self):
"""Return a string representation of the model."""
return f"{{id: {self.id}, filename: {self.original_filename}, STT: {self.stt}, S3 location: " + \
f"{self.s3_location}}}"

def __str__(self):
"""Return a string representation of the model."""
return f"filename: {self.original_filename}"

class LegacyFileTransferManager(models.Manager):
"""Extends object manager functionality for LegacyFileTransfer model."""

Expand Down
4 changes: 4 additions & 0 deletions tdrs-backend/tdpservice/data_files/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ def create(self, request, *args, **kwargs):
data_file_id = response.data.get('id')
data_file = DataFile.objects.get(id=data_file_id)

logger.info(f"Preparing parse task: User META -> user: {request.user}, stt: {data_file.stt}. " +
f"Datafile META -> datafile: {data_file_id}, section: {data_file.section}, " +
f"quarter {data_file.quarter}, year {data_file.year}.")

parser_task.parse.delay(data_file_id)
logger.info("Submitted parse task to queue for datafile %s.", data_file_id)

Expand Down
9 changes: 9 additions & 0 deletions tdrs-backend/tdpservice/parsers/fields.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
"""Datafile field representations."""

import logging

logger = logging.getLogger(__name__)

def value_is_empty(value, length):
"""Handle 'empty' values as field inputs."""
empty_values = [
Expand Down Expand Up @@ -36,6 +40,7 @@ def parse_value(self, line):
value = line[self.startIndex:self.endIndex]

if value_is_empty(value, self.endIndex-self.startIndex):
logger.debug(f"Field: '{self.name}' at position: [{self.startIndex}, {self.endIndex}) is empty.")
return None

match self.type:
Expand All @@ -44,9 +49,13 @@ def parse_value(self, line):
value = int(value)
return value
except ValueError:
logger.error(f"Error parsing field value: {value} to integer.")
return None
case 'string':
return value
case _:
logger.warn(f"Unknown field type: {self.type}.")
return None

class TransformField(Field):
"""Represents a field that requires some transformation before serializing."""
Expand Down
5 changes: 3 additions & 2 deletions tdrs-backend/tdpservice/parsers/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,12 @@ def rpt_month_name(self):

def __repr__(self):
"""Return a string representation of the model."""
return f"ParserError {self.id} for file {self.file} and object key {self.object_id}"
return f"{{id: {self.id}, file: {self.file.id}, row: {self.row_number}, column: {self.column_number}, " + \
f"error message: {self.error_message}}}"

def __str__(self):
"""Return a string representation of the model."""
return f"ParserError {self.__dict__}"
return f"error_message: {self.error_message}"

def _get_error_message(self):
"""Return the error message."""
Expand Down
Loading

0 comments on commit 9a85ac5

Please sign in to comment.