Skip to content

Commit

Permalink
1613 - DataFileSummary w/ Case Aggregates (#2612)
Browse files Browse the repository at this point in the history
* saving state real quick

* finishing merge with latest

* Missed old test script

* Added new test, more cleanup

* Updating unit tests in DFS, preparing for 1610

* Merging in Jan's 1610 code for parserError useful-ness

* Revert "Merging in Jan's 1610 code for parserError useful-ness"

This reverts commit c5796da.

* update to test to use dfs fixture

* saving state before new 1610 merge

* Resolving merge conflicts with 1610.

* Linting changes and comparing to 1610

* Some unit test linting but inherited 1610 issues

* Re-ordering job to run tests vs lint first.

* Updates to linting and unit tests.

* Fixing linting.

* Update tdrs-backend/setup.cfg

* updates per PR.

* Excluding trailers for rejection

* VSCode merge resolution is garbage.

* Fixing precheck for not implemented types

* Updating to error-handle not implemented schema types

* - Updated view to show latest datafiles
- Added admin filter to show newest or all datafile records
- Updated indices to allow easier elastic queries

* - Updated search indices to have parent FK

* - Fix lint errors

* - Updated submission tests
- Moved create_datafile to util

* - fix lint errors

* - removing frontend filtering

* - addding datafile to admin model

* Revert "- addding datafile to admin model"

This reverts commit 35a6f24.

* - Fixed issue where datafile FK wasnt populating
- Regenerated migration

* - Readding datafile back to admin view now that the error is resolved

* - adding datafile back

* Revert "- Readding datafile back to admin view now that the error is resolved"

This reverts commit 2807425.

* - Removed unnecessary fields
- Updated dependencies
- Updated filter

* - Updated document to include required fields

* - Fixed failing test

* add adminUrl to deployment cypress overrides

* Adding "beta" banners to relevant error report sections (#2522)

* Update views.py

* Update views.py

* Update SubmissionHistory.jsx

* Update SubmissionHistory.test.js

* Apply suggestions from code review

Co-authored-by: Miles Reiter <[email protected]>

* lint fixes

---------

Co-authored-by: Miles Reiter <[email protected]>
Co-authored-by: Alex P <[email protected]>
Co-authored-by: andrew-jameson <[email protected]>

* Create sprint-73-summary.md (#2565)

* hotfix for large file sizes (#2542)

* hotfix for large file sizes

* apply timeouts/req limits to dev

* filter identity pages from scan

* IGNORE sql injection

---------

Co-authored-by: Jan Timpe <[email protected]>
Co-authored-by: mo sohani <[email protected]>
Co-authored-by: Alex P <[email protected]>

* updating validation error language

* accidentally included coding challenge

* rm comments

* 2550 deactivation email link (#2557)

* - updated nginx buildpack

* - specifying different nginx version

* - Updating changelog

* - added script to update certain apps in cf
- added workflow for each environment in circi

* - fixed base config

* - fixing jobs

* - Updated based on feedback in OH

* - Updating defaults

* - Removing defaults

* - Fixing indent

* - Adding params to config

* test

* test

* - updating work dir

* - Adding checkout

* - adding cf check

* - logging into cf

* - update cf check to install required binary

* - removing unnecessary switch

* - Forcing plugin installation

* - test installing plugin from script also

* - Adding url to email

* - test code for sandbox

* - using my email

* Revert "Merge branch 'update-cf-os' into 2551-deactivation-email-link"

This reverts commit e963b9d, reversing
changes made to cc9cf81.

* Revert "- using my email"

This reverts commit cc9cf81.

* Revert "- test code for sandbox"

This reverts commit 0603774.

---------

Co-authored-by: Alex P <[email protected]>
Co-authored-by: Andrew <[email protected]>

* Update README.md (#2577)

Add ATO

Co-authored-by: Andrew <[email protected]>

* Create 2023, Spring - Testing CSV & Excel-based error reports.md

* Update README.md

* Updating deliverable links (#2584)

* User viewset not returning/duplicating users (#2573)

* - Fixed issue not allowing pagination to work locally with nginx
- Added ordering to user field to fix duplicates issue

* - fix lint error

* - Removing ID check since we cannot guarantee that the uuid that is generated per test run will be lexigraphically consistent

---------

Co-authored-by: Alex P <[email protected]>
Co-authored-by: Andrew <[email protected]>

* Update cf os (#2523)

* - updated nginx buildpack

* - specifying different nginx version

* - Updating changelog

* - added script to update certain apps in cf
- added workflow for each environment in circi

* - fixed base config

* - fixing jobs

* - Updated based on feedback in OH

* - Updating defaults

* - Removing defaults

* - Fixing indent

* - Adding params to config

* test

* test

* - updating work dir

* - Adding checkout

* - adding cf check

* - logging into cf

* - update cf check to install required binary

* - removing unnecessary switch

* - Forcing plugin installation

* - test installing plugin from script also

* - Adding new dependencies

* - adding package

* - fixing broken install

* - fixing libs

* - using correct command

* - gettign correct version of libc

* - trying to upgrade libs

* - testing

* - Updated README and script

* Revert "- Updated README and script"

This reverts commit 92697b3.

* - Removed unnecessary circi stuff
- Removed script
- Updated docs to callout updating secondary apps

* - Correct spelling error

---------

Co-authored-by: Andrew <[email protected]>

* Item Number Mismatch (#2578)

* - Updated schemas and models to reflect correct item numbers of fields

* - Revert migration

* - Updated header/trailer item numbers

* - Fixed item numbers off by one errors

---------

Co-authored-by: Andrew <[email protected]>

* pipeline filtering (#2538)

* pipeline changes that filter based on paths and branches. circle ci tracks specified branches in order to keep current functionality on HHS side.

* updated syntax to be in line with build-all.yml

* removed comma

* WIP build flow docs

* added Architecture Decision Record for the change to pipeline workflows

* corrected file type of doc to .md

---------

Co-authored-by: George Hudson <[email protected]>
Co-authored-by: Andrew <[email protected]>

* Hotfix Devops/2457 path filtering for documentation (#2597)

* pipeline changes that filter based on paths and branches. circle ci tracks specified branches in order to keep current functionality on HHS side.

* updated syntax to be in line with build-all.yml

* removed comma

* WIP build flow docs

* added Architecture Decision Record for the change to pipeline workflows

* corrected file type of doc to .md

* build and test all on PRs even for documentation

---------

Co-authored-by: George Hudson <[email protected]>

* Create sprint-74-summary.md (#2596)

Co-authored-by: Andrew <[email protected]>

* added URL filters (#2580)

* added URL filters

* allow github to trigger owasp and label deploys (#2601)

Co-authored-by: George Hudson <[email protected]>

---------

Co-authored-by: Andrew <[email protected]>
Co-authored-by: George Hudson <[email protected]>
Co-authored-by: George Hudson <[email protected]>

* Create sprint-75-summary.md (#2608)

* Create sprint-76-summary.md (#2609)

Co-authored-by: Andrew <[email protected]>

* - Resolved failing tests

* - Corrected merge thrash

* - Using randbits to generate pk to get around confilcting sequence pks

* Revert "- Using randbits to generate pk to get around confilcting sequence pks"

This reverts commit ac9b065.

* - Updating region in fixture instead of factory
- letting django handle transaction for test

* - Moved datafile reference to avoid confusion

* pushing up incomplete codebase

* Other unit tests now have passed w/ good error handling

* Working tests, need to get setup for case aggregates populating via DB

* - Updated queries
- Added helper function
- Need to merge in 2579 for queries to work

* minor improvement to month2int

* - Fixing most merge errors

* - Fixing functions

* - Updated queries based on generic relation

* - Updated queries to count by case number instead of record number

* - Added route
- Updated task to create dfs

* - updated tests to include dfs

* Cleaning up most comments that are no longer necessary and fixed lint issues.

* making minor updates, still broken tests.

* updating pipfile.lock and rebuild image resolved test issues

* Reorganizing tests, still failing in test_parse.py

* deleted summary file, split into other test scripts.

* Fixed missing self reference.

* Linting fixes.

* Found reference failure in deployed env.

* Removing extra returns for missing record type.

* lint fix

* Addressed invocation of datafile for failing test

* lint update for whitespace

* Intermediary commit, broken test

* new assignemnts in util

* - updated rejected query to correctly count objs

* - Fixing most tests

* - Fixed user error. Swapped numbers by accident.

* - make region None to avoid PK collision

* - Fix lint errors

* - Updating to avoid warning

* vscode merge conflict resolution (#2623)

* auto-create the external network

* didn't stage commit properly

* checking diffs, matching 1613.2

* doesn't work in pipeline. must be cached local

* re-commenting in unit test

* lint failures fixed

---------

Co-authored-by: andrew-jameson <[email protected]>

* url change per me, want pipeline to run e2e

* Upgraded to querysets, fix PR comments, PE str

* missing : not caught locally

* Feat/1613 merge 2 (#2650)

* Create sprint-78-summary.md (#2645)

* Missing/unsaved parser_error for record_type

* removing redundant tests

* Hopefully resolved on unit tests and lint

---------

Co-authored-by: Smithh-Co <[email protected]>
Co-authored-by: andrew-jameson <[email protected]>

* icontains

* tests

* Changing dict structure per 1612.

* fixed tests and lint issues, parse is too complex

* schema_manager replaces schema check

* Saving state prior to merge-conflict.

* Adopting latest manager, removing old error style.

* Commented out t6 line during Office hours

* minor reference update

* Acclimating to schemaManager

* lint-fix isinstance

* syntax mistake with isinstance

* Apply suggestions from code review

* reverting search_index merge artifacts.

* adjusting for removing unused "get-schema()"

* whitespace lint

* Feedback from Jan

* Ensuring tests run/work.

* Ensure we have leading zero in rptmonthyear.

* Minor lint fix for exception logging

* resolving merge conflict problems

* fixing tests from merge conflicts.

* dumb lint fix

* reducing line length for lint

* Moving DFS migration into it's own file to avoid conflicts.

---------

Co-authored-by: andrew-jameson <[email protected]>
Co-authored-by: elipe17 <[email protected]>
Co-authored-by: Jan Timpe <[email protected]>
Co-authored-by: Miles Reiter <[email protected]>
Co-authored-by: Alex P <[email protected]>
Co-authored-by: Smithh-Co <[email protected]>
Co-authored-by: mo sohani <[email protected]>
Co-authored-by: Eric Lipe <[email protected]>
Co-authored-by: Lauren Frohlich <[email protected]>
Co-authored-by: Miles Reiter <[email protected]>
Co-authored-by: George Hudson <[email protected]>
Co-authored-by: George Hudson <[email protected]>
Co-authored-by: raftmsohani <[email protected]>
  • Loading branch information
14 people authored Sep 25, 2023
1 parent bf2cf12 commit 43cb33d
Show file tree
Hide file tree
Showing 26 changed files with 659 additions and 224 deletions.
6 changes: 3 additions & 3 deletions .circleci/build-and-test/jobs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@
- checkout
- docker-compose-check
- docker-compose-up-backend
- run:
name: Execute Python Linting Test
command: cd tdrs-backend; docker-compose run --rm web bash -c "flake8 ."
- run:
name: Run Unit Tests And Create Code Coverage Report
command: |
cd tdrs-backend;
docker-compose run --rm web bash -c "./wait_for_services.sh && pytest --cov-report=xml"
- run:
name: Execute Python Linting Test
command: cd tdrs-backend; docker-compose run --rm web bash -c "flake8 ."
- upload-codecov:
component: backend
coverage-report: ./tdrs-backend/coverage.xml
Expand Down
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,5 +82,5 @@ workflows:
- develop
- main
- master
- /^release.*/
- /^release.*/

1 change: 0 additions & 1 deletion scripts/zap-scanner.sh
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,6 @@ ZAP_CLI_OPTIONS="\
-config globalexcludeurl.url_list.url\(21\).regex='^https:\/\/.*\.identitysandbox.gov\/.*$' \
-config globalexcludeurl.url_list.url\(21\).description='Site - IdentitySandbox.gov' \
-config globalexcludeurl.url_list.url\(21\).enabled=true \
-config spider.postform=true"

# How long ZAP will crawl the app with the spider process
Expand Down
1 change: 1 addition & 0 deletions tdrs-backend/Pipfile.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions tdrs-backend/docker-compose.local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ services:
build: .
command: >
bash -c "./wait_for_services.sh &&
./gunicorn_start.sh &&
./gunicorn_start.sh &&
celery -A tdpservice.settings worker -l info"
ports:
- "5555:5555"
Expand All @@ -106,5 +106,5 @@ volumes:

networks:
default:
external:
name: external-net
name: external-net
external: true
2 changes: 1 addition & 1 deletion tdrs-backend/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,5 +124,5 @@ volumes:

networks:
default:
external:
name: external-net
external: true
2 changes: 1 addition & 1 deletion tdrs-backend/tdpservice/data_files/test/factories.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ class Meta:
extension = "txt"
section = "Active Case Data"
quarter = "Q1"
year = "2020"
year = 2020
version = 1
user = factory.SubFactory(UserFactory)
stt = factory.SubFactory(STTFactory)
Expand Down
7 changes: 7 additions & 0 deletions tdrs-backend/tdpservice/parsers/admin.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,11 @@ class ParserErrorAdmin(admin.ModelAdmin):
]


class DataFileSummaryAdmin(admin.ModelAdmin):
"""ModelAdmin class for DataFileSummary objects generated in parsing."""

list_display = ['status', 'case_aggregates', 'datafile']


admin.site.register(models.ParserError, ParserErrorAdmin)
admin.site.register(models.DataFileSummary, DataFileSummaryAdmin)
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,5 @@ class Migration(migrations.Migration):
model_name='parsererror',
name='error_type',
field=models.TextField(choices=[('1', 'File pre-check'), ('2', 'Record value invalid'), ('3', 'Record value consistency'), ('4', 'Case consistency'), ('5', 'Section consistency'), ('6', 'Historical consistency')], max_length=128),
),
)
]
24 changes: 24 additions & 0 deletions tdrs-backend/tdpservice/parsers/migrations/0007_datafilesummary.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Generated by Django 3.2.15 on 2023-09-20 15:35

from django.db import migrations, models
import django.db.models.deletion


class Migration(migrations.Migration):

dependencies = [
('data_files', '0012_datafile_s3_versioning_id'),
('parsers', '0006_auto_20230810_1500'),
]

operations = [
migrations.CreateModel(
name='DataFileSummary',
fields=[
('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('status', models.CharField(choices=[('Pending', 'Pending'), ('Accepted', 'Accepted'), ('Accepted with Errors', 'Accepted With Errors'), ('Rejected', 'Rejected')], default='Pending', max_length=50)),
('case_aggregates', models.JSONField(null=True)),
('datafile', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='data_files.datafile')),
],
),
]
45 changes: 43 additions & 2 deletions tdrs-backend/tdpservice/parsers/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from django.utils.translation import gettext_lazy as _
from django.contrib.contenttypes.fields import GenericForeignKey
from django.contrib.contenttypes.models import ContentType

from tdpservice.data_files.models import DataFile

class ParserErrorCategoryChoices(models.TextChoices):
"""Enum of ParserError error_type."""
Expand Down Expand Up @@ -62,8 +62,49 @@ def __repr__(self):

def __str__(self):
"""Return a string representation of the model."""
return f"error_message: {self.error_message}"
return f"ParserError {self.__dict__}"

def _get_error_message(self):
"""Return the error message."""
return self.error_message

class DataFileSummary(models.Model):
"""Aggregates information about a parsed file."""

class Status(models.TextChoices):
"""Enum for status of parsed file."""

PENDING = "Pending" # file has been uploaded, but not validated
ACCEPTED = "Accepted"
ACCEPTED_WITH_ERRORS = "Accepted with Errors"
REJECTED = "Rejected"

status = models.CharField(
max_length=50,
choices=Status.choices,
default=Status.PENDING,
)

datafile = models.ForeignKey(DataFile, on_delete=models.CASCADE)

case_aggregates = models.JSONField(null=True, blank=False)

def get_status(self):
"""Set and return the status field based on errors and models associated with datafile."""
errors = ParserError.objects.filter(file=self.datafile)
[print(error) for error in errors]

# excluding row-level pre-checks and trailer pre-checks.
precheck_errors = errors.filter(error_type=ParserErrorCategoryChoices.PRE_CHECK)\
.exclude(field_name="Record_Type")\
.exclude(error_message__icontains="trailer")\
.exclude(error_message__icontains="Unknown Record_Type was found.")

if errors is None:
return DataFileSummary.Status.PENDING
elif errors.count() == 0:
return DataFileSummary.Status.ACCEPTED
elif precheck_errors.count() > 0:
return DataFileSummary.Status.REJECTED
else:
return DataFileSummary.Status.ACCEPTED_WITH_ERRORS
86 changes: 20 additions & 66 deletions tdrs-backend/tdpservice/parsers/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ def parse_datafile(datafile):

section_is_valid, section_error = validators.validate_header_section_matches_submission(
datafile,
program_type,
section,
util.get_section_reference(program_type, section),
util.make_generate_parser_error(datafile, 1)
)

if not section_is_valid:
Expand Down Expand Up @@ -123,7 +123,6 @@ def parse_datafile_lines(datafile, program_type, section, is_encrypted):
errors = {}

line_number = 0
schema_manager_options = get_schema_manager_options(program_type)

unsaved_records = {}
unsaved_parser_errors = {}
Expand Down Expand Up @@ -180,11 +179,9 @@ def parse_datafile_lines(datafile, program_type, section, is_encrypted):
prev_sum = header_count + trailer_count
continue

schema_manager = get_schema_manager(line, section, schema_manager_options)

schema_manager.update_encrypted_fields(is_encrypted)
schema_manager = get_schema_manager(line, section, program_type)

records = manager_parse_line(line, schema_manager, generate_error)
records = manager_parse_line(line, schema_manager, generate_error, is_encrypted)

record_number = 0
for i in range(len(records)):
Expand Down Expand Up @@ -236,68 +233,25 @@ def parse_datafile_lines(datafile, program_type, section, is_encrypted):
return errors


def manager_parse_line(line, schema_manager, generate_error):
def manager_parse_line(line, schema_manager, generate_error, is_encrypted=False):
"""Parse and validate a datafile line using SchemaManager."""
if schema_manager.schemas:
try:
schema_manager.update_encrypted_fields(is_encrypted)
records = schema_manager.parse_and_validate(line, generate_error)
return records
except AttributeError as e:
logging.error(e)
return [(None, False, [
generate_error(
schema=None,
error_category=ParserErrorCategoryChoices.PRE_CHECK,
error_message="Unknown Record_Type was found.",
record=None,
field="Record_Type",
)
])]

logger.debug("Record Type is missing from record.")
return [(None, False, [
generate_error(
schema=None,
error_category=ParserErrorCategoryChoices.PRE_CHECK,
error_message="Record Type is missing from record.",
record=None,
field=None
)
])]


def get_schema_manager_options(program_type):
"""Return the allowed schema options."""
match program_type:
case 'TAN':
return {
'A': {
'T1': schema_defs.tanf.t1,
'T2': schema_defs.tanf.t2,
'T3': schema_defs.tanf.t3,
},
'C': {
'T4': schema_defs.tanf.t4,
'T5': schema_defs.tanf.t5,
},
'G': {
'T6': schema_defs.tanf.t6,
},
'S': {
# 'T7': schema_options.t7,
},
}
case 'SSP':
return {
'A': {
'M1': schema_defs.ssp.m1,
'M2': schema_defs.ssp.m2,
'M3': schema_defs.ssp.m3,
},
'C': {
# 'M4': schema_options.m4,
# 'M5': schema_options.m5,
},
'G': {
# 'M6': schema_options.m6,
},
'S': {
# 'M7': schema_options.m7,
},
}
# case tribal?
return None


def get_schema_manager(line, section, schema_options):
def get_schema_manager(line, section, program_type):
"""Return the appropriate schema for the line."""
line_type = line[0:2]
return schema_options.get(section, {}).get(line_type, util.SchemaManager([]))
return util.get_program_model(program_type, section, line_type)
2 changes: 1 addition & 1 deletion tdrs-backend/tdpservice/parsers/row_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ def run_preparsing_validators(self, line, generate_error):
error_category=ParserErrorCategoryChoices.PRE_CHECK,
error_message=validator_error,
record=None,
field=None
field="Record_Type"
)
)

Expand Down
2 changes: 1 addition & 1 deletion tdrs-backend/tdpservice/parsers/schema_defs/tanf/t1.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Schema for HEADER row of all submission types."""
"""Schema for t1 record types."""

from ...util import SchemaManager
from ...fields import Field
Expand Down
12 changes: 11 additions & 1 deletion tdrs-backend/tdpservice/parsers/serializers.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Serializers for parsing errors."""

from rest_framework import serializers
from .models import ParserError
from .models import ParserError, DataFileSummary


class ParsingErrorSerializer(serializers.ModelSerializer):
Expand All @@ -23,3 +23,13 @@ class Meta:

model = ParserError
fields = '__all__'


class DataFileSummarySerializer(serializers.ModelSerializer):
"""Serializer for Parsing Errors."""

class Meta:
"""Metadata."""

model = DataFileSummary
fields = ['status', 'case_aggregates', 'datafile']
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
HEADER20204A06 TAN1EN
T12020101111111111223003403361110213120000300000000000008730010000000000000000000000000000000000222222000000002229012
T2202010111111111121219740114WTTTTTY@W2221222222221012212110014722011400000000000000000000000000000000000000000000000000000000000000000000000000000000000291
T2202010111111111121219740114WTTTTTY@W2221222222221012212110014722011500000000000000000000000000000000000000000000000000000000000000000000000000000000000291
T320201011111111112120190127WTTTT90W022212222204398100000000
T12020101111111111524503401311110233110374300000000000005450320000000000000000000000000000000000222222000000002229021
T2202010111111111152219730113WTTTT@#Z@2221222122211012210110630023080700000000000000000000000000000000000000000000000000000000000000000000000551019700000000
T320201011111111115120160401WTTTT@BTB22212212204398100000000
T12020101111111114023001401101120213110336300000000000002910410000000000000000000000000000000000222222000000002229012
T2202010111111111401219910501WTTTT@9#T2221222222221012212210421322011400000000000000000000000000000000000000000000000000000000000000000000000000000000000000
T2202010111111111401219910501WTTTT@9#T2221222222221012212210421322011500000000000000000000000000000000000000000000000000000000000000000000000000000000000000
T320201011111111140120170423WTTTT@@T#22212222204398100000000
T12020101111111114721801401711120212110374300000000000003820060000000000000000000000000000000000222222000000002229012
T2202010111111111471219800223WTTTT@TTW2222212222221012212110065423010700000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Expand Down
Loading

0 comments on commit 43cb33d

Please sign in to comment.