From 43cb33d537cb77d82567078f94126a6e54ac043c Mon Sep 17 00:00:00 2001 From: Andrew <84722778+andrew-jameson@users.noreply.github.com> Date: Mon, 25 Sep 2023 14:42:27 -0400 Subject: [PATCH 1/4] 1613 - DataFileSummary w/ Case Aggregates (#2612) * saving state real quick * finishing merge with latest * Missed old test script * Added new test, more cleanup * Updating unit tests in DFS, preparing for 1610 * Merging in Jan's 1610 code for parserError useful-ness * Revert "Merging in Jan's 1610 code for parserError useful-ness" This reverts commit c5796da69d0e9a6d356057550378d536e2be5f8b. * update to test to use dfs fixture * saving state before new 1610 merge * Resolving merge conflicts with 1610. * Linting changes and comparing to 1610 * Some unit test linting but inherited 1610 issues * Re-ordering job to run tests vs lint first. * Updates to linting and unit tests. * Fixing linting. * Update tdrs-backend/setup.cfg * updates per PR. * Excluding trailers for rejection * VSCode merge resolution is garbage. * Fixing precheck for not implemented types * Updating to error-handle not implemented schema types * - Updated view to show latest datafiles - Added admin filter to show newest or all datafile records - Updated indices to allow easier elastic queries * - Updated search indices to have parent FK * - Fix lint errors * - Updated submission tests - Moved create_datafile to util * - fix lint errors * - removing frontend filtering * - addding datafile to admin model * Revert "- addding datafile to admin model" This reverts commit 35a6f24c36c3a4c00ddcfc40f20833530b0199f4. * - Fixed issue where datafile FK wasnt populating - Regenerated migration * - Readding datafile back to admin view now that the error is resolved * - adding datafile back * Revert "- Readding datafile back to admin view now that the error is resolved" This reverts commit 2807425059fd1b5b355edfb16d30d170cf869d7b. * - Removed unnecessary fields - Updated dependencies - Updated filter * - Updated document to include required fields * - Fixed failing test * add adminUrl to deployment cypress overrides * Adding "beta" banners to relevant error report sections (#2522) * Update views.py * Update views.py * Update SubmissionHistory.jsx * Update SubmissionHistory.test.js * Apply suggestions from code review Co-authored-by: Miles Reiter * lint fixes --------- Co-authored-by: Miles Reiter Co-authored-by: Alex P <63075587+ADPennington@users.noreply.github.com> Co-authored-by: andrew-jameson * Create sprint-73-summary.md (#2565) * hotfix for large file sizes (#2542) * hotfix for large file sizes * apply timeouts/req limits to dev * filter identity pages from scan * IGNORE sql injection --------- Co-authored-by: Jan Timpe Co-authored-by: mo sohani Co-authored-by: Alex P <63075587+ADPennington@users.noreply.github.com> * updating validation error language * accidentally included coding challenge * rm comments * 2550 deactivation email link (#2557) * - updated nginx buildpack * - specifying different nginx version * - Updating changelog * - added script to update certain apps in cf - added workflow for each environment in circi * - fixed base config * - fixing jobs * - Updated based on feedback in OH * - Updating defaults * - Removing defaults * - Fixing indent * - Adding params to config * test * test * - updating work dir * - Adding checkout * - adding cf check * - logging into cf * - update cf check to install required binary * - removing unnecessary switch * - Forcing plugin installation * - test installing plugin from script also * - Adding url to email * - test code for sandbox * - using my email * Revert "Merge branch 'update-cf-os' into 2551-deactivation-email-link" This reverts commit e963b9df48dd1f72ca0c5b192c979bac11851d11, reversing changes made to cc9cf81e9d76c42f51ffd5e102f6027d3eb5e645. * Revert "- using my email" This reverts commit cc9cf81e9d76c42f51ffd5e102f6027d3eb5e645. * Revert "- test code for sandbox" This reverts commit 06037747197d17ed8e63b086fcfcf048ecb50dc4. --------- Co-authored-by: Alex P <63075587+ADPennington@users.noreply.github.com> Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * Update README.md (#2577) Add ATO Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * Create 2023, Spring - Testing CSV & Excel-based error reports.md * Update README.md * Updating deliverable links (#2584) * User viewset not returning/duplicating users (#2573) * - Fixed issue not allowing pagination to work locally with nginx - Added ordering to user field to fix duplicates issue * - fix lint error * - Removing ID check since we cannot guarantee that the uuid that is generated per test run will be lexigraphically consistent --------- Co-authored-by: Alex P <63075587+ADPennington@users.noreply.github.com> Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * Update cf os (#2523) * - updated nginx buildpack * - specifying different nginx version * - Updating changelog * - added script to update certain apps in cf - added workflow for each environment in circi * - fixed base config * - fixing jobs * - Updated based on feedback in OH * - Updating defaults * - Removing defaults * - Fixing indent * - Adding params to config * test * test * - updating work dir * - Adding checkout * - adding cf check * - logging into cf * - update cf check to install required binary * - removing unnecessary switch * - Forcing plugin installation * - test installing plugin from script also * - Adding new dependencies * - adding package * - fixing broken install * - fixing libs * - using correct command * - gettign correct version of libc * - trying to upgrade libs * - testing * - Updated README and script * Revert "- Updated README and script" This reverts commit 92697b3e53d1fd87b8d3e7995abb9093aa26e307. * - Removed unnecessary circi stuff - Removed script - Updated docs to callout updating secondary apps * - Correct spelling error --------- Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * Item Number Mismatch (#2578) * - Updated schemas and models to reflect correct item numbers of fields * - Revert migration * - Updated header/trailer item numbers * - Fixed item numbers off by one errors --------- Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * pipeline filtering (#2538) * pipeline changes that filter based on paths and branches. circle ci tracks specified branches in order to keep current functionality on HHS side. * updated syntax to be in line with build-all.yml * removed comma * WIP build flow docs * added Architecture Decision Record for the change to pipeline workflows * corrected file type of doc to .md --------- Co-authored-by: George Hudson Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * Hotfix Devops/2457 path filtering for documentation (#2597) * pipeline changes that filter based on paths and branches. circle ci tracks specified branches in order to keep current functionality on HHS side. * updated syntax to be in line with build-all.yml * removed comma * WIP build flow docs * added Architecture Decision Record for the change to pipeline workflows * corrected file type of doc to .md * build and test all on PRs even for documentation --------- Co-authored-by: George Hudson * Create sprint-74-summary.md (#2596) Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * added URL filters (#2580) * added URL filters * allow github to trigger owasp and label deploys (#2601) Co-authored-by: George Hudson --------- Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> Co-authored-by: George Hudson Co-authored-by: George Hudson * Create sprint-75-summary.md (#2608) * Create sprint-76-summary.md (#2609) Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> * - Resolved failing tests * - Corrected merge thrash * - Using randbits to generate pk to get around confilcting sequence pks * Revert "- Using randbits to generate pk to get around confilcting sequence pks" This reverts commit ac9b0659a62f64c4114c41faf0baa659a92be07c. * - Updating region in fixture instead of factory - letting django handle transaction for test * - Moved datafile reference to avoid confusion * pushing up incomplete codebase * Other unit tests now have passed w/ good error handling * Working tests, need to get setup for case aggregates populating via DB * - Updated queries - Added helper function - Need to merge in 2579 for queries to work * minor improvement to month2int * - Fixing most merge errors * - Fixing functions * - Updated queries based on generic relation * - Updated queries to count by case number instead of record number * - Added route - Updated task to create dfs * - updated tests to include dfs * Cleaning up most comments that are no longer necessary and fixed lint issues. * making minor updates, still broken tests. * updating pipfile.lock and rebuild image resolved test issues * Reorganizing tests, still failing in test_parse.py * deleted summary file, split into other test scripts. * Fixed missing self reference. * Linting fixes. * Found reference failure in deployed env. * Removing extra returns for missing record type. * lint fix * Addressed invocation of datafile for failing test * lint update for whitespace * Intermediary commit, broken test * new assignemnts in util * - updated rejected query to correctly count objs * - Fixing most tests * - Fixed user error. Swapped numbers by accident. * - make region None to avoid PK collision * - Fix lint errors * - Updating to avoid warning * vscode merge conflict resolution (#2623) * auto-create the external network * didn't stage commit properly * checking diffs, matching 1613.2 * doesn't work in pipeline. must be cached local * re-commenting in unit test * lint failures fixed --------- Co-authored-by: andrew-jameson * url change per me, want pipeline to run e2e * Upgraded to querysets, fix PR comments, PE str * missing : not caught locally * Feat/1613 merge 2 (#2650) * Create sprint-78-summary.md (#2645) * Missing/unsaved parser_error for record_type * removing redundant tests * Hopefully resolved on unit tests and lint --------- Co-authored-by: Smithh-Co <121890311+Smithh-Co@users.noreply.github.com> Co-authored-by: andrew-jameson * icontains * tests * Changing dict structure per 1612. * fixed tests and lint issues, parse is too complex * schema_manager replaces schema check * Saving state prior to merge-conflict. * Adopting latest manager, removing old error style. * Commented out t6 line during Office hours * minor reference update * Acclimating to schemaManager * lint-fix isinstance * syntax mistake with isinstance * Apply suggestions from code review * reverting search_index merge artifacts. * adjusting for removing unused "get-schema()" * whitespace lint * Feedback from Jan * Ensuring tests run/work. * Ensure we have leading zero in rptmonthyear. * Minor lint fix for exception logging * resolving merge conflict problems * fixing tests from merge conflicts. * dumb lint fix * reducing line length for lint * Moving DFS migration into it's own file to avoid conflicts. --------- Co-authored-by: andrew-jameson Co-authored-by: elipe17 Co-authored-by: Jan Timpe Co-authored-by: Miles Reiter Co-authored-by: Alex P <63075587+ADPennington@users.noreply.github.com> Co-authored-by: Smithh-Co <121890311+Smithh-Co@users.noreply.github.com> Co-authored-by: mo sohani Co-authored-by: Eric Lipe <125676261+elipe17@users.noreply.github.com> Co-authored-by: Lauren Frohlich <61251539+lfrohlich@users.noreply.github.com> Co-authored-by: Miles Reiter Co-authored-by: George Hudson Co-authored-by: George Hudson Co-authored-by: raftmsohani <97037188+raftmsohani@users.noreply.github.com> --- .circleci/build-and-test/jobs.yml | 6 +- .circleci/config.yml | 4 +- scripts/zap-scanner.sh | 1 - tdrs-backend/Pipfile.lock | 1 + tdrs-backend/docker-compose.local.yml | 6 +- tdrs-backend/docker-compose.yml | 2 +- .../tdpservice/data_files/test/factories.py | 2 +- tdrs-backend/tdpservice/parsers/admin.py | 7 + .../0002_alter_parsererror_error_type.py | 2 +- .../migrations/0007_datafilesummary.py | 24 ++ tdrs-backend/tdpservice/parsers/models.py | 45 ++- tdrs-backend/tdpservice/parsers/parse.py | 86 ++---- tdrs-backend/tdpservice/parsers/row_schema.py | 2 +- .../tdpservice/parsers/schema_defs/tanf/t1.py | 2 +- .../tdpservice/parsers/serializers.py | 12 +- .../parsers/test/data/small_tanf_section1.txt | 4 +- .../tdpservice/parsers/test/factories.py | 59 +++- .../tdpservice/parsers/test/test_models.py | 2 +- .../tdpservice/parsers/test/test_parse.py | 291 ++++++++++++++---- tdrs-backend/tdpservice/parsers/urls.py | 7 +- tdrs-backend/tdpservice/parsers/util.py | 222 ++++++++++++- tdrs-backend/tdpservice/parsers/validators.py | 70 +---- tdrs-backend/tdpservice/parsers/views.py | 12 +- .../tdpservice/scheduling/parser_task.py | 9 +- .../tdpservice/users/test/test_permissions.py | 3 + tdrs-frontend/docker-compose.yml | 2 +- 26 files changed, 659 insertions(+), 224 deletions(-) create mode 100644 tdrs-backend/tdpservice/parsers/migrations/0007_datafilesummary.py diff --git a/.circleci/build-and-test/jobs.yml b/.circleci/build-and-test/jobs.yml index b4f5afd2f..4e32831f8 100644 --- a/.circleci/build-and-test/jobs.yml +++ b/.circleci/build-and-test/jobs.yml @@ -5,14 +5,14 @@ - checkout - docker-compose-check - docker-compose-up-backend - - run: - name: Execute Python Linting Test - command: cd tdrs-backend; docker-compose run --rm web bash -c "flake8 ." - run: name: Run Unit Tests And Create Code Coverage Report command: | cd tdrs-backend; docker-compose run --rm web bash -c "./wait_for_services.sh && pytest --cov-report=xml" + - run: + name: Execute Python Linting Test + command: cd tdrs-backend; docker-compose run --rm web bash -c "flake8 ." - upload-codecov: component: backend coverage-report: ./tdrs-backend/coverage.xml diff --git a/.circleci/config.yml b/.circleci/config.yml index 8b8a62ee7..65715debc 100755 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -82,5 +82,5 @@ workflows: - develop - main - master - - /^release.*/ - + - /^release.*/ + diff --git a/scripts/zap-scanner.sh b/scripts/zap-scanner.sh index f2e999895..c3f534b84 100755 --- a/scripts/zap-scanner.sh +++ b/scripts/zap-scanner.sh @@ -139,7 +139,6 @@ ZAP_CLI_OPTIONS="\ -config globalexcludeurl.url_list.url\(21\).regex='^https:\/\/.*\.identitysandbox.gov\/.*$' \ -config globalexcludeurl.url_list.url\(21\).description='Site - IdentitySandbox.gov' \ -config globalexcludeurl.url_list.url\(21\).enabled=true \ - -config spider.postform=true" # How long ZAP will crawl the app with the spider process diff --git a/tdrs-backend/Pipfile.lock b/tdrs-backend/Pipfile.lock index 3e049d740..bc99d280f 100644 --- a/tdrs-backend/Pipfile.lock +++ b/tdrs-backend/Pipfile.lock @@ -916,6 +916,7 @@ ], "index": "pypi", "version": "==2022.1" + }, "redis": { "hashes": [ diff --git a/tdrs-backend/docker-compose.local.yml b/tdrs-backend/docker-compose.local.yml index ac5924e18..3c8e76317 100644 --- a/tdrs-backend/docker-compose.local.yml +++ b/tdrs-backend/docker-compose.local.yml @@ -80,7 +80,7 @@ services: build: . command: > bash -c "./wait_for_services.sh && - ./gunicorn_start.sh && + ./gunicorn_start.sh && celery -A tdpservice.settings worker -l info" ports: - "5555:5555" @@ -106,5 +106,5 @@ volumes: networks: default: - external: - name: external-net + name: external-net + external: true diff --git a/tdrs-backend/docker-compose.yml b/tdrs-backend/docker-compose.yml index d9d10d393..69e08bc64 100644 --- a/tdrs-backend/docker-compose.yml +++ b/tdrs-backend/docker-compose.yml @@ -124,5 +124,5 @@ volumes: networks: default: - external: name: external-net + external: true diff --git a/tdrs-backend/tdpservice/data_files/test/factories.py b/tdrs-backend/tdpservice/data_files/test/factories.py index 34522154c..88333f7d9 100644 --- a/tdrs-backend/tdpservice/data_files/test/factories.py +++ b/tdrs-backend/tdpservice/data_files/test/factories.py @@ -18,7 +18,7 @@ class Meta: extension = "txt" section = "Active Case Data" quarter = "Q1" - year = "2020" + year = 2020 version = 1 user = factory.SubFactory(UserFactory) stt = factory.SubFactory(STTFactory) diff --git a/tdrs-backend/tdpservice/parsers/admin.py b/tdrs-backend/tdpservice/parsers/admin.py index c98ef5d70..266fb5b26 100644 --- a/tdrs-backend/tdpservice/parsers/admin.py +++ b/tdrs-backend/tdpservice/parsers/admin.py @@ -15,4 +15,11 @@ class ParserErrorAdmin(admin.ModelAdmin): ] +class DataFileSummaryAdmin(admin.ModelAdmin): + """ModelAdmin class for DataFileSummary objects generated in parsing.""" + + list_display = ['status', 'case_aggregates', 'datafile'] + + admin.site.register(models.ParserError, ParserErrorAdmin) +admin.site.register(models.DataFileSummary, DataFileSummaryAdmin) diff --git a/tdrs-backend/tdpservice/parsers/migrations/0002_alter_parsererror_error_type.py b/tdrs-backend/tdpservice/parsers/migrations/0002_alter_parsererror_error_type.py index 5236b5c29..e55c856ce 100644 --- a/tdrs-backend/tdpservice/parsers/migrations/0002_alter_parsererror_error_type.py +++ b/tdrs-backend/tdpservice/parsers/migrations/0002_alter_parsererror_error_type.py @@ -14,5 +14,5 @@ class Migration(migrations.Migration): model_name='parsererror', name='error_type', field=models.TextField(choices=[('1', 'File pre-check'), ('2', 'Record value invalid'), ('3', 'Record value consistency'), ('4', 'Case consistency'), ('5', 'Section consistency'), ('6', 'Historical consistency')], max_length=128), - ), + ) ] diff --git a/tdrs-backend/tdpservice/parsers/migrations/0007_datafilesummary.py b/tdrs-backend/tdpservice/parsers/migrations/0007_datafilesummary.py new file mode 100644 index 000000000..5f5e2a9b5 --- /dev/null +++ b/tdrs-backend/tdpservice/parsers/migrations/0007_datafilesummary.py @@ -0,0 +1,24 @@ +# Generated by Django 3.2.15 on 2023-09-20 15:35 + +from django.db import migrations, models +import django.db.models.deletion + + +class Migration(migrations.Migration): + + dependencies = [ + ('data_files', '0012_datafile_s3_versioning_id'), + ('parsers', '0006_auto_20230810_1500'), + ] + + operations = [ + migrations.CreateModel( + name='DataFileSummary', + fields=[ + ('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), + ('status', models.CharField(choices=[('Pending', 'Pending'), ('Accepted', 'Accepted'), ('Accepted with Errors', 'Accepted With Errors'), ('Rejected', 'Rejected')], default='Pending', max_length=50)), + ('case_aggregates', models.JSONField(null=True)), + ('datafile', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='data_files.datafile')), + ], + ), + ] diff --git a/tdrs-backend/tdpservice/parsers/models.py b/tdrs-backend/tdpservice/parsers/models.py index 4a638e06a..0c0ccdc50 100644 --- a/tdrs-backend/tdpservice/parsers/models.py +++ b/tdrs-backend/tdpservice/parsers/models.py @@ -5,7 +5,7 @@ from django.utils.translation import gettext_lazy as _ from django.contrib.contenttypes.fields import GenericForeignKey from django.contrib.contenttypes.models import ContentType - +from tdpservice.data_files.models import DataFile class ParserErrorCategoryChoices(models.TextChoices): """Enum of ParserError error_type.""" @@ -62,8 +62,49 @@ def __repr__(self): def __str__(self): """Return a string representation of the model.""" - return f"error_message: {self.error_message}" + return f"ParserError {self.__dict__}" def _get_error_message(self): """Return the error message.""" return self.error_message + +class DataFileSummary(models.Model): + """Aggregates information about a parsed file.""" + + class Status(models.TextChoices): + """Enum for status of parsed file.""" + + PENDING = "Pending" # file has been uploaded, but not validated + ACCEPTED = "Accepted" + ACCEPTED_WITH_ERRORS = "Accepted with Errors" + REJECTED = "Rejected" + + status = models.CharField( + max_length=50, + choices=Status.choices, + default=Status.PENDING, + ) + + datafile = models.ForeignKey(DataFile, on_delete=models.CASCADE) + + case_aggregates = models.JSONField(null=True, blank=False) + + def get_status(self): + """Set and return the status field based on errors and models associated with datafile.""" + errors = ParserError.objects.filter(file=self.datafile) + [print(error) for error in errors] + + # excluding row-level pre-checks and trailer pre-checks. + precheck_errors = errors.filter(error_type=ParserErrorCategoryChoices.PRE_CHECK)\ + .exclude(field_name="Record_Type")\ + .exclude(error_message__icontains="trailer")\ + .exclude(error_message__icontains="Unknown Record_Type was found.") + + if errors is None: + return DataFileSummary.Status.PENDING + elif errors.count() == 0: + return DataFileSummary.Status.ACCEPTED + elif precheck_errors.count() > 0: + return DataFileSummary.Status.REJECTED + else: + return DataFileSummary.Status.ACCEPTED_WITH_ERRORS diff --git a/tdrs-backend/tdpservice/parsers/parse.py b/tdrs-backend/tdpservice/parsers/parse.py index 2c2183c68..e8e4a3121 100644 --- a/tdrs-backend/tdpservice/parsers/parse.py +++ b/tdrs-backend/tdpservice/parsers/parse.py @@ -38,8 +38,8 @@ def parse_datafile(datafile): section_is_valid, section_error = validators.validate_header_section_matches_submission( datafile, - program_type, - section, + util.get_section_reference(program_type, section), + util.make_generate_parser_error(datafile, 1) ) if not section_is_valid: @@ -123,7 +123,6 @@ def parse_datafile_lines(datafile, program_type, section, is_encrypted): errors = {} line_number = 0 - schema_manager_options = get_schema_manager_options(program_type) unsaved_records = {} unsaved_parser_errors = {} @@ -180,11 +179,9 @@ def parse_datafile_lines(datafile, program_type, section, is_encrypted): prev_sum = header_count + trailer_count continue - schema_manager = get_schema_manager(line, section, schema_manager_options) - - schema_manager.update_encrypted_fields(is_encrypted) + schema_manager = get_schema_manager(line, section, program_type) - records = manager_parse_line(line, schema_manager, generate_error) + records = manager_parse_line(line, schema_manager, generate_error, is_encrypted) record_number = 0 for i in range(len(records)): @@ -236,68 +233,25 @@ def parse_datafile_lines(datafile, program_type, section, is_encrypted): return errors -def manager_parse_line(line, schema_manager, generate_error): +def manager_parse_line(line, schema_manager, generate_error, is_encrypted=False): """Parse and validate a datafile line using SchemaManager.""" - if schema_manager.schemas: + try: + schema_manager.update_encrypted_fields(is_encrypted) records = schema_manager.parse_and_validate(line, generate_error) return records + except AttributeError as e: + logging.error(e) + return [(None, False, [ + generate_error( + schema=None, + error_category=ParserErrorCategoryChoices.PRE_CHECK, + error_message="Unknown Record_Type was found.", + record=None, + field="Record_Type", + ) + ])] - logger.debug("Record Type is missing from record.") - return [(None, False, [ - generate_error( - schema=None, - error_category=ParserErrorCategoryChoices.PRE_CHECK, - error_message="Record Type is missing from record.", - record=None, - field=None - ) - ])] - - -def get_schema_manager_options(program_type): - """Return the allowed schema options.""" - match program_type: - case 'TAN': - return { - 'A': { - 'T1': schema_defs.tanf.t1, - 'T2': schema_defs.tanf.t2, - 'T3': schema_defs.tanf.t3, - }, - 'C': { - 'T4': schema_defs.tanf.t4, - 'T5': schema_defs.tanf.t5, - }, - 'G': { - 'T6': schema_defs.tanf.t6, - }, - 'S': { - # 'T7': schema_options.t7, - }, - } - case 'SSP': - return { - 'A': { - 'M1': schema_defs.ssp.m1, - 'M2': schema_defs.ssp.m2, - 'M3': schema_defs.ssp.m3, - }, - 'C': { - # 'M4': schema_options.m4, - # 'M5': schema_options.m5, - }, - 'G': { - # 'M6': schema_options.m6, - }, - 'S': { - # 'M7': schema_options.m7, - }, - } - # case tribal? - return None - - -def get_schema_manager(line, section, schema_options): +def get_schema_manager(line, section, program_type): """Return the appropriate schema for the line.""" line_type = line[0:2] - return schema_options.get(section, {}).get(line_type, util.SchemaManager([])) + return util.get_program_model(program_type, section, line_type) diff --git a/tdrs-backend/tdpservice/parsers/row_schema.py b/tdrs-backend/tdpservice/parsers/row_schema.py index a4faecdf3..d19f9f5f1 100644 --- a/tdrs-backend/tdpservice/parsers/row_schema.py +++ b/tdrs-backend/tdpservice/parsers/row_schema.py @@ -81,7 +81,7 @@ def run_preparsing_validators(self, line, generate_error): error_category=ParserErrorCategoryChoices.PRE_CHECK, error_message=validator_error, record=None, - field=None + field="Record_Type" ) ) diff --git a/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t1.py b/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t1.py index 08e171c22..546910386 100644 --- a/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t1.py +++ b/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t1.py @@ -1,4 +1,4 @@ -"""Schema for HEADER row of all submission types.""" +"""Schema for t1 record types.""" from ...util import SchemaManager from ...fields import Field diff --git a/tdrs-backend/tdpservice/parsers/serializers.py b/tdrs-backend/tdpservice/parsers/serializers.py index 05a4e0d07..9b4ad734d 100644 --- a/tdrs-backend/tdpservice/parsers/serializers.py +++ b/tdrs-backend/tdpservice/parsers/serializers.py @@ -1,7 +1,7 @@ """Serializers for parsing errors.""" from rest_framework import serializers -from .models import ParserError +from .models import ParserError, DataFileSummary class ParsingErrorSerializer(serializers.ModelSerializer): @@ -23,3 +23,13 @@ class Meta: model = ParserError fields = '__all__' + + +class DataFileSummarySerializer(serializers.ModelSerializer): + """Serializer for Parsing Errors.""" + + class Meta: + """Metadata.""" + + model = DataFileSummary + fields = ['status', 'case_aggregates', 'datafile'] diff --git a/tdrs-backend/tdpservice/parsers/test/data/small_tanf_section1.txt b/tdrs-backend/tdpservice/parsers/test/data/small_tanf_section1.txt index e906c2ed3..dc9ddae99 100644 --- a/tdrs-backend/tdpservice/parsers/test/data/small_tanf_section1.txt +++ b/tdrs-backend/tdpservice/parsers/test/data/small_tanf_section1.txt @@ -1,12 +1,12 @@ HEADER20204A06 TAN1EN T12020101111111111223003403361110213120000300000000000008730010000000000000000000000000000000000222222000000002229012 -T2202010111111111121219740114WTTTTTY@W2221222222221012212110014722011400000000000000000000000000000000000000000000000000000000000000000000000000000000000291 +T2202010111111111121219740114WTTTTTY@W2221222222221012212110014722011500000000000000000000000000000000000000000000000000000000000000000000000000000000000291 T320201011111111112120190127WTTTT90W022212222204398100000000 T12020101111111111524503401311110233110374300000000000005450320000000000000000000000000000000000222222000000002229021 T2202010111111111152219730113WTTTT@#Z@2221222122211012210110630023080700000000000000000000000000000000000000000000000000000000000000000000000551019700000000 T320201011111111115120160401WTTTT@BTB22212212204398100000000 T12020101111111114023001401101120213110336300000000000002910410000000000000000000000000000000000222222000000002229012 -T2202010111111111401219910501WTTTT@9#T2221222222221012212210421322011400000000000000000000000000000000000000000000000000000000000000000000000000000000000000 +T2202010111111111401219910501WTTTT@9#T2221222222221012212210421322011500000000000000000000000000000000000000000000000000000000000000000000000000000000000000 T320201011111111140120170423WTTTT@@T#22212222204398100000000 T12020101111111114721801401711120212110374300000000000003820060000000000000000000000000000000000222222000000002229012 T2202010111111111471219800223WTTTT@TTW2222212222221012212110065423010700000000000000000000000000000000000000000000000000000000000000000000000000000000000000 diff --git a/tdrs-backend/tdpservice/parsers/test/factories.py b/tdrs-backend/tdpservice/parsers/test/factories.py index c9f9adc6c..8eb309b60 100644 --- a/tdrs-backend/tdpservice/parsers/test/factories.py +++ b/tdrs-backend/tdpservice/parsers/test/factories.py @@ -1,7 +1,64 @@ """Factories for generating test data for parsers.""" import factory +from tdpservice.parsers.models import DataFileSummary, ParserErrorCategoryChoices from faker import Faker from tdpservice.data_files.test.factories import DataFileFactory +from tdpservice.users.test.factories import UserFactory +from tdpservice.stts.test.factories import STTFactory + +class ParsingFileFactory(factory.django.DjangoModelFactory): + """Generate test data for data files.""" + + class Meta: + """Hardcoded meta data for data files.""" + + model = "data_files.DataFile" + + original_filename = "data_file.txt" + slug = "data_file-txt-slug" + extension = "txt" + section = "Active Case Data" + quarter = "Q1" + year = "2020" + version = 1 + user = factory.SubFactory(UserFactory) + stt = factory.SubFactory(STTFactory) + file = factory.django.FileField(data=b'test', filename='my_data_file.txt') + s3_versioning_id = 0 + +class DataFileSummaryFactory(factory.django.DjangoModelFactory): + """Generate test data for data files.""" + + class Meta: + """Hardcoded meta data for data files.""" + + model = DataFileSummary + + status = DataFileSummary.Status.PENDING + + case_aggregates = { + "rejected": 0, + "months": [ + { + "accepted_without_errors": 100, + "accepted_with_errors": 10, + "month": "Jan", + }, + { + "accepted_without_errors": 100, + "accepted_with_errors": 10, + "month": "Feb", + }, + { + "accepted_without_errors": 100, + "accepted_with_errors": 10, + "month": "Mar", + }, + ] + } + + datafile = factory.SubFactory(DataFileFactory) + fake = Faker() @@ -21,7 +78,7 @@ class Meta: case_number = '1' rpt_month_year = 202001 error_message = "test error message" - error_type = "out of range" + error_type = ParserErrorCategoryChoices.PRE_CHECK created_at = factory.Faker("date_time") fields_json = {"test": "test"} diff --git a/tdrs-backend/tdpservice/parsers/test/test_models.py b/tdrs-backend/tdpservice/parsers/test/test_models.py index c46532ada..783e859e7 100644 --- a/tdrs-backend/tdpservice/parsers/test/test_models.py +++ b/tdrs-backend/tdpservice/parsers/test/test_models.py @@ -2,7 +2,7 @@ import pytest from tdpservice.parsers.models import ParserError -from tdpservice.parsers.test.factories import ParserErrorFactory +from .factories import ParserErrorFactory @pytest.fixture def parser_error_instance(): diff --git a/tdrs-backend/tdpservice/parsers/test/test_parse.py b/tdrs-backend/tdpservice/parsers/test/test_parse.py index 4a538ae8a..fd794280b 100644 --- a/tdrs-backend/tdpservice/parsers/test/test_parse.py +++ b/tdrs-backend/tdpservice/parsers/test/test_parse.py @@ -2,11 +2,14 @@ import pytest -from ..util import create_test_datafile from .. import parse -from ..models import ParserError, ParserErrorCategoryChoices +from ..models import ParserError, ParserErrorCategoryChoices, DataFileSummary from tdpservice.search_indexes.models.tanf import TANF_T1, TANF_T2, TANF_T3, TANF_T4, TANF_T5, TANF_T6 from tdpservice.search_indexes.models.ssp import SSP_M1, SSP_M2, SSP_M3 +from .factories import DataFileSummaryFactory +from tdpservice.data_files.models import DataFile +from .. import schema_defs, util + import logging es_logger = logging.getLogger('elasticsearch') @@ -16,15 +19,30 @@ @pytest.fixture def test_datafile(stt_user, stt): """Fixture for small_correct_file.""" - return create_test_datafile('small_correct_file', stt_user, stt) + return util.create_test_datafile('small_correct_file', stt_user, stt) +@pytest.fixture +def dfs(): + """Fixture for DataFileSummary.""" + return DataFileSummaryFactory.create() -@pytest.mark.django_db() -def test_parse_small_correct_file(test_datafile): +@pytest.mark.django_db +def test_parse_small_correct_file(test_datafile, dfs): """Test parsing of small_correct_file.""" - errors = parse.parse_datafile(test_datafile) - errors = ParserError.objects.filter(file=test_datafile) - assert errors.count() == 0 + dfs.datafile = test_datafile + dfs.save() + + parse.parse_datafile(test_datafile) + dfs.status = dfs.get_status() + dfs.case_aggregates = util.case_aggregates_by_month(dfs.datafile, dfs.status) + assert dfs.case_aggregates == {'rejected': 0, + 'months': [ + {'accepted_without_errors': 1, 'accepted_with_errors': 0, 'month': 'Oct'}, + {'accepted_without_errors': 0, 'accepted_with_errors': 0, 'month': 'Nov'}, + {'accepted_without_errors': 0, 'accepted_with_errors': 0, 'month': 'Dec'} + ]} + + assert dfs.get_status() == DataFileSummary.Status.ACCEPTED assert TANF_T1.objects.count() == 1 @@ -41,16 +59,32 @@ def test_parse_small_correct_file(test_datafile): assert t1.SANC_REDUCTION_AMT == 0 assert t1.FAMILY_NEW_CHILD == 2 - -@pytest.mark.django_db() -def test_parse_section_mismatch(test_datafile): +@pytest.mark.django_db +def test_parse_section_mismatch(test_datafile, dfs): """Test parsing of small_correct_file where the DataFile section doesn't match the rawfile section.""" test_datafile.section = 'Closed Case Data' test_datafile.save() - errors = parse.parse_datafile(test_datafile) + dfs.datafile = test_datafile + dfs.save() + errors = parse.parse_datafile(test_datafile) + dfs.status = dfs.get_status() + assert dfs.status == DataFileSummary.Status.REJECTED parser_errors = ParserError.objects.filter(file=test_datafile) + dfs.case_aggregates = util.case_aggregates_by_month(dfs.datafile, dfs.status) + assert dfs.case_aggregates == {'rejected': 1, + 'months': [ + {'accepted_without_errors': 'N/A', + 'accepted_with_errors': 'N/A', + 'month': 'Oct'}, + {'accepted_without_errors': 'N/A', + 'accepted_with_errors': 'N/A', + 'month': 'Nov'}, + {'accepted_without_errors': 'N/A', + 'accepted_with_errors': 'N/A', + 'month': 'Dec'} + ]} assert parser_errors.count() == 1 err = parser_errors.first() @@ -65,13 +99,16 @@ def test_parse_section_mismatch(test_datafile): } -@pytest.mark.django_db() -def test_parse_wrong_program_type(test_datafile): +@pytest.mark.django_db +def test_parse_wrong_program_type(test_datafile, dfs): """Test parsing of small_correct_file where the DataFile program type doesn't match the rawfile.""" test_datafile.section = 'SSP Active Case Data' test_datafile.save() + dfs.datafile = test_datafile + dfs.save() errors = parse.parse_datafile(test_datafile) + assert dfs.get_status() == DataFileSummary.Status.REJECTED parser_errors = ParserError.objects.filter(file=test_datafile) assert parser_errors.count() == 1 @@ -91,17 +128,30 @@ def test_parse_wrong_program_type(test_datafile): @pytest.fixture def test_big_file(stt_user, stt): """Fixture for ADS.E2J.FTP1.TS06.""" - return create_test_datafile('ADS.E2J.FTP1.TS06', stt_user, stt) - + return util.create_test_datafile('ADS.E2J.FTP1.TS06', stt_user, stt) -@pytest.mark.django_db() -def test_parse_big_file(test_big_file): +@pytest.mark.django_db +@pytest.mark.skip(reason="long runtime") # big_files +def test_parse_big_file(test_big_file, dfs): """Test parsing of ADS.E2J.FTP1.TS06.""" expected_t1_record_count = 815 expected_t2_record_count = 882 expected_t3_record_count = 1376 + dfs.datafile = test_big_file + dfs.save() + parse.parse_datafile(test_big_file) + dfs.status = dfs.get_status() + assert dfs.status == DataFileSummary.Status.ACCEPTED_WITH_ERRORS + dfs.case_aggregates = util.case_aggregates_by_month(dfs.datafile, dfs.status) + assert dfs.case_aggregates == {'rejected': 0, + 'months': [ + {'accepted_without_errors': 171, 'accepted_with_errors': 99, 'month': 'Oct'}, + {'accepted_without_errors': 169, 'accepted_with_errors': 104, 'month': 'Nov'}, + {'accepted_without_errors': 166, 'accepted_with_errors': 106, 'month': 'Dec'} + ]} + parser_errors = ParserError.objects.filter(file=test_big_file) error_message = 'MONTHS_FED_TIME_LIMIT is required but a value was not provided.' @@ -119,10 +169,10 @@ def test_parse_big_file(test_big_file): @pytest.fixture def bad_test_file(stt_user, stt): """Fixture for bad_TANF_S2.""" - return create_test_datafile('bad_TANF_S2.txt', stt_user, stt) + return util.create_test_datafile('bad_TANF_S2.txt', stt_user, stt) -@pytest.mark.django_db() +@pytest.mark.django_db def test_parse_bad_test_file(bad_test_file): """Test parsing of bad_TANF_S2.""" errors = parse.parse_datafile(bad_test_file) @@ -145,13 +195,15 @@ def test_parse_bad_test_file(bad_test_file): @pytest.fixture def bad_file_missing_header(stt_user, stt): """Fixture for bad_missing_header.""" - return create_test_datafile('bad_missing_header.txt', stt_user, stt) - + return util.create_test_datafile('bad_missing_header.txt', stt_user, stt) -@pytest.mark.django_db() -def test_parse_bad_file_missing_header(bad_file_missing_header): +@pytest.mark.django_db +def test_parse_bad_file_missing_header(bad_file_missing_header, dfs): """Test parsing of bad_missing_header.""" errors = parse.parse_datafile(bad_file_missing_header) + dfs.datafile = bad_file_missing_header + dfs.save() + assert dfs.get_status() == DataFileSummary.Status.REJECTED parser_errors = ParserError.objects.filter(file=bad_file_missing_header) @@ -172,13 +224,16 @@ def test_parse_bad_file_missing_header(bad_file_missing_header): @pytest.fixture def bad_file_multiple_headers(stt_user, stt): """Fixture for bad_two_headers.""" - return create_test_datafile('bad_two_headers.txt', stt_user, stt) + return util.create_test_datafile('bad_two_headers.txt', stt_user, stt) -@pytest.mark.django_db() -def test_parse_bad_file_multiple_headers(bad_file_multiple_headers): +@pytest.mark.django_db +def test_parse_bad_file_multiple_headers(bad_file_multiple_headers, dfs): """Test parsing of bad_two_headers.""" errors = parse.parse_datafile(bad_file_multiple_headers) + dfs.datafile = bad_file_multiple_headers + dfs.save() + assert dfs.get_status() == DataFileSummary.Status.REJECTED parser_errors = ParserError.objects.filter(file=bad_file_multiple_headers) assert parser_errors.count() == 1 @@ -196,11 +251,11 @@ def test_parse_bad_file_multiple_headers(bad_file_multiple_headers): @pytest.fixture def big_bad_test_file(stt_user, stt): """Fixture for bad_TANF_S1.""" - return create_test_datafile('bad_TANF_S1.txt', stt_user, stt) + return util.create_test_datafile('bad_TANF_S1.txt', stt_user, stt) -@pytest.mark.django_db() -def test_parse_big_bad_test_file(big_bad_test_file): +@pytest.mark.django_db +def test_parse_big_bad_test_file(big_bad_test_file, dfs): """Test parsing of bad_TANF_S1.""" parse.parse_datafile(big_bad_test_file) @@ -219,12 +274,14 @@ def test_parse_big_bad_test_file(big_bad_test_file): @pytest.fixture def bad_trailer_file(stt_user, stt): """Fixture for bad_trailer_1.""" - return create_test_datafile('bad_trailer_1.txt', stt_user, stt) + return util.create_test_datafile('bad_trailer_1.txt', stt_user, stt) - -@pytest.mark.django_db() -def test_parse_bad_trailer_file(bad_trailer_file): +@pytest.mark.django_db +def test_parse_bad_trailer_file(bad_trailer_file, dfs): """Test parsing bad_trailer_1.""" + dfs.datafile = bad_trailer_file + dfs.save() + errors = parse.parse_datafile(bad_trailer_file) parser_errors = ParserError.objects.filter(file=bad_trailer_file) @@ -251,7 +308,7 @@ def test_parse_bad_trailer_file(bad_trailer_file): @pytest.fixture def bad_trailer_file_2(stt_user, stt): """Fixture for bad_trailer_2.""" - return create_test_datafile('bad_trailer_2.txt', stt_user, stt) + return util.create_test_datafile('bad_trailer_2.txt', stt_user, stt) @pytest.mark.django_db() @@ -298,15 +355,35 @@ def test_parse_bad_trailer_file2(bad_trailer_file_2): @pytest.fixture def empty_file(stt_user, stt): """Fixture for empty_file.""" - return create_test_datafile('empty_file', stt_user, stt) + return util.create_test_datafile('empty_file', stt_user, stt) -@pytest.mark.django_db() -def test_parse_empty_file(empty_file): +@pytest.mark.django_db +def test_parse_empty_file(empty_file, dfs): """Test parsing of empty_file.""" + dfs.datafile = empty_file + dfs.save() errors = parse.parse_datafile(empty_file) + dfs.status = dfs.get_status() + dfs.case_aggregates = util.case_aggregates_by_month(empty_file, dfs.status) + + assert dfs.status == DataFileSummary.Status.REJECTED + assert dfs.case_aggregates == {'rejected': 2, + 'months': [ + {'accepted_without_errors': 'N/A', + 'accepted_with_errors': 'N/A', + 'month': 'Oct'}, + {'accepted_without_errors': 'N/A', + 'accepted_with_errors': 'N/A', + 'month': 'Nov'}, + {'accepted_without_errors': 'N/A', + 'accepted_with_errors': 'N/A', + 'month': 'Dec'} + ]} + parser_errors = ParserError.objects.filter(file=empty_file).order_by('id') + assert parser_errors.count() == 2 err = parser_errors.first() @@ -324,18 +401,35 @@ def test_parse_empty_file(empty_file): @pytest.fixture def small_ssp_section1_datafile(stt_user, stt): """Fixture for small_ssp_section1.""" - return create_test_datafile('small_ssp_section1.txt', stt_user, stt, 'SSP Active Case Data') + return util.create_test_datafile('small_ssp_section1.txt', stt_user, stt, 'SSP Active Case Data') -@pytest.mark.django_db() -def test_parse_small_ssp_section1_datafile(small_ssp_section1_datafile): +@pytest.mark.django_db +def test_parse_small_ssp_section1_datafile(small_ssp_section1_datafile, dfs): """Test parsing small_ssp_section1_datafile.""" expected_m1_record_count = 5 expected_m2_record_count = 6 expected_m3_record_count = 8 + small_ssp_section1_datafile.year = 2019 + small_ssp_section1_datafile.quarter = 'Q1' + small_ssp_section1_datafile.save() + + dfs.datafile = small_ssp_section1_datafile + dfs.save() + errors = parse.parse_datafile(small_ssp_section1_datafile) + dfs.status = dfs.get_status() + assert dfs.status == DataFileSummary.Status.ACCEPTED_WITH_ERRORS + dfs.case_aggregates = util.case_aggregates_by_month(dfs.datafile, dfs.status) + assert dfs.case_aggregates == {'rejected': 1, + 'months': [ + {'accepted_without_errors': 5, 'accepted_with_errors': 0, 'month': 'Oct'}, + {'accepted_without_errors': 0, 'accepted_with_errors': 0, 'month': 'Nov'}, + {'accepted_without_errors': 0, 'accepted_with_errors': 0, 'month': 'Dec'} + ]} + parser_errors = ParserError.objects.filter(file=small_ssp_section1_datafile) assert parser_errors.count() == 1 @@ -357,7 +451,7 @@ def test_parse_small_ssp_section1_datafile(small_ssp_section1_datafile): @pytest.fixture def ssp_section1_datafile(stt_user, stt): """Fixture for ssp_section1_datafile.""" - return create_test_datafile('ssp_section1_datafile.txt', stt_user, stt, 'SSP Active Case Data') + return util.create_test_datafile('ssp_section1_datafile.txt', stt_user, stt, 'SSP Active Case Data') @pytest.mark.django_db() @@ -387,13 +481,26 @@ def test_parse_ssp_section1_datafile(ssp_section1_datafile): @pytest.fixture def small_tanf_section1_datafile(stt_user, stt): """Fixture for small_tanf_section1.""" - return create_test_datafile('small_tanf_section1.txt', stt_user, stt) + return util.create_test_datafile('small_tanf_section1.txt', stt_user, stt) -@pytest.mark.django_db() -def test_parse_tanf_section1_datafile(small_tanf_section1_datafile): +@pytest.mark.django_db +def test_parse_tanf_section1_datafile(small_tanf_section1_datafile, dfs): """Test parsing of small_tanf_section1_datafile and validate T2 model data.""" + dfs.datafile = small_tanf_section1_datafile + dfs.save() + parse.parse_datafile(small_tanf_section1_datafile) + dfs.status = dfs.get_status() + assert dfs.status == DataFileSummary.Status.ACCEPTED + dfs.case_aggregates = util.case_aggregates_by_month(dfs.datafile, dfs.status) + assert dfs.case_aggregates == {'rejected': 0, + 'months': [ + {'accepted_without_errors': 5, 'accepted_with_errors': 0, 'month': 'Oct'}, + {'accepted_without_errors': 0, 'accepted_with_errors': 0, 'month': 'Nov'}, + {'accepted_without_errors': 0, 'accepted_with_errors': 0, 'month': 'Dec'} + ]} + assert TANF_T2.objects.count() == 5 t2_models = TANF_T2.objects.all() @@ -410,6 +517,7 @@ def test_parse_tanf_section1_datafile(small_tanf_section1_datafile): assert t2_2.FAMILY_AFFILIATION == 2 assert t2_2.OTHER_UNEARNED_INCOME == '0000' + @pytest.mark.django_db() def test_parse_tanf_section1_datafile_obj_counts(small_tanf_section1_datafile): """Test parsing of small_tanf_section1_datafile in general.""" @@ -444,7 +552,8 @@ def test_parse_tanf_section1_datafile_t3s(small_tanf_section1_datafile): @pytest.fixture def super_big_s1_file(stt_user, stt): """Fixture for ADS.E2J.NDM1.TS53_fake.""" - return create_test_datafile('ADS.E2J.NDM1.TS53_fake', stt_user, stt) + return util.create_test_datafile('ADS.E2J.NDM1.TS53_fake', stt_user, stt) + @pytest.mark.django_db() def test_parse_super_big_s1_file(super_big_s1_file): @@ -458,9 +567,10 @@ def test_parse_super_big_s1_file(super_big_s1_file): @pytest.fixture def super_big_s1_rollback_file(stt_user, stt): """Fixture for ADS.E2J.NDM1.TS53_fake.rollback.""" - return create_test_datafile('ADS.E2J.NDM1.TS53_fake.rollback', stt_user, stt) + return util.create_test_datafile('ADS.E2J.NDM1.TS53_fake.rollback', stt_user, stt) @pytest.mark.django_db() +@pytest.mark.skip(reason="cuz") # big_files def test_parse_super_big_s1_file_with_rollback(super_big_s1_rollback_file): """Test parsing of super_big_s1_rollback_file. @@ -487,16 +597,22 @@ def test_parse_super_big_s1_file_with_rollback(super_big_s1_rollback_file): @pytest.fixture def bad_tanf_s1__row_missing_required_field(stt_user, stt): """Fixture for small_tanf_section1.""" - return create_test_datafile('small_bad_tanf_s1', stt_user, stt) + return util.create_test_datafile('small_bad_tanf_s1', stt_user, stt) -@pytest.mark.django_db() -def test_parse_bad_tfs1_missing_required(bad_tanf_s1__row_missing_required_field): +@pytest.mark.django_db +def test_parse_bad_tfs1_missing_required(bad_tanf_s1__row_missing_required_field, dfs): """Test parsing a bad TANF Section 1 submission where a row is missing required data.""" + dfs.datafile = bad_tanf_s1__row_missing_required_field + dfs.save() + parse.parse_datafile(bad_tanf_s1__row_missing_required_field) + assert dfs.get_status() == DataFileSummary.Status.ACCEPTED_WITH_ERRORS + parser_errors = ParserError.objects.filter(file=bad_tanf_s1__row_missing_required_field) assert parser_errors.count() == 4 + [print(parser_error) for parser_error in parser_errors] error_message = 'RPT_MONTH_YEAR is required but a value was not provided.' row_2_error = parser_errors.get(row_number=2, error_message=error_message) @@ -517,7 +633,7 @@ def test_parse_bad_tfs1_missing_required(bad_tanf_s1__row_missing_required_field assert row_4_error.content_type.model == 'tanf_t3' assert row_4_error.object_id is not None - error_message = 'Record Type is missing from record.' + error_message = 'Unknown Record_Type was found.' row_5_error = parser_errors.get(row_number=5, error_message=error_message) assert row_5_error.error_type == ParserErrorCategoryChoices.PRE_CHECK assert row_5_error.error_message == error_message @@ -528,7 +644,7 @@ def test_parse_bad_tfs1_missing_required(bad_tanf_s1__row_missing_required_field @pytest.fixture def bad_ssp_s1__row_missing_required_field(stt_user, stt): """Fixture for ssp_section1_datafile.""" - return create_test_datafile('small_bad_ssp_s1', stt_user, stt, 'SSP Active Case Data') + return util.create_test_datafile('small_bad_ssp_s1', stt_user, stt, 'SSP Active Case Data') @pytest.mark.django_db() @@ -559,7 +675,7 @@ def test_parse_bad_ssp_s1_missing_required(bad_ssp_s1__row_missing_required_fiel row_5_error = parser_errors.get(row_number=5) assert row_5_error.error_type == ParserErrorCategoryChoices.PRE_CHECK - assert row_5_error.error_message == 'Record Type is missing from record.' + assert row_5_error.error_message == 'Unknown Record_Type was found.' assert row_5_error.content_type is None assert row_5_error.object_id is None @@ -577,10 +693,71 @@ def test_parse_bad_ssp_s1_missing_required(bad_ssp_s1__row_missing_required_fiel 'trailer': [trailer_error], } +@pytest.mark.django_db +def test_dfs_set_case_aggregates(test_datafile, dfs): + """Test that the case aggregates are set correctly.""" + test_datafile.section = 'Active Case Data' + test_datafile.save() + parse.parse_datafile(test_datafile) # this still needs to execute to create db objects to be queried + dfs.file = test_datafile + dfs.save() + dfs.status = dfs.get_status() + dfs.case_aggregates = util.case_aggregates_by_month(test_datafile, dfs.status) + dfs.save() + + for month in dfs.case_aggregates['months']: + if month['month'] == 'Oct': + assert month['accepted_without_errors'] == 1 + assert month['accepted_with_errors'] == 0 + +@pytest.mark.django_db +def test_get_schema_options(dfs): + """Test use-cases for translating strings to named object references.""" + ''' + text -> section + text -> models{} YES + text -> model YES + datafile -> model + ^ section -> program -> model + datafile -> text + model -> text YES + section -> text + + text**: input string from the header/file + ''' + + # from text: + schema = parse.get_schema_manager('T1xx', 'A', 'TAN') + assert isinstance(schema, util.SchemaManager) + assert schema == schema_defs.tanf.t1 + + # get model + models = util.get_program_models('TAN', 'A') + assert models == { + 'T1': schema_defs.tanf.t1, + 'T2': schema_defs.tanf.t2, + 'T3': schema_defs.tanf.t3, + } + + model = util.get_program_model('TAN', 'A', 'T1') + assert model == schema_defs.tanf.t1 + # get section + section = util.get_section_reference('TAN', 'C') + assert section == DataFile.Section.CLOSED_CASE_DATA + + # from datafile: + # get model(s) + # get section str + + # from model: + # get text + # get section str + # get ref section + @pytest.fixture def small_tanf_section2_file(stt_user, stt): - """Fixture for small_tanf_section2.""" - return create_test_datafile('small_tanf_section2.txt', stt_user, stt, 'Closed Case Data') + """Fixture for tanf section2 datafile.""" + return util.create_test_datafile('small_tanf_section2.txt', stt_user, stt, 'Closed Case Data') @pytest.mark.django_db() def test_parse_small_tanf_section2_file(small_tanf_section2_file): @@ -606,7 +783,7 @@ def test_parse_small_tanf_section2_file(small_tanf_section2_file): @pytest.fixture def tanf_section2_file(stt_user, stt): """Fixture for ADS.E2J.FTP2.TS06.""" - return create_test_datafile('ADS.E2J.FTP2.TS06', stt_user, stt, 'Closed Case Data') + return util.create_test_datafile('ADS.E2J.FTP2.TS06', stt_user, stt, 'Closed Case Data') @pytest.mark.django_db() def test_parse_tanf_section2_file(tanf_section2_file): @@ -627,7 +804,7 @@ def test_parse_tanf_section2_file(tanf_section2_file): @pytest.fixture def tanf_section3_file(stt_user, stt): """Fixture for ADS.E2J.FTP3.TS06.""" - return create_test_datafile('ADS.E2J.FTP3.TS06', stt_user, stt, "Aggregate Data") + return util.create_test_datafile('ADS.E2J.FTP3.TS06', stt_user, stt, "Aggregate Data") @pytest.mark.django_db() def test_parse_tanf_section3_file(tanf_section3_file): diff --git a/tdrs-backend/tdpservice/parsers/urls.py b/tdrs-backend/tdpservice/parsers/urls.py index f2226e0ab..cd1d560d3 100644 --- a/tdrs-backend/tdpservice/parsers/urls.py +++ b/tdrs-backend/tdpservice/parsers/urls.py @@ -1,12 +1,13 @@ """Routing for DataFiles.""" from django.urls import path, include from rest_framework.routers import DefaultRouter -from .views import ParsingErrorViewSet +from .views import ParsingErrorViewSet, DataFileSummaryViewSet router = DefaultRouter() -router.register("", ParsingErrorViewSet) +router.register("parsing_errors", ParsingErrorViewSet) +router.register("dfs", DataFileSummaryViewSet) urlpatterns = [ - path('parsing_errors/', include(router.urls)), + path('', include(router.urls)), ] diff --git a/tdrs-backend/tdpservice/parsers/util.py b/tdrs-backend/tdpservice/parsers/util.py index 0e50bce1b..accc36269 100644 --- a/tdrs-backend/tdpservice/parsers/util.py +++ b/tdrs-backend/tdpservice/parsers/util.py @@ -1,17 +1,22 @@ """Utility file for functions shared between all parsers even preparser.""" from .models import ParserError from django.contrib.contenttypes.models import ContentType +from . import schema_defs from tdpservice.data_files.models import DataFile +from datetime import datetime from pathlib import Path from .fields import TransformField -from datetime import datetime +import logging + +logger = logging.getLogger(__name__) + def create_test_datafile(filename, stt_user, stt, section='Active Case Data'): """Create a test DataFile instance with the given file attached.""" path = str(Path(__file__).parent.joinpath('test/data')) + f'/{filename}' datafile = DataFile.create_new_version({ - 'quarter': '4', - 'year': 2022, + 'quarter': 'Q1', + 'year': 2021, 'section': section, 'user': stt_user, 'stt': stt @@ -88,9 +93,158 @@ def contains_encrypted_indicator(line, encryption_field): return encryption_field.parse_value(line) == "E" return False -def month_to_int(month): - """Return the integer value of a month.""" - return datetime.strptime(month, '%b').strftime('%m') +def get_schema_options(program, section, query=None, model=None, model_name=None): + """Centralized function to return the appropriate schema for a given program, section, and query. + + TODO: need to rework this docstring as it is outdated hence the weird ';;' for some of them. + + @param program: the abbreviated program type (.e.g, 'TAN') + @param section: the section of the file (.e.g, 'A');; or ACTIVE_CASE_DATA + @param query: the query for section_names (.e.g, 'section', 'models', etc.) + @return: the appropriate references (e.g., ACTIVE_CASE_DATA or {t1,t2,t3}) ;; returning 'A' + """ + schema_options = { + 'TAN': { + 'A': { + 'section': DataFile.Section.ACTIVE_CASE_DATA, + 'models': { + 'T1': schema_defs.tanf.t1, + 'T2': schema_defs.tanf.t2, + 'T3': schema_defs.tanf.t3, + } + }, + 'C': { + 'section': DataFile.Section.CLOSED_CASE_DATA, + 'models': { + 'T4': schema_defs.tanf.t4, + 'T5': schema_defs.tanf.t5, + } + }, + 'G': { + 'section': DataFile.Section.AGGREGATE_DATA, + 'models': { + 'T6': schema_defs.tanf.t6, + } + }, + 'S': { + 'section': DataFile.Section.STRATUM_DATA, + 'models': { + # 'T7': schema_defs.tanf.t7, + } + } + }, + 'SSP': { + 'A': { + 'section': DataFile.Section.SSP_ACTIVE_CASE_DATA, + 'models': { + 'M1': schema_defs.ssp.m1, + 'M2': schema_defs.ssp.m2, + 'M3': schema_defs.ssp.m3, + } + }, + 'C': { + 'section': DataFile.Section.SSP_CLOSED_CASE_DATA, + 'models': { + # 'S4': schema_defs.ssp.m4, + # 'S5': schema_defs.ssp.m5, + } + }, + 'G': { + 'section': DataFile.Section.SSP_AGGREGATE_DATA, + 'models': { + # 'S6': schema_defs.ssp.m6, + } + }, + 'S': { + 'section': DataFile.Section.SSP_STRATUM_DATA, + 'models': { + # 'S7': schema_defs.ssp.m7, + } + } + }, + # TODO: tribal tanf + } + + if query == "text": + for prog_name, prog_dict in schema_options.items(): + for sect, val in prog_dict.items(): + if val['section'] == section: + return {'program_type': prog_name, 'section': sect} + raise ValueError("Model not found in schema_defs") + elif query == "section": + return schema_options.get(program, {}).get(section, None)[query] + elif query == "models": + links = schema_options.get(program, {}).get(section, None) + + # if query is not chosen or wrong input, return all options + # query = 'models', model = 'T1' + models = links.get(query, links) + + if model_name is None: + return models + elif model_name not in models.keys(): + logger.debug(f"Model {model_name} not found in schema_defs") + return [] # intentionally trigger the error_msg for unknown record type + else: + return models.get(model_name, models) + + +''' +text -> section YES +text -> models{} YES +text -> model YES +datafile -> model + ^ section -> program -> model +datafile -> text +model -> text YES +section -> text + +text**: input string from the header/file +''' + +def get_program_models(str_prog, str_section): + """Return the models dict for a given program and section.""" + return get_schema_options(program=str_prog, section=str_section, query='models') + +def get_program_model(str_prog, str_section, str_model): + """Return singular model for a given program, section, and name.""" + return get_schema_options(program=str_prog, section=str_section, query='models', model_name=str_model) + +def get_section_reference(str_prog, str_section): + """Return the named section reference for a given program and section.""" + return get_schema_options(program=str_prog, section=str_section, query='section') + +def get_text_from_df(df): + """Return the short-hand text for program, section for a given datafile.""" + return get_schema_options("", section=df.section, query='text') + +def get_prog_from_section(str_section): + """Return the program type for a given section.""" + # e.g., 'SSP Closed Case Data' + if str_section.startswith('SSP'): + return 'SSP' + elif str_section.startswith('Tribal'): + return 'TAN' # problematic, do we need to infer tribal entirely from tribe/fips code? + else: + return 'TAN' + + # TODO: if given a datafile (section), we can reverse back to the program b/c the + # section string has "tribal/ssp" in it, then process of elimination we have tanf + +def get_schema(line, section, program_type): + """Return the appropriate schema for the line.""" + line_type = line[0:2] + return get_schema_options(program_type, section, query='models', model_name=line_type) + +def fiscal_to_calendar(year, fiscal_quarter): + """Decrement the input quarter text by one.""" + array = [1, 2, 3, 4] # wrapping around an array + int_qtr = int(fiscal_quarter[1:]) # remove the 'Q', e.g., 'Q1' -> '1' + if int_qtr == 1: + year = year - 1 + + ind_qtr = array.index(int_qtr) # get the index so we can easily wrap-around end of array + return year, "Q{}".format(array[ind_qtr - 1]) # return the previous quarter def transform_to_months(quarter): """Return a list of months in a quarter.""" @@ -105,3 +259,59 @@ def transform_to_months(quarter): return ["Oct", "Nov", "Dec"] case _: raise ValueError("Invalid quarter value.") + + +def month_to_int(month): + """Return the integer value of a month.""" + return datetime.strptime(month, '%b').strftime('%m') + + +def case_aggregates_by_month(df, dfs_status): + """Return case aggregates by month.""" + section = str(df.section) # section -> text + program_type = get_prog_from_section(section) # section -> program_type -> text + + # from datafile year/quarter, generate short month names for each month in quarter ala 'Jan', 'Feb', 'Mar' + calendar_year, calendar_qtr = fiscal_to_calendar(df.year, df.quarter) + month_list = transform_to_months(calendar_qtr) + + short_section = get_text_from_df(df)['section'] + schema_models_dict = get_program_models(program_type, short_section) + schema_models = [model for model in schema_models_dict.values()] + + aggregate_data = {"months": [], "rejected": 0} + for month in month_list: + total = 0 + cases_with_errors = 0 + accepted = 0 + month_int = month_to_int(month) + rpt_month_year = int(f"{calendar_year}{month_int}") + + if dfs_status == "Rejected": + # we need to be careful here on examples of bad headers or empty files, since no month will be found + # but we can rely on the frontend submitted year-quarter to still generate the list of months + aggregate_data["months"].append({"accepted_with_errors": "N/A", + "accepted_without_errors": "N/A", + "month": month}) + continue + + case_numbers = set() + for schema_model in schema_models: + if isinstance(schema_model, SchemaManager): + schema_model = schema_model.schemas[0] + + curr_case_numbers = set(schema_model.model.objects.filter(datafile=df).filter(RPT_MONTH_YEAR=rpt_month_year) + .distinct("CASE_NUMBER").values_list("CASE_NUMBER", flat=True)) + case_numbers = case_numbers.union(curr_case_numbers) + + total += len(case_numbers) + cases_with_errors += ParserError.objects.filter(case_number__in=case_numbers).distinct('case_number').count() + accepted = total - cases_with_errors + + aggregate_data['months'].append({"month": month, + "accepted_without_errors": accepted, + "accepted_with_errors": cases_with_errors}) + + aggregate_data['rejected'] = ParserError.objects.filter(file=df).filter(case_number=None).count() + + return aggregate_data diff --git a/tdrs-backend/tdpservice/parsers/validators.py b/tdrs-backend/tdpservice/parsers/validators.py index c811a6ef1..a8722794d 100644 --- a/tdrs-backend/tdpservice/parsers/validators.py +++ b/tdrs-backend/tdpservice/parsers/validators.py @@ -1,8 +1,6 @@ """Generic parser validator functions for use in schema definitions.""" -from .util import generate_parser_error from .models import ParserErrorCategoryChoices -from tdpservice.data_files.models import DataFile from datetime import date # higher order validator func @@ -348,76 +346,14 @@ def validate(instance): return (True, None) return lambda instance: validate(instance) -def validate_single_header_trailer(datafile): - """Validate that a raw datafile has one trailer and one footer.""" - line_number = 0 - headers = 0 - trailers = 0 - is_valid = True - error_message = None - - for rawline in datafile.file: - line = rawline.decode() - line_number += 1 - - if line.startswith('HEADER'): - headers += 1 - elif line.startswith('TRAILER'): - trailers += 1 - - if headers > 1: - is_valid = False - error_message = 'Multiple headers found.' - break - - if trailers > 1: - is_valid = False - error_message = 'Multiple trailers found.' - break - - if headers == 0: - is_valid = False - error_message = 'No headers found.' - error = None - if not is_valid: - error = generate_parser_error( - datafile=datafile, - line_number=line_number, - schema=None, - error_category=ParserErrorCategoryChoices.PRE_CHECK, - error_message=error_message, - record=None, - field=None - ) - - return is_valid, error - - -def validate_header_section_matches_submission(datafile, program_type, section): +def validate_header_section_matches_submission(datafile, section, generate_error): """Validate header section matches submission section.""" - section_names = { - 'TAN': { - 'A': DataFile.Section.ACTIVE_CASE_DATA, - 'C': DataFile.Section.CLOSED_CASE_DATA, - 'G': DataFile.Section.AGGREGATE_DATA, - 'S': DataFile.Section.STRATUM_DATA, - }, - 'SSP': { - 'A': DataFile.Section.SSP_ACTIVE_CASE_DATA, - 'C': DataFile.Section.SSP_CLOSED_CASE_DATA, - 'G': DataFile.Section.SSP_AGGREGATE_DATA, - 'S': DataFile.Section.SSP_STRATUM_DATA, - }, - } - - is_valid = datafile.section == section_names.get(program_type, {}).get(section) + is_valid = datafile.section == section error = None if not is_valid: - error = generate_parser_error( - datafile=datafile, - line_number=1, + error = generate_error( schema=None, error_category=ParserErrorCategoryChoices.PRE_CHECK, error_message=f"Data does not match the expected layout for {datafile.section}.", diff --git a/tdrs-backend/tdpservice/parsers/views.py b/tdrs-backend/tdpservice/parsers/views.py index d39965ee3..8e40b79e4 100644 --- a/tdrs-backend/tdpservice/parsers/views.py +++ b/tdrs-backend/tdpservice/parsers/views.py @@ -2,8 +2,8 @@ from tdpservice.users.permissions import IsApprovedPermission from rest_framework.viewsets import ModelViewSet from rest_framework.response import Response -from .serializers import ParsingErrorSerializer -from .models import ParserError +from .serializers import ParsingErrorSerializer, DataFileSummarySerializer +from .models import ParserError, DataFileSummary import logging import base64 from io import BytesIO @@ -69,3 +69,11 @@ def _get_xls_serialized_file(self, data): col += 1 workbook.close() return {"data": data, "xls_report": base64.b64encode(output.getvalue()).decode("utf-8")} + + +class DataFileSummaryViewSet(ModelViewSet): + """DataFileSummary file views.""" + + queryset = DataFileSummary.objects.all() + serializer_class = DataFileSummarySerializer + permission_classes = [IsApprovedPermission] diff --git a/tdrs-backend/tdpservice/scheduling/parser_task.py b/tdrs-backend/tdpservice/scheduling/parser_task.py index 4ffd91277..b1e5f8d5c 100644 --- a/tdrs-backend/tdpservice/scheduling/parser_task.py +++ b/tdrs-backend/tdpservice/scheduling/parser_task.py @@ -4,6 +4,9 @@ import logging from tdpservice.data_files.models import DataFile from tdpservice.parsers.parse import parse_datafile +from tdpservice.parsers.models import DataFileSummary +from tdpservice.parsers.util import case_aggregates_by_month + logger = logging.getLogger(__name__) @@ -17,5 +20,9 @@ def parse(data_file_id): data_file = DataFile.objects.get(id=data_file_id) logger.info(f"DataFile parsing started for file -> {repr(data_file)}") + dfs = DataFileSummary.objects.create(datafile=data_file, status=DataFileSummary.Status.PENDING) errors = parse_datafile(data_file) - logger.info(f"DataFile parsing finished with {len(errors)} errors, for file -> {repr(data_file)}.") + dfs.status = dfs.get_status() + dfs.case_aggregates = case_aggregates_by_month(data_file, dfs.status) + dfs.save() + logger.info(f"Parsing finished for file -> {repr(data_file)} with status {dfs.status} and {len(errors)} errors.") diff --git a/tdrs-backend/tdpservice/users/test/test_permissions.py b/tdrs-backend/tdpservice/users/test/test_permissions.py index 984e3b226..2f25347aa 100644 --- a/tdrs-backend/tdpservice/users/test/test_permissions.py +++ b/tdrs-backend/tdpservice/users/test/test_permissions.py @@ -111,6 +111,9 @@ def test_ofa_system_admin_permissions(ofa_system_admin): 'parsers.add_parsererror', 'parsers.change_parsererror', 'parsers.view_parsererror', + 'parsers.add_datafilesummary', + 'parsers.view_datafilesummary', + 'parsers.change_datafilesummary', 'search_indexes.add_ssp_m1', 'search_indexes.view_ssp_m1', 'search_indexes.change_ssp_m1', diff --git a/tdrs-frontend/docker-compose.yml b/tdrs-frontend/docker-compose.yml index 0e6a28283..d75772fa5 100644 --- a/tdrs-frontend/docker-compose.yml +++ b/tdrs-frontend/docker-compose.yml @@ -32,7 +32,7 @@ services: command: > /bin/sh -c "echo 'starting nginx' && - envsubst '$${BACK_END}' < /etc/nginx/locations.conf > /etc/nginx/locations_.conf && + envsubst '$${BACK_END}' < /etc/nginx/locations.conf > /etc/nginx/locations_.conf && rm /etc/nginx/locations.conf && cp /etc/nginx/locations_.conf /etc/nginx/locations.conf && envsubst ' From c94b8f7302733767e6cd3b69adaf92e0aa4dede7 Mon Sep 17 00:00:00 2001 From: Smithh-Co <121890311+Smithh-Co@users.noreply.github.com> Date: Thu, 28 Sep 2023 10:57:06 -0700 Subject: [PATCH 2/4] Create sprint-81-summary.md (#2715) --- docs/Sprint-Review/sprint-81-summary.md | 55 +++++++++++++++++++++++++ 1 file changed, 55 insertions(+) create mode 100644 docs/Sprint-Review/sprint-81-summary.md diff --git a/docs/Sprint-Review/sprint-81-summary.md b/docs/Sprint-Review/sprint-81-summary.md new file mode 100644 index 000000000..439a9f9c7 --- /dev/null +++ b/docs/Sprint-Review/sprint-81-summary.md @@ -0,0 +1,55 @@ + +# Sprint 81 Summary + +08/30/23 - 09/12/23 + +Velocity: Dev (13) + +## Sprint Goal +* Continue parsing engine development for TANF Sections (02 and 04) and close out subsmission history and metadata workflows (1613/12/10). +* UX to continue regional staff and in-app messaging research, errors audit approach, and bridge onboarding to >90% of total users +* DevOps to investigate singular ClamAV (2429), resolve utlity images for CircleCI and evaluate CI/CD pipeline. + + +## Tickets +### Completed/Merged +* [#2626 improve parsing logging](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2626) +* [#1109 TANF (02) Parsing and Validation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1109) +* [#2116 Container Registry Creation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2116) + +### Ready to Merge +* N/A + +### Submitted (QASP Review, OCIO Review) +* [#1613 As a developer, I need parsed file meta data (TANF Section 1)](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/board) + +### Closed (not merged) +* N/A + +## Moved to Next Sprint (Blocked, Raft Review, In Progress, Current Sprint Backlog) +### In Progress + +* [#2429 Singular ClamAV scanner](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2429) +* [#1111 TANF (04) Parsing and Validation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1111) +* [#2664 (bug) file extension](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2664) +* [#2695 space-filled values update (TANF (01))](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2695) +* [#2411 As system admin, I awnt to view metadata on parsed datafiles](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2411) +* [#2536 [spike] Cat 4 validation](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/2536) + + +### Blocked +* N/A + + +### Raft Review +* [#1610 As a user, I need information about the acceptance of my data and a link for the error report](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1610) +* [#1612 Detailed case level metadata](https://app.zenhub.com/workspaces/sprint-board-5f18ab06dfd91c000f7e682e/issues/gh/raft-tech/tanf-app/1612) + + +### Demo +* Internal: + * 1109 - TANF Sec (02) + * 2626 - Parsing logging enhancements + + + From 8289c015dc9aa9992cfc23cb731645fbbaf8fc69 Mon Sep 17 00:00:00 2001 From: raftmsohani <97037188+raftmsohani@users.noreply.github.com> Date: Thu, 28 Sep 2023 14:46:59 -0400 Subject: [PATCH 3/4] DB-drop-and-reset-write-down (#2710) * added md file for dropping DB * added section for db_backup & merged into README * Update CloudFoundry-DB-Connection.md --------- Co-authored-by: Andrew <84722778+andrew-jameson@users.noreply.github.com> --- .../CloudFoundry-DB-Connection.md | 33 +++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/docs/Technical-Documentation/CloudFoundry-DB-Connection.md b/docs/Technical-Documentation/CloudFoundry-DB-Connection.md index 3a72e28e7..c7ca0efe8 100644 --- a/docs/Technical-Documentation/CloudFoundry-DB-Connection.md +++ b/docs/Technical-Documentation/CloudFoundry-DB-Connection.md @@ -23,3 +23,36 @@ From [this github](https://github.com/cloud-gov/cf-service-connect) which has so > `cf connect-to-service tdp-backend- tdp-db-dev` + +# How to DROP existing DB and Recreate a fresh DB + +### Connecting to DB service +First step is to connect to the instance DB (see above). + +#### Optional: DB backup +Before deleting the DB and recreating a fresh DB, you might want to create a backup from the existing data in case you decide to revert the DB changes back. + +For creating a DB backup, please see: `/tdpservice/scheduling/BACKUP_README.md` + +#### Drop and Recreate + +e.g: +>`cf connect-to-service tdp-backend-qasp tdp-db-dev` + +After connection to the DB is made (the step above will make a psql connection), then the following Postgres commands have to run: + +1. List the DBs: `\l` +2. Potgres does not _DROP_ a database when you are connected to the same DB. As such, you will have to connect to a different DB using command: +>`\c {a_database}` + + A good candiadate is: +>`\c postgres` +3. find the associated DB name with instance. E.g: `tdp_db_dev_qasp` +4. use the following command to delete the DB: +>`DROP DATABASE {DB_NAME}` +5. use the following command to create the DB: +>`CREATE DATABASE {DB_NAME}` + +After the DB is created, since the database is cinoketely empty, we will need to redeploy the app again to create tables (or alternatively we can restore a good backup), and then we should run populate stt command to add STT data to the empty DB + +>`./manage.py populatestts` From fb6cc5f847ad4c2a9e670e3a2ad361af1c9f3ef8 Mon Sep 17 00:00:00 2001 From: jtimpe <111305129+jtimpe@users.noreply.github.com> Date: Fri, 29 Sep 2023 11:56:31 -0400 Subject: [PATCH 4/4] Feature/1111 tanf section 4 (#2657) * - ADding t2 schema * - small fixes to naming - added t2.py - Updating var names to match field names * - Added new doc for T2 - Updated model fo rT2 - Added active parse function * - Added T3 schema defs * - Fixing most lint errors * - Updated T3 to multirow record * - Added unit test and data files - updated field names * - Updating var names to conform to DB max var name length * - Updating based on unit test failure * - adding datafile * - Adding unit tests for t2 and t3 * - Fixed field types - Updated test file * - Removing many migrations to consolodate into one migration * Revert "- Removing many migrations to consolodate into one migration" This reverts commit 1455ae4d334f72e250405803d61d26c0a894e886. * - Resolved test issues after merge * - Resolve lint errors * - Merged 1113.2 * - Removed unnecessary file * - Rename model fields - Combined migrations * - fixed spelling in test * - Fixed lint errors * rm commented validators * - Updated schema newlines to be consistent * - Updated field names - Updated tests - Updated migration * - consolodating migrations * - Updated readme and backup script * - Updated parse logic to batch create records * - Fixed lint errors * - Batching record serialization * - Updated parse logic * - Adding big s1 datafile * - fixing lint errors * - Removing test to see if its causing weird failure * - Updating based on comments - Removing big file since circi cant handle it * - fixing error in update method * - fixing error in update * - Removed extraneous seek * - updating ignore to ignore patch files * - Made evaluate_trailer pure/immutable * Revert "- Removing test to see if its causing weird failure" This reverts commit 64b78b737c97fb9bfa70217ff70ccffea8d85429. * - Use custom transactions while parsing - Handle transaction rollback on parse failure/error - Update tests to use transaction logic - Created new decorator to rollback database when assertion error is hit - Making elastic search log level configurable - Added test for inter parse rollback * - updated store logic to one liner - updated decorator to catch all exceptions - updated tests * - removed function - renamed test * - refactored multirecord schema to be a schema manager instead - updated parse functions to use the new layout - updated bulk create to manage batch size * - Name update for clarity * - Fix lint errors * - Changing batch size to avoid crash in circi * - Updated based on review - Updated tests to reflect line number * - Updated based on review/OH - Added extra transaction safety * - Updated view to show latest datafiles - Added admin filter to show newest or all datafile records - Updated indices to allow easier elastic queries * - Updated based on review comments * - Updated based on feedback for preparse error handling - updated tests * - Updated search indices to have parent FK * - Fix lint errors * - Updated submission tests - Moved create_datafile to util * - fix lint errors * - removing frontend filtering * - addding datafile to admin model * Revert "- addding datafile to admin model" This reverts commit 35a6f24c36c3a4c00ddcfc40f20833530b0199f4. * - Fixed issue where datafile FK wasnt populating - Regenerated migration * - Readding datafile back to admin view now that the error is resolved * - adding datafile back * Revert "- Readding datafile back to admin view now that the error is resolved" This reverts commit 2807425059fd1b5b355edfb16d30d170cf869d7b. * - Removed unnecessary fields - Updated dependencies - Updated filter * - Updated document to include required fields * - Moved datafile reference to avoid confusion * - Updating based on comments. - Added base class to keep things a little more DRY * - Refactored parsing to avoid custom transactions by leveraging the FK on the records. Rollbacks are a lot slower now, but hopefully that will happen much less versus successful parses. * - small optimization for the rollback * - Fix lint errors * - Removing commented code * - Installing build essentials * Revert "- Installing build essentials" This reverts commit 782a5f7d687e60965a9100520f027b9678dbac35. * - adding setup tools and wheel * Revert "- adding setup tools and wheel" This reverts commit f529728811fba242132b7c42f9e9e09d6037fa70. * - Updating dependencies to get around pep issue * - Pin factoryboy - fix lint error * - Updating manifest * - Added EncryptedField class - Updated schema's - Updated datafiles since all are encrypted * - Fix lint errors * - Added decryption for ssp * - Making the encrypted check stronger * - Added section two models, schemas, and an initial test * - Allowing fields to be nullable * - add missing field * - Consolodating migrations * - Consolodate migration - Add tests * - Added encrypted field * - Refactored fields and rowschema into seperate files - Updated encrypted field to take decryption function * - fix lint errors * - Fixed file spacing * - fix imports - fix lint errors * - Fix import error * - Fix failing test * - Revert erroneous change * - Updating item numbers * - Fixed item numbers * - Updating to accomodate item number as string * - Removed erroneous update that was causing error - Fixed whitespace in datafiles * - White space error * - Resolved error * - Fixing test * - fix lint errors * - Added field validators for T4/T5 * - Added cat3 validators * - Resolve lint and unit test errors * - small optimization to validator * - Added tests for cat3 validators for T5 record * - Fix lint errors * - updated fields to correct types * - update race/ethnicity * - updated tests post merge - added check in make_validator to just return false if the value is None * - Updated based on review feedback * - Fix lint error * - Updating to the correct types * - Resolve conflicts * - Moving field updates to seperate migration to see if they take place * Revert "- Moving field updates to seperate migration to see if they take place" This reverts commit a3214f23e0bd9cc4cc37ad40e63ae0c98b4ea64b. * - Revert migration 15 to original state - Generate new migration for field alterations * - Merged base branch and updated tests/factories * - Fixed test * - Fixed field names to correspond to model * - Remove duplicate function - Update ssn validator - Removed asserts with large number of parse errors * t7 model, index changes * add t7 to parser * update model mapping test * add s3-s10 * - Remove print statemtnt * - Resolved filter issue that made it seem like records werent being created - Added filter to remaining records * - fix lint errors * - correct form merge conflict * - Added specific validators to avoid duplication - Updated naming of validators * TANF Section 3 Parsing/Validation (#2649) * - Updated to support parsing section 3 data * - Added validators and tests for T6 record * - resolve lint errors * - Quick rename * - Update fields in test * - Fixed conflict * - Fix error from merge conflict * - Updated to create RPT_MONTH_YEAR from CALENDAR_QUARTER - Updated tests * - Added new validators for field change * - Remove debug code * - Genericize TransformField - Update all schemas to use TransformField - Move transforms to seperate file - Fix lint errors * - put kwargs in correct spot * - Very minor change to avoid setting field unnecessarily on all TransformFields * - Updated validator to call out name of field versus the value - Updated tests * - Fix lint * - Calling super to avoid duplicate code * - Added validators for transformed field * - Updating based on merge * - Fixed remaining merge conflicts * - Fix lint errors * - update error messages based on validator updates * - Corrected validator * - Resolved bug causing file to think it was encrytped after multiple submissions and changing the encryption header in between submissions * - UPdated migration - Updated test * - Added transform fields - UPdated migrations * - Stratified T7 to one record per month per section indicator and stratum. * - Updated tests * - Fix test * - Add T7 factory * - Fix lint error * - Remove test class until cat three validators exist * - fix lint errors * - Fixing colliding migration * - Generating schemas instead of defining * - Fix lint errors * - Updated based on review feedback * rm 17/18 * remake migration 18 * fix merge error * fix merge error * fix merge error * labels are stuck * rm * dont compute aggregates for section 3/4 * move save * Update tdrs-backend/tdpservice/parsers/test/factories.py Co-authored-by: Alex P. <63075587+ADPennington@users.noreply.github.com> * enhance t7 modeladmin * fix rpt_month_year off by 1 * lint * differentiate item 6a/6b/67 in t7 records --------- Co-authored-by: elipe17 Co-authored-by: Alex P <63075587+ADPennington@users.noreply.github.com> Co-authored-by: Eric Lipe <125676261+elipe17@users.noreply.github.com> --- tdrs-backend/tdpservice/parsers/parse.py | 44 ++++++++++++ .../parsers/schema_defs/tanf/__init__.py | 2 + .../tdpservice/parsers/schema_defs/tanf/t7.py | 65 +++++++++++++++++ .../parsers/test/data/ADS.E2J.FTP4.TS06 | 3 + .../tdpservice/parsers/test/factories.py | 14 ++++ .../tdpservice/parsers/test/test_parse.py | 31 +++++++- tdrs-backend/tdpservice/parsers/util.py | 2 +- .../tdpservice/scheduling/parser_task.py | 6 +- .../tdpservice/search_indexes/admin/tanf.py | 11 ++- .../search_indexes/documents/tanf.py | 13 ++-- .../migrations/0018_auto_20230920_1846.py | 71 +++++++++++++++++++ .../tdpservice/search_indexes/models/tanf.py | 16 ++--- .../search_indexes/test/test_model_mapping.py | 40 +++++------ 13 files changed, 272 insertions(+), 46 deletions(-) create mode 100644 tdrs-backend/tdpservice/parsers/schema_defs/tanf/t7.py create mode 100644 tdrs-backend/tdpservice/parsers/test/data/ADS.E2J.FTP4.TS06 create mode 100644 tdrs-backend/tdpservice/search_indexes/migrations/0018_auto_20230920_1846.py diff --git a/tdrs-backend/tdpservice/parsers/parse.py b/tdrs-backend/tdpservice/parsers/parse.py index e8e4a3121..409d239b8 100644 --- a/tdrs-backend/tdpservice/parsers/parse.py +++ b/tdrs-backend/tdpservice/parsers/parse.py @@ -251,6 +251,50 @@ def manager_parse_line(line, schema_manager, generate_error, is_encrypted=False) ) ])] + +def get_schema_manager_options(program_type): + """Return the allowed schema options.""" + match program_type: + case 'TAN': + return { + 'A': { + 'T1': schema_defs.tanf.t1, + 'T2': schema_defs.tanf.t2, + 'T3': schema_defs.tanf.t3, + }, + 'C': { + 'T4': schema_defs.tanf.t4, + 'T5': schema_defs.tanf.t5, + }, + 'G': { + 'T6': schema_defs.tanf.t6, + }, + 'S': { + 'T7': schema_defs.tanf.t7, + }, + } + case 'SSP': + return { + 'A': { + 'M1': schema_defs.ssp.m1, + 'M2': schema_defs.ssp.m2, + 'M3': schema_defs.ssp.m3, + }, + 'C': { + # 'M4': schema_options.m4, + # 'M5': schema_options.m5, + }, + 'G': { + # 'M6': schema_options.m6, + }, + 'S': { + # 'M7': schema_options.m7, + }, + } + # case tribal? + return None + + def get_schema_manager(line, section, program_type): """Return the appropriate schema for the line.""" line_type = line[0:2] diff --git a/tdrs-backend/tdpservice/parsers/schema_defs/tanf/__init__.py b/tdrs-backend/tdpservice/parsers/schema_defs/tanf/__init__.py index e82db47b6..7c1997236 100644 --- a/tdrs-backend/tdpservice/parsers/schema_defs/tanf/__init__.py +++ b/tdrs-backend/tdpservice/parsers/schema_defs/tanf/__init__.py @@ -4,6 +4,7 @@ from .t4 import t4 from .t5 import t5 from .t6 import t6 +from .t7 import t7 t1 = t1 t2 = t2 @@ -11,3 +12,4 @@ t4 = t4 t5 = t5 t6 = t6 +t7 = t7 diff --git a/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t7.py b/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t7.py new file mode 100644 index 000000000..2fcb4e0fd --- /dev/null +++ b/tdrs-backend/tdpservice/parsers/schema_defs/tanf/t7.py @@ -0,0 +1,65 @@ +"""Schema for TANF T7 Row.""" + +from ...util import SchemaManager +from ...fields import Field, TransformField +from ...row_schema import RowSchema +from ...transforms import calendar_quarter_to_rpt_month_year +from ... import validators +from tdpservice.search_indexes.models.tanf import TANF_T7 + +schemas = [] + +validator_index = 7 +section_ind_index = 7 +stratum_index = 8 +families_index = 10 +for i in range(1, 31): + month_index = (i - 1) % 3 + sub_item_labels = ['A', 'B', 'C'] + families_value_item_number = f"6{sub_item_labels[month_index]}" + + schemas.append( + RowSchema( + model=TANF_T7, + quiet_preparser_errors=i > 1, + preparsing_validators=[ + validators.notEmpty(0, 7), + validators.notEmpty(validator_index, validator_index + 24), + ], + postparsing_validators=[], + fields=[ + Field(item="0", name="RecordType", type='string', startIndex=0, endIndex=2, + required=True, validators=[]), + Field(item="3", name='CALENDAR_QUARTER', type='number', startIndex=2, endIndex=7, + required=True, validators=[validators.dateYearIsLargerThan(1998), + validators.quarterIsValid()]), + TransformField( + transform_func=calendar_quarter_to_rpt_month_year(month_index), + item="3A", + name='RPT_MONTH_YEAR', + type='number', + startIndex=2, + endIndex=7, + required=True, + validators=[ + validators.dateYearIsLargerThan(1998), + validators.dateMonthIsValid() + ] + ), + Field(item="4", name='TDRS_SECTION_IND', type='string', startIndex=section_ind_index, + endIndex=section_ind_index + 1, required=True, validators=[validators.oneOf(['1', '2'])]), + Field(item="5", name='STRATUM', type='string', startIndex=stratum_index, + endIndex=stratum_index + 2, required=True, validators=[validators.isInStringRange(1, 99)]), + Field(item=families_value_item_number, name='FAMILIES_MONTH', type='number', startIndex=families_index, + endIndex=families_index + 7, required=True, validators=[validators.isInLimits(0, 9999999)]), + ] + ) + ) + + index_offset = 0 if i % 3 != 0 else 24 + validator_index += index_offset + section_ind_index += index_offset + stratum_index += index_offset + families_index += 7 if i % 3 != 0 else 10 + +t7 = SchemaManager(schemas=schemas) diff --git a/tdrs-backend/tdpservice/parsers/test/data/ADS.E2J.FTP4.TS06 b/tdrs-backend/tdpservice/parsers/test/data/ADS.E2J.FTP4.TS06 new file mode 100644 index 000000000..5c344cf42 --- /dev/null +++ b/tdrs-backend/tdpservice/parsers/test/data/ADS.E2J.FTP4.TS06 @@ -0,0 +1,3 @@ +HEADER20204S06 TAN1 N +T720204101006853700680540068454103000312400037850003180104000347400036460003583106000044600004360000325299000506200036070003385202000039100002740000499 +TRAILER0000001 \ No newline at end of file diff --git a/tdrs-backend/tdpservice/parsers/test/factories.py b/tdrs-backend/tdpservice/parsers/test/factories.py index 8eb309b60..b12d3c5ad 100644 --- a/tdrs-backend/tdpservice/parsers/test/factories.py +++ b/tdrs-backend/tdpservice/parsers/test/factories.py @@ -335,3 +335,17 @@ class Meta: NUM_BIRTHS = 1 NUM_OUTWEDLOCK_BIRTHS = 1 NUM_CLOSED_CASES = 1 + +class TanfT7Factory(factory.django.DjangoModelFactory): + """Generate TANF T7 record for testing.""" + + class Meta: + """Hardcoded meta data for TANF_T7.""" + + model = "search_indexes.TANF_T7" + + CALENDAR_QUARTER = 20204 + RPT_MONTH_YEAR = 202011 + TDRS_SECTION_IND = '1' + STRATUM = '01' + FAMILIES_MONTH = 1 diff --git a/tdrs-backend/tdpservice/parsers/test/test_parse.py b/tdrs-backend/tdpservice/parsers/test/test_parse.py index fd794280b..9c785f79f 100644 --- a/tdrs-backend/tdpservice/parsers/test/test_parse.py +++ b/tdrs-backend/tdpservice/parsers/test/test_parse.py @@ -4,7 +4,7 @@ import pytest from .. import parse from ..models import ParserError, ParserErrorCategoryChoices, DataFileSummary -from tdpservice.search_indexes.models.tanf import TANF_T1, TANF_T2, TANF_T3, TANF_T4, TANF_T5, TANF_T6 +from tdpservice.search_indexes.models.tanf import TANF_T1, TANF_T2, TANF_T3, TANF_T4, TANF_T5, TANF_T6, TANF_T7 from tdpservice.search_indexes.models.ssp import SSP_M1, SSP_M2, SSP_M3 from .factories import DataFileSummaryFactory from tdpservice.data_files.models import DataFile @@ -833,3 +833,32 @@ def test_parse_tanf_section3_file(tanf_section3_file): assert first.NUM_CLOSED_CASES == 3884 assert second.NUM_CLOSED_CASES == 3881 assert third.NUM_CLOSED_CASES == 5453 + +@pytest.fixture +def tanf_section4_file(stt_user, stt): + """Fixture for ADS.E2J.FTP4.TS06.""" + return util.create_test_datafile('ADS.E2J.FTP4.TS06', stt_user, stt, "Stratum Data") + +@pytest.mark.django_db() +def test_parse_tanf_section4_file(tanf_section4_file): + """Test parsing TANF Section 4 submission.""" + parse.parse_datafile(tanf_section4_file) + + assert TANF_T7.objects.all().count() == 18 + + parser_errors = ParserError.objects.filter(file=tanf_section4_file) + assert parser_errors.count() == 0 + + t7_objs = TANF_T7.objects.all().order_by('FAMILIES_MONTH') + + first = t7_objs.first() + sixth = t7_objs[5] + + assert first.RPT_MONTH_YEAR == 202011 + assert sixth.RPT_MONTH_YEAR == 202012 + + assert first.TDRS_SECTION_IND == '2' + assert sixth.TDRS_SECTION_IND == '2' + + assert first.FAMILIES_MONTH == 274 + assert sixth.FAMILIES_MONTH == 499 diff --git a/tdrs-backend/tdpservice/parsers/util.py b/tdrs-backend/tdpservice/parsers/util.py index accc36269..073b7b8d8 100644 --- a/tdrs-backend/tdpservice/parsers/util.py +++ b/tdrs-backend/tdpservice/parsers/util.py @@ -129,7 +129,7 @@ def get_schema_options(program, section, query=None, model=None, model_name=None 'S': { 'section': DataFile.Section.STRATUM_DATA, 'models': { - # 'T7': schema_defs.tanf.t7, + 'T7': schema_defs.tanf.t7, } } }, diff --git a/tdrs-backend/tdpservice/scheduling/parser_task.py b/tdrs-backend/tdpservice/scheduling/parser_task.py index b1e5f8d5c..f9fab7f6f 100644 --- a/tdrs-backend/tdpservice/scheduling/parser_task.py +++ b/tdrs-backend/tdpservice/scheduling/parser_task.py @@ -23,6 +23,10 @@ def parse(data_file_id): dfs = DataFileSummary.objects.create(datafile=data_file, status=DataFileSummary.Status.PENDING) errors = parse_datafile(data_file) dfs.status = dfs.get_status() - dfs.case_aggregates = case_aggregates_by_month(data_file, dfs.status) + + if "Case Data" in data_file.section: + dfs.case_aggregates = case_aggregates_by_month(data_file, dfs.status) + dfs.save() + logger.info(f"Parsing finished for file -> {repr(data_file)} with status {dfs.status} and {len(errors)} errors.") diff --git a/tdrs-backend/tdpservice/search_indexes/admin/tanf.py b/tdrs-backend/tdpservice/search_indexes/admin/tanf.py index 88bce804f..af3429695 100644 --- a/tdrs-backend/tdpservice/search_indexes/admin/tanf.py +++ b/tdrs-backend/tdpservice/search_indexes/admin/tanf.py @@ -109,12 +109,17 @@ class TANF_T7Admin(admin.ModelAdmin): """ModelAdmin class for parsed T7 data files.""" list_display = [ - 'record', - 'rpt_month_year', + 'RecordType', + 'CALENDAR_QUARTER', + 'RPT_MONTH_YEAR', + 'TDRS_SECTION_IND', + 'STRATUM', + 'FAMILIES_MONTH', 'datafile', ] list_filter = [ + 'CALENDAR_QUARTER', CreationDateFilter, - 'rpt_month_year', + 'RPT_MONTH_YEAR', ] diff --git a/tdrs-backend/tdpservice/search_indexes/documents/tanf.py b/tdrs-backend/tdpservice/search_indexes/documents/tanf.py index aba080ff1..c613c50a2 100644 --- a/tdrs-backend/tdpservice/search_indexes/documents/tanf.py +++ b/tdrs-backend/tdpservice/search_indexes/documents/tanf.py @@ -343,11 +343,10 @@ class Django: model = TANF_T7 fields = [ - 'record', - 'rpt_month_year', - 'fips_code', - 'calendar_quarter', - 'tdrs_section_ind', - 'stratum', - 'families', + "RecordType", + "CALENDAR_QUARTER", + "RPT_MONTH_YEAR", + "TDRS_SECTION_IND", + "STRATUM", + "FAMILIES_MONTH", ] diff --git a/tdrs-backend/tdpservice/search_indexes/migrations/0018_auto_20230920_1846.py b/tdrs-backend/tdpservice/search_indexes/migrations/0018_auto_20230920_1846.py new file mode 100644 index 000000000..46390d523 --- /dev/null +++ b/tdrs-backend/tdpservice/search_indexes/migrations/0018_auto_20230920_1846.py @@ -0,0 +1,71 @@ +# Generated by Django 3.2.15 on 2023-09-20 18:46 + +from django.db import migrations, models + + +class Migration(migrations.Migration): + + dependencies = [ + ('search_indexes', '0017_auto_20230914_1720'), + ] + + operations = [ + migrations.RemoveField( + model_name='tanf_t7', + name='calendar_quarter', + ), + migrations.RemoveField( + model_name='tanf_t7', + name='families', + ), + migrations.RemoveField( + model_name='tanf_t7', + name='fips_code', + ), + migrations.RemoveField( + model_name='tanf_t7', + name='record', + ), + migrations.RemoveField( + model_name='tanf_t7', + name='rpt_month_year', + ), + migrations.RemoveField( + model_name='tanf_t7', + name='stratum', + ), + migrations.RemoveField( + model_name='tanf_t7', + name='tdrs_section_ind', + ), + migrations.AddField( + model_name='tanf_t7', + name='CALENDAR_QUARTER', + field=models.IntegerField(blank=True, null=True), + ), + migrations.AddField( + model_name='tanf_t7', + name='FAMILIES_MONTH', + field=models.IntegerField(null=True), + ), + migrations.AddField( + model_name='tanf_t7', + name='RPT_MONTH_YEAR', + field=models.IntegerField(null=True), + ), + migrations.AddField( + model_name='tanf_t7', + name='RecordType', + field=models.CharField(max_length=156, null=True), + ), + migrations.AddField( + model_name='tanf_t7', + name='STRATUM', + field=models.CharField(max_length=2, null=True), + ), + migrations.AddField( + model_name='tanf_t7', + name='TDRS_SECTION_IND', + field=models.CharField(max_length=1, null=True), + ), + ] diff --git a/tdrs-backend/tdpservice/search_indexes/models/tanf.py b/tdrs-backend/tdpservice/search_indexes/models/tanf.py index dd61c84c9..9e31fffe2 100644 --- a/tdrs-backend/tdpservice/search_indexes/models/tanf.py +++ b/tdrs-backend/tdpservice/search_indexes/models/tanf.py @@ -342,15 +342,13 @@ class TANF_T7(models.Model): related_name='t7_parent' ) - record = models.CharField(max_length=156, null=False, blank=False) - rpt_month_year = models.IntegerField(null=False, blank=False) - fips_code = models.CharField(max_length=100, null=False, blank=False) - - calendar_quarter = models.IntegerField(null=False, blank=False) - tdrs_section_ind = models.CharField( + RecordType = models.CharField(max_length=156, null=True, blank=False) + CALENDAR_QUARTER = models.IntegerField(null=True, blank=True) + RPT_MONTH_YEAR = models.IntegerField(null=True, blank=False) + TDRS_SECTION_IND = models.CharField( max_length=1, - null=False, + null=True, blank=False ) - stratum = models.CharField(max_length=2, null=False, blank=False) - families = models.IntegerField(null=False, blank=False) + STRATUM = models.CharField(max_length=2, null=True, blank=False) + FAMILIES_MONTH = models.IntegerField(null=True, blank=False) diff --git a/tdrs-backend/tdpservice/search_indexes/test/test_model_mapping.py b/tdrs-backend/tdpservice/search_indexes/test/test_model_mapping.py index cd762eb56..e8e9eb81f 100644 --- a/tdrs-backend/tdpservice/search_indexes/test/test_model_mapping.py +++ b/tdrs-backend/tdpservice/search_indexes/test/test_model_mapping.py @@ -2,7 +2,6 @@ import pytest from faker import Faker -from django.db.utils import IntegrityError from tdpservice.search_indexes import models from tdpservice.search_indexes import documents from tdpservice.parsers.util import create_test_datafile @@ -346,26 +345,27 @@ def test_can_create_and_index_tanf_t7_submission(test_datafile): submission = models.tanf.TANF_T7() submission.datafile = test_datafile - submission.record = record_num - submission.rpt_month_year = 1 - submission.fips_code = '2' - submission.calendar_quarter = 1 - submission.tdrs_section_ind = '1' - submission.stratum = '1' - submission.families = 1 + submission.RecordType = record_num + submission.CALENDAR_YEAR = 2020 + submission.CALENDAR_QUARTER = 1 + submission.TDRS_SECTION_IND = '1' + submission.STRATUM = '01' + submission.FAMILIES_MONTH_1 = 47655 + submission.FAMILIES_MONTH_2 = 81982 + submission.FAMILIES_MONTH_3 = 9999999 submission.save() # No checks her because t7 records can't be parsed currently. - # assert submission.id is not None + assert submission.id is not None - # search = documents.tanf.TANF_T7DataSubmissionDocument.search().query( - # 'match', - # record=record_num - # ) - # response = search.execute() + search = documents.tanf.TANF_T7DataSubmissionDocument.search().query( + 'match', + RecordType=record_num + ) + response = search.execute() - # assert response.hits.total.value == 1 + assert response.hits.total.value == 1 @pytest.mark.django_db @@ -373,17 +373,9 @@ def test_does_not_create_index_if_model_creation_fails(): """Index creation shouldn't happen if saving a model errors.""" record_num = fake.uuid4() - with pytest.raises(IntegrityError): - submission = models.tanf.TANF_T7.objects.create( - record=record_num - # leave out a bunch of required fields - ) - - assert submission.id is None - search = documents.tanf.TANF_T7DataSubmissionDocument.search().query( 'match', - record=record_num + RecordType=record_num ) response = search.execute()