Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2795 s1 duplicates #2956

Merged
merged 143 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from 103 commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
828537c
- Initial commit for duplicate record manager for TANF section 1 records
elipe17 Apr 17, 2024
60cc275
- more efficient error generation
elipe17 Apr 17, 2024
49ead96
- Added a way to track precedence in the error messages to avoid gene…
elipe17 Apr 17, 2024
83b9d0b
- Fix lint
elipe17 Apr 18, 2024
a47a37d
Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …
elipe17 Apr 18, 2024
d154dd6
- Added support to delete deplicate records
elipe17 Apr 18, 2024
7f6e9b1
- add support for bulk/raw delete
elipe17 Apr 18, 2024
2d98d3a
- Update deletes on rollbacks to raw delete since it wont hurt anything
elipe17 Apr 18, 2024
3c07f99
- Updated elastic and django to support raw/bulk deleting
elipe17 Apr 19, 2024
496484e
- Updating rollback test since it is very relevant now that we are do…
elipe17 Apr 19, 2024
66b52eb
- raise log level
elipe17 Apr 19, 2024
8d9c614
- Update test
elipe17 Apr 19, 2024
399c10c
- remove import
elipe17 Apr 19, 2024
86311cb
- Update error generation to use record and schema
elipe17 Apr 19, 2024
04f05f4
- Update error precedence logic
elipe17 Apr 19, 2024
a968d0c
- fix lint
elipe17 Apr 19, 2024
21c0614
Merge branch '2842-cat-4-remaining-s2-validators' of https://github.c…
elipe17 Apr 22, 2024
f421618
- use correct method
elipe17 Apr 22, 2024
038abe0
- remove extra call to bulk create errors
elipe17 Apr 22, 2024
3211cbe
- move duplicate manager into case consistency validator
elipe17 Apr 23, 2024
9e2b9e3
- update add_record logic to include rpt_month_year
elipe17 Apr 23, 2024
5a3cba4
- Move SortedRecordSchema pairs to util
elipe17 Apr 24, 2024
a2ee10c
- remove unused import
elipe17 Apr 24, 2024
e932df7
- Update class to store records based on hash pending the section type
elipe17 Apr 24, 2024
d842733
- Update parse to leverage new data structure
elipe17 Apr 24, 2024
c6a9f17
- clearing after bulk create
elipe17 Apr 24, 2024
d228959
- Rename to be more accurate
elipe17 Apr 24, 2024
15059f6
- Updated class name
elipe17 Apr 24, 2024
3c56c8c
- Adding infrastructure to remove records from memory or the DB pendi…
elipe17 Apr 24, 2024
75d207d
- fix logic issue
elipe17 Apr 24, 2024
a90ee08
- fix logic error
elipe17 Apr 25, 2024
85d25b6
- added back list of cases
elipe17 Apr 25, 2024
d1f88ec
- fixed test_add_record
elipe17 Apr 25, 2024
db1e6ba
- Fix all case consistency tests
elipe17 Apr 25, 2024
8e92d86
- Fix most lint errors
elipe17 Apr 25, 2024
112eeb6
- remove print
elipe17 Apr 25, 2024
503a017
- update rollback logic
elipe17 Apr 25, 2024
efff54c
- Update duplicate error messages
elipe17 Apr 25, 2024
3388709
- functionized dup manager a bit
elipe17 Apr 26, 2024
30645b1
- Add test for S1 partial duplicate detection
elipe17 Apr 26, 2024
e55e88c
- fix lint
elipe17 Apr 26, 2024
785c68e
- Add section 2 tests
elipe17 Apr 26, 2024
d19de28
- Fix reference error
elipe17 Apr 26, 2024
78fbee7
- order by id
elipe17 Apr 26, 2024
0676a4b
- parametrizing batch size
elipe17 Apr 26, 2024
2e68b32
- move creation logic in parse.py
elipe17 Apr 26, 2024
2db1222
- added section 2 duplicate tests
elipe17 Apr 26, 2024
7689a6a
- Test for partial duplicates
elipe17 Apr 26, 2024
697b99d
- fix lint
elipe17 Apr 26, 2024
4fdeb66
- Fixed logic error where last case in file wouldnt be removed if it …
elipe17 Apr 27, 2024
6ed405e
- Update SortedRecords to leverage an unsorted container to make bulk…
elipe17 Apr 29, 2024
63f39fd
- naming and doc string updates
elipe17 Apr 29, 2024
edb6b20
- renaming functions
elipe17 Apr 30, 2024
74ae160
- update doc strings
elipe17 Apr 30, 2024
fb32a9f
- Add test for family affiliation negating partial duplicity
elipe17 Apr 30, 2024
607452a
- Updated to support duplicate checking on section 3/4
elipe17 Apr 30, 2024
4b2d1f6
- fix lint
elipe17 Apr 30, 2024
12d4c2e
- Update docstring
elipe17 Apr 30, 2024
7f995c3
Merge branch '2842-cat-4-remaining-s2-validators' of https://github.c…
elipe17 May 1, 2024
52f7fd8
- Update to allow all record types to have duplicate detection
elipe17 May 1, 2024
74be2a4
- add duplicate detection unit tests for all program types and record…
elipe17 May 1, 2024
4fcdf51
- fix lint
elipe17 May 1, 2024
fdba1d4
- Add cases for ssp
elipe17 May 1, 2024
6a36d2d
- Move parser fixtures to their own conftest.py
elipe17 May 1, 2024
a5dcf04
- Move conftest.py to the test folder
elipe17 May 1, 2024
114a0a2
- removing whitespace
elipe17 May 1, 2024
f71bc04
- Remove 'test' from fixture names
elipe17 May 14, 2024
b1af752
Merge branch '2842-cat-4-remaining-s2-validators' of https://github.c…
elipe17 May 14, 2024
f657be1
- Fix failing tests due to merge
elipe17 May 14, 2024
9eeda03
Merge branch '2795-s1-duplicates' of https://github.com/raft-tech/TAN…
elipe17 May 14, 2024
9ac138e
- Update message for failing test
elipe17 May 14, 2024
eee134d
Merge branch '2795-s1-duplicates' of https://github.com/raft-tech/TAN…
elipe17 May 14, 2024
e085c62
- Remove Vars from compose file
elipe17 May 14, 2024
5656bdc
- fixed test
elipe17 May 15, 2024
a814dc2
- linting
elipe17 May 15, 2024
ab379ce
- move partial hash checking logic to schema/fields
elipe17 May 16, 2024
0df2bfc
Merge branch '2842-cat-4-remaining-s2-validators' of https://github.c…
elipe17 May 16, 2024
062a503
Merge branch '2795-s1-duplicates' of https://github.com/raft-tech/TAN…
elipe17 May 16, 2024
409d773
- Move fixtures to conftest.py
elipe17 May 16, 2024
141b677
- Fix failing tests
elipe17 May 16, 2024
4de2de8
Merge branch '2842-cat-4-remaining-s2-validators' of https://github.c…
elipe17 May 17, 2024
a51ad70
- fixed tests
elipe17 May 17, 2024
669a87c
Revert "- move partial hash checking logic to schema/fields"
elipe17 May 17, 2024
90356db
- Better way to do partial hash checking
elipe17 May 17, 2024
66243d6
- Moved hash generation into schema
elipe17 May 17, 2024
f3156b0
Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …
elipe17 May 17, 2024
56b31b3
- Updated error message
elipe17 May 20, 2024
7a753af
Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …
elipe17 May 20, 2024
fc084c3
Merge branch '2795-s1-duplicates' of https://github.com/raft-tech/TAN…
elipe17 May 20, 2024
13cf281
- move fixtures to conftest
elipe17 May 20, 2024
ac2fbc5
- fix lint
elipe17 May 20, 2024
57b82fd
Merge branch '2795-s1-duplicates' of https://github.com/raft-tech/TAN…
elipe17 May 20, 2024
2a2b29a
- fixed tests from merge conflict
elipe17 May 20, 2024
0e405be
Merge branch 'develop' into 2795-s1-duplicates
elipe17 May 20, 2024
e2dcad9
- Update error creation logging to track all errors
elipe17 May 20, 2024
0d315c7
- lint
elipe17 May 20, 2024
838b213
Merge branch 'develop' into 2795-s1-duplicates
elipe17 May 21, 2024
2c7e672
Merge branch '2795-s1-duplicates' of https://github.com/raft-tech/TAN…
elipe17 May 21, 2024
8180668
- move fixture
elipe17 May 21, 2024
083e0db
Merge pull request #2970 from raft-tech/2966-unify-duplicate-detection
elipe17 May 21, 2024
3505ee2
- Updated aggregates query to take into account cat4 errors
elipe17 May 21, 2024
89d6298
- lint
elipe17 May 21, 2024
fb8ebe0
Merge branch 'develop' into 2795-s1-duplicates
elipe17 May 21, 2024
c5189a6
- import as Query
elipe17 May 24, 2024
d74ffd3
Merge branch 'develop' into 2795-s1-duplicates
elipe17 May 24, 2024
ec0c736
- moved fixture to correct location
elipe17 May 24, 2024
52e43a1
- Updated cat4 duplicate errors to include fields that are duplicated
elipe17 May 24, 2024
b8e139d
- fix lint
elipe17 May 24, 2024
3263be6
- update tests
elipe17 May 24, 2024
42dac4b
Merge branch 'develop' into 2795-s1-duplicates
elipe17 May 25, 2024
1f365fb
Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …
elipe17 May 30, 2024
439f82d
- update from merge conflict
elipe17 May 30, 2024
cfda581
- Add additional exclusion criteria for necessary tests
elipe17 May 30, 2024
8db2599
Merge branch 'develop' into 2795-s1-duplicates
elipe17 May 31, 2024
d4aa531
- renamed parameters and classes to make more sense with their duties.
elipe17 May 31, 2024
2997b5a
- implement TODOs that came out of OH
elipe17 Jun 3, 2024
012f5e5
- remove todo
elipe17 Jun 3, 2024
0d93223
- fix lint
elipe17 Jun 3, 2024
73c0888
Merge branch 'develop' of https://github.com/raft-tech/TANF-app into …
elipe17 Jun 5, 2024
3a56634
- adding some generic parsing flow docs
elipe17 Jun 5, 2024
cda4f90
Merge branch 'develop' into 2795-s1-duplicates
ADPennington Jun 6, 2024
b2e8aac
Merge branch 'develop' into 2795-s1-duplicates
ADPennington Jun 17, 2024
9a8c3e8
- Added useful debug logging to partial hash functions
elipe17 Jun 17, 2024
0fce22f
- Updated duplicate detection to allow "partial duplicates" of s3 and…
elipe17 Jun 17, 2024
610e0cd
- updated file status to be consistent with expectations
elipe17 Jun 18, 2024
5127bfe
- Updated case_consistency validator to track which case has been val…
elipe17 Jun 20, 2024
57d7de4
- Update tests
elipe17 Jun 20, 2024
caa6dff
- Add edge case cat4 test
elipe17 Jun 21, 2024
a4ac411
- Updated manager to correctly delete serialized records
elipe17 Jun 21, 2024
daad0a0
- Updated test
elipe17 Jun 21, 2024
89b6d66
- fix lint
elipe17 Jun 21, 2024
4a5fca9
Merge branch 'develop' into 2795-s1-duplicates
elipe17 Jun 21, 2024
cd9f441
- Add elastic specific exception handling
elipe17 Jun 21, 2024
d6ad86f
- add extra case to file for duplicate detection
elipe17 Jun 21, 2024
324266b
- Added creation of LogEntry objects in exception blocks
elipe17 Jun 21, 2024
733b991
- fix lint
elipe17 Jun 21, 2024
0eedd85
- Add line number to duplicate error ParserError objects
elipe17 Jun 24, 2024
4dcdf83
- Remove dup test class
elipe17 Jun 24, 2024
12c90c7
- fix lint
elipe17 Jun 24, 2024
9511668
- Updated based on review feedback
elipe17 Jun 26, 2024
33931d3
- Update tests
elipe17 Jun 26, 2024
c6ec73d
- fix lint
elipe17 Jun 26, 2024
c503e1d
Merge branch 'develop' into 2795-s1-duplicates
elipe17 Jun 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions tdrs-backend/tdpservice/data_files/test/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,8 @@ def assert_error_report_tanf_file_content_matches_with_friendly_names(response):

assert ws.cell(row=1, column=1).value == "Error reporting in TDP is still in development.We'll" \
+ " be in touch when it's ready to use!For now please refer to the reports you receive via email"
assert ws.cell(row=5, column=COL_ERROR_MESSAGE).value == "if cash amount :873 validator1 passed" \
+ " then number of months T1: 0 is not larger than 0."
assert ws.cell(row=5, column=COL_ERROR_MESSAGE).value == "Every T1 record should have at least one " + \
"corresponding T2 or T3 record with the same RPT_MONTH_YEAR and CASE_NUMBER."

@staticmethod
def assert_error_report_ssp_file_content_matches_with_friendly_names(response):
Expand Down Expand Up @@ -132,8 +132,9 @@ def assert_error_report_file_content_matches_without_friendly_names(response):

assert ws.cell(row=1, column=1).value == "Error reporting in TDP is still in development.We'll" \
+ " be in touch when it's ready to use!For now please refer to the reports you receive via email"
assert ws.cell(row=5, column=COL_ERROR_MESSAGE).value == ("if CASH_AMOUNT :873 validator1 passed then "
"NBR_MONTHS T1: 0 is not larger than 0.")
assert ws.cell(row=5, column=COL_ERROR_MESSAGE).value == ("Every T1 record should have at least one "
"corresponding T2 or T3 record with the same "
"RPT_MONTH_YEAR and CASE_NUMBER.")

@staticmethod
def assert_data_file_exists(data_file_data, version, user):
Expand Down
18 changes: 11 additions & 7 deletions tdrs-backend/tdpservice/parsers/aggregates.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
"""Aggregate methods for the parsers."""
from .row_schema import SchemaManager
from .models import ParserError
from .models import ParserError, ParserErrorCategoryChoices
from .util import month_to_int, \
transform_to_months, fiscal_to_calendar, get_prog_from_section
from .schema_defs.utils import get_program_models, get_text_from_df
from django.db.models import Q
elipe17 marked this conversation as resolved.
Show resolved Hide resolved


def case_aggregates_by_month(df, dfs_status):
Expand Down Expand Up @@ -39,22 +40,25 @@ def case_aggregates_by_month(df, dfs_status):
if isinstance(schema_model, SchemaManager):
schema_model = schema_model.schemas[0]

curr_case_numbers = set(schema_model.document.Django.model.objects.filter(datafile=df)
.filter(RPT_MONTH_YEAR=rpt_month_year)
curr_case_numbers = set(schema_model.document.Django.model.objects.filter(datafile=df,
RPT_MONTH_YEAR=rpt_month_year)
.distinct("CASE_NUMBER").values_list("CASE_NUMBER", flat=True))
case_numbers = case_numbers.union(curr_case_numbers)

total += len(case_numbers)
cases_with_errors += ParserError.objects.filter(file=df).filter(
case_number__in=case_numbers).distinct('case_number').count()
cases_with_errors += ParserError.objects.filter(file=df, case_number__in=case_numbers)\
.distinct('case_number').count()
accepted = total - cases_with_errors

aggregate_data['months'].append({"month": month,
"accepted_without_errors": accepted,
"accepted_with_errors": cases_with_errors})

aggregate_data['rejected'] = ParserError.objects.filter(file=df).filter(case_number=None).distinct("row_number")\
.exclude(row_number=0).count()
error_type_query = Q(error_type=ParserErrorCategoryChoices.PRE_CHECK) | \
Q(error_type=ParserErrorCategoryChoices.CASE_CONSISTENCY)

aggregate_data['rejected'] = ParserError.objects.filter(error_type_query, file=df)\
.distinct("row_number").exclude(row_number=0).count()

return aggregate_data

Expand Down
Loading
Loading