Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: 878 permissions errors #892

Merged
merged 26 commits into from
Sep 25, 2024
Merged

Fix: 878 permissions errors #892

merged 26 commits into from
Sep 25, 2024

Conversation

TylerHendrickson
Copy link
Member

@TylerHendrickson TylerHendrickson commented Aug 27, 2024

Ticket #878

Description

This PR fixes an issue that currently prevents the Lambda handler for the SplitGrantsGovXMLDB pipeline step from storing new and revised grant opportunity XML files in the S3 "prepared data" bucket. In doing so, it modifies the strategy used by the Lambda function for determining whether a given grant opportunity (a single XML database record) is new or (un)modified – previously, we checked the last-modified timestamp of the target S3 object (if any) and compared it to the LastUpdatedDate timestamp of the record being processed. As of this PR, we will instead compare the record being processed to a corresponding DynamoDB record (if any) to implement the same conditional-write check. Doing it in this manner has several benefits (which are outlined in Step 3 of #878 (comment) on the associated issue), the most relevant of which is that it decouples our ability to check whether a record has updates from the mechanism in which we store those records, which in this case allows us to avoid needing to rebuild our S3 repository of extracted XML records from scratch.

Additionally, this PR includes a change that makes it possible to limit the number of records split and processed by SplitGrantsGovXMLDB in development environments. Because the total number of records in the source Grants.gov XML database is so large (currently ~80k records), processing the entire database places quite a bit of strain on developer workstations and risks maxing out available resources, especially in new environments where the pipeline is being run from scratch (which is often the case when testing with LocalStack). To help reduce record-processing to a limit more reasonable for testing, a check is added that short-circuits processing once a configured limit has been reached. The limit is enforced whenever the Lambda execution environment is configured with a MAX_SPLIT_RECORDS environment variable set to a number greater than its default value of -1. In LocalStack environments, this value is configured to stop processing after 10 records, which can be adjusted by editing the max_split_grantsgov_records input variable in terraform/local.tfvars.

Testing

  • Unit tests (run task test)
  • Deploy to LocalStack and run the pipeline from scratch, observing 10 records were saved to DynamoDB. Redeploy with max_split_grantsgov_records = 20, run the pipeline again, and observe that an additional 10 records were saved, but that the original 10 were not overwritten.
    • An easy way to check this is to edit the original 10 records in DynamoDB before redeployment by deleting at least one item attribute, and ensuring that the attribute does not reappear in the original 10 records.
    • Alternatively, you can check that the PersistGrantsGovXMLDB Lambda function was invoked exactly 20 times (10 on the first pipeline run, and 10 on the second) using CloudWatch metrics in LocalStack.

Automated and Unit Tests

  • Added Unit tests

Manual tests for Reviewer

  • Added steps to test feature/functionality manually

Checklist

  • Provided ticket and description
  • Provided testing information
  • Provided adequate test coverage for all new code
  • Added PR reviewers

@TylerHendrickson TylerHendrickson self-assigned this Aug 27, 2024
@TylerHendrickson TylerHendrickson added the bug Something isn't working label Aug 27, 2024
@github-actions github-actions bot added go Pull requests that update Go code terraform Pull requests that update Terraform code labels Aug 27, 2024
Comment on lines +42 to +43
# Path: <first 3 digits of grant ID><grant id>/grants.gov/v2.OpportunitySynopsisDetail_1_0.xml
"${data.aws_s3_bucket.prepared_data.arn}/*/*/grants.gov/v2.OpportunitySynopsisDetail_1_0.xml",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables the downstream PersistGrantsGovXMLDB Lambda to run without running into the same permissions issue that SplitGrantsGovXMLDB is currently experiencing in Staging.

Comment on lines 56 to 61
# Path: <first 3 digits of grant ID>/<grant id>/grants.gov/v2.xml (deprecated)
"${data.aws_s3_bucket.prepared_data.arn}/*/*/grants.gov/v2.xml",
# Path: <first 3 digits of grant ID><grant id>/grants.gov/v2.OpportunitySynopsisDetail_1_0.xml
"${data.aws_s3_bucket.prepared_data.arn}/*/*/grants.gov/v2.OpportunitySynopsisDetail_1_0.xml",
# Path: <first 3 digits of grant ID><grant id>/grants.gov/v2.OpportunityForecastDetail_1_0.xml
"${data.aws_s3_bucket.prepared_data.arn}/*/*/grants.gov/v2.OpportunityForecastDetail_1_0.xml",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the permissions issue the SplitGrantsGovXMLDB is currently experiencing in Staging.

Copy link

github-actions bot commented Aug 27, 2024

Terraform Summary

Step Result
🖌 Terraform Format & Style
⚙️ Terraform Initialization
🤖 Terraform Validation
📖 Terraform Plan

Output

Validation Output
Success! The configuration is valid.


Plan Output
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # module.DownloadFFISSpreadsheet.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-DownloadFFISSpreadsheet"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-DownloadFFISSpreadsheet:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-DownloadFFISSpreadsheet:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                      = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:downloadffisspreadsheet" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:downloadffisspreadsheet"
              ~ "DD_VERSION"                   = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (11 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.DownloadFFISSpreadsheet.module.lambda_function.aws_lambda_permission.current_version_triggers["SQSQueueNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "SQSQueueNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (4 unchanged attributes hidden)
    }

  # module.DownloadGrantsGovDB.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-DownloadGrantsGovDB"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-DownloadGrantsGovDB:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-DownloadGrantsGovDB:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                        = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:downloadgrantsgovdb" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:downloadgrantsgovdb"
              ~ "DD_VERSION"                     = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (13 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.DownloadGrantsGovDB.module.lambda_function.aws_lambda_permission.current_version_triggers["Schedule"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "Schedule" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.EnqueueFFISDownload.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-EnqueueFFISDownload"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-EnqueueFFISDownload:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-EnqueueFFISDownload:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                      = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:enqueueffisdownload" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:enqueueffisdownload"
              ~ "DD_VERSION"                   = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (11 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.EnqueueFFISDownload.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.ExtractGrantsGovDBToXML.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-ExtractGrantsGovDBToXML"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-ExtractGrantsGovDBToXML:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-ExtractGrantsGovDBToXML:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                      = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:extractgrantsgovdbtoxml" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:extractgrantsgovdbtoxml"
              ~ "DD_VERSION"                   = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (11 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.ExtractGrantsGovDBToXML.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.PersistFFISData.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-PersistFFISData"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-PersistFFISData:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-PersistFFISData:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                       = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:persistffisdata" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:persistffisdata"
              ~ "DD_VERSION"                    = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (11 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.PersistFFISData.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.PersistGrantsGovXMLDB.module.lambda_function.aws_iam_policy.additional_json[0] will be updated in-place
  ~ resource "aws_iam_policy" "additional_json" {
        id               = "arn:aws:iam::357150818708:policy/grants_ingest-PersistGrantsGovXMLDB"
        name             = "grants_ingest-PersistGrantsGovXMLDB"
      ~ policy           = jsonencode(
          ~ {
              ~ Statement = [
                    # (1 unchanged element hidden)
                    {
                        Action   = [
                            "dynamodb:UpdateItem",
                            "dynamodb:ListTables",
                        ]
                        Effect   = "Allow"
                        Resource = "arn:aws:dynamodb:us-west-2:357150818708:table/grantsingest-prepareddata"
                        Sid      = "AllowDynamoDBPreparedData"
                    },
                  ~ {
                      ~ Resource = [
                            "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2/*/*/grants.gov/v2.xml",
+                           "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2/*/*/grants.gov/v2.OpportunitySynopsisDetail_1_0.xml",
                            "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2",
                        ]
                        # (3 unchanged attributes hidden)
                    },
                ]
                # (1 unchanged attribute hidden)
            }
        )
        tags             = {}
        # (5 unchanged attributes hidden)
    }

  # module.PersistGrantsGovXMLDB.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-PersistGrantsGovXMLDB"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-PersistGrantsGovXMLDB:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-PersistGrantsGovXMLDB:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                       = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:persistgrantsgovxmldb" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:persistgrantsgovxmldb"
              ~ "DD_VERSION"                    = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (11 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.PersistGrantsGovXMLDB.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.PublishGrantEvents.module.lambda_artifact.aws_s3_object.lambda_function must be replaced
-/+ resource "aws_s3_object" "lambda_function" {
+       acl                    = (known after apply)
      ~ arn                    = "arn:aws:s3:::grantsingest-lambdaartifacts-357150818708-us-west-2/43b179075ee58ed4cbd4ae06f9988c73.zip" -> (known after apply)
      ~ bucket_key_enabled     = false -> (known after apply)
+       checksum_crc32         = (known after apply)
+       checksum_crc32c        = (known after apply)
+       checksum_sha1          = (known after apply)
+       checksum_sha256        = (known after apply)
      ~ content_type           = "binary/octet-stream" -> (known after apply)
      ~ etag                   = "b7e718e27d5435f07daecffc5f91c3a9-3" -> (known after apply)
      ~ id                     = "43b179075ee58ed4cbd4ae06f9988c73.zip" -> (known after apply)
      ~ key                    = "43b179075ee58ed4cbd4ae06f9988c73.zip" -> "e94b303774670ffa4b3b3e4d9828edd4.zip" # forces replacement
+       kms_key_id             = (known after apply)
-       metadata               = {} -> null
      ~ storage_class          = "STANDARD" -> (known after apply)
-       tags                   = {} -> null
      ~ version_id             = "igDyQwlt5qm6jr8nTAW.k07wJsXOgoJf" -> (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.PublishGrantEvents.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-PublishGrantEvents"
      ~ last_modified                  = "2024-09-24T21:18:28.000+0000" -> (known after apply)
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-PublishGrantEvents:62" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-PublishGrantEvents:62/invocations" -> (known after apply)
      ~ s3_key                         = "43b179075ee58ed4cbd4ae06f9988c73.zip" -> "e94b303774670ffa4b3b3e4d9828edd4.zip"
        tags                           = {}
      ~ version                        = "62" -> (known after apply)
        # (19 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                      = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:publishgrantevents" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:publishgrantevents"
              ~ "DD_VERSION"                   = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (11 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.PublishGrantEvents.module.lambda_function.aws_lambda_permission.current_version_triggers["dynamodb"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "dynamodb" -> (known after apply)
      ~ qualifier           = "62" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.ReceiveFFISEmail.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-ReceiveFFISEmail"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-ReceiveFFISEmail:60" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-ReceiveFFISEmail:60/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "60" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                        = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:receiveffisemail" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:receiveffisemail"
              ~ "DD_VERSION"                     = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (12 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.ReceiveFFISEmail.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "60" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.SplitFFISSpreadsheet.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-SplitFFISSpreadsheet"
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-SplitFFISSpreadsheet:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-SplitFFISSpreadsheet:61/invocations" -> (known after apply)
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (21 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                          = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:splitffisspreadsheet" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:splitffisspreadsheet"
              ~ "DD_VERSION"                       = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
                # (14 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.SplitFFISSpreadsheet.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.SplitGrantsGovXMLDB.module.lambda_artifact.aws_s3_object.lambda_function must be replaced
-/+ resource "aws_s3_object" "lambda_function" {
+       acl                    = (known after apply)
      ~ arn                    = "arn:aws:s3:::grantsingest-lambdaartifacts-357150818708-us-west-2/0b128d0933f346d9928cbc1a4a8a60ab.zip" -> (known after apply)
      ~ bucket_key_enabled     = false -> (known after apply)
+       checksum_crc32         = (known after apply)
+       checksum_crc32c        = (known after apply)
+       checksum_sha1          = (known after apply)
+       checksum_sha256        = (known after apply)
      ~ content_type           = "binary/octet-stream" -> (known after apply)
      ~ etag                   = "d9019a3f9040c909e216abd5b346b6ac-3" -> (known after apply)
      ~ id                     = "0b128d0933f346d9928cbc1a4a8a60ab.zip" -> (known after apply)
      ~ key                    = "0b128d0933f346d9928cbc1a4a8a60ab.zip" -> "20d1aaa6d456b437e49813e50d29c590.zip" # forces replacement
+       kms_key_id             = (known after apply)
-       metadata               = {} -> null
      ~ storage_class          = "STANDARD" -> (known after apply)
-       tags                   = {} -> null
      ~ version_id             = "340H95UAqFie_h5lIK9TcLBf8GNQTwe9" -> (known after apply)
        # (5 unchanged attributes hidden)
    }

  # module.SplitGrantsGovXMLDB.module.lambda_function.aws_iam_policy.additional_json[0] will be updated in-place
  ~ resource "aws_iam_policy" "additional_json" {
        id               = "arn:aws:iam::357150818708:policy/grants_ingest-SplitGrantsGovXMLDB"
        name             = "grants_ingest-SplitGrantsGovXMLDB"
      ~ policy           = jsonencode(
          ~ {
              ~ Statement = [
                    {
                        Action   = "secretsmanager:GetSecretValue"
                        Effect   = "Allow"
                        Resource = "arn:aws:secretsmanager:us-west-2:357150818708:secret:grants_ingest-datadog_api_key-8kMn2C"
                        Sid      = "GetDatadogAPIKeySecretValue"
                    },
                  ~ {
                      ~ Action   = [
-                           "s3:ListBucket",
-                           "s3:GetObject",
+                           "dynamodb:ListTables",
+                           "dynamodb:GetItem",
                        ]
                      ~ Resource = [
-                           "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2/*/*/grants.gov/v2.xml",
-                           "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2",
                        ] -> "arn:aws:dynamodb:us-west-2:357150818708:table/grantsingest-prepareddata"
                      ~ Sid      = "AllowInspectS3PreparedData" -> "AllowReadDynamoDBPreparedData"
                        # (1 unchanged attribute hidden)
                    },
                    {
                        Action   = "s3:GetObject"
                        Effect   = "Allow"
                        Resource = "arn:aws:s3:::grantsingest-grantssourcedata-357150818708-us-west-2/sources/*/*/*/grants.gov/extract.xml"
                        Sid      = "AllowS3DownloadSourceData"
                    },
                  ~ {
                      ~ Resource = "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2/*/*/grants.gov/v2.xml" -> [
+                           "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2/*/*/grants.gov/v2.OpportunitySynopsisDetail_1_0.xml",
+                           "arn:aws:s3:::grantsingest-grantsprepareddata-357150818708-us-west-2/*/*/grants.gov/v2.OpportunityForecastDetail_1_0.xml",
                        ]
                        # (3 unchanged attributes hidden)
                    },
                ]
                # (1 unchanged attribute hidden)
            }
        )
        tags             = {}
        # (5 unchanged attributes hidden)
    }

  # module.SplitGrantsGovXMLDB.module.lambda_function.aws_lambda_function.this[0] will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        id                             = "grants_ingest-SplitGrantsGovXMLDB"
      ~ last_modified                  = "2024-09-24T21:18:28.000+0000" -> (known after apply)
      ~ qualified_arn                  = "arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-SplitGrantsGovXMLDB:61" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-west-2:lambda:path/2015-03-31/functions/arn:aws:lambda:us-west-2:357150818708:function:grants_ingest-SplitGrantsGovXMLDB:61/invocations" -> (known after apply)
      ~ s3_key                         = "0b128d0933f346d9928cbc1a4a8a60ab.zip" -> "20d1aaa6d456b437e49813e50d29c590.zip"
        tags                           = {}
      ~ version                        = "61" -> (known after apply)
        # (19 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "DD_TAGS"                          = "git.commit.sha:8292a198eb05c9836c899c49b771bde639b12185,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:splitgrantsgovxmldb" -> "git.commit.sha:ceb6f05abbcf9e5e1983fa1f6297540e354a3239,git.repository_url:github.com/usdigitalresponse/grants-ingest,handlername:splitgrantsgovxmldb"
              ~ "DD_VERSION"                       = "8292a198eb05c9836c899c49b771bde639b12185" -> "ceb6f05abbcf9e5e1983fa1f6297540e354a3239"
+               "GRANTS_PREPARED_DATA_TABLE_NAME"  = "grantsingest-prepareddata"
+               "MAX_SPLIT_RECORDS"                = "-1"
                # (15 unchanged elements hidden)
            }
        }

        # (3 unchanged blocks hidden)
    }

  # module.SplitGrantsGovXMLDB.module.lambda_function.aws_lambda_permission.current_version_triggers["S3BucketNotification"] must be replaced
-/+ resource "aws_lambda_permission" "current_version_triggers" {
      ~ id                  = "S3BucketNotification" -> (known after apply)
      ~ qualifier           = "61" # forces replacement -> (known after apply) # forces replacement
+       statement_id_prefix = (known after apply)
        # (5 unchanged attributes hidden)
    }

Plan: 12 to add, 12 to change, 12 to destroy.

Pusher: @TylerHendrickson, Action: pull_request_target, Workflow: Continuous Integration

@TylerHendrickson TylerHendrickson requested a review from a team August 27, 2024 22:38
@TylerHendrickson TylerHendrickson changed the title Fix/878 putitem failures Fix: 878 permissions errors Aug 27, 2024
@TylerHendrickson TylerHendrickson enabled auto-merge (squash) August 27, 2024 22:40
@TylerHendrickson TylerHendrickson requested a review from a team as a code owner August 27, 2024 22:59
@github-actions github-actions bot added the github Repository automation and configuration label Aug 27, 2024
@ClaireValdivia
Copy link

@joshgarza would you be able to review this PR?

@@ -129,6 +124,9 @@ func handleS3EventWithConfig(cfg aws.Config, ctx context.Context, s3Event events
func readRecords(ctx context.Context, r io.Reader, ch chan<- grantRecord) error {
Copy link
Member Author

@TylerHendrickson TylerHendrickson Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this function enable developers to set a MAX_SPLIT_RECORDS environment variable in order to limit the number of records that will be extracted from the source DB during invocation. Specifically, it helps developers test on a reasonable number of records instead of needing to process >78k records from grants.gov.

Very simply: we're incrementing a counter for each extracted record and breaking out of the loop if it reaches $MAX_SPLIT_RECORDS.

@@ -138,6 +136,11 @@ func readRecords(ctx context.Context, r io.Reader, ch chan<- grantRecord) error
return err
}

// End early if we have reached any configured limit on the number of records sent to ch
if env.MaxSplitRecords > -1 && countSentRecords >= env.MaxSplitRecords {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If MAX_SPLIT_RECORDS env var is less than zero, we ignore it.

@@ -112,7 +112,7 @@ func buildGrantModificationEventJSON(record events.DynamoDBEventRecord) ([]byte,
}
if err := prevVersion.Validate(); err != nil {
sendMetric("grant_data.invalid", 1, metricTag)
log.Warn(logger, "grant data from ItemMapper is invalid", err)
log.Warn(logger, "grant data from ItemMapper is invalid", "error", err)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This change is wholly unrelated to the rest of this PR; I just happened to notice that we were missing the error key for this log in Datadog (odd numbers of k/v arguments make things weird). We can probably lint for this in the future.

@@ -37,15 +37,17 @@ module "lambda_execution_policy" {
]
resources = [
data.aws_s3_bucket.prepared_data.arn,
# Path: <first 3 of grant id>/<grant id>/grants.gov/v2.xml
"${data.aws_s3_bucket.prepared_data.arn}/*/*/grants.gov/v2.xml"
# Path: <first 3 of grant id>/<grant id>/grants.gov/v2.xml (deprecated)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is deprecated do we still need to have permissions to this bucket? Same question for SplitGrantsGovXMLDB as well.

Copy link
Member Author

@TylerHendrickson TylerHendrickson Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still needed for PersistGrantsGovXMLDB because the bucket has existing objects (which we aren't planning to move/recreate).

However, SplitGrantsGovXMLDB no longer has any reason to have permissions for the bucket as it will only write objects using the new key convention from now on. I'll remove the permissions for that function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to add: Even though during normal pipeline execution it should only encounter objects with the new key convention, I'd like to keep the ability for PersistGrantsGovXMLDB to read existing S3 objects in case there's a bug where we need to trigger the Lambda to reprocess an object from S3 as a recovery step, or if we need to rebuild the DynamoDB index. After this is live and confirmed stable in Production for a few weeks, we should circle back to remove the permissions here as well. I'll make a follow-up issue for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working github Repository automation and configuration go Pull requests that update Go code terraform Pull requests that update Terraform code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants