Resource drift not being recognized #2073

Bryan-Meier · 2023-09-25T20:22:24Z

Provider Version

0.71.0
If it helps, this issue has been this way since it was under chanzuckerberg.

Terraform Version

1.5.5

Describe the bug

For whatever reason, when we do a plan it's not picking up some instances of drift. In our case once it's stops detecting the drift when doing plans, it will never recognize the drift. The only way I have been able to force alignment on the resources have this issue is by doing a "terraform apply -replace=xyz -target=abc". This is especially frustrating when there are hundreds of resources that are not being detected.

Expected behavior

When I run plan I would expect that any drift that has happened between the last plan and the current would be picked up and shown as part of the current plan.

Code samples and commands

NA

Additional context

The only thing I can thing that I can think of from a cause perspective is our use of modules. A simple example of this is that we have a module that is responsible for creating a schema, a set of roles and privileges of those roles for the given schema. This module has required variables for the database name and schema name. Based on those inputs the schema is created, some roles with the appropriate privileges associated to them which are grants to the schema. There is an output for each of the roles and the schema that was created.

The module mentioned is called from our main.tf where a for_each argument is used to iterate over a list of schemas. For some reason when there is drift in the form of a role belonging to the schema being dropped the plan is not picking that up and attempting to create it. It's like the comparison between the state and actual infrastructure isn't happening for some objects. I would assume if the resource was originally created by TF and then dropped manually by a user that TF would find that it's missing on the next plan but that doesn't seem to be the case.

Please let me know if more information is needed and I will do my best to fill in the gaps where needed, Thank you!

sfc-gh-jcieslak · 2024-04-26T13:54:50Z

Hey 👋
From our perspective, I think the state drift could be missed by either:\

Having invalid Read operation. In the Read operation (which is run on terraform plan) we are querying Snowflake and detecting if any changes are made. If any value is different from the value in the state file, then Terraform will plan the Update operation on this resource to update it on the Snowflake side).
Having too broad diff suppress. Sometimes it is common to write DiffSupress function (docs) to not plan the Update even when the field changed on the Snowflake side. As shown in the docs it is common in cases like skipping case-sensitive differences, but it could be used to detect more sophisticated cases. If not used right, it can lead to cases where an Update is needed, but the DiffSupress function is canceling it on every terraform plan.
Those are the main 2 cases I can think of, but maybe there are more complex examples preventing resources from creating a plan. Soon, we'll be working on refactoring all GA resources which means every Read function will be refactored and DiffSupresses checked. The refactor should result in more stable resources and get rid of the not-detected drifts, so this issue should be slowly fixed when we will be refactoring resource-by-resource.

sfc-gh-jcieslak · 2024-04-29T11:42:30Z

To add more in the more recent versions (0.71.0 is a pretty old one) we started to use a new SDK to communicate with Snowflake which eliminated a lot of bugs that were happening in the previous versions. Also, would you be able to point out which resources specifically are causing issues for you (so we could take a closer look at them)?

Bryan-Meier · 2024-04-29T15:07:02Z

@sfc-gh-jcieslak, I can do some tests to determine if this is still an issue since we are now on version 0.88.0.

sfc-gh-jcieslak · 2024-05-07T05:46:42Z

Please do if you can, but with the current roadmap as you can see we'll be focusing now on the refactoring GA objects, with the points I listed above in mind. I'm assuming by the end of this refactoring there should be little to no strange state drifts, making the provider more stable.

Bryan-Meier · 2024-05-07T15:11:30Z

Hi @sfc-gh-jcieslak, I will close this issue for now as I have done some minimal testing and everything appears to be ok. If I find an issue with drift going forward I will create a new issue and refer back to this one. Thanks for the help and attention on this one!

Bryan-Meier added the bug Used to mark issues with provider's incorrect behavior label Sep 25, 2023

Bryan-Meier closed this as completed May 7, 2024

sfc-gh-jcieslak added the category:other label May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource drift not being recognized #2073

Resource drift not being recognized #2073

Bryan-Meier commented Sep 25, 2023 •

edited

Loading

sfc-gh-jcieslak commented Apr 26, 2024

sfc-gh-jcieslak commented Apr 29, 2024

Bryan-Meier commented Apr 29, 2024 •

edited

Loading

sfc-gh-jcieslak commented May 7, 2024

Bryan-Meier commented May 7, 2024

Resource drift not being recognized #2073

Resource drift not being recognized #2073

Comments

Bryan-Meier commented Sep 25, 2023 • edited Loading

sfc-gh-jcieslak commented Apr 26, 2024

sfc-gh-jcieslak commented Apr 29, 2024

Bryan-Meier commented Apr 29, 2024 • edited Loading

sfc-gh-jcieslak commented May 7, 2024

Bryan-Meier commented May 7, 2024

Bryan-Meier commented Sep 25, 2023 •

edited

Loading

Bryan-Meier commented Apr 29, 2024 •

edited

Loading