Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource drift not being recognized #2073

Closed
Bryan-Meier opened this issue Sep 25, 2023 · 5 comments
Closed

Resource drift not being recognized #2073

Bryan-Meier opened this issue Sep 25, 2023 · 5 comments
Labels
bug Used to mark issues with provider's incorrect behavior category:other

Comments

@Bryan-Meier
Copy link

Bryan-Meier commented Sep 25, 2023

Provider Version

0.71.0
If it helps, this issue has been this way since it was under chanzuckerberg.

Terraform Version

1.5.5

Describe the bug

For whatever reason, when we do a plan it's not picking up some instances of drift. In our case once it's stops detecting the drift when doing plans, it will never recognize the drift. The only way I have been able to force alignment on the resources have this issue is by doing a "terraform apply -replace=xyz -target=abc". This is especially frustrating when there are hundreds of resources that are not being detected.

Expected behavior

When I run plan I would expect that any drift that has happened between the last plan and the current would be picked up and shown as part of the current plan.

Code samples and commands

NA

Additional context

The only thing I can thing that I can think of from a cause perspective is our use of modules. A simple example of this is that we have a module that is responsible for creating a schema, a set of roles and privileges of those roles for the given schema. This module has required variables for the database name and schema name. Based on those inputs the schema is created, some roles with the appropriate privileges associated to them which are grants to the schema. There is an output for each of the roles and the schema that was created.

The module mentioned is called from our main.tf where a for_each argument is used to iterate over a list of schemas. For some reason when there is drift in the form of a role belonging to the schema being dropped the plan is not picking that up and attempting to create it. It's like the comparison between the state and actual infrastructure isn't happening for some objects. I would assume if the resource was originally created by TF and then dropped manually by a user that TF would find that it's missing on the next plan but that doesn't seem to be the case.

Please let me know if more information is needed and I will do my best to fill in the gaps where needed, Thank you!

@Bryan-Meier Bryan-Meier added the bug Used to mark issues with provider's incorrect behavior label Sep 25, 2023
@sfc-gh-jcieslak
Copy link
Collaborator

Hey 👋
From our perspective, I think the state drift could be missed by either:\

  • Having invalid Read operation. In the Read operation (which is run on terraform plan) we are querying Snowflake and detecting if any changes are made. If any value is different from the value in the state file, then Terraform will plan the Update operation on this resource to update it on the Snowflake side).
  • Having too broad diff suppress. Sometimes it is common to write DiffSupress function (docs) to not plan the Update even when the field changed on the Snowflake side. As shown in the docs it is common in cases like skipping case-sensitive differences, but it could be used to detect more sophisticated cases. If not used right, it can lead to cases where an Update is needed, but the DiffSupress function is canceling it on every terraform plan.
    Those are the main 2 cases I can think of, but maybe there are more complex examples preventing resources from creating a plan. Soon, we'll be working on refactoring all GA resources which means every Read function will be refactored and DiffSupresses checked. The refactor should result in more stable resources and get rid of the not-detected drifts, so this issue should be slowly fixed when we will be refactoring resource-by-resource.

@sfc-gh-jcieslak
Copy link
Collaborator

To add more in the more recent versions (0.71.0 is a pretty old one) we started to use a new SDK to communicate with Snowflake which eliminated a lot of bugs that were happening in the previous versions. Also, would you be able to point out which resources specifically are causing issues for you (so we could take a closer look at them)?

@Bryan-Meier
Copy link
Author

Bryan-Meier commented Apr 29, 2024

@sfc-gh-jcieslak, I can do some tests to determine if this is still an issue since we are now on version 0.88.0.

@sfc-gh-jcieslak
Copy link
Collaborator

Please do if you can, but with the current roadmap as you can see we'll be focusing now on the refactoring GA objects, with the points I listed above in mind. I'm assuming by the end of this refactoring there should be little to no strange state drifts, making the provider more stable.

@Bryan-Meier
Copy link
Author

Hi @sfc-gh-jcieslak, I will close this issue for now as I have done some minimal testing and everything appears to be ok. If I find an issue with drift going forward I will create a new issue and refer back to this one. Thanks for the help and attention on this one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to mark issues with provider's incorrect behavior category:other
Projects
None yet
Development

No branches or pull requests

2 participants