Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snowflake_file_format update-in-place doesn't apply #2385

Closed
alexmaras opened this issue Jan 22, 2024 · 27 comments
Closed

snowflake_file_format update-in-place doesn't apply #2385

alexmaras opened this issue Jan 22, 2024 · 27 comments
Labels
bug Used to mark issues with provider's incorrect behavior

Comments

@alexmaras
Copy link

alexmaras commented Jan 22, 2024

Terraform CLI and Provider Versions

Terraform v1.6.5
on linux_amd64
+ provider registry.terraform.io/snowflake-labs/snowflake v0.82.0

Terraform Configuration

resource "snowflake_file_format" "csv" {
    binary_format                  = "HEX"
    compression                    = "AUTO"
    database                       = "RAW"
    date_format                    = "AUTO"
    empty_field_as_null            = true
    encoding                       = "UTF8"
    error_on_column_count_mismatch = true
    escape                         = "NONE"
    escape_unenclosed_field        = "NONE"
    field_delimiter                = ","
    field_optionally_enclosed_by   = "\""
    format_type                    = "CSV"
    id                             = "RAW|EXAMPLE_SCHEMA|PROCESS_CSV"
    name                           = "PROCESS_CSV"
    null_if                        = [
        "NULL",
    ]
    parse_header                   = false
    record_delimiter               = <<-EOT
        
    EOT
    replace_invalid_characters     = false
    schema                         = "EXAMPLE_SCHEMA"
    skip_blank_lines               = false
    skip_byte_order_mark           = true
    skip_header                    = 1
    time_format                    = "AUTO"
    timestamp_format               = "AUTO"
    trim_space                     = false
}

Expected Behavior

Updating a field in the file_format resource and re-applying updates the file format in snowflake

Actual Behavior

The update does not change anything in snowflake, and terraform picks up the change as needing to be applied every time

Steps to Reproduce

  1. Create a snowflake_file_format resource
  2. terraform apply
  3. Update the resource, change the field_optionally_enclosed_by field for instance (not specific to this field) to NONE
  4. terraform apply

How much impact is this issue causing?

Medium

Logs

No response

Additional Information

Workaround at the moment is to taint the resource in terraform and allow it to be destroyed and recreated.

@alexmaras alexmaras added the bug Used to mark issues with provider's incorrect behavior label Jan 22, 2024
@sfc-gh-asawicki
Copy link
Collaborator

Hey @alexmaras. Thanks for reporting the issue.

Could you provide the debug logs (TF_LOG=DEBUG) showing that the update is not taking place?

@justin-ramirez-gametime

We are having a similar issue:
Terraform v1.7.0 on darwin_arm64

  • provider registry.terraform.io/snowflake-labs/snowflake v0.84.0

These are included on the plan every time:
Screenshot 2024-01-22 at 1 45 46 PM

However the state does not show any change after this plan is applied:
image
image

To add to the complexity of this issue, this problem only persists in our staging and testing workspaces but not in Production. We speculated that this was because we imported these in production and the values were already set to true.

@alexmaras
Copy link
Author

hi @sfc-gh-asawicki - thanks for the response. @justin-ramirez-gametime - the problem seems identical here.

Attached are the logs from two applies that happened directly one after the other, both reporting success, with no change in snowflake.

tf-output.txt.gz
tf-output-second.txt.gz

Here is the output from terrafrom apply too.

run-1.txt
run-2.txt

@sfc-gh-asawicki
Copy link
Collaborator

Thanks for the logs @alexmaras, @justin-ramirez-gametime. I will analyze this issue this week.

@raulbonet
Copy link
Contributor

Hello,

We had the same problem; I think I found the issue; created an MR with the fix.

@justin-ramirez-gametime
Copy link

Any news on this?

sfc-gh-asawicki added a commit that referenced this issue Feb 8, 2024
ref
#2385

I detected 3 bugs:

1. `d.Get("format_type")` has to be casted first to type
`sdk.FileFormatType` in order to be compared with another
FileFormatType, if not, the comparison is always `False`.

2. The attribute `comment` was not part of the Update() function

3. The default init value of a pointer is `null`, this means we need to
account for this case


Apart from that:
My trial account expired today and I could not run the Acceptance Tests,
but I manually tested it and it seems to work

---------

Co-authored-by: Raül Bonet <[email protected]>
Co-authored-by: Artur Sawicki <[email protected]>
@sfc-gh-asawicki
Copy link
Collaborator

I have just merged the change @raulbonet mentioned in his previous comment. We plan to release the new version in the middle of the next week (Wednesday or Thursday).

@sfc-gh-asawicki
Copy link
Collaborator

sfc-gh-asawicki commented Feb 16, 2024

@alexmaras @justin-ramirez-gametime the aforementioned PR was released as part of 0.86.0. Could you please check if this solves the issue for you?

@benriou
Copy link

benriou commented Feb 21, 2024

hello

I'm still facing the same issue in 0.86.0 (tested today) where escape_unenclosed_field is always planned for update-in-place with no effect.

@raulbonet
Copy link
Contributor

Hello @benriou
Can you please post the terraform configuration and the plan so I can debug?

@benriou
Copy link

benriou commented Feb 21, 2024


resource "snowflake_file_format" "loadbalancer_logs_XXX_dev" {
  name                = "LOADBALANCER_LOGS"
  database            = "DEV"
  schema              = "PUBLIC"
  format_type         = "CSV"
  binary_format       = "HEX"
  date_format         = "AUTO"
  time_format         = "AUTO"
  timestamp_format    = "AUTO"
  empty_field_as_null = true

  error_on_column_count_mismatch = true

  escape                  = "NONE"
  escape_unenclosed_field = "\\\\"
  encoding                = "UTF8"


  field_optionally_enclosed_by = "\""
  record_delimiter             = "\n"
  field_delimiter              = " "
  skip_header                  = 0
  skip_blank_lines             = true
  skip_byte_order_mark         = true
  null_if                      = ["\\\\N"]

  compression = "AUTO"
}
  # snowflake_file_format.loadbalancer_logs_XXX_dev will be updated in-place
  ~ resource "snowflake_file_format" "loadbalancer_logs_XXX_dev" {
      ~ escape_unenclosed_field        = "\\" -> "\\\\"
        id                             = "DEV|PUBLIC|LOADBALANCER_LOGS"
        name                           = "LOADBALANCER_LOGS"
        # (32 unchanged attributes hidden)
    }

@benriou
Copy link

benriou commented Feb 21, 2024

Please let me know if you need further details @raulbonet ; thank you

@raulbonet
Copy link
Contributor

Thanks, I will have a look later today

@raulbonet
Copy link
Contributor

Hello @benriou
I had a look. I think that the problem is just that \\\\ is not a valid value for this field, but Snowflake instead of raising an exception, does nothing. If you read the documentation here , section ESCAPE_UNENCLOSED_FIELD:

A singlebyte character string used as the escape character for unenclosed field values only. An escape character invokes an alternative interpretation on subsequent characters in a character sequence. You can use the ESCAPE character to interpret instances of the FIELD_DELIMITER or RECORD_DELIMITER characters in the data as literals. The escape character can also be used to escape instances of itself in the data.

Accepts common escape sequences, octal values, or hex values.

But \\\\ cannot be expressed as a single byte sequence, nor is a "common escape sequence".

If you manually try to execute the statement in Snowflake GUI:
ALTER FILE FORMAT "DEV"."PUBLIC"."LOADBALANCER_LOGS" SET ESCAPE_UNENCLOSED_FIELD = '\\\\'

And then query the file format:
SHOW FILE FORMATS LIKE 'LOADBALANCER_LOGS' IN SCHEMA "DEV"."PUBLIC";

You will see that, indeed, the character has not changed.

I tried with other characters, like "A" or "a" and in this case, it gets replaced by \n, and still not raising an exception.

@benriou
Copy link

benriou commented Feb 26, 2024

thank you @raulbonet ;

I thought I had to escape the backslashes.

This is working like a charm since I configured the field to \\ ; the change is correctly propagated to the FILE_FORMAT.

@raulbonet
Copy link
Contributor

Glad to know it worked! :)

@sfc-gh-asawicki
Copy link
Collaborator

@alexmaras @justin-ramirez-gametime could you also confirm that it solved your issue, please?

@alexmaras
Copy link
Author

thanks @sfc-gh-asawicki - I'll check it out tomorrow and report back.

@alexmaras
Copy link
Author

Hey @sfc-gh-asawicki - The issue I'm hitting now is similar to the above, but I have no way of solving it.

I have some file formats defined in Snowflake that have a character as their optional enclosing value - in this case double quotes ("). I need to change those file formats to have no optional enclosing value - i.e. change it to NONE. If I use NONE, I now get an error - so this is an improvement - that says:

│ Error: CSVFieldOptionallyEnclosedBy must be one of [null ' "]
│ 
│   with module.ingest_seer_data["study_label_groups"].snowflake_file_format.csv[0],
│   on modules/snowpipe-integration/file_format.tf line 31, in resource "snowflake_file_format" "csv":
│   31: resource "snowflake_file_format" "csv" {
│ 

However, if I use null, terraform just doesn't report any changes at all. The file format does need updating, but it refuses to plan it because it's using the defaults.

@raulbonet
Copy link
Contributor

Thanks @alexmaras , I will have a look tomorrow (EMEA timezone)

@alexmaras
Copy link
Author

alexmaras commented Feb 28, 2024

Thanks @raulbonet - I imagine NONE will have to be added as a special option to prevent terraform from swallowing the change. I'll also have a poke in the go code and see if I can get a handle on where it happens.

This is the error I'm hitting:

validEnclosedBy := []string{"null", "'", `"`}

So I guess at the very least, NONE would have to be added to that list, but I don't know how that'll pass through to snowflake. I'll see if I can build and test a change like that in the meantime, but I imagine I might hit an issue with it being wrapped in quotes or similar.

As far as I understand it, any single character (or the keyword NONE) should be valid, so this might just be something worth making less strict in general.

@raulbonet
Copy link
Contributor

Hello @alexmaras
I had a look, and indeed you are right: the validate() should account for the NONE use-case. I am not sure if the intention of the original developer was to use null as NONE or indeed check for null, but the latter is not needed, since the validate function already has the condition valueSet(opts.CSVFieldOptionallyEnclosedBy).

Anyway, I will create a PR with this small change.

@raulbonet
Copy link
Contributor

@sfc-gh-asawicki , I notice a couple of issues here, just bringing it to your attention:

  1. NONE is a Snowflake reserved keyword that afaik, currently the SQLBuilder does not support. So, the statement executed by Snowflake is:

ALTER ...... = 'NONE', when it should be ALTER ...... = NONE

It seems that both seem to work in this case, but not sure if this is the case for absolutely all resources in Snowflake.

  1. I am noticing a non-idempotent operation here: if the resource is being created, Snowflake will default to whatever default value it has. However, if the resource is being altered, Snowflake will just keep the existing value. The only way to make this idempotent would be recreate the resource when a value is set to null, so it defaults to Snowflake's default value.

I guess we do not want to address this in this PR, but this probably affects all resources. I wonder if we want to account for this use-case or at least track it in the backlog.

@sfc-gh-asawicki
Copy link
Collaborator

@raulbonet Thanks, I will take a look tomorrow.

@sfc-gh-asawicki
Copy link
Collaborator

Hey @raulbonet. My answers below:

ad 1
I will note it down, we should run it with NONE, not 'NONE'.

ad 2
This is one of the topics we are aware of and that we will address during resources redesign (https://github.com/Snowflake-Labs/terraform-provider-snowflake/blob/main/ROADMAP.md#our-roadmap). One of the PRs that shows how we can handle this is #2502. We are of course limited only to objects where we can easily access the default snowflake value. For the cases where this is not possible, we will just encourage the user to fill in the value (or even force the user to do it by requiring it).

sfc-gh-asawicki added a commit that referenced this issue Mar 5, 2024
ref #2385 

As pointed out by https://github.com/alexmaras , the `validate()`
function does not properly validate the use-case where
field_optionally_enclosed_by is None.

---------

Co-authored-by: Artur Sawicki <[email protected]>
@alexmaras
Copy link
Author

Thanks @raulbonet for helping!

0.87.1 has solved the issue, and I can apply file format updates successfully, including "NONE"!

image

Resolved by #2575 and released in 0.87.1

@justin-ramirez-gametime

@sfc-gh-asawicki version 0.87.1 fixed the issue we were having

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Used to mark issues with provider's incorrect behavior
Projects
None yet
Development

No branches or pull requests

5 participants