Automatically cache zh and h1 hash values for all downloaded providers #27811

sergei-ivanov · 2021-02-18T01:44:57Z

Current Terraform Version

v0.14.6

Use-cases

Dependency locking mechanism has quickly turned from highly anticipated feature into a major pain point after we upgraded to Terraform 0.14. This is due to a combination of provider caching and running Terraform on multiple platforms (see attempted solutions and a list of related issues below).

Ideally we would want Terraform to automatically fetch all available hashes for each provider and add them to the lock file. But according to the explanation here it is currently not supported by the registry API and is unlikely to be fixed in 0.14.

So we would like to mitigate at least the performance impact of dependency locking while we don't have a more comprehensive solution.

Attempted Solutions

After each provider version upgrade we need to run terraform init -upgrade once and then terraform providers lock twice in each affected module. For cached providers, terraform init only populates a single h1 hash, which is calculated from the cached provider's package. Then the first run of terraform providers lock without parameters populates all zh hashes, and finally the second run of e.g. terraform providers lock -platform=windows_amd64 -platform=darwin_amd64 adds h1 hashes for the additional platforms.

The main problem with the terraform providers lock command is that it never caches its results anywhere. And it never cooperates with provider cache to check if the provider has already been downloaded. Instead, it always fetches the file with SHA256 signatures (from where the zh hashes originate). Then it always downloads the provider package for a given platform, calculates the h1 hash for it. And then it discards all downloaded files and never persists the hashes anywhere apart from the .terraform.lock.hcl file. As a result, when locking providers for 3 different platforms, in each of the 10 modules with 3 providers per module, terraform providers lock will download and throw away provider binaries 3×10×3=90 times. Given the typical size of a provider package, that can add up to several Gb of data needlessly fetched. And as a result the operation takes a lot of time over an average internet connection.

Unfortunately, since the new lock file feature cannot be turned off, we are facing a dilemma: either reduce the overhead somehow, or permanently ban the lock files through .gitignore, so that they are never shared.

Proposal

Since signatures and hashes, like provider packages themselves, are considered to be immutable, there does not seem to be any good reason why the signatures couldn't also be cached in some file inside ~/.terraform.d. Both terraform init and terraform providers lock should cooperate with provider cache, instead of fighting it or ignoring it. Every time init or lock runs, it should check the cached signatures first. If it needs to download (and store in a local cache) a new version of a provider (or a package for a different platform), it also needs to update the signature cache with whatever it has just downloaded and/or calculated. Even if in the future the registry protocol allows querying for h1 hashes without downloading the provider package, storing those responses in cache will still make sense for performance reasons.

For backward compatibility, this feature may need to be disabled by default and enabled via a new configuration key in ~/.terraformrc. But enabling the provider cache should always imply enabling signature cache, because otherwise the entire integration with the cache is broken.

References

The text was updated successfully, but these errors were encountered:

sergei-ivanov · 2021-02-23T02:58:04Z

A little update, because I've just seen my weekly firewall report. Running terraform providers lock across ~100 modules in order to pro-actively lock hashes for 3 platforms (linux, mac and windows) generated about 25Gb of traffic to HashiCorp CDN. This is insane and is not really sustainable in the long term (especially given that we prefer tracking the latest provider releases and upgrade often).

sergei-ivanov · 2021-09-07T12:07:35Z

@apparentlymart @jbardin sorry for tagging you directly, but this issue does not seem to be getting any traction, while the problem is still there, as evidenced by a large number of linked issues. Lack of checksum caching has grown into a painful problem, because I have to look after an internal repository of 200+ terraform modules, and any mass provider upgrade becomes a challenge. I am working from home, I have a typical residential broadband, and when locking 200+ modules requires downloading something like 25Gb and waiting for a few hours, it's an indication that something is not right (because with proper caching it would have only required a few minutes).

I would really appreciate it even if you left a comment on the proposal in this PR.

tmccombs · 2022-06-07T15:29:22Z

This problem doesn't require using multiple platforms. Even if you use multiple hosts with the same platform (for example, all linux), if one person has the provider cached running init will only generate the zh hash, but if someone else then runs init and doesn't have that version of the provder hashed, it will add all the h1 hashes as well.

sergei-ivanov added enhancement new new issue not yet triaged labels Feb 18, 2021

marcoreni mentioned this issue Mar 23, 2021

Caching not usable in 0.14.x due to lock file checksums #27769

Closed

secustor mentioned this issue Apr 10, 2021

Do not download and index already mirrored providers #28333

Closed

sergei-ivanov mentioned this issue Oct 1, 2021

feat: Add new hook for terraform providers lock operation antonbabenko/pre-commit-terraform#173

Merged

apparentlymart mentioned this issue Oct 20, 2021

feature: ability to disable the dependency lockfile #29760

Closed

jbardin mentioned this issue Oct 22, 2021

terraform init does not populate .terraform.lock.hcl with hashes for all platforms #29794

Closed

acdha mentioned this issue Nov 10, 2021

Provide a way to configure platforms for Dependency lock file generation #28627

Open

tmccombs mentioned this issue Jun 6, 2022

Lock file generation is inconsistant #31194

Closed

wyardley mentioned this issue Aug 16, 2022

Ability to disable lockfiles #31533

Open

jsyrjala mentioned this issue Nov 21, 2023

terraform providers lock should use the plugin cache #33837

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically cache zh and h1 hash values for all downloaded providers #27811

Automatically cache zh and h1 hash values for all downloaded providers #27811

sergei-ivanov commented Feb 18, 2021

sergei-ivanov commented Feb 23, 2021

sergei-ivanov commented Sep 7, 2021

tmccombs commented Jun 7, 2022

Automatically cache zh and h1 hash values for all downloaded providers #27811

Automatically cache zh and h1 hash values for all downloaded providers #27811

Comments

sergei-ivanov commented Feb 18, 2021

Current Terraform Version

Use-cases

Attempted Solutions

Proposal

References

sergei-ivanov commented Feb 23, 2021

sergei-ivanov commented Sep 7, 2021

tmccombs commented Jun 7, 2022