Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of key-value rate limits #6947

Merged
merged 42 commits into from
Jul 21, 2023
Merged

Conversation

beautifulentropy
Copy link
Member

@beautifulentropy beautifulentropy commented Jun 20, 2023

This design seeks to reduce read-pressure on our DB by moving rate limit tabulation to a key-value datastore. This PR provides the following:

  • (README.md) a short guide to the schemas, formats, and concepts introduced in this PR
  • (source.go) an interface for storing, retrieving, and resetting a subscriber bucket
  • (name.go) an enumeration of all defined rate limits
  • (limit.go) a schema for defining default limits and per-subscriber overrides
  • (limiter.go) a high-level API for interacting with key-value rate limits
  • (gcra.go) an implementation of the Generic Cell Rate Algorithm, a leaky bucket-style scheduling algorithm, used to calculate the present or future capacity of a subscriber bucket using spend and refund operations

Note: the included source implementation is test-only and currently accomplished using a simple in-memory map protected by a mutex, implementations using Redis and potentially other data stores will follow.

Part of #5545

@beautifulentropy beautifulentropy marked this pull request as ready for review June 20, 2023 21:08
@beautifulentropy beautifulentropy requested a review from a team as a code owner June 20, 2023 21:08
ratelimits/names.go Show resolved Hide resolved
ratelimits/limit.go Show resolved Hide resolved
ratelimits/limit.go Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
@pgporada pgporada requested a review from a team June 21, 2023 16:00
Copy link
Contributor

@aarongable aarongable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet reviewed gcra.go; sending comments so you have them before I disappear into other stuff for the afternoon.

ratelimits/names.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit_test.go Outdated Show resolved Hide resolved
ratelimits/limit_test.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
@beautifulentropy
Copy link
Member Author

I haven't yet reviewed gcra.go; sending comments so you have them before I disappear into other stuff for the afternoon.

When you check out gcra.go use https://brandur.org/rate-limiting#gcra instead of my old (and I now realize) wrong explainer in the design doc.

ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
ratelimits/limit.go Outdated Show resolved Hide resolved
@beautifulentropy
Copy link
Member Author

Merged main so we can have working CI again (#6997).

ratelimits/README.md Outdated Show resolved Hide resolved
ratelimits/README.md Outdated Show resolved Hide resolved
ratelimits/README.md Outdated Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
ratelimits/README.md Outdated Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
)

func Test_decide(t *testing.T) {
clk := clock.NewFake()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: You could declare these variables in each specific test scope and reference them throughout the test. Where this helped me was in all the test.AssertEquals(t, d.ResetIn, burstTime) helping to hammer home the TAT related to burst concept.

	const burst = 10
	const burstTime = (time.Second * burst)

ratelimits/limit_test.go Show resolved Hide resolved
ratelimits/gcra_test.go Show resolved Hide resolved
ratelimits/names.go Outdated Show resolved Hide resolved
ratelimits/gcra_test.go Show resolved Hide resolved
ratelimits/gcra.go Outdated Show resolved Hide resolved
ratelimits/gcra_test.go Show resolved Hide resolved
ratelimits/gcra.go Outdated Show resolved Hide resolved
ratelimits/gcra.go Outdated Show resolved Hide resolved
ratelimits/limiter.go Outdated Show resolved Hide resolved
aarongable
aarongable previously approved these changes Jul 20, 2023
ratelimits/gcra.go Outdated Show resolved Hide resolved
ratelimits/README.md Show resolved Hide resolved
ratelimits/gcra.go Show resolved Hide resolved
ratelimits/gcra_test.go Show resolved Hide resolved
@beautifulentropy beautifulentropy merged commit 055f620 into main Jul 21, 2023
@beautifulentropy beautifulentropy deleted the rate-limits-v2 branch July 21, 2023 16:57
@ringerc
Copy link

ringerc commented Jul 24, 2023

The README makes it look a bit like like this covers the NewRegistrationsPerIPAddress and NewOrdersPerAccount limits but not other limits like CertificatesPerFQDNSet (limit on renewals for one unique domain set) or CertificatesPerName (limit on non-renewal certs by eTLD+1) or other limits defined in ratelimits.Limits and ratelimits.rateLimitConfig.

The new ratelimits/names.go has CertificatesPerFQDNSetPerAccount and CertificatesPerDomainPerAccount which appears to correspond to the other ratelimits code's CertificatesPerFQDNSet and CertificatesPerName respectively (with clearer names) but these names don't seem to be used elsewhere.

So this PR as-merged was still an API PoC - right? Where a storage backend for the bucket states is still TBD and none of the exiting rate-limits code refers to this new API yet?

@aarongable
Copy link
Contributor

Correct. As this PR states, it is the "initial implementation". None of the ACME API code paths invoke this new rate limit infrastructure yet, and the "inmem" backing store (which cannot share data between multiple instances of the same service, and would drop all of its data every time that service is restarted e.g. for a deploy) is obviously only intended for testing purposes, not production use. The next steps here are to implement a closer-to-production-ready backing store, and to begin calling into this code for least one actual rate limit, so that we can see how well this system performs in practice.

beautifulentropy added a commit that referenced this pull request Oct 4, 2023
Integrate the key-value rate limits from #6947 into the WFE. Rate limits
are backed by the Redis source added in #7016, and use the SRV record
shard discovery added in #7042.

Part of #5545
beautifulentropy added a commit that referenced this pull request Nov 8, 2023
…7117)

The `Limiter` API has been adjusted significantly to both improve both
safety and ergonomics and two `Limit` types have been corrected to match
the legacy implementations.

**Safety**
Previously, the key used for looking up limit overrides and for fetching
individual buckets from the key-value store was constructed within the
WFE. This posed a risk: if the key was malformed, the default limit
would still be enforced, but individual overrides would fail to function
properly. This has been addressed by the introduction of a new
`BucketId` type along with a `BucketId` constructor for each `Limit`
type. Each constructor is responsible for producing a well-formed bucket
key which undergoes the very same validation as any potentially matching
override key.

**Ergonomics**
Previously, each of the `Limiter` methods took a `Limit` name, a bucket
identifier, and a cost to be spent/ refunded. To simplify this, each
method now accepts a new `Transaction` type which provides a cost, and
wraps a `BucketId` identifying the specific bucket.

The two changes above, when taken together, make the implementation of
batched rate limit transactions considerably easier, as a batch method
can accept a slice of `Transaction`.

**Limit Corrections**
PR #6947 added all of the existing rate limits which could be made
compatible with the key-value approach. Two of these were improperly
implemented;
- `CertificatesPerDomain` and `CertificatesPerFQDNSet`, were implemented
as
- `CertificatesPerDomainPerAccount` and
`CertificatesPerFQDNSetPerAccount`.

Since we do not actually associate these limits with a particular ACME
account, the `regID` portion of each of their bucket keys has been
removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants