Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

call [docker] with auth #9803

Merged
merged 9 commits into from
Dec 31, 2023
Merged

Conversation

chris48s
Copy link
Member

@chris48s chris48s commented Dec 11, 2023

Closes #9342

Just to summarise the reason why I am doing this:

  • DockerHub imposes a rate limit of 180 reqs/minute for anonymous usage
  • We increased the max-age on DockerHub badges to try and stay under it. We are now up to 4 hours
  • Even caching the badges for 4 hours, we are still sometimes hitting the limits
  • By using auth, we can increase our rate limit to 600 reqs/minute, which would be a big help
  • DockerHub's auth pattern is a bit complicated and none of our existing auth helpers cover it
  • Docker is one of our most popular services

So.. the point of this PR is:

  • Add a JWT auth helper
  • Modify the docker badges to use it
    so we can..
  • Set DockerHub tokens in production

I have attempted to do this in a relatively general way, but at the moment we only have one service that uses this type of auth so there might be assumptions that turn out to be docker-specific. We might find if we have another service that uses this pattern, we need to modify it a bit.

Still TODO:

  • Will this also work for https://cloud.docker.com ? Does it matter?
  • Write unit tests for JWT AuthHelper (might do some refactoring to support this)
  • Write specs for docker service auth
  • Create and set credentials in production/keybase

@chris48s chris48s added operations Hosting, monitoring, and reliability for the production badge servers service-badge New or updated service badge labels Dec 11, 2023
Copy link
Contributor

github-actions bot commented Dec 11, 2023

Warnings
⚠️ This PR modified the server but none of its tests.
That's okay so long as it's refactoring existing code.
Messages
📖 ✨ Thanks for your contribution to Shields, @chris48s!
📖

Thanks for contributing to our documentation. We ❤️ our documentarians!

Generated by 🚫 dangerJS against 5666e88

import { parseJson } from './json.js'
import validate from './validate.js'

let jwtCache = Object.create(null)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't already have a JWT, we need to obtain one. That is one round-trip on the network. Then once we've got one, we can use it to make the API call we actually want to make to get some data to put on a badge, which is a second round-trip on the network.

In order to avoid making 2 requests for every single docker badge, I have implemented a rudimentary cache using a module-level object.

After we've done the OAuth dance, DockerHub issues a JWT with an expiry date on it. At the moment, this is 1 month but in principle that could change.

Realistically, it is uncommon for us to run a single VM instance for a full month. We tend to deploy every week or two. This means in practice each VM we boot will make 2 round-trips on the network the first time we process a request for a docker badge. Then all subsequent requests for the lifetime of that VM will be able to use a cached token. If a single VM were to actually run for a month or Docker shorten the expiry on tokens, I implemented proper logic for handling the token expiry anyway.

I could have implemented this so we store a single cached JWT shared between VMs in postgres instead of each VM maintaining its own cache, but I think keeping towards being more stateless is preferable. It also means that if self-hosting users want to use DockerHub auth to allow badges for private repos, they don't have to run a postgres server to store one string. That would be annoying.


const resp = await fetch(serviceInstance, {})

expect(serviceInstance.authHelper.withJwtAuth.calledOnce).to.be.true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All I am doing is asserting that the fetch method calls the service class's withJwtAuth method, but I think that is all we care about. I don't think more elaborate mocks actually buy us much here.

@chris48s
Copy link
Member Author

I still need to make credentials and set them, but the code is now ready for review

@chris48s chris48s changed the title WIP call [docker] with auth call [docker] with auth Dec 17, 2023
@chris48s chris48s marked this pull request as ready for review December 17, 2023 18:50
@PyvesB
Copy link
Member

PyvesB commented Dec 30, 2023

What happens if Docker decides to revoke the token before its expiry? My understanding is that we'll keep on trying to use the token until either we've restarted our server or the initially-advertised expiry date has been reached. If that's indeed the case, this doesn't seem ideal.

I'd personally even be in favour of ignoring the advertised expiry date, and only clearing the cache when we get an auth error. We'd suffer from an extra round trip once in a blue moon for not proactively requesting the new token, but on the other hand we'd save the call to _isJwtValid on every badge render.

@chris48s
Copy link
Member Author

Yeah you're right. If a token was revoked before its given expiry, we'd just keep trying it. That's an interesting point I hadn't considered.

I guess to account for that we would need to expose a way to allow the service class to invalidate the cached token (and ideally retry the request) if we get a 401 or 403 back from DockerHub (or whatever the code is - TODO: check).

That said, ideally I'd like to do that without leaking too much of the implementation details of the auth into the service class. i.e: if a contributor wants to add a new badge that gets data from dockerhub, that should ideally be abstracted away as much as possible.

Another thing we could do is just set our own much shorter expiry on the token. So we set the expiry to (for example) 1 hour regardless, even if the expiry on the token is 1 month. We'll re-auth more frequently, but then even in the event that a token does get revoked before its expiry (this seems like an exceptional case, rather than something we would expect to see often) it will fix itself after a bit with no intervention from us. That might be a less conceptually correct but more pragmatic way to cover that edge case?

@PyvesB
Copy link
Member

PyvesB commented Dec 30, 2023

That might be a less conceptually correct but more pragmatic way to cover that edge case?

Happy to align with your proposed alternative. :)

Copy link
Member

@PyvesB PyvesB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now 👍🏻

@chris48s chris48s enabled auto-merge December 31, 2023 14:53
@chris48s chris48s added this pull request to the merge queue Dec 31, 2023
Merged via the queue into badges:master with commit 880c1fb Dec 31, 2023
24 checks passed
@chris48s chris48s deleted the 9342-dockerhub-auth branch December 31, 2023 15:00
@chris48s
Copy link
Member Author

We're now running with a token in prod scoped to 'public repo read-only'
Have shared the credentials with the team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
operations Hosting, monitoring, and reliability for the production badge servers service-badge New or updated service badge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docker badges rate limited by upstream service
2 participants