Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add kid as param for client constructor, split out auth challenge creation into its own method #201

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

AlecTroemel
Copy link
Collaborator

What

  • add kid as param to client constructor
  • split out creation of authorization challenges into its own method

Why

Currently users of sewer as an API can only get a cert atomically, and are unable to control the execution of the multiple async steps required. It would be great to be able to execute and await each step of the cert acquisition process (creating challenge resources, checking challenge status) however they want since time.sleep is not always the most efficient way to wait (time.wait is actively discouraged in queue systems such as celery).

This MR is the beginnings of allowing that "per step" control

@mmaney
Copy link
Collaborator

mmaney commented Aug 13, 2020

@AlecTroemel I was confused by multiple async steps, because sewer has NO async steps. Given it uses the requests library for all queries, it is incompatible with async in its current form. So what is it exactly you're trying to do here?

Wait, let me make a wild guess - you want to use the pieces of sewer as a library at a finer grain. If so, I have good news and bad news and in-between news: I've got similar thoughts, as the big class full of globals is not a good design for a library. The drastic reworking I did on your inspirational HTTP addition was one small piece of that, I think. So the good news is that I'm not opposed to dis-integrating this great huge blob that sewer currently is. The bad news is that I don't see the "kid" business as having anything to do with that. You add it, making the already too-fat interface wider, but never use it that I can see!

The in-between news is that breaking up get_certificate is a reasonable place to start - that's clearly the last vestige of what began life as (or intended to be) a simple one certificate at a time program, much like cli.py is now. But this PR doesn't try nearly hard enough! create_authorization_challenges is still a concatentation of registering the account, creating the certificate order, and collecting the order's challenge descriptors. So it splits get_certificate up, but not in what I can see as a really coherent way - it's just temporal cohesion, which is why get_certificate was such an obvious candidate in the first place. The most obvious bit is registering - if the account key is already registered that's unnecessary.

My head is still stuck in trying to get the accumulated changes in shape for 0.8.3. I think it's getting close, but until it's ready I'm trying really hard not to think about "0.9". :-/ Without stirring up those thoughts too much yet, I can suggest you might look at the pieces of the proposed create_authorization_challenges and think about what parameters they would have to take if they were standalone functions instead of being mired in the soup of shared state that is a Client instance.

Oh, and a look at the various things that RFC8555 refers to as "objects" will give you some insight into the sort of refactoring which I am trying not to start thinking about again right this minute.

@AlecTroemel
Copy link
Collaborator Author

@mmaney thanks for taking a look at this.

by "multiple async steps" I meant during the full process of getting a new cert we need to wait for something external. For example we need to wait for dns or http challenge resources to be propagated, and we need to wait for the authorization status at lets encrypt to be valid. You nailed it right on the head with "you want to use the pieces of sewer as a library at a finer grain". Specifically I want to be able to execute each of these steps on my own timing (there's most likely some missing steps not listed here)

  1. create the challenge resources
  2. check the authorization status, or more generally check if the authorization is "valid"
  3. download the cert
  4. cleanup the challenge resources

and at each step I want to be able to control how I wait or retry when things aren't set up yet. Each of those steps has a method (or small set of methods) on the client that we can call. The one exception to that is the creation of the auth challenges, which is currently wrapped up into the get_certificate function.

  1. Nothing yet
  2. check_authorization_status
  3. send_csr and download_certificate
  4. provider.clear

that's the main thing this PR aims to fix. I think I'm ok tracking any sort of state needed between those methods (like the finalization url and challenges list).

The reason I added the kid as a param when instantiation the client is because as things currently stand I'll need to re-instantiate the client at each step listed above, but I only want to register with lets encrypt once. In fact I think the kid needs to be the same across all requests to lets encrypt during the cert acquisition process

I hope that clears up at least the intent of things 😅 . I'll think about what it would mean for things to be more functional, and maybe once you wrap up 0.8.3 things will have a clearer direction!

@mmaney
Copy link
Collaborator

mmaney commented Aug 14, 2020

get_certificate is the blob of code that's at the top of my hit list for changes when I'm ready to start really breaking compatibility. It was the ugly mess in its interaction with check_authorization_status that first drew my attention - that entire mess needs a vigorous refactoring. I refer to that refactoring as "0.9". :-)

As it happens, there's another PR, #203, that touches on some other "0.9" issues. The whole Client class is basically a namespace that holds globals for all its parts, many of which should probably be instead in several smaller objects and some [more] freestanding functions.

And yes, the entire tree of operations from new-order (creating a cert request) all the way through downloading that certificate MUST use the same key for signing the queries. And there are a few requests that have to have the key itself rather than the kid, though they may not be in that tree (RFC8555 may be no one's friend, but it's your really necessary companion) - sewer hasn't yet supported some ACME operations, eg. revocation.

@mmaney
Copy link
Collaborator

mmaney commented Aug 19, 2020

Got 0.8.3 out, finally. Looking at the things that were set aside... I think I understand what you want to do. How would it fit for you if you were talking to (off the pointy top of my head):

An AcmeServer object, initialized with the directory URL, can be used to register an account key, handles the messages back & forth, definitely tracks available nonces (something else I implemented "long ago" but set aside)... I have a half-baked notion that it might not directly provide all the protocol bits itself, but that's where the fingers meet the wind, as it were.

Oh, and it produces new Certificate objects for you. There's a pretty OOPy pile of possible objects here, which may or may not be the right way to handle this, but after going over the RFC so much it sounds obvious. A Certificate has one or more Authz which each have at least one Challenge which is about where we merge back in with parts of the current implementation to get the challenge published. More handwaving here, but this really just reifying the process that's embedded in Client in a different way.

Account keys and certificate keys are created from outside this, though there will be library functions for this and probably some of the other uttility crypto - that JWS header stuff, eg,

But as I said, this is all a lot of handwaving, at least as far as having any code for you to look at yet. Does this sound like something that would move things in the direction you're looking to go? :-)

@AlecTroemel
Copy link
Collaborator Author

@mmaney I'm not sure I completely follow what your describing. But in general I think what I'm hoping for is to pull some of the state out of the client and into the hands of the consumer of the client; either the cli, a monolith "get_cert" function, or someone using sewer as a library. Specifically, I just need to be able to control when each part of the cert acquisition process is done, and relatedly be able to store the state needed between steps however I want (not just relying on a monolithic Client to do that).

Maybe "more functional" is what I'm wanting? I think breaking up the clients "get a cert" process into smaller more independent objects would be a step in that direction. Jetstacks awesome cert-manager project may be a good place to draw inspiration from. From a bird's-eye view they break things up this way

issuers (they support other issuers besides lets-encrypt) => acme account
certificate => certificateRequest => orders => challenges

I'm not sure we need that much separation, maybe this would be a good place to start?

  • AcmeAccount: account for an account key
  • Order: Intent to get a cert. owns 1 or more challenges and has an account associated with it. Probably manages other stuff like the kid
  • Challenge: manage the life-cycle of a dns/https challenge needed for authorization

@mmaney
Copy link
Collaborator

mmaney commented Aug 25, 2020

I've been sketching out a couple things. This is still pretty barebones ...

AcmeService __init__(directory_url, request_timeout, process_timeout)
  def new_account(...) -> AcmeAccount
  def new_order(...) -> AcmeOrder
  def revoke_cert(...)
  def key_change(...)

AcmeAccount has the key, kid after registration, method jwk() (and caches value since it never changes)

AcmeOrder  __init__(location, json_response)  initialized from json in body and location header (only way to get that with LE)
  def update_order ?  checks order status, other fields that change or appear as the process proceeds
  def finalize(csr) 
  def get_cert()

It's been a few days since I made the sketch I've summarized above, and I've been supressing the urge to revise parts as I went. :-) Probably will make sense to add AcmeChallenge to hold the various bookkeeping and to make a nicer interface to the drivers than the list of dicts. Le plus ca change...

Now, if there were a simple way to switch from using requests to an actual async library... that's already got the One Obvious Place for such changes since I funnelled all of Client's ACME requests through GET & PUT methods. Of course there are all those existing drivers that (all? think so) use requests for their HTML traffic... There's mention of a twisted version of sewer in the discussion on #196, so apparently it can be done if one is driven enough.

But the last couple days I've been diverted by AcmeKey (somewhat working & tested) and AcmeCsr (just started, as it rurns out that's created very early in the process in Client). The motivation here was a happy collision of #203 and an old wishlist item about getting rid of the OpenSsl stuff so we only have to import one crypto library. I see that as part of the whole "Client must be dis-integrated" theme. I need to see if I can finish that, or find out that there's some unexpected problem that's more difficult than just rewriting create_csr. :-)

@mmaney
Copy link
Collaborator

mmaney commented Sep 10, 2020

0.8.4 will have the crypto refactoring, and I think that will give you the functionality you're asking for with the kid patch. Or most of it, and maybe the account key registration can be made just slightly smart. Have to think about this...

If you haven't already, look at the crypto branch [warning: sometimes rebases, often push --force for small changes]. The part that should interest you is that the account key is passed, complete except for the kid, into Client(). And the kid is cached in the AcmeKey object, so you always have that once it's done the registration. If I can convince myself it's safe for other use cases, maybe it could just skip registration if there's already a kid in the AcmeKey...

Anyway, that's what's likely to be in the mainline code for the next little while. You'll like the other AcmeXxx objects that are taking shape in my notes here, but that's not for 0.8.4. OTOH, my hope is to be using that when my certs come up for renewal next month.

@mmaney
Copy link
Collaborator

mmaney commented Sep 12, 2020

@AlecTroemel crypto (and the piece I forgot to add to let it skip registration if the AcmeKey already has its kid) has been merged to the master branch. I think that should serve as a stand-in for the kid parameter - as long as your Client-using code keeps it live, it will Just Work™. If you want to be able to persist the account key state to disk, for now you'll have to handle the kid separately.

For complete control (at this point) you'd have to re-implement get_certificate, and there are a couple things that are much too tightly wrapped up in how get_certificate uses them to be easily used by anything that isn't pretty much a clone of get_certificate. That's surely the ugliest part of sewer's body...

I was thinking, just a few days ago, that the next round of work might have to set my Acme* plan aside in favor of work I'd rather put off (it's so messy! and it stains!) on get_certificate, but the more I think about it the more I think that Acme* has to be the foundation - the glutinous mass of state that is Client has to be broken up both to facilitate, and as a side effect of, the get_certificate work.

Anyway, if you have a chance to look at, or better still test, the current master in the next few days, let me know, huh? I'm pretty sure it's okay, but as #211 recently reminded us all, I ain't perfikt!

@mmaney mmaney closed this Sep 12, 2020
@mmaney mmaney reopened this Sep 12, 2020
@mmaney
Copy link
Collaborator

mmaney commented Jul 1, 2021

@AlecTroemel Let me get one nagging question out of the way: have you actually encountered a failure due to no matching challenge type? If you have, it should already be tossing an (uncontrolled) exception, because identifier_auth won't be defined, no? And so will the test and attempt to report it, then.

But it's late, and I may easily be missing something...

@mmaney
Copy link
Collaborator

mmaney commented Jul 1, 2021

@AlecTroemel Ah yes, I was missing the description of how this may arise in #209. It's still just-before-coffee here, but the situation you described is not really is not an error (in the ACME process). Take a look at the discussion in that other PR.

And apologies for being so much absent. I finally got the bit between my teeth about what proved to be a complete rewrite of the ACME engine, then stalled out when I ran into an error in LE's implementation. (I want to say that they weren't setting the valid flag in responses that are supposed to have it?) I believe they've since fixed that and it's been rolled out into production, but while it was broken I set the work aside. I really should get back to that...

The new engine has also been tested (a little) against one or two other free cert providers, which illuminated some corner cases where the code relied on LE implementation choices. I'm not going to claim that the new engine was designed for async, but the decomposition of what sewer knows as Client into spec/protocol sized parts is a big step in that direction.

@AlecTroemel
Copy link
Collaborator Author

@mmaney

@AlecTroemel Let me get one nagging question out of the way: have you actually encountered a failure due to no matching challenge type? If you have, it should already be tossing an (uncontrolled) exception, because identifier_auth won't be defined, no? And so will the test and attempt to report it, then.

Yes I have, at least on the fork/branch I've been using (I didn't think there was a big drift in this area of the codebase). I had a domain just the other day I attempted to get a cert using DNS authentication, and sewer failed with this error

raised unexpected: UnboundLocalError("local variable 'identifier_auth' referenced before assignment")

which IMO is really misleading, and took me a while to remember what it really meant 😆 . why not throw a more specific error explaining the situation?

but the situation you described is not really is not an error (in the ACME process). Take a look at the discussion in that other PR.

I don't see how is it not an error. Sewer is trying to use a validation type Lets Encrypt says is not allowed... The code as far as I know cant continue is this situation.

@mmaney
Copy link
Collaborator

mmaney commented Jul 1, 2021

@AlecTroemel the failure you described in #209 was one where the host was already authorized - that shouldn't be an error just because this time you're looking to use a different auth method, because it doesn't need to be re-authorized. It does need to be recognized as this one's ready to move forward, which does require fixing the exception in the logging, etc., but it's the exact opposite of an error, it's how the protocol is supposed to work when the host has already been authorized (for the given key/ID).

This seems not to have shown up in testing or common uses of sewer because there seem to be few using a non-ephemeral ID (key). That blow it all up exception when it hits would be hard to overlook!

And I think the correct answer is to recognize when the host "challenge" in the response is actually the sign of an existing authorization, and then handle it as an already completed challenge. I'm pretty sure that will be a larger change than just patching get_identifier_authorization, which brings me back to my theme since last fall, "get_certificate delanda est". :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants