Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up pd auto-https code #3627

Closed
hdevalence opened this issue Jan 18, 2024 · 3 comments · Fixed by #3652
Closed

Clean up pd auto-https code #3627

hdevalence opened this issue Jan 18, 2024 · 3 comments · Fixed by #3652
Assignees
Labels
A-node Area: System design and implementation for node software C-chore Codebase maintenance that doesn't fix bugs or add features, and isn't urgent or blocking.

Comments

@hdevalence
Copy link
Member

hdevalence commented Jan 18, 2024

In the ~year since this code was written, there may be better options. https://docs.rs/tokio-rustls-acme/latest/tokio_rustls_acme/ seems promising

Originally posted by @hdevalence in #3522 (comment)

#3522, the initial experiment for serving HTTP content on the GRPC endpoint, is blocked by the fact that we would then need to use Axum directly, but our current auto-https support only works with Tonic specifically (which seems wrong, it's "all HTTP")

@hdevalence
Copy link
Member Author

We'll probably need to rebuild #3522 anyways, so if we could clean up the auto-https support to get something that works directly with Axum first, we would have an easier time rebuilding that work.

@cratelyn cratelyn self-assigned this Jan 18, 2024
@cratelyn
Copy link
Contributor

i going to self-assign this. @hdevalence and i spent some time talking today, getting me up to speed with the motivation here. once i have made some more headway on the mock consensus engine, #3588, i will pivot over to work on this.

@cratelyn
Copy link
Contributor

cratelyn commented Jan 22, 2024

starting work on this issue this week! i've also opened #3646 to track the original motivation (a self-hosted web interface) from #3522.

@github-project-automation github-project-automation bot moved this to 🗄️ Backlog in Penumbra Jan 23, 2024
@cratelyn cratelyn moved this from 🗄️ Backlog to In progress in Penumbra Jan 23, 2024
cratelyn added a commit that referenced this issue Jan 24, 2024
```
         /!\ this is a work in progress and will    /!\
         /!\ be force pushed until ready for review /!\
```

fixes #3627.

this reorganizes the logic in pd's startup code related to automatically
managed https functionality. in particular, this is done in such a
manner as to expose an axum `Router` that will permit later work
(#3646, #3522) to serve other additional routes from our grpc
endpoint.

Refs: #3627
Refs: #3646
Refs: #3522
cratelyn added a commit that referenced this issue Jan 24, 2024
```
         /!\ this is a work in progress and will    /!\
         /!\ be force pushed until ready for review /!\
```

 ## 👀 overview

fixes #3627.

this reorganizes the logic in pd's startup code related to automatically
managed https functionality.

 ## 🎨 background & motivation

this PR, besides cleaning up the `rustls-acme`-related auto-https logic, is
also interested in *creating a state-of-affairs that will dovetail into
pr #3522*. in particular, this expression to start the GRPC serve given a
bound listener...

```rust
tokio::task::Builder::new()
    .name("grpc_server")
    .spawn(grpc_server.serve_with_incoming(tls_incoming))
    .expect("failed to spawn grpc server")
```

...should be adjusted so as to accept an `axum::Router`.

other tertiary requirements:
- when no `grpc_auto_https` flag has been configured, serve without using an ACME resolver
- ALPN permit http/1.1 for grpc-web backwards compatibility.

 ### ⚖️  `rustls-acme` and `tokio-rustls-acme`

quoth the #3627 description, citing an earlier comment:

> In the ~year since this code was written, there may be better options.
> `tokio-rustls-acme` seems promising
\- <#3522 (comment)>

for reference, the repositories for each live here, and here:
- <https://github.com/FlorianUekermann/rustls-acme>
- <https://github.com/n0-computer/tokio-rustls-acme>

after some comparison, i have come to the opinion that `rustls-acme` will still
be adequate for our purposes. the latter is a fork of the former, but active
development appears to have continued in the former, and i did not see any
particular "_must-have_" features for us in the latter.

 ## 🚰 dropping down to `axum`

(this part is lengthy...)

as stated above, we want to switch to an `axum::Router`. this means that
we won't be able to use the `AcmeConfig::incoming` function. the
`rustls-acme` library provides some "low-level" examples we can check
out:

- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_tokio.rs>
- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_axum.rs>

`low_level_axum` sounds promising, given that #3522 itself drops down
from tonic into axum. a catch is that the "axum" example is referring to
`axum-server`, a library that wraps `axum`, rather than `axum` itself.
more on that momentarily.

we also use `tonic` 0.10.2 in pd, and elsewhere in the penumbra
monorepo. tonic isn't using hyper 1.x yet. this was being worked on in
hyperium/tonic#1583, continued on in hyperium/tonic#1595, and tracked in
hyperium/tonic#1579. that work also depends upon hyperium/hyper#3461.

this is complicated by the fact that `axum-server` 0.6.0 depends on
`hyper` 1.1.0. this is particularly relevant because of
hyperium/hyper#3040. so, we must be mindful that there is now a
`tower::Service` and a `hyper::Service`. `hyper-util` provides
facilities to convert between the two, and other helper glue.
additionally, see the "_upgrading"_ migration guide for more information
on the differences between pre- and post-1.0 interfaces in `hyper`.

- https://hyper.rs/guides/1/upgrading/
- <https://docs.rs/hyper-util/latest/hyper_util/service/struct.TowerToHyperService.html>

at a high level we need to take our `tonic::Router`, and turn that into
a `MakeService`, some object that can be used to _generate_ `Service`
instances to process a request.

this is additionally complicated by the fact that `axum-server` defines
its _own_ `MakeService` trait. that trait is sealed, blanket
implemented, and is a bound included in
`axum_server::server::Server::serve`.

```rust
// axum_server::serve
impl Server {
    pub async fn serve<M>(self, mut make_service: M) -> io::Result<()>
    where
        M: MakeService<SocketAddr, Request<Incoming>>,
        // ...
    { todo!() }
}
```

the `SocketAddr` parameter above is of importance.

`axum::Router::into_make_service` gives us `IntoMakeService<Router>`
- <https://docs.rs/axum/latest/axum/routing/struct.IntoMakeService.html>

```rust
impl<S, T> Service<T> for IntoMakeService<S>
where
    S: Clone,
{
    type Response = S;
    type Error = Infallible;
    type Future = IntoMakeServiceFuture<S>;
```

...which means that we are faced with this error:

```
error[E0277]: the trait bound `Router: tower_service::Service<http::request::Request<hyper::body::incoming::Incoming>>` is not satisfied
   --> crates/bin/pd/src/auto_https.rs:62:20
    |
62  |             .serve(make_svc)
    |              ----- ^^^^^^^^ the trait `tower_service::Service<http::request::Request<hyper::body::incoming::Incoming>>` is not implemented for `Router`
    |              |
    |              required by a bound introduced by this call
    |
    = help: the trait `tower_service::Service<axum::http::Request<axum::body::Body>>` is implemented for `Router`
    = help: for that trait implementation, expected `axum::http::Request<axum::body::Body>`, found `http::request::Request<hyper::body::incoming::Incoming>`
    = note: required for `IntoMakeService<Router>` to implement `axum_server::service::MakeService<std::net::SocketAddr, http::request::Request<hyper::body::incoming::Incoming>>`
```

`axum::Router::into_make_service` does not play well with
`axum_server::Server`.

...and if we specify the new `hyper::body::incoming::Incoming` as the
body type for the axum router,

```
error[E0599]: the method `into_make_service` exists for struct `Router<(), Incoming>`, but its trait bounds were not satisfied
  --> crates/bin/pd/src/auto_https.rs:57:31
   |
57 |         let make_svc = router.into_make_service();
   |                               ^^^^^^^^^^^^^^^^^ method cannot be called on `Router<(), Incoming>` due to unsatisfied trait bounds
   |
  ::: /home/katie/.cargo/registry/src/index.crates.io-6f17d22bba15001f/hyper-1.1.0/src/body/incoming.rs:47:1
   |
47 | pub struct Incoming {
   | ------------------- doesn't satisfy `hyper::body::Incoming: HttpBody`
   |
   = note: the following trait bounds were not satisfied:
           `hyper::body::Incoming: HttpBody`
```

`HttpBody` was the pre-1.0 trait, that is now `Body` post-1.0. in short, there
are some incongruencies at play as different libraries catch up to the new
breaking changes in `hyper`.

 ## 🤨 ...so now what?

TODO(kate): write up course of action for tomorrow.

---

🟥 NB: i worry about breakage due to untested load-bearing behavior
during a migration like this. see "tertiary requirements", above.
consider this a PR to be suspect of when deploying a new testnet.

Refs: #3627
Refs: #3646
Refs: #3522
cratelyn added a commit that referenced this issue Jan 25, 2024
 ## 👀 overview

fixes #3627.

this reorganizes the logic in pd's startup code related to automatically
managed https functionality.

 ## 🎨 background & motivation

this PR, besides cleaning up the `rustls-acme`-related auto-https logic, is
also interested in *creating a state-of-affairs that will dovetail into
pr #3522*. in particular, this expression to start the GRPC serve given a
bound listener...

```rust
tokio::task::Builder::new()
    .name("grpc_server")
    .spawn(grpc_server.serve_with_incoming(tls_incoming))
    .expect("failed to spawn grpc server")
```

...should be adjusted so as to work with an `axum::Router`.

 ### ⚖️  `rustls-acme` and `tokio-rustls-acme`

quoth the #3627 description, citing an earlier comment:

> In the ~year since this code was written, there may be better options.
> `tokio-rustls-acme` seems promising
\- <#3522 (comment)>

for reference, the repositories for each live here, and here:
- <https://github.com/FlorianUekermann/rustls-acme>
- <https://github.com/n0-computer/tokio-rustls-acme>

after some comparison, i have come to the opinion that `rustls-acme` will still
be adequate for our purposes. the latter is a fork of the former, but active
development appears to have continued in the former, and i did not see any
particular "_must-have_" features for us in the latter.

 ## 🎴 changes

this commit moves some of the auto-https related code from the `main`
entrypoint, into standalone functions in `pd::main`.

some constants are defined, to keep control flow clear and to help
facilitate the addition of future options e.g. a flag to control the
LetsEncrypt environment to use.

 ## 🚰 dropping down to `axum`; a brief note on future upgrades

as stated above, we want to switch to an `axum::Router`. this means that
we won't be able to use the `AcmeConfig::incoming` function. the
`rustls-acme` library provides some "low-level" examples this work is
based upon

- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_tokio.rs>
- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_axum.rs>

we also use `tonic` 0.10.2 in pd, and elsewhere in the penumbra
monorepo. tonic isn't using hyper 1.x yet. this was being worked on in
hyperium/tonic#1583, continued on in hyperium/tonic#1595, and tracked in
hyperium/tonic#1579. that work also depends upon hyperium/hyper#3461.

we will need to be wait for tonic to finish its migration over to hyper
1.0, see:
hyperium/tonic#1579 (comment)

this is understandable, but i make note of this situation as a
signpost for our future selves when considering a migration to
recent versions of axum-server, axum, rustls-acme, or hyper.

for now, it's easiest to stay in lock-step with tonic, and we can revisit
the upgrade path(s) at a future date.

===

Refs: #3627
Refs: #3646
Refs: #3522
@cratelyn cratelyn added A-node Area: System design and implementation for node software C-chore Codebase maintenance that doesn't fix bugs or add features, and isn't urgent or blocking. labels Jan 25, 2024
conorsch pushed a commit that referenced this issue Jan 26, 2024
 ## 👀 overview

fixes #3627.

this reorganizes the logic in pd's startup code related to automatically
managed https functionality.

 ## 🎨 background & motivation

this PR, besides cleaning up the `rustls-acme`-related auto-https logic, is
also interested in *creating a state-of-affairs that will dovetail into
pr #3522*. in particular, this expression to start the GRPC serve given a
bound listener...

```rust
tokio::task::Builder::new()
    .name("grpc_server")
    .spawn(grpc_server.serve_with_incoming(tls_incoming))
    .expect("failed to spawn grpc server")
```

...should be adjusted so as to work with an `axum::Router`.

 ### ⚖️  `rustls-acme` and `tokio-rustls-acme`

quoth the #3627 description, citing an earlier comment:

> In the ~year since this code was written, there may be better options.
> `tokio-rustls-acme` seems promising
\- <#3522 (comment)>

for reference, the repositories for each live here, and here:
- <https://github.com/FlorianUekermann/rustls-acme>
- <https://github.com/n0-computer/tokio-rustls-acme>

after some comparison, i have come to the opinion that `rustls-acme` will still
be adequate for our purposes. the latter is a fork of the former, but active
development appears to have continued in the former, and i did not see any
particular "_must-have_" features for us in the latter.

 ## 🎴 changes

this commit moves some of the auto-https related code from the `main`
entrypoint, into standalone functions in `pd::main`.

some constants are defined, to keep control flow clear and to help
facilitate the addition of future options e.g. a flag to control the
LetsEncrypt environment to use.

 ## 🚰 dropping down to `axum`; a brief note on future upgrades

as stated above, we want to switch to an `axum::Router`. this means that
we won't be able to use the `AcmeConfig::incoming` function. the
`rustls-acme` library provides some "low-level" examples this work is
based upon

- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_tokio.rs>
- <https://github.com/FlorianUekermann/rustls-acme/blob/main/examples/low_level_axum.rs>

we also use `tonic` 0.10.2 in pd, and elsewhere in the penumbra
monorepo. tonic isn't using hyper 1.x yet. this was being worked on in
hyperium/tonic#1583, continued on in hyperium/tonic#1595, and tracked in
hyperium/tonic#1579. that work also depends upon hyperium/hyper#3461.

we will need to be wait for tonic to finish its migration over to hyper
1.0, see:
hyperium/tonic#1579 (comment)

this is understandable, but i make note of this situation as a
signpost for our future selves when considering a migration to
recent versions of axum-server, axum, rustls-acme, or hyper.

for now, it's easiest to stay in lock-step with tonic, and we can revisit
the upgrade path(s) at a future date.

===

Refs: #3627
Refs: #3646
Refs: #3522
@github-project-automation github-project-automation bot moved this from In progress to Done in Penumbra Jan 26, 2024
cratelyn added a commit that referenced this issue Jan 31, 2024
fixes #3119. see also, #1886.

this pulls the auto-https code (see #3627, #3652) into a standalone
library crate.
conorsch pushed a commit that referenced this issue Jan 31, 2024
fixes #3119. see also, #1886.

this pulls the auto-https code (see #3627, #3652) into a standalone
library crate.
TalDerei pushed a commit that referenced this issue Feb 8, 2024
fixes #3119. see also, #1886.

this pulls the auto-https code (see #3627, #3652) into a standalone
library crate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-node Area: System design and implementation for node software C-chore Codebase maintenance that doesn't fix bugs or add features, and isn't urgent or blocking.
Projects
Archived in project
Status: No status
Development

Successfully merging a pull request may close this issue.

2 participants