Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 1.1 connections not being reused #3350

Open
lzap opened this issue Dec 5, 2024 · 10 comments
Open

HTTP 1.1 connections not being reused #3350

lzap opened this issue Dec 5, 2024 · 10 comments

Comments

@lzap
Copy link

lzap commented Dec 5, 2024

Hello,

we host repositories behind a MTLS HTTP/1.1 gateway which redirects all requests to AWS S3 (HTTP 1.1 as well). There is no HTTP/2 support for either of the services at the moment. I was wondering if those HTTP 1.1 connections are being reused and I would need some help understanding if that is the case:

# OSTREE_DEBUG_HTTP=1 ostree pull ...
* Re-using existing connection with host mtls.internal.console.stage.redhat.com
> GET /api/pulp-content/edge-integration-test-2/lzap-rhel9/config HTTP/1.1
< HTTP/1.1 302 Moved Temporarily
< Location: https://edge-pulp-test.s3.amazonaws.com/artifact/XXX
* Re-using existing connection with host edge-pulp-test.s3.amazonaws.com
> GET /artifact/XXX

So first couple of connections (6) the libcurl indicates a connection being reused both to the gateway and the S3. But after that, the rest of the whole transaction looks like this:

* Connection #7 to host mtls.internal.console.stage.redhat.com left intact
* Issue another request to this URL: 'https://edge-pulp-test.s3.amazonaws.com/artifact/XXX'
* Can not multiplex, even if we wanted to
* shutting down connection #7

And the connection number is increasing which to me it looks like for some reason libcurl is not reusing them. I thought for a moment that the hard limit of 8 connections is the limiting factor, but I increased it or completely removed it (defaults to no limit) and the behavior is the same.

Pulp does support deltas and we will likely end up doing that, I just want to understand what is going on. From the docs, it looks like the "dumb protocol" design was created with AWS S3 in mind, but this service does not support HTTP/2 to this day.

Can you help me understanding what is wrong? Maybe I am just mis-reading the logs, but output of lsof confirms that new and new TCP connections are being made. Also pmrep network.tcp.activeopens shows new TCP connections being opened at a rate of about 15 per second on my system. When paired with MTLS the performance is down to zero. Thanks for help.

Switching to HTTP/2 protocol completely mitigates the problem, the performance over HTTP/2 is very good. We currently have HTTP 1.1 only infrastructure to work with that is why I filed it. I just wanted to let others now after I spent some time on this.

@cgwalters
Copy link
Member

cgwalters commented Dec 5, 2024

AWS S3 (HTTP 1.1 as well)

Yes, though it's worth noting that Cloudfront does speak HTTP2. That's my recommendation and what other people are doing today with ostree backed by S3.

From the docs, it looks like the "dumb protocol" design was created with AWS S3 in mind, but this service does not support HTTP/2 to this day.

Not S3 specifically, but an object store yes. It's worth noting that I think every other object store does, e.g. Azure Blob storage does. For example today when you go to a podman release and download a binary that's a redirect to Azure Blob:

curl -L --head https://github.com/containers/podman/releases/download/v5.3.1/podman-5.3.1-setup.exe
HTTP/2 302 
server: GitHub.com
date: Thu, 05 Dec 2024 15:05:47 GMT
content-type: text/html; charset=utf-8
location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/109145553/9e44e86e-dfb8-4699-86f2-fe3c66372f55?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=
...

HTTP/2 200 
content-type: application/octet-stream
last-modified: Thu, 21 Nov 2024 15:49:09 GMT
etag: "0x8DD0A440A1328DB"
server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
...
$

Can you help me understanding what is wrong?

We can investigate this but it's really worth emphasizing that I think HTTP2 is really worth pushing for...

Can you try replicating what happens with just a plain curl -vLf --remote-name-all for those objects?

Also of important note is that zstd:chunked which was created to get a similar "dynamic per-object delta fetch" for containers also can end up doing a lot of HTTP requests.

@lzap
Copy link
Author

lzap commented Dec 6, 2024

We can investigate this

Well it sounds like a bug to me, at least identifying the issue is worth it, I am not saying let’s invest into fixing it too much.

Can you try replicating what happens with just a plain curl -vLf --remote-name-all for those objects?

Not sure how tho, curl does not support grabbing a list of URLs, not sure how many of them I could send via xargs but wget does support it. So I extracted about 4k of the URLs, it looks like wget behaves quite differently.

It first follows all the redirects and add them to the back of the queue, during this phase because no data is being downloaded just HTTP headers the "speed" is reported as really bad around 200 B/s in my case. What is worth nothing is that wget by default uses 5 connections concurrently and they are reused for the whole phase.

COMMAND    PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
wget    858602 root    3u  IPv4 11405108      0t0  TCP zzzap.xxx.redhat.com:38058->a23-212-110-25.deploy.static.akamaitechnologies.com:https (
ESTABLISHED)
wget    858602 root    4u  IPv4 11404011      0t0  TCP zzzap.xxx.redhat.com:38070->a23-212-110-25.deploy.static.akamaitechnologies.com:https (
ESTABLISHED)
wget    858602 root    5u  IPv4 11405997      0t0  TCP zzzap.xxx.redhat.com:38072->a23-212-110-25.deploy.static.akamaitechnologies.com:https (
ESTABLISHED)
wget    858602 root    6u  IPv4 11406557      0t0  TCP zzzap.xxx.redhat.com:38078->a23-212-110-25.deploy.static.akamaitechnologies.com:https (
ESTABLISHED)
wget    858602 root    7u  IPv4 11404012      0t0  TCP zzzap.xxx.redhat.com:38086->a23-212-110-25.deploy.static.akamaitechnologies.com:https (
ESTABLISHED)

Then it switches to the download phase where it downloads all the files from S3 at the acceptable throughtput. Connections are also reused, tho from time to time (every few seconds) a connection is closed perhaps on the AWS side

COMMAND    PID USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
wget    856001 root    3u  IPv4 11394968      0t0  TCP zzzap.xxx.redhat.com:51516->s3-1-w.amazonaws.com:https (ESTABLISHED)
wget    856001 root    4u  IPv4 11397709      0t0  TCP zzzap.xxx.redhat.com:51538->s3-1-w.amazonaws.com:https (ESTABLISHED)
wget    856001 root    5u  IPv4 11396632      0t0  TCP zzzap.xxx.redhat.com:51534->s3-1-w.amazonaws.com:https (ESTABLISHED)
wget    856001 root    6u  IPv4 11395787      0t0  TCP zzzap.xxx.redhat.com:51532->s3-1-w.amazonaws.com:https (ESTABLISHED)
wget    856001 root    7u  IPv4 11394967      0t0  TCP zzzap.xxx.redhat.com:51510->s3-1-w.amazonaws.com:https (ESTABLISHED)

Conclusions so far:

  • For some reason, libcurl does not reuse redirected connections efficiently.
  • The bottleneck is the initial request to get the S3 URL (MTLS, token creation)
  • Cloudfront HTTP/2 could possibly help but it needs to be enabled both on the Pulp API content endpoint and S3.

For the record, this is how I stored the list of URLs:

OSTREE_DEBUG_HTTP=1 ostree pull --repo=repo3 pulp rhel/9/x86_64/edge 2>&1 | grep "> GET /api" | awk '{print "https://mtls.internal.console.stage.redhat.com" $3}' > ostree-list

And this is how I downloaded the contents:

wget -i ostree-list --ca-certificate crc-ca.crt --certificate crc-stage.crt --private-key crc-stage.key

@lzap
Copy link
Author

lzap commented Dec 11, 2024

For the record, I conducted a test with redirects over HTTP/2 and haven’t experienced any HTTP closed connections. The whole transaction ran at a very good speed over a single TCP HTTP/2 connection.

@cgwalters
Copy link
Member

Not sure how tho, curl does not support grabbing a list of URLs,

It does. If you want to download a bunch of URLs with curl, most convenient is using --remote-name-all e.g. curl -L --remote-name-all https://dl.fedoraproject.org/pub/fedora/linux/releases/41/Everything/x86_64/os/EFI/BOOT/{BOOTIA32.EFI,BOOTX64.EFI,grubia32.efi,mmx64.efi} to fetch some EFI binaries (chosen as they're not small, but not too large either).

So I think we should be able to reproduce the behavior with the curl binary outside of ostree in this scenario by manually passing the objects to fetch, right?

Let's not involve wget as it's an entirely different codebase.

@lzap
Copy link
Author

lzap commented Dec 12, 2024

Yeah, I knew I could expand shell like that but it did not occur to me to use bigger files. So here is an example of downloading few big files from EPEL9 main Fedora download site. With dl.fedoraproject.org it resolves to one of the main servers and then curl keeps connections (I am using a dumb command to watch this: watch -n0.3 lsof -a -itcp -p $(pidof curl)):

curl -L --remote-name-all --http1.1 -Z --parallel-max 3 https://dl.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/{a/arm-none-eabi-gcc-cs-12.4.0-1.el9.x86_64.rpm,b/bullet-devel-doc-3.08-6.el9.x86_64.rpm,c/clamav-data-1.0.7-1.el9.noarch.rpm,e/enlightenment-data-0.26.0-1.el9.noarch.rpm,i/i3-devel-doc-4.20.1-3.el9.noarch.rpm,j/java-latest-openjdk-portable-devel-fastdebug-21.0.1.0.12-2.rolling.el9.x86_64.rpm,j/java-latest-openjdk-portable-unstripped-21.0.1.0.12-2.rolling.el9.x86_64.rpm,j/java-latest-openjdk-portable-devel-slowdebug-21.0.1.0.12-2.rolling.el9.x86_64.rpm,j/java-latest-openjdk-static-libs-fastdebug-23.0.1.0.11-1.rolling.el9.x86_64.rpm,j/java-latest-openjdk-jmods-fastdebug-23.0.1.0.11-1.rolling.el9.x86_64.rpm}

Now, when I try to use download.fedorahosted.org it behaves just like Pulp, the system we are working with. It is a HTTP redirect to one of the geo-closed mirrors. The behavior is completely different, despite I set maximum 3 connections, curl goes ahead and opens as many as 10 of them and downloads everything pretty quickly:

curl -L --remote-name-all --http1.1 -Z --parallel-max 3 https://download.fedoraproject.org/pub/epel/9/Everything/x86_64/Packages/{a/arm-none-eabi-gcc-cs-12.4.0-1.el9.x86_64.rpm,b/bullet-devel-doc-3.08-6.el9.x86_64.rpm,c/clamav-data-1.0.7-1.el9.noarch.rpm,e/enlightenment-data-0.26.0-1.el9.noarch.rpm,i/i3-devel-doc-4.20.1-3.el9.noarch.rpm,j/java-latest-openjdk-portable-devel-fastdebug-21.0.1.0.12-2.rolling.el9.x86_64.rpm,j/java-latest-openjdk-portable-unstripped-21.0.1.0.12-2.rolling.el9.x86_64.rpm,j/java-latest-openjdk-portable-devel-slowdebug-21.0.1.0.12-2.rolling.el9.x86_64.rpm,j/java-latest-openjdk-static-libs-fastdebug-23.0.1.0.11-1.rolling.el9.x86_64.rpm,j/java-latest-openjdk-jmods-fastdebug-23.0.1.0.11-1.rolling.el9.x86_64.rpm}

It even reports "live=3" and "xfers=10":

DL% UL%  Dled  Uled  Xfers  Live Total     Current  Left    Speed
--  --  2491M     0    10     3  --:--:--  0:00:35 --:--:-- 56.9M 

I tried both --no-parallel-immediate and --parallel-immediate without any effect. I am out of ideas, you can copy and paste these results and try it yourself if you can figure out something else. I was using curl from latest Fedora stable:

# curl --version
curl 8.9.1 (x86_64-redhat-linux-gnu) libcurl/8.9.1 OpenSSL/3.2.2 zlib/1.3.1.zlib-ng brotli/1.1.0 libidn2/2.3.7 libpsl/0.21.5 libssh/0.10.6/openssl/zlib nghttp2/1.62.1 OpenLDAP/2.6.8
Release-Date: 2024-07-31
Protocols: dict file ftp ftps gopher gophers http https imap imaps ipfs ipns ldap ldaps mqtt pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp ws wss
Features: alt-svc AsynchDNS brotli GSS-API HSTS HTTP2 HTTPS-proxy IDN IPv6 Kerberos Largefile libz NTLM PSL SPNEGO SSL threadsafe TLS-SRP UnixSockets

My conclusion so far:

  • curl behaves differently than ostree in a sense that ostree is capping maximum amount of connections somehow wheres curl goes above the maximum parallel setting
  • but even when I tried to remove the cap from the ostree codebase I did not see any difference
  • I am currently not testing against Pulp I want to setup up an environment and retest
  • fedora very often returns a bunch of different hostnames for every single redirect so this of course makes the client not being able to reuse connections
  • I want to check if Pulp does the same (I think it returns the same S3 hostname but DNS resolves it to different IPs "randomly") that might be the culprit
  • i would be interested if there is a way to unlimit ostree client so it can open up as many connection as it needs to cover all these redirected servers

@lzap
Copy link
Author

lzap commented Dec 12, 2024

So the hostname Pulp returns is always the same, it resolves randomly in my region:

# host s3.amazonaws.com
s3.amazonaws.com has address 52.217.80.70
s3.amazonaws.com has address 16.15.193.108
s3.amazonaws.com has address 52.217.126.32
s3.amazonaws.com has address 52.217.128.32
s3.amazonaws.com has address 52.217.123.48
s3.amazonaws.com has address 52.216.209.16
s3.amazonaws.com has address 52.217.83.142
s3.amazonaws.com has address 52.216.37.168

I tried to create an entry in /etc/hosts with a single IP but that did nothing. An ostree pull is still opening about 20 new TCP connections per second. So this theory is not correct.

@cgwalters
Copy link
Member

So /usr/bin/curl and /usr/bin/ostree both use libcurl - so should be configurable to do the same thing.

One thing here that seems quite possible is that /bin/curl configures some options on by default that the library doesn't.

It's also possible (but IMO unlikely) that there is some logic specifically in the curl binary.

but even when I tried to remove the cap from the ostree codebase I did not see any difference

https://curl.se/libcurl/c/CURLMOPT_MAX_HOST_CONNECTIONS.html documents the behavior there and it's pretty clear what it does...I don't think it's the problem here.

One thing to remember here, I don't know if you hit this - if you're changing the ostree codebase, it's not enough to just update /bin/ostree for this, it's libostree-1.so.1 that needs to be updated.

One discovery here is that curl/curl@1be704e changes a bunch of the relevant logging strings here - I was confused why using GH code search I couldn't find any of the messages. And so if debugging the curl code specifically it will be important to look at the specific libcurl git tag/commit.

@lzap lzap changed the title Connections not being reused after redirects HTTP 1.1 connections not being reused after redirects Dec 13, 2024
@lzap lzap changed the title HTTP 1.1 connections not being reused after redirects HTTP 1.1 connections not being reused Dec 13, 2024
@lzap
Copy link
Author

lzap commented Dec 13, 2024

https://curl.se/libcurl/c/CURLMOPT_MAX_HOST_CONNECTIONS.html documents the behavior there and it's pretty clear what it does...I don't think it's the problem here.

Ostree does not appear to use this value tho, it is just this:

# grep -r CURLMOPT .
grep: ./src/libostree/.libs/libostree_1_la-ostree-fetcher-curl.o: binary file matches
./src/libostree/ostree-fetcher-curl.c:  rc = curl_multi_setopt (self->multi, CURLMOPT_SOCKETFUNCTION, sock_cb);
./src/libostree/ostree-fetcher-curl.c:  rc = curl_multi_setopt (self->multi, CURLMOPT_SOCKETDATA, self);
./src/libostree/ostree-fetcher-curl.c:  rc = curl_multi_setopt (self->multi, CURLMOPT_TIMERFUNCTION, update_timeout_cb);
./src/libostree/ostree-fetcher-curl.c:  rc = curl_multi_setopt (self->multi, CURLMOPT_TIMERDATA, self);
./src/libostree/ostree-fetcher-curl.c:  //rc = curl_multi_setopt (self->multi, CURLMOPT_MAX_TOTAL_CONNECTIONS, 16);
./src/libostree/ostree-fetcher-curl.c:  rc = curl_multi_setopt (self->multi, CURLMOPT_PIPELINING, CURLPIPE_MULTIPLEX);
./src/libostree/ostree-fetcher-curl.c:/* CURLMOPT_SOCKETFUNCTION */

One thing to remember here, I don't know if you hit this - if you're changing the ostree codebase, it's not enough to just update /bin/ostree for this, it's libostree-1.so.1 that needs to be updated.

Ha good point, thanks for sharing. I will validate, what I did was just dumb:

  • Changed the MAX_HOST_CONNECTIONS
  • ./autoconf && configure --with-curl && make install
  • /usr/local/bin/curl ...

So let me see:

[root@zzzap ostree-src]# ldd /usr/local/bin/ostree | grep ostree
	libostree-1.so.1 => /usr/local/lib/libostree-1.so.1 (0x00007ff2a632c000)

[root@zzzap ostree-src]# ldd /usr/local/lib/libostree-1.so.1 | grep curl
	libcurl.so.4 => /lib64/libcurl.so.4 (0x00007fe63a9ae000)

Looks good to me. Anyways, I have noticed that HTTP redirect is not causing this, it looks like HTTP 1.1 connections are not kept alive even against plain and simple HTTP server. Looks like a broader bug to me, unless I am doing something terribly wrong. Maybe see yourself, I have prepared two repos, one without any redirects:

ostree --repo=without init
ostree --repo=without remote add repo --set="gpg-verify=false" --set="http2=false" https://home.zapletalovi.com/ostree/fedora40
ostree pull --repo=without repo fedora/40/x86_64/iot

One with redirect for each and every file:

ostree --repo=with init
ostree --repo=with remote add repo --set="gpg-verify=false" --set="http2=false" https://home.zapletalovi.com/ostree-fedora40
ostree pull --repo=with repo fedora/40/x86_64/iot

Watching new TCP connections is perhaps the most comfortable with this command:

sudo tcpdump "dst port 443 and tcp[tcpflags] & (tcp-syn|tcp-ack) == tcp-syn"

Now, I know TCP needs two parties and it can be server who is closing those connections, but this is pretty standard Apache httpd from Fedora 41, there is no rate limiting or MaxConnPerIP set. I tried to play around with various libcurl settings without any luck. Of course, the moment one enable HTTP/2 the performance skyrockets and only a single connection is ever used.

Options I tried to increase:

  • CURLMOPT_MAXCONNECTS
  • CURLMOPT_MAX_TOTAL_CONNECTIONS
  • CURLMOPT_MAX_PIPELINE_LENGTH

@bmbouter
Copy link

I just wanted to chime in here as an interested party. I'm involved in the server-side hosting of ostree content. I hear you on the HTTP/2 desire and we are pursuing that. Separately though, not having HTTP 1.1 connection reuse does seem like an issue when a pull initiates like 44K TCP connections.

Can someone confirm for me, is the current thinking that the issue is in libcurl?

@cgwalters
Copy link
Member

My suggestion is basically:

  • Get the desired behavior out of curl with the desired web server(s)
  • Figure out what (if anything) is different with ostree

Now, I know TCP needs two parties and it can be server who is closing those connections, but this is pretty standard Apache httpd from Fedora 41, there is no rate limiting or MaxConnPerIP set.

While yes the Apache docs do indicate KeepAlive is on by default, maybe we patched it downstream, etc.

I also did some debugging of related issues in the past by comparing the libostree curl settings with https://github.com/rpm-software-management/librepo FWIW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants