Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 100x faster writing to AWS S3 (PUT's) than Bun v1.0.15 (in same region) #7428

Closed
asilvas opened this issue Dec 3, 2023 · 31 comments · Fixed by #13434
Closed

Node 100x faster writing to AWS S3 (PUT's) than Bun v1.0.15 (in same region) #7428

asilvas opened this issue Dec 3, 2023 · 31 comments · Fixed by #13434
Assignees
Labels
bug Something isn't working

Comments

@asilvas
Copy link
Contributor

asilvas commented Dec 3, 2023

What version of Bun is running?

1.0.15+b3bdf22eb

What platform is your computer?

Linux 6.1.0-13-cloud-amd64 x86_64 unknown

What steps can reproduce the bug?

import { S3Client, PutObjectCommand, GetObjectCommand } from '@aws-sdk/client-s3';

const BUCKETS = ['myS3ExpressBucket', 'myS3StandardBucket'];
const CONCURRENCY = 10;
const TIME_PER_TEST_MS = 10_000;

const s3 = new S3Client({
  region: process.env.AWS_REGION || 'us-east-1',
});

for (const Bucket of BUCKETS) {
  await testBucket(Bucket);
}

async function testBucket(Bucket) {
  console.log(`Write testing ${Bucket}...`);
  const writes = await testWrites(Bucket);
  console.log(`Read testing ${Bucket}...`);
  const reads = await testReads(Bucket);

  console.log(`Write rate: ${writes.rate.toFixed(1)} ops/sec, latency: ${writes.latency.toFixed(1)} ms/op`);
  console.log(`Read rate: ${reads.rate.toFixed(1)} ops/sec, latency: ${reads.latency.toFixed(1)} ms/op`);
}

async function testWrites(Bucket) {
  const start = Date.now();
  let count = 0;
  while (Date.now() - start < TIME_PER_TEST_MS) {
    await Promise.all(Array.from({ length: CONCURRENCY }, async (_, i) => {
      await s3.send(new PutObjectCommand({
        Bucket,
        Key: `rand/test-${i}.txt`,
        Body: 'Hello World!',
      }));
    }));
    count += CONCURRENCY;
  }

  const elapsed = Date.now() - start;
  const rate = count / (elapsed / 1000);
  const latency = (elapsed / count) * 1000;

  return { count, rate, latency };
}

async function testReads(Bucket) {
  const start = Date.now();
  let count = 0;
  while (Date.now() - start < TIME_PER_TEST_MS) {
    await Promise.all(Array.from({ length: CONCURRENCY }, async (_, i) => {
      await s3.send(new GetObjectCommand({
        Bucket,
        Key: `rand/test-${i}.txt`,
      }));
    }));
    count += CONCURRENCY;
  }

  const elapsed = Date.now() - start;
  const rate = count / (elapsed / 1000);
  const latency = (elapsed / count) * 1000;

  return { count, rate, latency };
}

What is the expected behavior?

  1. I should be able to increase concurrency until I saturate the CPU
  2. Performance should be comparable or better than Node
  3. Should be able to easily saturate at least one CPU core (per process)

What do you see instead?

> node s3-test.js
Write testing myS3ExpressBucket...
Read testing myS3ExpressBucket...
Write rate: 991.5 ops/sec, latency: 1008.6 ms/op
Read rate: 9.1 ops/sec, latency: 109818.2 ms/op
Write testing myS3StandardBucket...
Read testing myS3StandardBucket...
Write rate: 94.2 ops/sec, latency: 10611.6 ms/op
Read rate: 9.0 ops/sec, latency: 111709.1 ms/op

> bun s3-test.js
Write testing myS3ExpressBucket...
Read testing myS3ExpressBucket...
Write rate: 9.6 ops/sec, latency: 103770.0 ms/op
Read rate: 1489.6 ops/sec, latency: 671.3 ms/op
Write testing myS3StandardBucket...
Read testing myS3StandardBucket...
Write rate: 8.9 ops/sec, latency: 112833.3 ms/op
Read rate: 234.3 ops/sec, latency: 4267.7 ms/op

Oddly enough Node & Bun seem almost inverse between reads and writes. But I'm not worried about read performance in this issue.

Additional information

  1. This test has been run several times on dedicated VM's in AWS, in the same region as the buckets.
  2. S3 Express is very new so not super relevant to this issue, but calling it out to clarify why performance is 5-10x faster between bucket types
  3. Using the very latest AWS SDK (have tried a few older versions as well)
  4. This is blocking a pretty big project that is heavily invested in Bun. I'll continue to troubleshoot but as of now am blocked
  5. S3 is bottle-necked at the partition level so yes you can increase the number of unique objects -- the above example is just the simplest way to show how bad the situation is with a low (10) concurrency. It's equally as slow at 100 concurrency due to this bug.
  6. CPU usage is very low (< 10%) while Bun is doing write tests -- it's waiting on something.
@asilvas asilvas added the bug Something isn't working label Dec 3, 2023
@asilvas asilvas changed the title Node 100x faster writing to S3 (PUT's) than Bun v1.0.15 Node 100x faster writing to AWS S3 (PUT's) than Bun v1.0.15 (in same region) Dec 3, 2023
@Jarred-Sumner
Copy link
Collaborator

Thank you for the detailed report. I don't know what's causing this but we will look into it.

As a stopgap, you might be able to make it work with just fetch. node:http internally is currently implemented as a wrapper around fetch.

It would also be interesting to see if it's using HTTP2. If it's using HTTP2, that would potentially explain it because that code is very new.

@asilvas
Copy link
Contributor Author

asilvas commented Dec 3, 2023

Thank you for the detailed report. I don't know what's causing this but we will look into it.

As a stopgap, you might be able to make it work with just fetch. node:http internally is currently implemented as a wrapper around fetch.

It would also be interesting to see if it's using HTTP2. If it's using HTTP2, that would potentially explain it because that code is very new.

Thanks for the quick response Jarred. Unless it's changed recently the major AWS API's don't support H2 (on the client side, anyway) but I'll report any findings here.

@asilvas
Copy link
Contributor Author

asilvas commented Dec 3, 2023

Verified their fetch handler FetchHttpHandler does not work either, since it was designed for browser. I could probably build a new handler but I'm not sure that's the best use of time since this is likely a bug. Just haven't narrowed it down.

@asilvas
Copy link
Contributor Author

asilvas commented Dec 3, 2023

Looks like I identified the root cause.

The NodeHttpHandler is waiting on a continue event on the request that never fires, and as a result times out after 1 second (by default). I verified this is what's happening.

https://github.com/smithy-lang/smithy-typescript/blob/main/packages/node-http-handler/src/write-request-body.ts#L29
https://github.com/smithy-lang/smithy-typescript/blob/main/packages/node-http-handler/src/node-http-handler.ts#L178

Doesn't look like there is an option to work around this since it is tied to requestTimeout, and if I set that low the request will just fail due to timeout. I'll attempt to work around it with a custom handler but this seems like a pretty big bug?

@asilvas
Copy link
Contributor Author

asilvas commented Dec 3, 2023

Not resolved, but with a hacky workaround I was at least able to demonstrate the real performance:

bun tools/s3-test.ts
Write testing myS3ExpressBucket...
Read testing myS3ExpressBucket...
Write rate: 1036.6 ops/sec, latency: 964.7 ms/op
Read rate: 1533.7 ops/sec, latency: 652.0 ms/op
Write testing myS3StandardBucket...
Read testing myS3StandardBucket...
Write rate: 95.3 ops/sec, latency: 10495.9 ms/op
Read rate: 230.1 ops/sec, latency: 4346.8 ms/op

About 4.5% faster than Node. Ignore the invalid latency stats in my quick test.

@asilvas
Copy link
Contributor Author

asilvas commented Dec 5, 2023

This bug has turned into quite the nightmare. The custom handler workaround that skips the 100-continue and works for most cases, but hangs indefinitely when deployed into an AWS Lambda with the custom Bun runtime layer.

@cirospaciari cirospaciari self-assigned this Dec 5, 2023
@asilvas
Copy link
Contributor Author

asilvas commented Dec 10, 2023

The linked PR might resolve this.

@asilvas
Copy link
Contributor Author

asilvas commented Dec 24, 2023

I've spent dozens of hours working around this issue but none have proven reliable across all AWS v3 SDK clients. I'm trying to avoid porting a major project back to Node over this issue. Any idea if this will be on your radar anytime soon? I'm a little shocked more people haven't reported this considering the largest cloud provider and all.

@Tirke
Copy link

Tirke commented Jan 25, 2024

@asilvas I tried to used bun with simple scripts (kinesis and S3) and found out bun is very buggy with both.

S3:

import { S3Client, ListObjectsV2Command, GetObjectCommand } from '@aws-sdk/client-s3';

const s3Client = new S3Client({});
const bucketName = 'bucket-name';

// Fetch all files in the bucket
const listCommand = new ListObjectsV2Command({ Bucket: bucketName });
const listResponse = await s3Client.send(listCommand);
const objects = listResponse.Contents || [];

// Iterate over the files
for (const object of objects) {
  try {
    const getCommand = new GetObjectCommand({ Bucket: bucketName, Key: object.Key });
    const getResponse = await s3Client.send(getCommand);
    const content = await getResponse?.Body?.transformToString(); // Bun freezes randomly (but rapidly) on that line
    console.log(content);
  } catch (error) {
    console.error(error);
  }
}

And Kinesis doesnt work at all I get
RangeError: Maximum call stack size exceeded.
error: Unexpected error: http2 request did not get a response

import { Kinesis, PutRecordsCommand } from '@aws-sdk/client-kinesis'

const streamName = 'stream-name'
const kinesisClient = new Kinesis({})
const textEncoder = new TextEncoder()

const putRecordsCommand = new PutRecordsCommand({
  Records: [
    {
      Data: textEncoder.encode(JSON.stringify({ message: 'Hello, msg1!' })),
      PartitionKey: 'partitionKey1',
    },
    {
      Data: textEncoder.encode(JSON.stringify({ message: 'Hello, msg2!' })),
      PartitionKey: 'partitionKey1',
    },
  ],
  StreamName: streamName,
})

try {
  const result = await kinesisClient.send(putRecordsCommand) // this doesnt work at all in Bun 
  console.log(result)
} catch (error) {
  console.error(error)
}

@Berndy
Copy link

Berndy commented Jan 26, 2024

@asilvas @Tirke I'm having a very similar issue considering specifically the ListObjectsV2Command. I've seen it maybe once on my local Ubuntu 22 machine running bun 1.0.25, and pretty consistently on my EC2 instance with docker images based on oven/bun:alpine, oven/bun:slim and oven/bun:latest.

I have the following listPaged function, where sometimes the ListObjectsV2Command just never returns, it doesn't even throw a timeout error.

export const s3listPaged = (bucket, prefix) => new Observable(subscriber => {
    try {
        const _FETCH_PAGE = async continuationToken => {
            const command = new ListObjectsV2Command({
                Bucket: bucket,
                Prefix: prefix,
                ContinuationToken: continuationToken,
            });

            const ContinuationToken = await client.send(command).then(({ Contents, NextContinuationToken }) => {
                subscriber.next(Contents || []);
                return NextContinuationToken;
            });

            ContinuationToken ? _FETCH_PAGE(ContinuationToken) : subscriber.complete();
        };
        _FETCH_PAGE();
    } catch (err) {
        subscriber.error(err);
    }
});

I can work around this by implementing my own timeout and some retries, but that does not seem like a clean solution. I'm worried this would just leave those connections open and block newer ones down the line?

@asilvas
Copy link
Contributor Author

asilvas commented Jan 26, 2024

Anyone waiting on a fix should thumbs up the original post to get it more priority.

@jotanarciso
Copy link

👀

@raduconst06
Copy link

Encountering the same problems.
It seems there is a whole range of issues regarding the aws s3 sdk and gpc.

@ScreamZ
Copy link

ScreamZ commented Feb 18, 2024

I've no idea if bun supports it yet, but have you tried to use https://clinicjs.org/ to diagnose?

@samuelAndalon
Copy link

samuelAndalon commented Mar 1, 2024

I just got bit by this, using the S3Client, locally even with docker it works perfectly, when deployed it randomly hangs forever, in my case I just need to watch a file for changes, this is the only issue that is stopping me to use bun in production and I gotta say base on some benchmarks bun is 600x faster for what I need for. For now I’ll switch to use awscli and watch the file with bun FS in the file system and use awscli to watch the file with a bash script

@asilvas-godaddy
Copy link

Keep thumbs upping the original post to give it more attention

@Jarred-Sumner
Copy link
Collaborator

This PR #8456 might fix it

@samuelAndalon
Copy link

@Jarred-Sumner just tried 1.0.33, seems that awssdk hanging issue got fixed, will keep an eye on it.

@samuelAndalon
Copy link

can confirm this is fixed.

@par5ul1
Copy link

par5ul1 commented Mar 26, 2024

1.0.35 and I still have issues. A simple 'Hello world!' PUT hangs for >5mins (I didn't wait long enough to find out if it ever finishes)

@marcosrjjunior
Copy link

There is still a noticeable delay on this request.

Here is an example to help testing using Content-Type: multipart/form-data.

You can test using the simple put function or using the Upload.

.env

AWS_REGION=
AWS_BUCKET=
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
import { fromEnv } from '@aws-sdk/credential-providers'
import { Upload } from '@aws-sdk/lib-storage'
import * as S3 from '@aws-sdk/client-s3'

const s3 = new S3.S3Client({
  region: process.env.AWS_REGION,
  credentials: fromEnv(),
})


const putObject = async ({
  key,
  file
}: {
  key: string
  file: File | Blob
}) => {
  const arrayBufferTest = await file.arrayBuffer()
  const content = Buffer.from(arrayBufferTest)

  const command = new S3.PutObjectCommand({
    Bucket: process.env.AWS_BUCKET,
    Key: key,
    Body: content,
  })

   // const parallelUploads3 = new Upload({
  //   client: s3,
  //   params: { Bucket: process.env.AWS_BUCKET, Key: key, Body: content },

  //   // tags: [
  //   //   /*...*/
  //   // ], // optional tags
  //   queueSize: 4, // optional concurrency configuration
  //   partSize: 1024 * 1024 * 20, // optional size of each part, in bytes, at least 5MB
  //   leavePartsOnError: true, // optional manually handle dropped parts
  // })

  try {
    const response = await s3.send(command)

    // const response = await parallelUploads3.done()

    return response
  } catch (error) {
    console.log(error)
    throw error
  }
}

const response = await uploadFile({
  key: 'file-key.jpg',
  file: params.photo,
})

console.log('uploadFile: response', response)

Dependencies Bun v1.1.0

"@aws-sdk/client-s3": "^3.540.0",
"@aws-sdk/credential-providers": "^3.540.0",
"@aws-sdk/lib-storage": "^3.540.0",
"@aws-sdk/s3-request-presigner": "^3.540.0",

last tested: 02/04

@mrherickz
Copy link

@marcosrjjunior same here, was able to workaround and unblock myself for now by just signing the url upload the file via fetch.

import { getSignedUrl } from "@aws-sdk/s3-request-presigner";
import { PutObjectCommand } from "@aws-sdk/client-s3";

// TODO: this is a workaround for a performance issue with the S3 client
const url = await getSignedUrl(
  s3Client,
  new PutObjectCommand({
    Bucket: "awesome-bucket",
    Key: "ricardo-milos.png"
  })
);

await fetch(url, {
  method: "PUT",
  body
});

Previously a PUT would take more than a second to my local machine s3 bucket, using that will bring down to a few ms.

Not sure if there's an easier approach, but that works.

Hopefully Bun team will get that solved.

@rossanmol
Copy link

Experiencing same issue.

@lnlife
Copy link

lnlife commented May 20, 2024

Experiencing the same issue. When I use aws sdk v2, it upload file normally. If switch to v3, it is 1 second slower.

@FeldrinH
Copy link

Verified their fetch handler FetchHttpHandler does not work either, since it was designed for browser. I could probably build a new handler but I'm not sure that's the best use of time since this is likely a bug. Just haven't narrowed it down.

AWS SDK has since fixed some compatibility issues with FetchHttpHandler (see aws/aws-sdk-js-v3#4619). Might be worth checking again, in case FetchHttpHandler is now compatible with Bun.

@marcosrjjunior
Copy link

same response on my end

@Jarred-Sumner
Copy link
Collaborator

We haven't implemented HTTP request body streaming, so it's probably buffering. This likely won't be fixed until that is implemented.

@kravetsone
Copy link

We haven't implemented HTTP request body streaming, so it's probably buffering. This likely won't be fixed until that is implemented.

this is incredibly important for backend development(

@Jarred-Sumner
Copy link
Collaborator

This was fixed in Bun v1.1.25

@par5ul1
Copy link

par5ul1 commented Sep 4, 2024

I have updated both @aws-sdk and bun to latest. The slowdown is still there. I am happy to provide debug info; just don't know what would be helpful. This is the relevant snippet:

const params: PutObjectCommandInput = {
    Bucket: bucket,
    Key: filePath,
    Body: data, // buffer
    ContentType: contentType,
  };

  const uploadCommand = new PutObjectCommand(params);

  try {
    await Promise.race([
      s3Client.send(uploadCommand),
      new Promise((_, reject) => {
        setTimeout(() => reject(new Error('Upload timeout')), 60000 * 5); // 5 minutes timeout
      }),
    ]);
// ...

bun (package.json and machine): 1.1.26
@aws-sdk/client-s3: 3.637.0

Machine Specs: M2, MacOS 14.0

@marcosrjjunior
Copy link

@par5ul1 the issue was solved for me.
(the project is using the exact versions mentioned for @aws-sdk/client-s3 and bun)

here is an the example I'm using

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.