Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support parallel builds #11984

Merged
merged 28 commits into from
Oct 9, 2024
Merged

Conversation

chaegumi
Copy link
Contributor

Changes

  • add support parallel builds

Testing

When greater than 1 using parallel builds during the build.

{
  build: {
    concurrency: 1
  }
}

pnpm run build

Docs

add a configuration options concurrency in build options

Copy link

changeset-bot bot commented Sep 13, 2024

🦋 Changeset detected

Latest commit: 775b042

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added pkg: astro Related to the core `astro` package (scope) docs pr labels Sep 13, 2024
@github-actions github-actions bot added pkg: solid Related to Solid (scope) pkg: integration Related to any renderer integration (scope) labels Sep 13, 2024
@ematipico
Copy link
Member

ematipico commented Sep 13, 2024

At the beginning I thought you were implementing the solution using worker threads, but your solution stil uses a main thread. This solution feels more like a queue instead. Is my analysis incorrect?

@matthewp
Copy link
Contributor

Thanks, as we're in the middle of releasing Astro 5 beta we likely won't get to this right away but we'll circle back in a couple of weeks and take a look.

Note that we have had a few attempts to bring this in, but it never quite worked out. Might look at old PRs to see what went wrong in the past. Also, generating pages isn't usually the slow part of the builds, so would love to see some examples of where this has big benefits. cc @bluwy who might have thoughts.

@chaegumi
Copy link
Contributor Author

chaegumi commented Sep 13, 2024

My code writing may not be optimal, but this change is really important to me, and I believe it will also be beneficial to Astro. My project is connected to the CMS API, with more than 9,000 URLs. The current Astro build queues one build before generating the next one. My modification allows the setting to generate 10 or more URLs at the same time. Below is a screenshot of my project running results.
企业微信截图_20240914073931
企业微信截图_20240914074234

Before:

12:40:24 [@astrojs/sitemap] sitemap-index.xml created at dist
12:40:24 [build] 9377 page(s) built in 7158.37s
12:40:24 [build] Complete!
Done in 7289.08s.

After:

21:45:59 [build] Waiting for integration "@astrojs/sitemap", hook "astro:build:done"...
21:46:01 [@astrojs/sitemap] sitemap-index.xml created at dist
21:46:01 [build] 9372 page(s) built in 4003.43s
21:46:01 [build] Complete!
Done in 4063.53s.

@bluwy
Copy link
Member

bluwy commented Sep 16, 2024

Yeah this seems to still run on the main thread, but what I understand is that it allows multiple fetch calls from different pages to be executed in parallel? Usually for cases like this, we suggest doing all the fetches beforehand (e.g. from a shared file, during build only), that way you're not blocked by the page rendering order and better cache things if needed.

Maybe that's the better approach here for your app? I think there's some value in a concurrency option, but it's hard to tell users what's the ideal value for it. If you set it too high, you can slow down the rendering process too.

@chaegumi
Copy link
Contributor Author

chaegumi commented Sep 16, 2024

Assuming I have 10 pages, from page 1 to page 10, is the current program generating page 2 after page 1 is completed, page 3 after page 2 is completed, until page 10 is generated.
My requirement is simple. I need to generate two pages at a time, such as page 1 and page 2, page 3 and page 4 (or the concurrency parameter I added). Is that okay?
So I used it https://lodash.com/docs/4.17.15#chunk , split the pages that need to be generated, and then use promise.all

@gacek1123
Copy link

@bluwy could we maybe get an experimental release for this?

I have an app that does all fetching beforehand and then renders pages from a shared cache of sorts so I could maybe see if this makes any difference.

It is quite a bit smaller (around 800 pages) but we should still be able to see if there's any improvement.

@chaegumi
Copy link
Contributor Author

@bluwy
Copy link
Member

bluwy commented Sep 19, 2024

My requirement is simple. I need to generate two pages at a time, such as page 1 and page 2, page 3 and page 4 (or the concurrency parameter I added). Is that okay?
So I used it lodash.com/docs/4.17.15#chunk , split the pages that need to be generated, and then use promise.all

I think using p-limit could be better so that we always generate two pages at a time, so that we don't have pages waiting for each chunks. Before making any changes to that though, I think the team still needs to discuss if this is something we want.

Like I mentioned, if you perform all the fetch calls beforehand, cache it, and share it between pages, does that help with the generation speed? That should make page generation even faster than having a concurrency option.

@bluwy could we maybe get an experimental release for this?

If you've not tested linking locally like @chaegumi mentioned, I can definitely cut a preview release for this. But it looks like the lockfile is borked as it's installed with pnpm v8, so i'm not sure publishing will pass. Would have to fix that first.

@chaegumi
Copy link
Contributor Author

chaegumi commented Sep 19, 2024

if you perform all the fetch calls beforehand, cache it, and share it between pages, does that help with the generation speed?

My program was designed to do this, and I started using it https://github.com/11ty/eleventy-fetch I did request caching, but when my program cached the .cache directory for a few hundred MB, it failed to start. Later, I switched to cacache. I am now using keyv and keyv-lru to cache my requests, and I have also done Incremental Static Regeneration (ISR), but none of these can solve my problem. I have no problem obtaining API data, but now I just want to control the number of pages generated simultaneously. One page takes 1 second, and I want to change it to generate 10 pages per second.

But it looks like the lockfile is borked as it's installed with pnpm v8, so i'm not sure publishing will pass. Would have to fix that first.

ok

@bluwy
Copy link
Member

bluwy commented Sep 19, 2024

!snapshot build-concurrency

@bluwy
Copy link
Member

bluwy commented Sep 19, 2024

It looks like preview releases doesn't work on forks :(

but none of these can solve my problem. I have no problem obtaining API data, but now I just want to control the number of pages generated simultaneously. One page takes 1 second, and I want to change it to generate 10 pages per second.

Ah so you've already optimized network fetching, that's good to know. I'm curious what's making page rendering slow still. Anyways I can't seem to cut a preview release, so we might have to wait for the rest of the team to discuss about this feature.

@chaegumi
Copy link
Contributor Author

English is not my native language, so my expression may not be very accurate, causing your misunderstanding. My page rendering should not be slow, 200ms, 300ms, 500ms, 600ms, etc., and some may take about 1 second.
The current build program is one after another
I want multiple at once, don't queue up, because when generating, the order is not important

Copy link
Member

@sarah11918 sarah11918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this helpful feature @chaegumi ! 🙌

I am just reviewing the documentation part of the PR, and I've made some language suggestions. But, I will need your knowledge to make sure it's still correct! Can you please make sure nothing is wrong with the docs in the astro.ts, and then I will review the changeset to make the CHANGELOG message match?

packages/astro/src/@types/astro.ts Show resolved Hide resolved
packages/astro/src/@types/astro.ts Outdated Show resolved Hide resolved
.changeset/ten-emus-heal.md Outdated Show resolved Hide resolved
packages/astro/src/@types/astro.ts Outdated Show resolved Hide resolved
packages/astro/src/@types/astro.ts Outdated Show resolved Hide resolved
packages/astro/src/@types/astro.ts Outdated Show resolved Hide resolved
packages/astro/src/core/config/schema.ts Show resolved Hide resolved
packages/astro/src/core/config/schema.ts Outdated Show resolved Hide resolved
Co-authored-by: Sarah Rainsberger <[email protected]>
@bluwy
Copy link
Member

bluwy commented Oct 8, 2024

Thanks @sarah11918 those suggestions look great!

Copy link
Member

@sarah11918 sarah11918 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for docs! Let's go! 🚀

@ematipico ematipico merged commit 3ac2263 into withastro:main Oct 9, 2024
14 checks passed
@astrobot-houston astrobot-houston mentioned this pull request Oct 9, 2024
@chaegumi
Copy link
Contributor Author

chaegumi commented Oct 9, 2024

Thank you all for your help.

@slorber
Copy link

slorber commented Oct 14, 2024

Hey 👋 Docusaurus maintainer here.

I'm curious, what's the reason for Astro to use concurrency: 1 as a default and discourage its usage? I found it surprising.


To give some context, for Docusaurus, we went the other way around: we started with a Webpack SSG plugin we didn't build, that used Promise.all(paths.map(generate)) under the hood: https://github.com/markdalgleish/static-site-generator-webpack-plugin

That has worked well for us for 2020-2022 but some users reported memory spikes:
facebook/docusaurus#4765 (comment)

This led us to implement a similar solution (using p-map which is kind-of the same as p-limit but allow collecting a result): slorber/static-site-generator-webpack-plugin@03b6fab

Today we have internalized this code but still use 32 as default value and nobody complained so far:

const Concurrency = process.env.DOCUSAURUS_SSR_CONCURRENCY
  ? parseInt(process.env.DOCUSAURUS_SSR_CONCURRENCY, 10)
  : // Not easy to define a reasonable option default
    // See also https://github.com/sindresorhus/p-map/issues/24
    32;

Note that this code runs both CPU (React renderToString, HTML minification) and IO (writing HTML files to disk)


I'm curious, did I miss something that prevents you from adding more concurrency by default?

In the future, I'm also planning to introduce a thread pool for SSG, curious to check your past experiments and the lessons you learned here. Now that we migrated to Rspack, React SSG render has become a bottleneck for us.

@chaegumi
Copy link
Contributor Author

My idea: 1 represents the original program, and setting larger values is autonomous. If necessary, set them, and if not, do not set them.
The original program ran well, but I had this requirement because the API data I used was too slow. And this value cannot be too high, because my backend PHP API server cannot handle it, it is completely optional.

@slorber
Copy link

slorber commented Oct 14, 2024

Thanks @chaegumi

A value of 1 is a safe/conservative default.

What makes me curious is that from the outside it seems that you discourage users from using it, and not really planning to increase that default. Was wondering why.

@bluwy
Copy link
Member

bluwy commented Oct 14, 2024

@slorber I'm probably the main proponent to keep this behaviour so I'll try to explain:

  1. We found that higher concurrency doesn't always necessarily mean faster builds. I took the starlight docs (544 static pages) for a spin with concurrency 1, 4, and 32, and all of them built around 6-7s seconds. There isn't a clear winner among them. (macos m1 pro)
  2. We found that optimizing towards batching/caching fetch calls or data access to be more effective. While a higher concurrency negates this concern, it is still technically more work needed to render the pages overall, so we'd like to nudge users towards that practice instead of bumping the concurrency as an easy fix.
  3. We'd like to reuse this option for threading / worker threads in the future. And if it's set too high, it might break or degrade many sites unknowingly. So a low number now helps protect that, especially when we switch to os.cpus().length in the future.

Btw I noticed that docusaurus uses renderToPipeableStream to render the pages. I don't know if you've tried before, but we found rendering to string directly to be slightly faster than using a stream - React's renderToString API in this case. Maybe it helps on your case too. #10796

@slorber
Copy link

slorber commented Oct 15, 2024

Thanks for the explanations @bluwy, that makes sense.

We don't have the same constraint on Docusaurus: we don't support async/await (yet) in components, and users can only load data before the SSG phase.


Regarding renderToPipeableStream, I agree we don't really need it. We still have a "shell" template that is not managed by React, but maybe someday we'll be able to stream SSG directly to disk (not sure how this could work, but I have hope 😅 ).

https://19.react.dev/reference/react-dom/server/renderToString

CleanShot 2024-10-15 at 12 13 55

In our case I didn't measure precisely but it didn't seem using streaming was much slower, and it looked like the new recommended way in particular if I want to later experiment with RSC ran at build time using async/await/Suspense (although similarly we may not want to encourage users to do that, it can remain useful for flexibility)

@slorber
Copy link

slorber commented Oct 24, 2024

@bluwy I tested on Docusaurus and saw no noticeable difference by moving back to renderToString

My intuition is that it was only faster for Astro because you didn't have concurrent builds in the first place. When rendering things sequentially, a synchronous call will always be a bit faster than an asynchronous one, but this gets mitigated with parallelism.

@bluwy
Copy link
Member

bluwy commented Oct 24, 2024

Interesting thanks for checking that! Yeah that could be it. I wonder if reducing/removing the parallelism in Docusaurus may reveal it better, but I won't bother you to test it 😅 If node is able to manage async queues well enough there might not be a significant perf difference.

@slorber
Copy link

slorber commented Oct 24, 2024

Can confirm that a small local benchmark shows renderToString as faster with concurrency = 1

  const html =
    process.env.DOCUSAURUS_AB_BENCHMARK === 'true'
      ? renderToString(app)
      : await renderToHtml(app);
hyperfine --prepare 'yarn clear:website' --runs 5 'DOCUSAURUS_SSR_CONCURRENCY=1 DOCUSAURUS_AB_BENCHMARK=true yarn build:website --locale en' 'DOCUSAURUS_SSR_CONCURRENCY=1 DOCUSAURUS_AB_BENCHMARK=false yarn build:website --locale en'
Benchmark 1: DOCUSAURUS_SSR_CONCURRENCY=1 DOCUSAURUS_AB_BENCHMARK=true yarn build:website --locale en
  Time (mean ± σ):     42.539 s ±  0.371 s    [User: 78.937 s, System: 15.350 s]
  Range (min … max):   42.001 s … 42.844 s    5 runs

Benchmark 2: DOCUSAURUS_SSR_CONCURRENCY=1 DOCUSAURUS_AB_BENCHMARK=false yarn build:website --locale en
  Time (mean ± σ):     45.323 s ±  2.018 s    [User: 80.985 s, System: 14.029 s]
  Range (min … max):   42.832 s … 48.423 s    5 runs

Summary
  DOCUSAURUS_SSR_CONCURRENCY=1 DOCUSAURUS_AB_BENCHMARK=true yarn build:website --locale en ran
    1.07 ± 0.05 times faster than DOCUSAURUS_SSR_CONCURRENCY=1 DOCUSAURUS_AB_BENCHMARK=false yarn build:website --locale en

Note: it's the full website build, not just the SSG phase (which is ~15-20s with concurrency=1). If I isolated the SSG phase alone, the % would be higher.

@bluwy
Copy link
Member

bluwy commented Oct 24, 2024

Very interesting! Thanks for testing it. Good to know that generally non-streaming is a little faster than streaming. (for non-parallel)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs pr pkg: astro Related to the core `astro` package (scope) semver: minor Change triggers a `minor` release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants