-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use version of htmlrewriter which does not make use of asyncify, which looks to have a potential memory leak under high load #2721
Conversation
📊 Package size report 10%↑
Unchanged files
🤖 This report was automatically generated by pkg-size-action |
4a359c6
to
b76dfb1
Compare
import { decode as _base64Decode } from './edge-runtime/vendor/deno.land/[email protected]/encoding/base64.ts'; | ||
import { init as htmlRewriterInit } from './edge-runtime/vendor/deno.land/x/[email protected]/src/index.ts' | ||
import {handleMiddleware} from './edge-runtime/middleware.ts'; | ||
import handler from './server/${name}.js'; | ||
|
||
await htmlRewriterInit({ module_or_path: _base64Decode(${JSON.stringify( | ||
htmlRewriterWasm.toString('base64'), | ||
)}).buffer }); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bundling wasm file as-is doesn't seem to work, so instead we inline it using same method we do for wasm modules used by users in middleware
opennextjs-netlify/src/build/functions/edge.ts
Lines 125 to 133 in 6b56128
parts.push(`import { decode as _base64Decode } from "${base64ModulePathRelativeToOutputFile}";`) | |
for (const wasmChunk of wasm ?? []) { | |
const data = await readFile(join(srcDir, wasmChunk.filePath)) | |
parts.push( | |
`const ${wasmChunk.name} = _base64Decode(${JSON.stringify( | |
data.toString('base64'), | |
)}).buffer`, | |
) | |
} |
and pass module to init
function of htmlrewriter (argumentless init doesn't work after vendoring)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna have to just trust you on this one 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://gist.github.com/pieh/b205a685e518ef62afaab392e61e064f this is standalone script using this method that can be tested.
You can also test that passing crap to init
first arg like:
await htmlRewriterInit({
module_or_path: 'wat',
})
would result in errors:
error: Uncaught (in promise) TypeError: Invalid URL: 'wat'
module_or_path = fetch(module_or_path);
^
at getSerialization (ext:deno_url/00_url.js:98:11)
at new URL (ext:deno_url/00_url.js:405:27)
at new Request (ext:deno_fetch/23_request.js:329:25)
at ext:deno_fetch/26_fetch.js:320:27
at new Promise (<anonymous>)
at fetch (ext:deno_fetch/26_fetch.js:316:18)
at __wbg_init (https://deno.land/x/[email protected]/pkg/htmlrewriter.js:1207:22)
at file:///Users/misiek/dev/pgs-next-runtime/edge-runtime/test.ts:11:7
so the argument has effect.
Commenting out init
call also result in errors:
error: Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'htmlrewriter_new')
const ret = wasm.htmlrewriter_new(
^
at new HTMLRewriter (https://deno.land/x/[email protected]/pkg/htmlrewriter.js:781:22)
at Object.start (https://deno.land/x/[email protected]/src/index.ts:53:20)
at Module.invokeCallbackFunction (ext:deno_webidl/00_webidl.js:981:16)
at new TransformStream (ext:deno_web/06_streams.js:6214:16)
at HTMLRewriter.transform (https://deno.land/x/[email protected]/src/index.ts:48:29)
at rewriter (file:///Users/misiek/dev/pgs-next-runtime/edge-runtime/test.ts:39:6)
at file:///Users/misiek/dev/pgs-next-runtime/edge-runtime/test.ts:43:27
// htmlrewriter contains wasm files and those don't currently work great with vendoring | ||
// see https://github.com/denoland/deno/issues/14123 | ||
// to workaround this we copy the wasm files manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😢 I left a comment on that issue, as it seems like it should be reopened since Deno landed WASM support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deno landed it, but in 2.1 - https://deno.com/blog/v2.1#first-class-wasm-support , so support is not guaranteed :( otherwise we could migrate to imports in https://github.com/netlify/htmlrewriter/blob/edb477af3d08359c8f25edd8d301df69ffcc8e4b/pkg/htmlrewriter.js#L1184 to make use of it
import { decode as _base64Decode } from './edge-runtime/vendor/deno.land/[email protected]/encoding/base64.ts'; | ||
import { init as htmlRewriterInit } from './edge-runtime/vendor/deno.land/x/[email protected]/src/index.ts' | ||
import {handleMiddleware} from './edge-runtime/middleware.ts'; | ||
import handler from './server/${name}.js'; | ||
|
||
await htmlRewriterInit({ module_or_path: _base64Decode(${JSON.stringify( | ||
htmlRewriterWasm.toString('base64'), | ||
)}).buffer }); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna have to just trust you on this one 😄
… which looks to have a potential memory leak under high load we noticed the memory issue with Netlify's CSP plugin which used the same htmlrewriter library. We've built a new htmlrewriter library which uses the latest version of lol-html and removes the ability to use async-handlers, which is what required asyncify to be included.
Co-authored-by: Philippe Serhal <[email protected]>
af28413
to
fdb11ed
Compare
await htmlRewriterInit({ module_or_path: _base64Decode(${JSON.stringify( | ||
htmlRewriterWasm.toString('base64'), | ||
)}).buffer }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use a uint8array instead of a base64 string here, it would save on memory usage by 33%
I did this in a similar package I made specifically for CSP support - I could do the same for the htmlrewriter package if we want:
await Deno.writeTextFile(
"./pkg/embedded-wasm.ts",
`export const wasmBinary = Uint8Array.from(${
JSON.stringify(
Array.from(
await Deno.readFile(
`./pkg/${wasmFile}`,
),
),
)
});`,
);
https://github.com/netlify/csp_nonce_html_transformer/blob/main/scripts/build.ts#L13C3-L25
Description
This was based on #2707 (new PR, primarily due to problems with running tests against forks).
Need to use bit of hacks and indirections due to:
deno vendor
not playing well with wasm files ( Support vendoring modules which read "static" files, like.wasm
denoland/deno#14123), so I added manual pull for this file afterdeno vendor
in our build scriptfetch
attempt againstfile://
url which fails:init
function, but Netlify bundling also seems to skip.wasm
files so I inline wasm module using same method we currently inline wasm modules used by user in middlewareDocumentation
Tests
Any test that make use of middleware already tests this implicitly
Relevant links (GitHub issues, etc.) or a picture of cute animal
Fixes https://linear.app/netlify/issue/FRB-1523/nextjs-runtime-uses-the-same-html-rewriter-as-the-csp-plugin-that