Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an API to serialize with the "require well-formed" parameter set to true #84

Open
geofft opened this issue Aug 26, 2024 · 0 comments

Comments

@geofft
Copy link

geofft commented Aug 26, 2024

The serializeToString static method of XMLSerializer is specified to "produce an XML serialization of root passing a value of false for the require well-formed parameter, and return the result." It's a little bit confusing that something called XMLSerializer might return something that isn't actually valid XML, but I understand that this can't be changed for backwards compatibility. Still, it would be useful to have a mechanism that sets the "require well-formed" parameter to be true, i.e., that throws if the node cannot be serialized to XML.

Background: I'm trying to use the technique in this blog post to render HTML to an image by creating an SVG with a <foreignObject> containing the HTML. As noted on the page, because SVG is XML, you need the contents of <foreignObject> to be valid XML. Doing this with serializeToString, which the post suggests, works for most documents, but not certain less-than-well-formed HTML documents that successfully parse in the browser. The specific case I ran into was an attribute that unescaped quotation marks in the value:

<meta property="og:description" content="I forgot to "escape" this value">

which gets parsed as

<meta property="og:description" content="I forgot to " escape"="" this="" value"="">

i.e., it picks up some attributes whose names have a quotation mark in them. (You can see this by setting an element's innerHTML to the first string and then reading innerHTML again.) This can't be represented in XML, but serializeToString successfully returns an "XML" document with this syntax, which the browser cannot deserialize as XML (e.g., in an <img> with SVG source, or with new DOMParser().parseFromString(xml, "text/xml")).

I can try to see if DOMParser succeeds and throw away the parse if successful, or catch the error event from the <img>, but it would be cleanest if I could just get serializeToString to fail in the first place. Is it possible to add an optional boolean parameter serializeToString(document, requireWellFormed) that defaults to false, or a property of the XMLSerializer, or something?

(Originally reported as https://bugzilla.mozilla.org/1914813 because I didn't realize the spec requires this, but it does, and the behavior is the same in Firefox, Safari, and Chrome. See also mdn/content#35585.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@geofft and others