-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option for encode-marcXml and encode-xml for escaping unicode characters #529
Comments
I had a similar issue with Catmandu some years ago. |
Two things are mixed up here, i think:
The issue pointed out by @TobiasNx belongs to 2. See also https://stackoverflow.com/questions/730133/what-are-invalid-characters-in-xml |
What makes you think that? Syntax markers (1) are still escaped. AFAIUI, we're only talking about escaping invalid characters (2) here. |
The docs of
That's why i think it does not 2. (remove or replace invalid characters). As a side note, from the logs of QA catalogue:
|
I think the title of this issue is misleading, too. |
Now I see what you mean, thanks. I haven't actually looked at the character values in question ;) This implies that the behaviour in Metafacture hasn't changed, but rather that it's always been this way (with regard to invalid XML characters below This also means that simply passing |
Yep.
Looking at |
I would suggest the following:
|
https://metafacture.org/playground/?flux=inputFile%0A%7Copen-file%0A%7Cas-lines%0A%7Cdecode-json%0A%7Cencode-xml%28rootTag%3D%22collection%22%29%0A%7Cprint%0A%3B&data=%7B%22name%22%3A+%22Open+Educational+Resources+-+\u000Beine+kritische+Einf%C3%BChrung%22%7D
When special unicode characters are part of the transformed metadata encode-xml and encode-marcXml might transform the metadata to invalid xml.
There should be the possibility to create valid xml at least by adding an option for escping unicode characters.
The text was updated successfully, but these errors were encountered: