Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Results" sections of developer.mozilla.org (MDN) are not showing up #327

Open
benoit74 opened this issue Jun 21, 2024 · 5 comments
Open
Assignees
Labels
bug Something isn't working upstream
Milestone

Comments

@benoit74
Copy link
Collaborator

See webrecorder/wombat#156

This is not a problem with Zimit2, problem happens also in Zimit1 and on replayweb.page

@benoit74 benoit74 added bug Something isn't working upstream labels Jun 21, 2024
@benoit74 benoit74 added this to the 2.1.0 milestone Jun 21, 2024
@benoit74 benoit74 changed the title "Results" sections of developer.mozilla.org are not showing up "Results" sections of developer.mozilla.org (MDN) are not showing up Jun 21, 2024
@benoit74 benoit74 self-assigned this Jun 25, 2024
@kelson42 kelson42 pinned this issue Jun 27, 2024
@benoit74
Copy link
Collaborator Author

Despite been solved upstream, this is still not working within warc2zim: webrecorder/wombat#156 (comment)

@ikreymer
Copy link
Collaborator

The wombat issue is fixed, the issue is with warc2zim adding an unneeded %-encoding, converting from:

<iframe src="https://live.mdnplay.dev/en-US/docs/Web/HTML/Element/section/runner.html?id=before ... >

to:

<iframe src="../../../../../../live.mdnplay.dev/en-US/docs/Web/HTML/Element/section/runner.html%3Fid%3Dbefore" ... >

The code on the page checks for the id query param, and is unable to find it.

@benoit74
Copy link
Collaborator Author

Unfortunately, the %-encoding is not unneeded, it is indeed over-encoded, but this mostly mandatory, or at least it is the best tradeoff found so far (see #206 for lengthy discussions on this).

I agree this is what causes the problem here, and this issue is then indeed only a warc2zim issue.

Thank you for the analysis!

I don't know yet how we are supposed to handle this kind of situation ... but this is Kiwix team problem at least ^^

I remember that we also faced the same kind of problem somewhere else, but do not recall where.

Not that common anyway, since query parameters are more meant to be interpreted by the web server than by the client-side JS.

@benoit74
Copy link
Collaborator Author

One idea: add a regex of URLs for which we want to ignore the query parameter.

To be tested, but I think that in most cases (at least here in MDN it is the case), the server simply ignore the query parameter which is only used client-side.

We can hence:

  • store the WARC response without the query parameter in the ZIM
  • rewrite the URL and keep the query parameter as-is
  • at read time, the query parameter should/will be dropped in most readers (to be tested as well, big uncertainty on this)
  • and since the query parameter is intact, it will be "natural" to process by the JS client-side

@benoit74
Copy link
Collaborator Author

benoit74 commented Nov 14, 2024

Here is a test ZIM demoing what I proposed above: tests_eng_mdn-page_2024-11.zim.zip (remove .zip extension added to please GitHub)

Note that the trick to rewrite the URL without escaping the ? had to be done in wombatSetup.js because the URL is built dynamically JS-side ... which adds a bit of a problem ^^

At least the ZIM works well in kiwix-serve, mostly OK in kiwix-apple (there is a very different problem, see kiwix/kiwix-apple#1027), mostly OK in kiwix-android (there is a very different problem, see kiwix/kiwix-android#4084), OK on Kiwix PWA (on MacOS Firefox and Android Firefox) and Kiwix JS (on MacOS Firefox).

That being said, it is going to be pretty ugly to integrate these changes inside the codebase in a generic manner (especially since we need to pass this information to JS at runtime). All thoughts are welcomed ^^ (and in the mean time, I will create again a WARC of https://farm.openzim.org/recipes/developer.mozilla.org_en and create the ZIM manually with my hacks if it is OK for you, at least to play in dev).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream
Projects
None yet
Development

No branches or pull requests

2 participants