Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thalesdoc_en_all is failing #1214

Open
kelson42 opened this issue Nov 22, 2024 · 1 comment
Open

thalesdoc_en_all is failing #1214

kelson42 opened this issue Nov 22, 2024 · 1 comment
Labels
Bug Something isn't working Upstream For tickets which are waiting for an upstream modification (typically scrapper or target website)

Comments

@kelson42
Copy link
Collaborator

Recipe URL

https://farm.openzim.org/recipes/thalesdoc_en_all/

Last log lines

[zimit::2024-11-22 06:34:07,976] INFO:----------
[zimit::2024-11-22 06:34:07,976] INFO:Processing WARC files in/at /output/.tmpsx6dc3yt/collections/crawl-20241116163210663/archive
[zimit::2024-11-22 06:34:07,976] INFO:Calling warc2zim with these args: ['--name=thalesdoc_en_all', '--tags=HSM CipherTrust', '--favicon=https://drive.farm.openzim.org/thalesdoc_en_all/favicon.png', '--zim-file=thalesdoc_en_all_{period}.zim', '--publisher=openZIM', '--scraper-suffix', 'zimit 2.1.6', '--output', '/output', '--url', 'https://thalesdocs.com/', '--custom-css', 'https://drive.farm.openzim.org/thalesdoc_en_all/custom.css', '--title', 'Thales CPL Documentation Hub', '--description', 'Home to all your Cloud Protection and Licensing product documentation needs', '--lang', 'eng', '-v', '--progress-file', '/output/warc2zim.json', '/output/.tmpsx6dc3yt/collections/crawl-20241116163210663/archive']
[warc2zim::2024-11-22 06:34:07,978] DEBUG:Attempting to confirm output is writable in directory /output
[warc2zim::2024-11-22 06:34:07,978] DEBUG:Output is writable. Temporary file used for test: /output/tmp2k9xk5h_
[warc2zim::2024-11-22 06:34:07,978] DEBUG:Confirming ZIM file can be created using name: thalesdoc_en_all_2024-11.zim
[warc2zim::2024-11-22 06:34:07,979] DEBUG:4 WARC files found
[warc2zim::2024-11-22 06:34:07,999] DEBUG:Title: Thales CPL Documentation Hub
[warc2zim::2024-11-22 06:34:07,999] DEBUG:Language: eng
[warc2zim::2024-11-22 06:34:07,999] DEBUG:Favicons to consider: https://drive.farm.openzim.org/thalesdoc_en_all/favicon.png
[warc2zim::2024-11-22 06:34:08,020] ERROR:Main URL returned an unprocessable HTTP code: 403
[zimit::2024-11-22 06:34:08,021] INFO:
[zimit::2024-11-22 06:34:08,021] INFO:
[zimit::2024-11-22 06:34:08,021] INFO:SIGINT/SIGTERM received, stopping zimit
[zimit::2024-11-22 06:34:08,021] INFO:
[zimit::2024-11-22 06:34:08,021] INFO:

How many times the recipe failed in a row?

Once

How many ZIM have been produced before failure?

Many

Which action did you undertake so far?

None, I have no idea of what to do

What's next?

I don't know

More details

Really late "crash" for a very unclear reason. Maybe a bug, if not the message would benefit to be clearer IMHO.

@kelson42 kelson42 added Bug Something isn't working Question Further information is requested labels Nov 22, 2024
@benoit74 benoit74 added Upstream For tickets which are waiting for an upstream modification (typically scrapper or target website) and removed Question Further information is requested labels Nov 25, 2024
@benoit74
Copy link
Contributor

Upstream issue: openzim/warc2zim#424

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Upstream For tickets which are waiting for an upstream modification (typically scrapper or target website)
Projects
None yet
Development

No branches or pull requests

2 participants