-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Localization of scikit-image website content. #7296
Comments
Thanks for reaching out @steppi and for the overview! A few thoughts / comments from my site:
And a few questions:
|
Thanks @lagru.
That seems helpful. It could simplify the simplify the process of generating the
If this was something already planned, it would definitely simplify adding translations. Let me know if there's anything I could do to help out with that.
Yes, this is what we're doing for https://numpy.org, except in a few cases where the translations were added together with the English language update, (for example, the announcement that translations have been added.). After a change is merged into main, new strings are uploaded to Crowdin for translation, and translators are notified that there's more work to do. I guess it's not ideal, but having to coordinate with translators before making updates seems like it would add a lot of overhead for maintainers.
I don't think there have been any issues for numpy.org or scipy.org. Someone has to stay on top of things to make sure the website isn't neglected when changes are being made, and although I haven't really been involved in that side od things, my impression is that it isn't too bad. Updating the website needs to be part of process of producing a new release. At the least, an announcement with release highlights should always be added to the news section, and making any necessary changes to the installation guide should be part of the checklist when updating the website. Let me know if you have any other questions. |
Okay, I am glad then that this will probably not slow down updating documentation directly. If source version and translated version are out of sync, I guess there isn't some mechanism to make this transparent on the website?
I am more worried about something else. If we split into sphinx-generated documentation for which we maintain previous versions but put some documents on a static website without a version switcher, then we loose access to older versions of those documents. E.g. if we moved our user guide there, users wouldn't have easy access to guides for previous versions. This isn't necessarily a blocker for me, but maybe a trade-off in a few cases. How does NumPy address this? It seems they solve this problem by duplicating certain parts..? |
Ah, I see your point. I agree there's a trade-off here. There are things which clearly need to be versioned, and these should go in the documentation. Ideally, the brochure website should contain content which is unlikely to change frequently. There is common info between the brochure website and the documentation, but instead of thinking of this as complete duplication, I think it's more that the brochure website should contain broad summaries and documentation fleshes thing out. Specific details are liable to change, but I the website should just give a general idea which should be more static. I think past versions of numpy.org may be interesting for historical reasons, but typically information drops off when it's no longer relevant. e.g. links to tutorials which no longer exist, installation info which is out of date, links to communication channels which no longer exist. For historical research, the internet archive seems sufficient, https://web.archive.org/web/20240115000000*/numpy.org. Any information which is tied to specific versions, and will remain relevant for those versions into the future should go in the documentation though. In any case. I think this discussion about the website is separate from the translation issue. We can still set up the translation infrastructure for the current website as is, and any heavy lifting I'd need to do, I'll need to do anyway for other projects which will continue to generate their websites with sphinx and host the code on the primary repo. |
I've created an FAQ here with more information on the translation project here, https://scientific-python-translations.github.io/faq/. Feel free to take a look and let me know if you'd be interested in participating or if you have any other questions. We're going to start moving forward only with the more enthusiastic projects, so if you're still unsure, you can wait to see how things work out for other project websites before making a final decision. |
Hi,
I'm working for Quansight labs, helping set up infrastructure for translating content from the websites for core scientific python packages as part of the CZI Scientific Python Community & Communications Infrastructure. @jarrodmillman and @stefanv were authors one and two on the grant proposal, but I'll give an overview for the sake of everyone else reading this.
The goal is to translate the brochure websites of at least 8 of the Scientific Python core projects into at least 3 commonly used languages. The list of them can be found here. By "brochure website", I mean the project website that give a general overview of the package, as distinct from technical documentation like API references, examples, and tutorials. For scikit-image this is https://scikit-image.org/.
So far translations have been completed and published for https://numpy.org. I've recently reached out to Pandas (pandas-dev/pandas#56301 (comment)) and scikit-learn (scikit-learn/scikit-learn#28105), and plan to reach out to maintainers from the remaining core projects over the next week. There's a lot of work involved in setting up translation infrastructure, finding coordinating with qualified translators, and approving and publishing translated content. The hope is that a cross-functional team including employees from Quansight together with volunteer translators and reviewers could take on much of the burden, minimizing the effort needed from core project maintainers themselves.
For translation management, we've been using Crowdin enterprise. Crowdin have generously offered a free supported enterprise organization we can use for managing translations across the different projects. So far the support has been excellent. Crowdin can be synced with a GitHub repo containing content, with segmented strings of content being uploaded to Crowdin for translation, and translations sent back to the repo as commits to a running PR. For numpy.org, Crowdin was synced to directly to the repo https://github.com/numpy/numpy.org hosting the website content. Based on things that have come up in the discussions with Pandas and scikit-learn maintainers, it seems would be better to have a separate repo for managing translated content.
I'm just interested in getting the ball rolling here, and will give more info as things develop over the next coming weeks. Here's a summary of the steps I think would be involved:
Set up a repository for managing content that should be translated, with an automated process to get the latest content whenever changes are made. There may be multiple repos where content needs to be taken from. (For scikit-image much of it is in the docs folder from the primary repo, but I think at the least the index is in https://github.com/scikit-image/skimage-web.)
Set up Crowdin integration with this repository. Markdown files can be segmented automatically, gnu gettext can be used for sphinx .rst files to generate
.po
files as described here https://www.sphinx-doc.org/en/master/usage/advanced/intl.html.Myself and/or colleagues from Quansight will help take care of finding and vetting interested and qualified translators, and there will hopefully be large overlap between the translators for different projects.
Publishing translations on the core project website, with a drop down selector to choose between languages. How this is done will depend on the static site generator used. For sites using the Scientific Python Hugo theme (thanks @jarrodmillman and @stefanv) like numpy.org, setting this up is almost automatic. I've found that scikit-image is using the pydata-sphinx theme. There, I think the version selector could be used, or code could be copied from it to make a separate language selector.
Please let me know if you have any questions, especially from those who are much more knowledgeable than me about much of this stuff, and would probably like to hear more specifics.
The text was updated successfully, but these errors were encountered: