-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making Unicode supported LaTeX template as the default #7
Comments
So the proceedings template contains these lines, which are really specific to pdfLaTeX and shouldn't be used with the newer engines: \usepackage{times}
\usepackage{latexsym}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc} If I compile that on Overleaf, download the PDF, and check the fonts that are used with
So at least Overleaf uses the "Nimbus" fonts when including the "times" package. That makes me think that with XeLaTeX or LuaLaTeX, the above lines in the template should be replaced with: \usepackage{fontspec}
\setmainfont{Nimbus Roman}
\setsansfont{Nimbus Sans}
\setmonofont{Nimbus Mono} (EDIT: TeX Gyre Termes is probably better, since it's supposed to be the same but with more features.) I think it would make sense to check in the .sty file which TeX engine is used, and modify the font-related commands accordingly. @davidweichiang Would it make sense if I tried to prepare a pull request for something like this? |
This all sounds great. But we need to set it up so that it looks the same either way. |
Also related if any pub chairs are still using it: yz-joey/ACLPUB#7 |
XeLaTeX has a major disadvantage, which is that arXiv does not support it. So I don't think it can be made the default (yet). But I definitely agree with making it an option. @thammegowda In your example, the one on the right is set in Computer Modern, not Times Roman. So something is wrong with the font setup. |
The modifications I did to add some Unicode text was
\usepackage[english]{babel} % English as the main language
\babelprovide[import]{hindi}
\babelprovide[import]{arabic}
\babelprovide[import]{kannada}
\babelfont[*devanagari]{rm}{Lohit Devanagari}
\babelfont[*arabic]{rm}{Noto Sans Arabic}
Hindi: \foreignlanguage{hindi}{मानव अधिकारों की सार्वभौम घोषणा} Arabic: \foreignlanguage{arabic}{الإعلان العالمي لحقوق الإنسان
I didn't explicitly modify fonts for English/Latin. Is |
Well, I would say arXiv has a major disadvantage in that it doesn't support XeLaTeX/LuaLaTeX, but I can see how we should make sure to support it ;) @thammegowda The default font is Computer Modern, to get the correct font for the current *ACL template, both |
@mbollmann I agree, and I hope arXiv realizes this shortcoming and makes an update. Also, I have these two lines \usepackage{times}
\usepackage[T1]{fontenc} I didn't remove these two, but is XeLaTex using Computer Modern? That's surprising! |
@thammegowda Ah, maybe it is overwritten by something else in your preamble then. I can't access your Overleaf project, it's restricted. Try to move the "times" import further down maybe? |
@mbollmann \usepackage[english]{babel} % English as the main language
\babelprovide[import]{hindi}
\babelprovide[import]{arabic}
\babelprovide[import]{kannada}
\babelfont[*devanagari]{rm}{Lohit Devanagari}
\babelfont[*arabic]{rm}{Noto Sans Arabic}
\usepackage{times}
\usepackage[T1]{fontenc}
\usepackage{microtype} Here is a overleaf link: https://www.overleaf.com/read/vbyhzmssdkkb (worked for me in private/incognito)
|
@thammegowda Not an expert with Babel, but I think as soon as you use a \babelfont{rm}{TeX Gyre Termes} before you load the other, language-specific fonts, you get something virtually indistinguishable from it. |
That works! Thanks. |
I was just looking into whether there were efforts to move away from
Further decisions probably need to be made about sans-serif and monospaced fonts, but none that can't be solved with some research. |
We have been using PdfLatex compiler/engine as the default, but as we know it isn't Unicode (non-Latin) friendly.
Though the instructions suggest using XeLaTeX, the generated PDF looks different in many ways than PdfLatex's.
For example (left: PdfLatex, right: XeLatex): Look at the nuances in fonts, section headings aren't as bold as PdfTex's in the left. I believe the font weight isn't exactly the same.
My request/suggestion:
Move towards Unicode supported template as a way of encouraging NLP in non-Latin languages.
Researchers working on non-Latin languages should also be able to paste qualitative examples (without some non-vector images), right? So, how about making Unicode supported template (i.e XeLatex) as the default?
If any one interested in testing unicode support of latex templates, here is a file having UDHR titles in hundreds of languages:
udhr-title.txt
Thanks,
The text was updated successfully, but these errors were encountered: