Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name starting with non-ascii uppercase letter is not treated as constant #12791

Open
kolesar-andras opened this issue Nov 28, 2022 · 2 comments · May be fixed by #12792 or #15148
Open

Name starting with non-ascii uppercase letter is not treated as constant #12791

kolesar-andras opened this issue Nov 28, 2022 · 2 comments · May be fixed by #12792 or #15148
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:parser

Comments

@kolesar-andras
Copy link

The following code does not compile:

class ÁrvíztűrőTükörfúrógép
end

Error: expecting token 'CONST', not 'ÁrvíztűrőTükörfúrógép'

Some corner cases to narrow the problem:

  • Á fails
  • A works
  • Aőű works
  • AőűŐű also works

When a name starts with non-ascii uppercase letter (for example Á) it is currently not treated as constant.

This bevaviour also causes a bug:

Á = 1
Á = 2 # no error (Á is treated as a variable, not constant)

A = 1
A = 2 # Error: already initialized constant A

Running on Ubuntu 21.10:

$ crystal -v
Crystal 1.6.2 [879691b2e] (2022-11-03)

LLVM: 13.0.1
Default target: x86_64-unknown-linux-gnu
@kolesar-andras kolesar-andras added the kind:bug A bug in the code. Does not apply to documentation, specs, etc. label Nov 28, 2022
kolesar-andras added a commit to kolesar-andras/crystal that referenced this issue Nov 28, 2022
treat names starting with any uppercase letter as constant
not only ascii uppercase letters (A-Z)
for example Á (Latin Capital Letter A with Acute)

Fixes crystal-lang#12791
@straight-shoota
Copy link
Member

straight-shoota commented Nov 28, 2022

This might have been discussed somewhere before, but I'm not sure.

Uppercase first-character indicates a constant and it's a very relevant distinction from a lowercase variable. That means the two cases need to be easily discernible. There might be some trouble with that when following the Unicode character classes.

That being said, I think it's generally a good idea to allow using non-ASCII characters in identifiers the same way as ASCII characters.

The above example with class ÁrvíztűrőTükörfúrógép works in Ruby, btw.

@HertzDevil
Copy link
Contributor

The same can also be said about titlecase characters, i.e. Chars that belong to the Unicode Lt (letter titlecase) general category: Dž Lj Nj Dz ᾈ ᾉ ᾊ ᾋ ᾌ ᾍ ᾎ ᾏ ᾘ ᾙ ᾚ ᾛ ᾜ ᾝ ᾞ ᾟ ᾨ ᾩ ᾪ ᾫ ᾬ ᾭ ᾮ ᾯ ᾼ ῌ ῼ

Ruby also treats identifiers starting with titlecase characters as constants rather than variables.

nanobowers added a commit to nanobowers/crystal that referenced this issue Nov 2, 2024
nanobowers added a commit to nanobowers/crystal that referenced this issue Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:parser
Projects
None yet
3 participants