Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode Problems? #267

Open
codinuum opened this issue Feb 8, 2022 · 3 comments
Open

Unicode Problems? #267

codinuum opened this issue Feb 8, 2022 · 3 comments
Assignees

Comments

@codinuum
Copy link

codinuum commented Feb 8, 2022

Gumtree (2656040) failed to parse the following:
DeleteMessage.java.
It seems a malformed code in the above source caused the failure.

Error while running client 'parse'.
java.nio.charset.MalformedInputException: Input length = 1
at java.base/java.nio.charset.CoderResult.throwException(CoderResult.java:274)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:181)
at java.base/java.io.BufferedReader.read1(BufferedReader.java:210)
at java.base/java.io.BufferedReader.read(BufferedReader.java:287)
at java.base/java.io.BufferedReader.fill(BufferedReader.java:161)
at java.base/java.io.BufferedReader.read1(BufferedReader.java:212)
at java.base/java.io.BufferedReader.read(BufferedReader.java:287)
at java.base/java.io.Reader.read(Reader.java:229)
at com.github.gumtreediff.gen.jdt.AbstractJdtTreeGenerator.readerToCharArray(AbstractJdtTreeGenerator.java:44)
at com.github.gumtreediff.gen.jdt.AbstractJdtTreeGenerator.generate(AbstractJdtTreeGenerator.java:64)
at com.github.gumtreediff.gen.TreeGenerator.generateTree(TreeGenerator.java:41)
at com.github.gumtreediff.gen.TreeGenerator$ReaderConfigurator.reader(TreeGenerator.java:119)
at com.github.gumtreediff.gen.TreeGenerator$ReaderConfigurator.file(TreeGenerator.java:90)
at com.github.gumtreediff.gen.TreeGenerator$ReaderConfigurator.file(TreeGenerator.java:100)
at com.github.gumtreediff.gen.TreeGenerators.getTree(TreeGenerators.java:58)
at com.github.gumtreediff.gen.TreeGenerators.getTree(TreeGenerators.java:70)
at com.github.gumtreediff.client.ParseClient.getTreeContext(ParseClient.java:63)
at com.github.gumtreediff.client.ParseClient.run(ParseClient.java:54)
at com.github.gumtreediff.client.Run.startClient(Run.java:94)
at com.github.gumtreediff.client.Run.main(Run.java:128)

@codinuum
Copy link
Author

codinuum commented Feb 8, 2022

A possible workaround attached.
gumtree-unicode-fix.patch.txt

@jrfaller
Copy link
Member

jrfaller commented Feb 8, 2022

Hi @codinuum! Thanks for reporting this, and for the tentative patch.

Just to be sure when I got

file DeleteMessage.java                                           
DeleteMessage.java: Java source, ISO-8859 text

Did you try to parse the source using this instead of UTF-8, and would it work? Because it might be better to have an option for the charset in this case, no ?

@codinuum
Copy link
Author

codinuum commented Feb 8, 2022

I tried only UTF-8 for some batch jobs.

@jrfaller jrfaller self-assigned this Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants