Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Has trouble with international (UTF) source #1221

Open
simonjwright opened this issue Nov 16, 2024 · 10 comments
Open

Has trouble with international (UTF) source #1221

simonjwright opened this issue Nov 16, 2024 · 10 comments

Comments

@simonjwright
Copy link
Contributor

simonjwright commented Nov 16, 2024

Running this utf-8 file in ALS (Alire release 25.0.0) under Emacs ada-ts-mode

with ada.text_io; use ada.text_io;
procedure utf is
   à_variable : integer := 42;
begin
   put_line ("À_variable: " & à_variable'image);
end utf;

I get this message:

Error running timer: (error "utf.adb:1:1: Error formatting node (CompilationUnit). Keeping the initial input selection unchanged
pp-formatting.adb:1264")

Running in VSCode, it fails - the log file says

[ALS.MAIN] in GNATformat Format
[ALS.MAIN] raised CONSTRAINT_ERROR : erroneous memory access
_ALS.MAIN_ 0x0000000108647D68 0x0000000108647DA0 0x00000001086EA6AC 0x00000001086EB5F4 0x00000001079292C5 0x0000000107936C5C 0x000000010788C0A0 0x000000010783C29C 0x00000001077DB52C 0x0000000107803844 0x0000000107802DE8 0x0000000107803058 0x0000000107803C0C 0x0000000107803BB8 0x0000000107803240 0x0000000107803854 0x0000000107803884 0x0000000107802DE8 0x0000000107804F8C 0x00000001077996E8 0x00000001063BD344 0x00000001063BD620 0x00000001062AFBC8 0x00000001062962DC 0x000000010634930C 0x000000010633D33C 0x000000010532074C 0x000000010532D25C 0x000000010630A53C 0x0000000108631980 0x00000001951972E0

The stack trace occurs twice. I may be able to decode it, but it’ll be much easier if you do it, and will make more sense! (see this gnatformat issue).

@simonjwright simonjwright changed the title Has trouble with UTF source Has trouble with international (UTF) source Nov 16, 2024
@joaopsazevedo
Copy link
Contributor

Hello @simonjwright.

I tried to reproduce the issue on VS Code with the following project and VS Code configuration:

-- utf.gpr

project Utf is
   for Source_Dirs use (".");
   for Main use ("utf.adb");
end Utf;

-- utf.adb

with ada.text_io; use ada.text_io;
procedure utf is
   à_variable : integer := 42;
begin
   put_line ("À_variable: " & à_variable'image);
end utf;

-- .vscode/settings.json

{
   "ada.projectFile": "utf.gpr",
   "ada.defaultCharset": "utf8"
}

And it worked. So I assume that the problem is that the Emacs ada-ts-mode is not sending the "ada.defaultCharset": "utf8" setting on the initialize request or didChangeConfiguration notifications. I'm not familiar with Emacs ada-ts-mode, but the ada.defaultCharset settings can be set the exact same way as the ada.projectFile. Hopefully this helps but let us know if it doesn't.

@simonjwright
Copy link
Contributor Author

This is from the log, is that OK?

[ALS.IN] {"jsonrpc":"2.0","method":"initialized","params":{}}
[ALS.MAIN] 'initialized' Params : (NULL RECORD)
[ALS.IN] {"jsonrpc":"2.0","method":"workspace/didChangeConfiguration","params":{"settings":{"ada":{"defaultCharset":"utf8","enableDiagnostics":true}}}}
[ALS.MAIN] 'workspace/didChangeConfiguration' Params : 
_ALS.MAIN_ (SETTINGS => 
_ALS.MAIN_  [
_ALS.MAIN_   (KIND => START_OBJECT), 
_ALS.MAIN_   (KIND => KEY_NAME,
_ALS.MAIN_    KEY_NAME => "ada"), 
_ALS.MAIN_   (KIND => START_OBJECT), 
_ALS.MAIN_   (KIND => KEY_NAME,
_ALS.MAIN_    KEY_NAME => "defaultCharset"), 
_ALS.MAIN_   (KIND => STRING_VALUE,
_ALS.MAIN_    STRING_VALUE => "utf8"), 
_ALS.MAIN_   (KIND => KEY_NAME,
_ALS.MAIN_    KEY_NAME => "enableDiagnostics"), 
_ALS.MAIN_   (KIND => BOOLEAN_VALUE,
_ALS.MAIN_    BOOLEAN_VALUE => TRUE), 
_ALS.MAIN_   (KIND => END_OBJECT), 
_ALS.MAIN_   (KIND => END_OBJECT)])

Previously we were sending "UTF-8" (this is actually the default with Emacs lsp-mode, which is what I’ve set ada-ts-mode to use to communicate with ALS (the alternative is eglot)).

@brownts
Copy link

brownts commented Nov 20, 2024

While I couldn't duplicate the same error as above, I stumbled across another error during completion using these same files which seems to be UTF-8 related. I've observed this with ALS 25.0.20240915 and 26.0.20241117 and also both with Emacs and VSCode, although it seems very repeatable with Emacs.

Exception: PROGRAM_ERROR (vss-implementation-utf8_normalization.adb:2497 explicit raise)

Here is the ALS log for this exception:

[ALS.IN] {"jsonrpc":"2.0","method":"initialize","params":{"processId":1138618,"rootPath":"/home/troy/ada/utf","clientInfo":{"name":"emacs","version":"GNU Emacs 30.0.90 (build 2, x86_64-pc-linux-gnu, GTK+ Version 3.24.33, cairo version 1.16.0)\n of 2024-09-07"},"rootUri":"file:///home/troy/ada/utf","capabilities":{"general":{"positionEncodings":["utf-32","utf-16"]},"workspace":{"workspaceEdit":{"documentChanges":true,"resourceOperations":["create","rename","delete"]},"applyEdit":true,"symbol":{"symbolKind":{"valueSet":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]}},"executeCommand":{"dynamicRegistration":false},"didChangeWatchedFiles":{"dynamicRegistration":true},"workspaceFolders":true,"configuration":true,"codeLens":{"refreshSupport":true},"fileOperations":{"didCreate":false,"willCreate":false,"didRename":true,"willRename":true,"didDelete":false,"willDelete":false}},"textDocument":{"declaration":{"dynamicRegistration":true,"linkSupport":true},"definition":{"dynamicRegistration":true,"linkSupport":true},"references":{"dynamicRegistration":true},"implementation":{"dynamicRegistration":true,"linkSupport":true},"typeDefinition":{"dynamicRegistration":true,"linkSupport":true},"synchronization":{"willSave":true,"didSave":true,"willSaveWaitUntil":true},"documentSymbol":{"symbolKind":{"valueSet":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]},"hierarchicalDocumentSymbolSupport":true},"formatting":{"dynamicRegistration":true},"rangeFormatting":{"dynamicRegistration":true},"onTypeFormatting":{"dynamicRegistration":true},"rename":{"dynamicRegistration":true,"prepareSupport":true},"codeAction":{"dynamicRegistration":true,"isPreferredSupport":true,"codeActionLiteralSupport":{"codeActionKind":{"valueSet":["","quickfix","refactor","refactor.extract","refactor.inline","refactor.rewrite","source","source.organizeImports"]}},"resolveSupport":{"properties":["edit","command"]},"dataSupport":true},"completion":{"completionItem":{"snippetSupport":true,"documentationFormat":["markdown","plaintext"],"resolveAdditionalTextEditsSupport":true,"insertReplaceSupport":true,"deprecatedSupport":true,"resolveSupport":{"properties":["documentation","detail","additionalTextEdits","command"]},"insertTextModeSupport":{"valueSet":[1,2]}},"contextSupport":true,"dynamicRegistration":true},"signatureHelp":{"signatureInformation":{"parameterInformation":{"labelOffsetSupport":true}},"dynamicRegistration":true},"documentLink":{"dynamicRegistration":true,"tooltipSupport":true},"hover":{"contentFormat":["markdown","plaintext"],"dynamicRegistration":true},"foldingRange":{"dynamicRegistration":true},"selectionRange":{"dynamicRegistration":true},"callHierarchy":{"dynamicRegistration":false},"typeHierarchy":{"dynamicRegistration":true},"publishDiagnostics":{"relatedInformation":true,"tagSupport":{"valueSet":[1,2]},"versionSupport":true},"diagnostic":{"dynamicRegistration":false,"relatedDocumentSupport":false},"linkedEditingRange":{"dynamicRegistration":true}},"window":{"workDoneProgress":true,"showDocument":{"support":true}}},"initializationOptions":null,"workDoneToken":"1"},"id":1}
[ALS.IN] {"jsonrpc":"2.0","method":"workspace/didChangeConfiguration","params":{"settings":{"ada":{"projectFile":"utf.gpr","defaultCharset":"utf8","enableDiagnostics":true}}}}
[ALS.OUT] {"jsonrpc":"2.0","id":1,"result":{"capabilities":{"textDocumentSync":2,"completionProvider":{"triggerCharacters":[".",",","'","("],"resolveProvider":true},"hoverProvider":true,"signatureHelpProvider":{"triggerCharacters":[",","("],"retriggerCharacters":["\b"]},"declarationProvider":true,"definitionProvider":true,"typeDefinitionProvider":true,"implementationProvider":true,"referencesProvider":true,"documentHighlightProvider":true,"documentSymbolProvider":true,"codeActionProvider":{"workDoneProgress":false,"codeActionKinds":["quickfix","refactor.rewrite"],"resolveProvider":false},"workspaceSymbolProvider":true,"documentFormattingProvider":true,"documentRangeFormattingProvider":true,"documentOnTypeFormattingProvider":{"firstTriggerCharacter":"\n"},"renameProvider":{"prepareProvider":true},"foldingRangeProvider":true,"executeCommandProvider":{"commands":["als-other-file","als-suspend-execution","als-reload-project","als-show-dependencies","als-source-dirs","als-executables","als-mains","als-project-file","als-object-dir","als-named-parameters","als-auto-import","als-suppress-separate","als-refactor-extract-subprogram","als-refactor-introduce-parameter","als-refactor-pull_up_declaration","als-refactor-replace-type","als-refactor-sort_dependencies","als-refactor-add-parameters","als-refactor-remove-parameters","als-refactor-move-parameter","als-refactor-change-parameter-mode","als-refactor-change_parameters_type","als-refactor-change_parameters_default_value"]},"callHierarchyProvider":true,"semanticTokensProvider":{"legend":{"tokenTypes":[],"tokenModifiers":[]},"range":true,"full":true},"workspace":{},"alsReferenceKinds":["reference","access","write","call","dispatching call","parent","child","overriding"]}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"window/logMessage","params":{"type":4,"message":"Log directory is /home/troy/.als/als.2024-11-20T174619.1138815.log"}}
[ALS.IN] {"jsonrpc":"2.0","method":"initialized","params":{}}
[ALS.IN] {"jsonrpc":"2.0","method":"workspace/didChangeConfiguration","params":{"settings":{"ada":{"projectFile":"utf.gpr","defaultCharset":"utf8","enableDiagnostics":true}}}}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/didOpen","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb","languageId":"ada","version":0,"text":"with ada.text_io; use ada.text_io;\nprocedure utf is\n   à_variable : integer := 42;\nbegin\n   put_line (\"À_variable: \" & à_variable'image);\nend utf;\n"}}}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/codeAction","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"range":{"start":{"line":1,"character":3},"end":{"line":1,"character":3}},"context":{"diagnostics":[]}},"id":2}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/documentHighlight","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":3}
[ALS.OUT] {"jsonrpc":"2.0","id":1,"method":"client/registerCapability","params":{"registrations":[{"id":"1","method":"workspace/didChangeWatchedFiles","registerOptions":{"watchers":[{"globPattern":"/home/troy/ada/utf/*","kind":7}]}}]}}
[ALS.OUT] {"jsonrpc":"2.0","id":3,"method":"window/workDoneProgress/create","params":{"token":"ada_ls-1138815-indexing-1"}}
[ALS.IN] {"jsonrpc":"2.0","method":"$/cancelRequest","params":{"id":2}}
[ALS.IN] {"jsonrpc":"2.0","id":3,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/troy/ada/utf/utf.adb","diagnostics":[]}}
[ALS.OUT] {"jsonrpc":"2.0","id":2,"error":{"code":-32800,"message":"Request was canceled"}}
[ALS.OUT] {"jsonrpc":"2.0","id":3,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-1","value":{"kind":"begin","title":"Indexing","percentage":0}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-1","value":{"kind":"report","message":"144/1564 files","percentage":9}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-1","value":{"kind":"report","message":"334/1564 files","percentage":21}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-1","value":{"kind":"report","message":"747/1564 files","percentage":47}}}
[ALS.IN] {"jsonrpc":"2.0","method":"workspace/didChangeConfiguration","params":{"settings":{"ada":{"projectFile":"utf.gpr","defaultCharset":"UTF-8","enableDiagnostics":true}}}}
[ALS.IN] {"jsonrpc":"2.0","method":"workspace/didChangeConfiguration","params":{"settings":{"ada":{"projectFile":"utf.gpr","defaultCharset":"utf8","enableDiagnostics":true}}}}
[ALS.IN] {"jsonrpc":"2.0","id":1,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-1","value":{"kind":"report","message":"1019/1564 files","percentage":65}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/troy/ada/utf/utf.adb","diagnostics":[]}}
[ALS.OUT] {"jsonrpc":"2.0","id":5,"method":"client/unregisterCapability","params":{"unregisterations":[{"id":"1","method":"workspace/didChangeWatchedFiles"}]}}
[ALS.OUT] {"jsonrpc":"2.0","id":4,"method":"client/registerCapability","params":{"registrations":[{"id":"4","method":"workspace/didChangeWatchedFiles","registerOptions":{"watchers":[{"globPattern":"/home/troy/ada/utf/*","kind":7}]}}]}}
[ALS.OUT] {"jsonrpc":"2.0","id":6,"method":"window/workDoneProgress/create","params":{"token":"ada_ls-1138815-indexing-2"}}
[ALS.IN] {"jsonrpc":"2.0","id":5,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/troy/ada/utf/utf.adb","diagnostics":[]}}
[ALS.OUT] {"jsonrpc":"2.0","id":8,"method":"client/unregisterCapability","params":{"unregisterations":[{"id":"4","method":"workspace/didChangeWatchedFiles"}]}}
[ALS.OUT] {"jsonrpc":"2.0","id":7,"method":"client/registerCapability","params":{"registrations":[{"id":"7","method":"workspace/didChangeWatchedFiles","registerOptions":{"watchers":[{"globPattern":"/home/troy/ada/utf/*","kind":7}]}}]}}
[ALS.OUT] {"jsonrpc":"2.0","id":9,"method":"window/workDoneProgress/create","params":{"token":"ada_ls-1138815-indexing-3"}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-1","value":{"kind":"end"}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-2","value":{"kind":"begin","title":"Indexing","percentage":0}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-2","value":{"kind":"end"}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"begin","title":"Indexing","percentage":0}}}
[ALS.IN] {"jsonrpc":"2.0","id":8,"result":null}
[ALS.IN] {"jsonrpc":"2.0","id":4,"result":null}
[ALS.IN] {"jsonrpc":"2.0","id":6,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"report","message":"143/1564 files","percentage":9}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"report","message":"334/1564 files","percentage":21}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"report","message":"783/1564 files","percentage":50}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"report","message":"1057/1564 files","percentage":67}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"report","message":"1388/1564 files","percentage":88}}}
[ALS.OUT] {"jsonrpc":"2.0","method":"$/progress","params":{"token":"ada_ls-1138815-indexing-3","value":{"kind":"end"}}}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/hover","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":4}
[ALS.OUT] {"jsonrpc":"2.0","id":4,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/codeAction","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"range":{"start":{"line":1,"character":3},"end":{"line":1,"character":3}},"context":{"diagnostics":[]}},"id":5}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/documentHighlight","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":6}
[ALS.OUT] {"jsonrpc":"2.0","id":5,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","id":6,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/rangeFormatting","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"options":{"tabSize":3,"insertSpaces":true,"trimTrailingWhitespace":true,"insertFinalNewline":true,"trimFinalNewlines":true},"range":{"start":{"line":1,"character":0},"end":{"line":1,"character":16}}},"id":7}
[ALS.OUT] {"jsonrpc":"2.0","id":7,"result":[{"range":{"start":{"line":1,"character":0},"end":{"line":5,"character":8}},"newText":"procedure utf is\n   à_variable : Integer := 42;\nbegin\n   Put_Line (\"À_variable: \" & à_variable'Image);\nend utf;"}]}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/didChange","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb","version":1},"contentChanges":[{"range":{"start":{"line":1,"character":0},"end":{"line":5,"character":8}},"rangeLength":111,"text":"procedure utf is\n   à_variable : Integer := 42;\nbegin\n   Put_Line (\"À_variable: \" & à_variable'Image);\nend utf;"}]}}
[ALS.OUT] {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/troy/ada/utf/utf.adb","diagnostics":[]}}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/hover","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":8}
[ALS.OUT] {"jsonrpc":"2.0","id":8,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/codeAction","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"range":{"start":{"line":1,"character":3},"end":{"line":1,"character":3}},"context":{"diagnostics":[]}},"id":9}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/documentHighlight","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":10}
[ALS.OUT] {"jsonrpc":"2.0","id":9,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","id":10,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/rangeFormatting","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"options":{"tabSize":3,"insertSpaces":true,"trimTrailingWhitespace":true,"insertFinalNewline":true,"trimFinalNewlines":true},"range":{"start":{"line":1,"character":0},"end":{"line":1,"character":16}}},"id":11}
[ALS.OUT] {"jsonrpc":"2.0","id":11,"result":[{"range":{"start":{"line":1,"character":0},"end":{"line":5,"character":8}},"newText":"procedure utf is\n   à_variable : Integer := 42;\nbegin\n   Put_Line (\"À_variable: \" & à_variable'Image);\nend utf;"}]}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/didChange","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb","version":2},"contentChanges":[{"range":{"start":{"line":1,"character":0},"end":{"line":5,"character":8}},"rangeLength":111,"text":"procedure utf is\n   à_variable : Integer := 42;\nbegin\n   Put_Line (\"À_variable: \" & à_variable'Image);\nend utf;"}]}}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/completion","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3},"context":{"triggerKind":1}},"id":12}
[ALS.OUT] {"jsonrpc":"2.0","method":"textDocument/publishDiagnostics","params":{"uri":"file:///home/troy/ada/utf/utf.adb","diagnostics":[]}}
[ALS.OUT] {"jsonrpc":"2.0","id":12,"error":{"code":-32603,"message":"Exception: PROGRAM_ERROR (vss-implementation-utf8_normalization.adb:2497 explicit raise)"}}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/hover","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":13}
[ALS.OUT] {"jsonrpc":"2.0","id":13,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/codeAction","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"range":{"start":{"line":1,"character":3},"end":{"line":1,"character":3}},"context":{"diagnostics":[]}},"id":14}
[ALS.IN] {"jsonrpc":"2.0","method":"textDocument/documentHighlight","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":1,"character":3}},"id":15}
[ALS.OUT] {"jsonrpc":"2.0","id":14,"result":null}
[ALS.OUT] {"jsonrpc":"2.0","id":15,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"shutdown","params":null,"id":16}
[ALS.OUT] {"jsonrpc":"2.0","id":16,"result":null}
[ALS.IN] {"jsonrpc":"2.0","method":"exit","params":null}

@simonjwright
Copy link
Contributor Author

While I couldn't duplicate the same error as above, I stumbled across another error during completion using these same files which seems to be UTF-8 related. I've observed this with ALS 25.0.20240915 and 26.0.20241117 and also both with Emacs and VSCode, although it seems very repeatable with Emacs.

Exception: PROGRAM_ERROR (vss-implementation-utf8_normalization.adb:2497 explicit raise)

I just ran the VSS (25.0.0) test suite and got exactly this error.
macOS Sequoia 15.1.1, CLT 15.3, GCC 14.2.0-aarch64.

.objs/tests/test_transformer data/ucd testsuite/text/w3c-i18n-tests-casing/*.txt
Loading NormalizationTest.txt...
.objs/tests/test_transformer:
  Normalization Transformation:
    UCD NormalizationTest.txt: ERRORED	vss-implementation-utf8_normalization.adb:2497 explicit raise
raised PROGRAM_ERROR : vss-implementation-utf8_normalization.adb:2497 explicit raise
Load address: 0x100674000
Call stack traceback locations:
0x100731dcc 0x1007321a4 0x10073210c 0x100732420 0x1006f5afc 0x1006ed928 0x1006f1b44 0x1006f3470 0x1006f2c70 0x10071e904 0x10067853c 0x1006be5f8 0x100677d10 0x1006beb14 0x100675380 0x10067b5f4

Sorry about the undecorated traceback, see my comment in GCC PR 117538.

This is what atos makes of it:

$ atos -o .objs/tests/test_transformer -l 0x100674000 0x100731dcc 0x1007321a4 0x10073210c 0x100732420 0x1006f5afc 0x1006ed928 0x1006f1b44 0x1006f3470 0x1006f2c70 0x10071e904 0x10067853c 0x1006be5f8 0x100677d10 0x1006beb14 0x100675380 0x10067b5f4
ada__exceptions__complete_and_propagate_occurrence (in test_transformer) (a-except.adb:1128)
ada__exceptions__raise_with_location_and_msg (in test_transformer) (a-except.adb:1339)
__gnat_raise_program_error_msg (in test_transformer) (a-except.adb:1295)
__gnat_rcheck_PE_Explicit_Raise (in test_transformer) (a-except.adb:1533)
vss__implementation__utf8_normalization__unchecked_replace (in test_transformer) (vss-implementation-utf8_normalization.adb:2497)
vss__implementation__utf8_normalization__decompose_and_compose__apply_canonical_composition.6 (in test_transformer) (vss-implementation-utf8_normalization.adb:673)
vss__implementation__utf8_normalization__decompose_and_compose (in test_transformer) (vss-implementation-utf8_normalization.adb:1892)
vss__implementation__utf8_normalization__normalize__3 (in test_transformer) (vss-implementation-utf8_normalization.adb:2092)
vss__implementation__utf8_normalization__normalize (in test_transformer) (vss-implementation-utf8_normalization.adb:1991)
vss__transformers__normalization__transform__3 (in test_transformer) (vss-transformers-normalization.adb:25)
test_transformer__test_ucd_normalizationtest.0 (in test_transformer) (test_transformer-test_ucd_normalizationtest.adb:81)
test_support__run_testcase (in test_transformer) (test_support.adb:293)
test_transformer__test_normalization.13 (in test_transformer) (test_transformer.adb:75)
test_support__run_testsuite (in test_transformer) (test_support.adb:335)
_ada_test_transformer (in test_transformer) (test_transformer.adb:87)
main (in test_transformer) (b__test_transformer.adb:585)

The reason we’re getting PE is that the entire body of that subprogram except the raise has been commented out at commit 2be9f513 on 2024-04-02.

@godunko
Copy link
Contributor

godunko commented Nov 21, 2024

Thank you for detailed report, it is unimplemented feature in VSS, we will implement it in the future version of VSS.

Note, most probably, your editor doesn't use Normalization Form C.

@simonjwright
Copy link
Contributor Author

Note, most probably, your editor doesn't use Normalization Form C.

In my test file, lower case a grave à is represented as c3 a0, upper case a grave À as c3 80 which I think is NFC?

Having converted with iconv(1) to "utf8-mac", which I believe to be NFD (available on Macs because at any rate for the HFS filesystem, that was how files were stored on disk; could have been NFKD???), à is represented as 61 cc 80, À as 41 cc 80 .

gnatformat failed in the same way for either source.

@brownts
Copy link

brownts commented Nov 21, 2024

Note, most probably, your editor doesn't use Normalization Form C.

@godunko, does VSCode not support NFC? As I mentioned above, I was able to duplicate this on VSCode too. The following is a snippet of the VSCode ALS log (see attached for the full log):

[ALS.IN] {"jsonrpc":"2.0","id":65,"method":"textDocument/completion","params":{"textDocument":{"uri":"file:///home/troy/ada/utf/utf.adb"},"position":{"line":2,"character":4},"context":{"triggerKind":1}}} (14:55:18.531)
[ALS.MAIN] Getting completions, Pos = ( 3,  5) Node = <Id "g\xe0_variable" utf.adb:3:4-3:15> (14:55:18.532)
[ALS.MAIN] On_Server_Request (14:55:18.607)
[ALS.MAIN] raised PROGRAM_ERROR : vss-implementation-utf8_normalization.adb:2497 explicit raise
_ALS.MAIN_ Load address: 0x560caa08f000
_ALS.MAIN_ [/home/troy/.vscode/extensions/adacore.ada-26.0.202411173-linux-x64/x64/linux/ada_language_server]
_ALS.MAIN_ 0x560cad7bc943 Vss.Implementation.Utf8_Normalization.Unchecked_Replace at vss-implementation-utf8_normalization.adb:2497
_ALS.MAIN_ 0x560cad7b4c63 Vss.Implementation.Utf8_Normalization.Decompose_And_Compose.Apply_Canonical_Composition at vss-implementation-utf8_normalization.adb:673
_ALS.MAIN_ 0x560cad7b92a2 Vss.Implementation.Utf8_Normalization.Decompose_And_Compose at vss-implementation-utf8_normalization.adb:1892
_ALS.MAIN_ 0x560cad7baa02 Vss.Implementation.Utf8_Normalization.Normalize at vss-implementation-utf8_normalization.adb:2092
_ALS.MAIN_ 0x560cad7ba2b1 Vss.Implementation.Utf8_Normalization.Normalize at vss-implementation-utf8_normalization.adb:1991
_ALS.MAIN_ 0x560cad7adff6 Vss.Transformers.Caseless.Transform at vss-transformers-caseless.adb:36
_ALS.MAIN_ 0x560cad7a5d5a Vss.Strings.Starts_With at vss-strings.adb:1130
_ALS.MAIN_ 0x560cac205c98 Lsp.Ada_Completions.Keywords.Propose_Completion at lsp-ada_completions-keywords.adb:76
_ALS.MAIN_ 0x560cac2600f2 Lsp.Ada_Documents.Get_Completions_At at lsp-ada_documents.adb:940
_ALS.MAIN_ 0x560cac2e56cd Lsp.Ada_Handlers.On_Completion_Request at lsp-ada_handlers.adb:1470
_ALS.MAIN_ 0x560cac2d530e Lsp.Ada_Handlers.On_Server_Request at lsp-ada_handlers.adb:3129
_ALS.MAIN_ 0x560cab521c64 Lsp.Default_Message_Handlers.Execute at lsp-default_message_handlers.adb:85
_ALS.MAIN_ 0x560cab52c6df Lsp.Job_Schedulers.Process_Job at lsp-job_schedulers.adb:191
_ALS.MAIN_ 0x560cac2afbd2 Lsp.Servers.Processing_Task_TypeT at lsp-servers.adb:848
_ALS.MAIN_ 0x560cae50b74c system__tasking__stages__task_wrapper at ???
_ALS.MAIN_ [/lib/x86_64-linux-gnu/libc.so.6]
_ALS.MAIN_ 0x7fda2f44bac1
_ALS.MAIN_ 0x7fda2f4dd84e
_ALS.MAIN_ 0xfffffffffffffffe (14:55:18.675)

ada_ls_log.2024-11-20T144922.log

@godunko
Copy link
Contributor

godunko commented Nov 26, 2024

@simonjwright Ada standard "requires" use of NFC normalized text, use of other normalization forms is implementation defined. Normalization algorithm is quite complex, thus it is "slow". VSS run it only when detects that text is not normalized, and it is why this exception is not raised in the test suite.

You might need to check Emacs settings to send text in NFC form in all ALS requests. For opened files text doesn't loaded from disk, but send by Emacs.

PS. We are working on proper implementation of normalization in VSS.

@joaopsazevedo joaopsazevedo reopened this Nov 26, 2024
@joaopsazevedo
Copy link
Contributor

@simonjwright

This is from the log, is that OK?

[ALS.IN] {"jsonrpc":"2.0","method":"initialized","params":{}}
[ALS.MAIN] 'initialized' Params : (NULL RECORD)
[ALS.IN] {"jsonrpc":"2.0","method":"workspace/didChangeConfiguration","params":{"settings":{"ada":{"defaultCharset":"utf8","enableDiagnostics":true}}}}

Yes, this looks fine.

Note, most probably, your editor doesn't use Normalization Form C.

In my test file, lower case a grave à is represented as c3 a0, upper case a grave À as c3 80 which I think is NFC?

Having converted with iconv(1) to "utf8-mac", which I believe to be NFD (available on Macs because at any rate for the HFS filesystem, that was how files were stored on disk; could have been NFKD???), à is represented as 61 cc 80, À as 41 cc 80 .

gnatformat failed in the same way for either source.

With GNATformat, could you share how you reproduced the issue? Was the --charset utf8 switch used?

@simonjwright
Copy link
Contributor Author

With GNATformat, could you share how you reproduced the issue? Was the --charset utf8 switch used?

Many apologies: when I use that switch, the program completes successfully.

My excuse, such as it is, is that after gnatformat -h the switch appears mixed in with the other switches, so it’s easy to miss. I’m used to one-switch-per-line.

Just a slight point: if, as @godunko says, the 'Ada standard "requires" use of NFC normalized text', couldn't gnatformat's default charset be utf8? It’d be good if gnatformat made the default dependent on the platform, perhaps with an environment variable? Here I have $LANG = en_GB.UTF-8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants