Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby LSP Indexing is very slow on version 0.19.0 #2671

Closed
jmschp opened this issue Oct 4, 2024 · 16 comments · Fixed by #2704
Closed

Ruby LSP Indexing is very slow on version 0.19.0 #2671

jmschp opened this issue Oct 4, 2024 · 16 comments · Fixed by #2704
Assignees
Labels
bug Something isn't working vscode This pull request should be included in the VS Code extension's release notes

Comments

@jmschp
Copy link

jmschp commented Oct 4, 2024

Description

Ruby LSP Information

VS Code Version

1.91.1

Ruby LSP Extension Version

0.8.2

Ruby LSP Server Version

0.19.0

Ruby LSP Addons

  • Ruby LSP Rails

Ruby Version

3.3.5

Ruby Version Manager

asdf

Installed Extensions

Click to expand
  • better-comments (3.0.2)
  • cucumberautocomplete (3.0.5)
  • vscode-erb-beautify (0.5.0)
  • pyright (1.1.327)
  • vscode-zipfs (3.0.0)
  • vscode-markdownlint (0.56.0)
  • vscode-eslint (3.0.10)
  • dotenv-vscode (0.28.1)
  • gitlens (15.5.1)
  • prettier-vscode (11.0.0)
  • vscode-pull-request-github (0.92.0)
  • todo-tree (0.0.226)
  • vscode-drawio (1.6.6)
  • prettier-sql-vscode (1.6.0)
  • haml (1.4.1)
  • mongodb-vscode (1.7.0)
  • vscode-docker (1.29.2)
  • debugpy (2024.6.0)
  • python (2024.12.3)
  • vscode-pylance (2024.8.1)
  • cpptools (1.21.6)
  • cpptools-extension-pack (1.3.0)
  • cpptools-themes (2.0.0)
  • vscode-versionlens (1.14.2)
  • material-icon-theme (5.11.1)
  • platformio-ide (3.3.3)
  • vscode-xml (0.27.1)
  • vscode-yaml (1.15.0)
  • trailing-spaces (0.4.1)
  • ruby-lsp (0.8.2)
  • code-spell-checker (4.0.13)
  • code-spell-checker-portuguese-brazilian (2.2.1)
  • vscode-stylelint (1.4.0)
  • highlight-matching-tag (0.11.0)
  • markdown-all-in-one (3.6.2)

Ruby LSP Settings

Click to expand
Workspace
{}
User
{
  "enableExperimentalFeatures": false,
  "enabledFeatures": {
    "codeActions": true,
    "diagnostics": true,
    "documentHighlights": true,
    "documentLink": true,
    "documentSymbols": true,
    "foldingRanges": true,
    "formatting": true,
    "hover": true,
    "inlayHint": true,
    "onTypeFormatting": true,
    "selectionRanges": true,
    "semanticHighlighting": true,
    "completion": true,
    "codeLens": true,
    "definition": true,
    "workspaceSymbol": true,
    "signatureHelp": true,
    "typeHierarchy": true
  },
  "featuresConfiguration": {},
  "addonSettings": {},
  "rubyVersionManager": {
    "identifier": "asdf"
  },
  "customRubyCommand": "",
  "formatter": "auto",
  "linters": null,
  "bundleGemfile": "",
  "testTimeout": 30,
  "branch": "",
  "pullDiagnosticsOn": "both",
  "useBundlerCompose": false,
  "bypassTypechecker": false,
  "rubyExecutablePath": "",
  "indexing": {},
  "erbSupport": true
}

Reproduction steps

  1. Open VS Code
  2. Ruby LSP starts indexing
  3. On this new version (0.19.0) Indexing is a whole lot slower! 😢 It starts to slow down at 30% on my case.

Code snippet or error message

@jmschp jmschp added bug Something isn't working vscode This pull request should be included in the VS Code extension's release notes labels Oct 4, 2024
@vinistock
Copy link
Member

vinistock commented Oct 4, 2024

Thank you for the report. I think this is likely related to the fact that we started handling multibyte characters correctly in #2619.

While I understand that the slow down is not ideal, it's a cost we have to pay to handle non ASCII characters used in Ruby appropriately. For ASCII-only sources, there are optimizations we apply to ensure better performance, but calculating the appropriate code unit locations for multibyte sources is expensive by nature.

And we must do it during indexing otherwise users of multibyte characters will get features offset by an incorrect number of bytes, like developers using Japanese characters as part of their Ruby code.

Are you using a lot of multibyte characters in your codebase? For example, characters with accents like â, Japanese characters or emojis (even in comments)?

@jmschp
Copy link
Author

jmschp commented Oct 4, 2024

This is a new codebase I am working on, it big and I still don't know it very well yet. But I don't think we have a lot of non ASCII characters. It's an application for USA market only, everything is in English.

It is taking a bit more than 4 minutes to index.

@andyw8
Copy link
Contributor

andyw8 commented Oct 4, 2024

@jmschp I'd be curious to see the times for v0.18.4 vs v0.19.0.

(You can temporarily add ruby-lsp in your Gemfile to use a specific version).

@jmschp
Copy link
Author

jmschp commented Oct 4, 2024

@andyw8

I used the following Regular Expression [^\x00-\x7f] to search the code base, and found 64 results in 9 Ruby files.

Times:

  • 0.18.4 -> 0′44″
  • 0.19.0 -> 4′12″ 😢

@vinistock
Copy link
Member

That's a crazy increase. You're not indexing tests by any chance? Do you use test or spec for the directory?

@jmschp
Copy link
Author

jmschp commented Oct 4, 2024

@vinistock

I use spec.

I have the ruby-lsp-rspec (0.1.14) installed, but it is not specified in the Gemfile.

@vinistock
Copy link
Member

Did you already exclude the spec directory from indexing?

If not, can you please try this and let us know if indexing is faster? We should probably exclude spec by default like we do for test.

{
  "rubyLsp.indexing": {
    "excludedPatterns": ["**/spec/**.rb"],
  },
}

@epoberezhny
Copy link
Contributor

epoberezhny commented Oct 5, 2024

Hey! I faced the same issue, indexing time increased from 7 seconds on v0.18.4 to 32 seconds on v0.19.1 😞 I excluded the spec directory, but it didn't help. And there are no non-ASCII characters in my project they come from installed gems (AWS SDKs contain some huge files)

@Earlopain
Copy link
Contributor

Earlopain commented Oct 6, 2024

There's definitly some time being spend one something. I added some logging and here are my worst offenders:

/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/3.3.0/prism/node.rb
38.29374
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/prism-1.1.0/lib/prism/inspect_visitor.rb
0.57227
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/3.3.0/bundler/vendor/fileutils/lib/fileutils.rb
0.53581
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/addressable-2.8.7/lib/addressable/uri.rb
0.4459
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/3.3.0/fileutils.rb
0.38042
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/lexer-F0.rb
0.16495
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby32.rb
0.15025
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/prism-1.1.0/lib/prism/node.rb
0.08192
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/prism-1.1.0/lib/prism/serialize.rb
0.06388
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby31.rb
0.05073

Most are done very quickly but there's definitly at least one outlier, taking 38 (!) seconds. With 0.18.4:

/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/lexer-F1.rb
0.13788
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/3.3.0/rubygems/vendor/resolv/lib/resolv.rb
0.07309
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/lexer-F0.rb
0.05923
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby23.rb
0.04744
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby31.rb
0.04695
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby19.rb
0.04603
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby33.rb
0.04545
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/prism-1.1.0/lib/prism/node.rb
0.04175
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/3.3.0/prism/node.rb
0.04161
/home/earlopain/.rbenv/versions/3.3.5/lib/ruby/gems/3.3.0/gems/parser-3.3.5.0/lib/parser/ruby26.rb
0.03907

It's still in the Top10 but nowhere near as bad. Overall things seem to be taking longer. I guess prism/node.rb contains some special characters but the file is so large and I didn't find any from scrolling through.

Edit: I missed this initially, the file taking so long is the one from prism shipped with ruby by default. The one from the gem (I have it it my dependencies) finishes very quickly in both cases. It used to contain a bunch of non-ascii characters for pretty-printing the ast, which is now part of a different file. ruby/prism@0f21f5b

If I read the code right, it will dup the source code from the beginning all the way to the token it is currently looking at which is just horrible for performance the larger the file is.

@jmschp
Copy link
Author

jmschp commented Oct 6, 2024

Did you already exclude the spec directory from indexing?

If not, can you please try this and let us know if indexing is faster? We should probably exclude spec by default like we do for test.

{
  "rubyLsp.indexing": {
    "excludedPatterns": ["**/spec/**.rb"],
  },
}

@vinistock

I had not excluded the spec folder before. I did it now and it stills takes a long time, I did not time it exactly, but it feels the same.

@dgmora
Copy link

dgmora commented Oct 7, 2024

For me on 0.19 it gets stuck always at 66% for several minutes (10+), although it eventually finished. I tried again with 0.18.4 and ruby-lsp-rails (0.3.16) and it finishes indexing again normally in 10 seconds.

ruby-lsp-check finished in both cases, although it felt slower with with 0.19
Is there some additional info I could provide to help?

The lsp log did not seem to provide much information, just:

17514   │ [DEBUG][2024-10-07 11:37:43] .../vim/lsp/rpc.lua:408    "rpc.receive"
        │  {
17515   │   jsonrpc = "2.0",
17516   │   method = "$/progress",
17517   │   params = {
17518   │     token = "indexing-progress",
17519   │     value = {
17520   │       kind = "report",
17521   │       message = "66% completed",
17522   │       percentage = 66
17523   │     }
17524   │   }
17525   │ }

@vinistock vinistock assigned vinistock and unassigned st0012 Oct 7, 2024
@vinistock
Copy link
Member

Just an update: we're looking into how we can compute code unit lengths in a more performant way. I'd really like to avoid reverting the multibyte support since it allows developers who write comments in other languages (like Japanese) to have a correct experience using the Ruby LSP.

If we can't fix it in a timely manner, we will revert and re-evaluate.

@NotFounds
Copy link
Contributor

Hi, I'm sorry to hear that the performance has deteriorated due to the multi-byte character support.

I haven't been able to reproduce the issue in my local codebase and thus haven't been able to fully verify it, but would it be possible to modify it as follows?

The Prism::Location (which is actually Prism::Source) used during Index generation performs encoding conversion multiple times internally to calculate the correct position in code containing multi-byte characters.
https://github.com/ruby/prism/blob/c1a27a5f7f9c608e420fdb774c5dd2a8af9a1eb6/lib/prism/parse_result.rb#L92-L106

Currently, because it converts from the beginning of the file to the relevant position, the processing efficiency deteriorates as the file size increases.
On the other hand, it seems that by specifying the encoding during Prism.parse, we can convert the encoding of the source in advance. (If the comment is correct)
https://github.com/ruby/prism/blob/c1a27a5f7f9c608e420fdb774c5dd2a8af9a1eb6/lib/prism/parse_result.rb#L36-L40

I thought we might be able to achieve speedup by comparing the encoding of the source with the encoding of the argument, and skipping the conversion if they are the same.
Please ignore this if I'm saying something off the mark.

@dgmora
Copy link

dgmora commented Oct 12, 2024

For me on 0.19 it gets stuck always at 66% for several minutes (10+)...

Not happening anymore with 0.20.0 🎉

@vinistock
Copy link
Member

Awesome!

@jmschp
Copy link
Author

jmschp commented Oct 14, 2024

I can also confirm that is it much faster in the 0.20.0.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working vscode This pull request should be included in the VS Code extension's release notes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants