Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

scip-ctags: returns different kinds to universal-ctags #57659

Closed
keegancsmith opened this issue Oct 17, 2023 · 3 comments
Closed

scip-ctags: returns different kinds to universal-ctags #57659

keegancsmith opened this issue Oct 17, 2023 · 3 comments
Assignees

Comments

@keegancsmith
Copy link
Member

keegancsmith commented Oct 17, 2023

We just noticed a pretty major regression in ranking for go files in sourcegraph. This is likely more widespread. It is due to scip-ctags returning different kinds, resulting in the zoekt ranking code incorrectly assigning weights to results.

Right now we will work around it, but ideally we get the same data. What we have noticed so far (only testing go). This is confusing because the univesal ctag kind names are bad:

  • Kind is always "type". Can't tell between struct, interface and type alias.
  • Functions are called "function" instead of "func".
  • methodSpec is called method. methodSpec in universal-ctags is a method in an interface decleration.
  • methods and functions in universal-ctags are both "func". scip-ctags it is "method" and "function".
  • member is called variable.
  • var is called variable.
  • const is called constant.

Given we are moving to scip-ctags it would be great if we had a well defined spec here. All the current use of kinds are done by running universal-ctags and seeing what it outputs. Additionally we should be using scip-ctags in the zoekt repo.

@keegancsmith
Copy link
Member Author

The workaround just for go is at sourcegraph/zoekt#655. Gonna target getting that in for the patch release.

cc @sourcegraph/search-platform

@jtibshirani
Copy link
Member

jtibshirani commented Oct 26, 2023

Here's our plan to tackle this.

Improve 'kind' output for SCIP ctags:

Make Zoekt robust to differences in 'kind' output from SCIP and universal-ctags:

Also spun-off some issues:

@jtibshirani
Copy link
Member

Closing this out, as remaining work is tracked in follow-up issues:

jtibshirani referenced this issue Nov 1, 2023
Added more fine-grained kinds for C#, Python, and Ruby to better match the universal-ctags output.

Addresses https://github.com/sourcegraph/sourcegraph/issues/57659
keegancsmith referenced this issue Nov 2, 2023
* SCIP syntax: add more kinds for go ctags (#57806)

This change adds fine-grained kinds to the Go ctags output. Specific changes:
* Split `type` into `interface`, `struct`, and `typealias`
* Map struct members to `field` instead of `variable`

Universal ctags does not have a clear spec, and some languages use different
names for the same kind. So my strategy is not to match universal ctags
exactly, but just to capture the correct SCIP kinds. Clients need to handle the
fact that the kind names can be different.

* SCIP ctags: add kinds for C#, Python, Ruby (#57879)

Added more fine-grained kinds for C#, Python, and Ruby to better match the universal-ctags output.

Addresses https://github.com/sourcegraph/sourcegraph/issues/57659

* Add scip-ctags kinds for js, ts, and rust (#57899)

* SCIP ctags: use MethodSpec kind for Go (#57929)

Now that MethodSpec is available in SCIP, we can use it in the Go SCIP ctags
output.

* SCIP ctags: add kinds for Kotlin (#57998)

Improved ctags kind output for Kotlin:
* Split type into class, interface, object, and enum
* Split variable into enumMember, constant, and property
* Add type alias

* Fix snapshot tests by adding back trailing whitespace

Our precommit hook removes trailing whitespace, but the generated snapshots
include it.

---------

Co-authored-by: Auguste Rame <[email protected]>
vovakulikov referenced this issue Dec 12, 2023
Added more fine-grained kinds for C#, Python, and Ruby to better match the universal-ctags output.

Addresses https://github.com/sourcegraph/sourcegraph/issues/57659
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants