Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve analysis of files with consistent indentation where offset is guessed wrongly due to merging #68

Open
NateEag opened this issue Jun 14, 2023 · 2 comments

Comments

@NateEag
Copy link

NateEag commented Jun 14, 2023

dtrt-indent (version 20230302.2151) guesses the following file has an offset of 4, when 2 is what I'd expect:

declare namespace NodeJS {
  interface ProcessEnv {
    NODE_ENV: string;

    FOOBAR: string;
    BAR: string;
    BAZ: string;

    TEST?: string;
    TEST2?: string;
    TEST3?: string;

    OPTION: string;
    ANOTHER_OPTION: string;
    MORE_OPTIONS: string;
  }
}

Output of dtrt-indent-diagnosis:

Guessing offset for /Users/nathaneagleson/schedule-engine/servicechannel-integration/test.ts

Elapsed time for analysis: 0.000 seconds

Total relevant lines: 12 out of 18 (limit: 5000)

Histogram:

     2x   2 spaces
    10x   4 spaces

Analysis:

  offset 2 works for 100.00% of relevant lines, matching 2 distinct offsets - merged with offset 4 (16.67% deviation, limit 20.00%)
  offset 4 works for  83.33% of relevant lines, matching 1 distinct offsets - CONSIDERED
  offset 3 works for   0.00% of relevant lines, matching 0 distinct offsets - rejected: too few distinct matching offsets (1 required)
  offset 5 works for   0.00% of relevant lines, matching 0 distinct offsets - rejected: too few distinct matching offsets (1 required)
  offset 6 works for   0.00% of relevant lines, matching 0 distinct offsets - rejected: too few distinct matching offsets (1 required)
  offset 7 works for   0.00% of relevant lines, matching 0 distinct offsets - rejected: too few distinct matching offsets (1 required)
  offset 8 works for   0.00% of relevant lines, matching 0 distinct offsets - rejected: too few distinct matching offsets (1 required)

Summary:

  Best guess is offset 4 with 83.33% matching lines (80.00% required)
  Hard tab percentage: 0.00% (0 lines), -100.00% superior to soft tabs (threshold 300.00%)
  Soft tab percentage: 100.00% (12 lines), inf% superior to hard tabs (threshold 300.00%)

Conclusion:

  Guessed offset 4 with 83% confidence.
  Change indent-tab-setting: yes, to nil
@rrthomas rrthomas changed the title dtrt-indent-mode guesses offset of 4 for file where offset of 2 is better Improve analysis of files with consistent indentation where offset is guessed wrongly due to merging Jun 15, 2023
@rrthomas
Copy link
Collaborator

This is dtrt-indent working as intended (and the diagnosis tells you what's going on). It can't distinguish between the case where there are a few odd 2-indents somewhere in the middle of a long file (where an indent of 4 is what you really want), and the case where there are a few symmetric ones at each end (as here).

I thought this issue seemed familiar, and indeed, see #29. I'd rather not make dtrt-indent cleverer. On the other hand, decreasing the value of dtrt-indent-max-merge-deviation only works up to a certain point: for arbitrarily long JSON files you'd have to set it to zero.

I can't think of a simple analytical solution to this problem in general that doesn't involve understanding the syntax of the file.

One way to do it by trial and error: run indent-buffer with the first-guessed setting. If the buffer doesn't change, then keep the setting. This could be done in dtrt-indent-analyze when considering whether to merge two settings.

@NateEag
Copy link
Author

NateEag commented Jun 15, 2023

Thanks for the explanation, @rrthomas . That makes sense, and it seemed like a maybe-expected result of the design, but it also surprised me, hence my filing the issue.

I didn't find dtrt-indent-max-merge-deviation when I skimmed for something I could tweak that might help, so thanks, that looks useful.

I don't see a robust, correct analytical answer to the problem either.

Maybe an (opt-in?) dumb heuristic could be good enough. Before merging a lower indent level into a compatible higher indent level, check whether the file's first and last indents match the lower level?

If they do, on average, I bet the lower indent level would be correct.

Your trial-and-error suggestion also makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants