Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent intermediate state when sequencing gazelle_cabal and gazelle_haskell_modules #40

Open
GuillaumeGen opened this issue Oct 4, 2022 · 10 comments

Comments

@GuillaumeGen
Copy link
Contributor

When a project uses gazelle_cabal and gazelle_haskell_modules, adding Haskell files requires to modify the xxx.cabal file, hence to run gazelle_cabal to propagate those modifications to Bazel. But running gazelle_cabal alone on a "haskell_modules" Bazel project leads to an inconsistent state where the haskell_library rules have both the srcs and the modules fields, whereas the 2 are incompatible.
Fortunately, running gazelle_haskell_modules solves this and bring the project in the expected state.

@GuillaumeGen
Copy link
Contributor Author

I see 3 options:

  • We consider that it is not a real issue. And then we simply have to document it better.
  • gazelle_cabal could completely remove all the modules fields. This makes the intermediate step usable, but it is not very nice as it looks like gazelle_cabal makes a step backward and then gazelle_haskell_modules "rewrites" what has just been erased.
  • Sequencing the 2 in only one gazelle_binary is made possible. Currently there are some type inconsistencies, (see Internal error when trying to use gazelle_cabal and gazelle_haskell_modules together #39 ) but if gazelle_haskell_modules imported gazelle_cabal, it would probably possible to do a case analysis on the type provided to the "interface argument" of the Resolve function of gazelle_haskell_modules and then plug the 2 more smoothly.

@facundominguez
Copy link
Member

facundominguez commented Oct 4, 2022

We consider that it is not a real issue. And then we simply have to document it better.

While we don't have a concrete usability concern to address, this looks attractive if other fixes are laborious.

gazelle_cabal could completely remove all the modules fields.

... and keep the haskell_module rules? That could work, though I guess it would need to keep the modules attribute if there are any "#keep" comments in the attribute or its entries (I don't have any concrete use cases in mind, though).

Sequencing the 2 in only one gazelle_binary is made possible.

Maybe this is possible, and perhaps it would be the ideal solution. At some point I thought that dependencies between gazelle phases would make it difficult, but I don't have clarity about this anymore. If anything, it is worth writing down what are the difficulties of running them in a single binary.

@GuillaumeGen
Copy link
Contributor Author

After a discussion with Arnaud about the fact that solving this issue takes too much time, we agreed that I should make very regular comments on what has been achieved / has been understood / remain to do.

@GuillaumeGen
Copy link
Contributor Author

The current status (e02fbbc) is that the file generated on the test is

load("@rules_haskell//haskell:defs.bzl", "haskell_library", "haskell_test")
load("@rules_haskell//haskell/experimental:defs.bzl", "haskell_module")

# rule generated from test-project/bundled-gazelle-test.cabal by gazelle_cabal
haskell_library(
    name = "bundled-gazelle-test",
    ghcopts = ["-DVERSION_bundled_gazelle_test=\"0.0.0\""],
    modules = [
        ":bundled-gazelle-test.Fibo",
        ":bundled-gazelle-test.Power",
    ],
    version = "0.0.0",
    visibility = ["//visibility:public"],
    deps = [
        "@stackage//:base",
        "@stackage//:transformers",
    ],
)

# rule generated from test-project/bundled-gazelle-test.cabal by gazelle_cabal
haskell_test(
    name = "spec",
    ghcopts = ["-DVERSION_bundled_gazelle_test=\"0.0.0\""],
    modules = [":spec.Main"],
    narrowed_deps = [":bundled-gazelle-test"],
    version = "0.0.0",
    visibility = ["//visibility:public"],
    deps = [
        "@stackage//:base",
        "@stackage//:tasty",
        "@stackage//:tasty-hunit",
        "@stackage//:transformers",
    ],
)

# rule generated by gazelle_haskell_modules
haskell_module(
    name = "bundled-gazelle-test.Fibo",
    src = "src/Fibo.hs",
    src_strip_prefix = "src",
    visibility = ["//visibility:public"],
)

# rule generated by gazelle_haskell_modules
haskell_module(
    name = "bundled-gazelle-test.Power",
    src = "src/Power.hs",
    src_strip_prefix = "src",
    visibility = ["//visibility:public"],
)

# rule generated by gazelle_haskell_modules
haskell_module(
    name = "spec.Main",
    src = "tests/Main.hs",
    src_strip_prefix = "tests",
    visibility = ["//visibility:public"],
)

It is almost what is expected, but the dependencies of spec.Main (on bundled-gazelle-test.Fibo and bundled-gazelle.Power) are missing. The first investigations let me see that the RuleIndex used by the Resolve function does not contain the is_dep_of:... fields, which are the one used to fill the dependencies of a module.

@GuillaumeGen
Copy link
Contributor Author

GuillaumeGen commented Oct 17, 2022

As far as I understand, it is not possible to add those is_dep_of:... fields in the RuleIndex because this is filled during the Import phase, and this function does not have access to the imports associated to each rules, which precisely contains this information.
One could think, let's just run 'Resolve' (from gazelle_cabal) on the rules to have this information, but this requires to already have a RuleIndex, hence it is not possible to do so during the Import phase.
I do not understand why the Resolve phase of gazelle_cabal is not ran before the Import phase of gazelle_haskell_modules. I will ask it in the Gazelle on Bazel Slack.
I see 2 possibilities:

  1. Make gazelle_cabal provides a more complete rule before the Resolve phase. What I mean by it, is that currently, the GenerateRules phase generates the rules and the import data without any redundancies, which seems quite sensible, and then the resolve phase fills the missing fields in the rules, using both the import data and the RuleIndex. It might be possible to introduce a bit of redundancy between the rules and the import data, so that the Import phase of gazelle_haskell_modules can use it. This is not very elegant and might imply some works, I will wait until having an answer on why the Resolve phase of previous extensions are not ran before going this way.
  2. Store the information on the imports required in a separate map enrichRuleIndex and fill it while running Resolve on the rules generated by gazelle_haskell_modules. This is not elegant neither, as it relies on the order in which rules are resolved, but it works and is implemented in current commit : 6958aae.

@GuillaumeGen
Copy link
Contributor Author

The previous version was working, but a few setup_deps were missing for GHC 9.2.x (see https://github.com/tweag/gazelle_haskell_modules#using-ghc-version-92). This is solved in 12ac604.

@facundominguez
Copy link
Member

facundominguez commented Oct 19, 2022

Let me see if I understand.

gazelle_haskell_modules needs the rules generated by gazelle_cabal in order to generated haskell_module rules for them.
This would force gazelle_cabal to run in advance.

But what would happen if we tried to combine the callbacks of both extensions in a single gazelle run? Presumably calls would happen roughly in the following order:

  • gazelle_cabal.GenerateRules would be invoked on every subdirectory.
  • gazelle_haskell_modules.GenerateRules would be invoked on every subdirectory.
  • gazelle_cabal.Imports would be called for each rule generated by any of the extensions in gazelle_cabal's domain (haskell_binary, haskell_library, haskell_test)
  • gazelle_haskell_modules.Imports would be called for each rule generated by any of the extensions
  • gazelle_cabal.Resolve would be invoked for each rule generated by any of the extensions in gazelle_cabal's domain (haskell_binary, haskell_library, haskell_test)
  • gazelle_haskell_modules.Resolve would be invoked for each rule generated by any of the extensions

This is a conjecture, I don't really know how it works, but I think we should find it out ahead of discussing a solution.

@GuillaumeGen
Copy link
Contributor Author

This would be what I expected too. But when adding a simple print in the relevant functions of the extensions, the result is:

CABAL Loads
HASK_M Loads
CABAL GenerateRules
HASK_M GenerateRules
CABAL GenerateRules
HASK_M GenerateRules
CABAL GenerateRules
HASK_M GenerateRules
HASK_M Imports
HASK_M Imports
HASK_M Imports
HASK_M Imports
HASK_M Imports
CABAL Fix
HASK_M Fix
CABAL GenerateRules
HASK_M GenerateRules
HASK_M Embeds
HASK_M Embeds
HASK_M Embeds
HASK_M Embeds
HASK_M Embeds
HASK_M Resolve
HASK_M Resolve
HASK_M Resolve
HASK_M Resolve
HASK_M Resolve

No CABAL Imports and no CABAL Resolve.

@facundominguez
Copy link
Member

facundominguez commented Oct 31, 2022

Given the limitations of gazelle to combine these extensions, and given the complexities that #41 needs to introduce [1], shall we go by

We consider that it is not a real issue. And then we simply have to document it better.

?

[1]: The complexities I understand are:

  1. We need to make assumptions undocumented in gazelle about which callbacks run and in which order.
  2. We need to make gazelle_haskell_modules depend on gazelle_cabal
  3. We need to make gazelle_haskell_modules depend on gazelle_cabal even when the user doesn't intend to usegazelle_cabal (or introduce configuration machinery to avoid this).

@GuillaumeGen
Copy link
Contributor Author

I agree with your diagnosis of the complexity. It is the diagnosis (and the overcoming) of those complexity which made this issue quite long to solve.
According to https://bazelbuild.slack.com/archives/C01HMGN77Q8/p1667238537529459?thread_ts=1667212307.476989&cid=C01HMGN77Q8 to my question asked in the Bazel Slack, there does not seem to be any way to do modify the deps field of a rule generated by another extension than what I did (namely to not bring the Resolve part into the last called extension).

About what should we do now, that is a great question. I think that the first question to ask is "Do we expect to have users of gazelle_haskell_modules but not gazelle_cabal?" If the answer is "yes" then I think that adding to some user the annoyance of having to depend of gazelle_cabal to relieve some others of running 2 separate gazelle commands is not worth the maintenance of #41

Otherwise, it is more debatable question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants