Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interface review for resource API #1124

Open
vsoch opened this issue Dec 21, 2023 · 0 comments
Open

Interface review for resource API #1124

vsoch opened this issue Dec 21, 2023 · 0 comments

Comments

@vsoch
Copy link
Member

vsoch commented Dec 21, 2023

This issue is intended to review the exposed interfaces for the resource API, with the intention to understand what needs to be exposed (and what does not). I'll tackle this from the standpoint of our primary use case (at least for now) - the fluence plugin that uses the Go bindings and other TBA out of tree plugins that use fluxion.

Level 1: Go Modules

The entrypoint for a Go plugin using the Fluxion Go bindings would be these modules. I don't know the subtle distinction between the cli and module.

reapi_module

fluxmodule

reapi_module imports reapi_module.h, and uses these functions / structures from it:

  • struct_reapi_module_ctx_: the context
  • reapi_module_new used in Init
  • reapi_module_destroy used in Destroy
  • reapi_module_match_allocate used in MatchAllocate
  • reapi_module_info for Info
  • reapi_module_cancel for Cancel

reapi_cli

fluxcli

reapi_cli imports reapi_cli.h, and uses these functions / structures from it:

  • struct_reapi_cli_ctx_t: for the context
  • reapi_cli_new create the new client (context)
  • reapi_cli_destroy for Destroy
  • reapi_cli_initialize to init Fluxion with resource graph
  • reapi_cli_match_allocate for MatchAllocate
  • reapi_cli_update_allocate for UpdateAllocate
  • reapi_cli_cancel for Cancel
  • reapi_cli_info for Info
  • reapi_cli_stat for Stat
  • reapi_cli_get_err_msg for GetErrMsg
  • reapi_cli_clear_err_msg for ClearErrMsg

Level 2: C/C++ bindings

The Go modules above use the following header files and associated c code. It looks like the C code actually imports some C++ headers / code as well! I didn't know you could do that, TIL! I am guessing this is because we are using cgo? And cgo needs to use C, and fluxion is in C++, so we created the c bindings as an intermediate interface?

reapi_cli.h

The TLDR here is that the reapi_cli.h imports some of the C++ bindings, and it's these C++ bindings that bring in different components from resource (higher up) along with jobinfo and this is why we need to include those shared libraries.

Imports:

I'll stop there because it starts to look like everything is using everything else in resource!

reapi_module.h

TLDR: my impression here is that the distinction might be that this is indeed intended to be a module, meaning it doesn't bring in all the libraries from libresource.so. Was it the case that this module was started but it couldn't meet all the needs that we wanted, so it was left (and then the fluxcli started?)

Imports:

  • flux/core.h
  • resource/reapi/bindings/c++/reapi_module.hpp
    • imports resource/reapi/bindings/c++/reapi.hpp
  • resource/reapi/bindings/c++/reapi_module_impl.hpp
    • imports resource/reapi/bindings/c++/reapi_module.hpp

Next Steps

To step back - let's have a discussion about what our goals are (e.g., to create more separation between the API via more scoped functions? To simplify logic / types to be shared between libresource.so and libreapi_cli.so so they can use the same thing (but we expose a much smaller interface?) Let me know if you want me to dig deeper into any of the above - I'm hoping the links to the top level files help you explore as I did.

Example Use Case

Right now to build an out of tree plugin using fluxion (with the Go bindings) I need to both compile and then have the paths to the (non system installed) .so files exposed via LD_LIBRARY_PATH. Here are those things, and I'll show as a diff for what I have to do currently (red) vs what I'd like to do (green).

Compiling

- -L/opt/flux-sched/resource -lfluxion-resource -L/opt/flux-sched/resource/libjobspec -ljobspec_conv -L/opt/flux-sched/resource/reapi/bindings -lreapi_cli -lflux-idset -lstdc++ -lczmq -ljansson -lhwloc -lboost_system -lflux-hostlist -lboost_graph -lyaml-cpp" go build -ldflags '-w' -o bin/icecream src/cmd/main.go
+ -lfluxion-resource -lreapi_cli -lflux-idset -lstdc++ -lczmq -ljansson -lhwloc -lboost_system -lflux-hostlist -lboost_graph -lyaml-cpp" go build -ldflags '-w' -o bin/icecream src/cmd/main.go

And maybe fluxion-resource and reapi_cli are differently named / combined, I'm not sure. But the distinction is that I don't need to tell the linker about paths in the source code of a built flux-sched. They are in default paths somewhere on my system (likely in a container).

Runtime

# These are the flags
# This is what I need to export before I run my binary
+ This should be entirely unnecessary if flux (with the reapi_cli) is installed to a system location like `/lusr/lib`
- export LD_LIBRARY_PATH=/usr/lib:/opt/flux-sched/resource:/opt/flux-sched/resource/reapi/bindings:/opt/flux-sched/resource/libjobspec

# Running the binary!
./bin/icecream -spec icecream.yaml

Questions

And some questions that I have:

  • I think we should document somewhere when I'd want to use reapi_cli vs reapi_module
  • Given the existence of the two, is it the case that we really want one thing (that looks more like the module in terms of what it brings it) but that was impossible to implement?
  • Consistent function naming - e.g., full names vs not. It looks like we are using full names "MatchAllocate" so I think "GetErrMsg" should be "GetErrorMessage" (as an example).

Almost forgot! Ping @trws and @milroy from discussion today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant