-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use pluggy for plugin management #146
Conversation
Builds on top of @benbovy 's work in building router factories in xpublish-community#89 to build a plugin system. The plugin system uses entry points, which are most commonly used for console or GUI scripts. The entry_point group is `xpublish.plugin` Right now plugins can provide dataset specific and general (app) routes, with default prefixes and tags for both. Xpublish will by default load plugins via the entry point. Additionally, plugins can also be loaded directly via the init, as well as being disabled, or configured. The existing dataset router pattern also still works, so that folks aren't forced into using plugins Entry point reference: - https://setuptools.pypa.io/en/latest/userguide/entry_point.html - https://packaging.python.org/en/latest/specifications/entry-points/ - https://amir.rachum.com/amp/blog/2017/07/28/python-entry-points.html
# Conflicts: # .github/workflows/main.yaml # .pre-commit-config.yaml # setup.py
for more information, see https://pre-commit.ci
Another variation on xpublish-community#140 with a few of the ideas from the discussion there and xpublish-community#139. Plugin routers are now nested under a parent `Plugin` class which now acts as a way to combine multiple related pieces of functionality together (say db management routes and a CLI). This allows new plugin functionality to be added in other plugins or Xpublish related libraries without requiring the parent `Plugin` class to define everything. Plugins are loaded from the `xpublish.plugin` entrypoint group. Plugins can be manually configured via the `plugins` argument to `xpublish.Rest`. The specifics of plugin loading can be changed by overriding the `.setup_plugins()` method. Some other `xpublish.Rest` functionality has been refactored out into separate methods to allow easier overriding for instance making a `SingleDatasetRest` class that will allow simplifying `xpublish.Rest`. The `ds.rest` accessor has been move out into it's own file.
Pluggy is the core of py.test's ability to be extended, and it also is the plugin manager for Tox, Datasette, Jupyter-FPS, and Conda, among others. In Xpublish a set of hooks is defined that plugins can implement, and a `pluggy.PluginManager` (as `xpublish.Rest.pm`) proxies requests to the plugins that have implemented the hooks and aggregates the results. Hooks define a set of possible kwargs that can be passed to downstream implementations, but not all implementations need to implement all of them (which makes it easy to add new kwargs without disrupting existing implementations). Hooks can also be defined as only returning the first response, or wrapping other hooks (dataset middleware?). So far I've defined a handful of hooks in `xpublish.plugin.hooks:PluginSpec`: - app_router() - dataset_router() - get_datasets() - get_dataset(dataset_id: str) - register_hookspec() - Which allows plugins to register new hook types `get_datasets` and `get_dataset` allow plugins to provide datasets without loading them on launch, or overriding `Rest._get_dataset_fn` or `Rest.setup_datasets()`. I've kept the kwargs relatively minimal right now on the hooks as it's easier to expand the kwargs later, than it is to reduce them. I've additionally refactored the single dataset usage into it's own class `xpublish.SingleDatasetRest` to simplify some of the conditional logic which the accessor uses. Pluggy references: - https://pluggy.readthedocs.io/en/stable/ - https://docs.pytest.org/en/latest/how-to/writing_plugins.html - https://docs.datasette.io/en/latest/writing_plugins.html - https://docs.conda.io/projects/conda/en/latest/dev-guide/plugins/index.html#
PRs for the routers that I've made to work as pluggy based plugins: |
Other hooks that might be worth adding to Xpublish itself:
Hooks to delegate to plugins:
|
After some pondering (powered by Future Island and a long commute in the dark), it seems like setting dependencies at the Plugin level is probably the wrong move. It makes it harder to override (say When I get a chance, I'll explore passing in dependencies as arguments to |
In a conversation with @mpiannucci I got to pondering how Dask fits into the plugin equation. As far as I understand, xarray finds any dask client that's been initialized. If that's the case, it might be possible to add a new general If xarray can't take advantage of a dask client that's was initialized in another plugin due to namespace issues, then we could use a more explicit I think this could help take care of #54 in that the specifics of a Dask cluster are determined by the specifics of the loaded plugin. Hence we could have |
Refactored dependency injection into plugin routers so that dependencies can be overridden when routers are called, rather than when plugins are instantiated. This makes it so that routers can be reused and adapted by other plugins.
# "dataset_id" parameter should be absent in all paths | ||
assert len(openapi_schema['paths']['/']['get']['parameters']) == 0 | ||
|
||
# test cached value | ||
openapi_schema = airtemp_rest.app.openapi() | ||
assert len(openapi_schema['paths']['/']['get']['parameters']) == 0 | ||
with pytest.raises(KeyError): | ||
# "dataset_id" parameter should be absent in all paths | ||
# parameters is no longer generated when plugins use passed in deps | ||
# and get_dataset is replaced | ||
assert len(openapi_schema['paths']['/']['get']['parameters']) == 0 | ||
|
||
with pytest.raises(KeyError): | ||
# test cached value | ||
# parameters is no longer generated when plugins use passed in deps | ||
# and get_dataset is replaced | ||
openapi_schema = airtemp_rest.app.openapi() | ||
assert len(openapi_schema['paths']['/']['get']['parameters']) == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the change in dependency loading for built in routes (via plugins), get_dataset
doesn't take an argument for single datasets, and thus the parameters
key doesn't get added to the openapi spec.
It still would get added (and need to be overridden) for any non-plugin routers, so the modified spec generator is still needed for single datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome @abkfenris!!! Apologies for such a delayed review.
A few questions below but they are mostly about just getting me caught up on what you (and @benbovy) have done here.
xpublish/plugins/hooks.py
Outdated
"""Plugin extension points""" | ||
|
||
@hookspec | ||
def app_router(self, deps: Dependencies) -> APIRouter: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my learning, all these methods are optional right? I.e. a plugin could implement one or more of these hooks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, they are all optional. I'm thinking I might explore mix-ins for common plugin patterns, in addition to a whole lot of documentation around plugins and how the whole ecosystem can work.
@@ -0,0 +1,75 @@ | |||
from typing import List |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question about file naming. Do we need the /included/
part of the file path for the default plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wavering between /default/
and /included/
for the plugins that come with xpublish. I did want to keep them in the /plugins/
directory, but I wanted to differentiate them from the plugin machinery. I initially had /included_plugins/
or similar at the same level as /plugins/
but that seemed like overkill and more confusing.
I'm also thinking that as part of documenting things, I'll probably make a few demo plugins that won't be automatically loaded via entrypoints, but can show off the different parts of the functionality (for example an dataset plugin that provides the xarray tutorial datasets), so I was thinking those would end up in xpublish/plugins/demo/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a bit overkill at this point but I'm fine with it.
Ok, @jhamman as soon as you are happy, this is ready for you to merge. |
Super excited by this @abkfenris! |
Well done @abkfenris, this is great! Sorry I didn't take the chance to look at it but happy to see it merged. I'll open new issues if I have any comment. |
@benbovy there is definitely more work to be done around plugins (see comment above), so there will be further iteration on them that I'd love feedback on. I'm hoping to work on the docs soon to at least give a rough intro to plugins, and explain the vision behind them/where Xpublish is going, so I'd love some feedback on that. |
Pluggy is the core of py.test's ability to be extended, and it also is the plugin manager for Tox, Datasette, Jupyter-FPS, Napari, and Conda, among others.
In Xpublish a set of hooks is defined that plugins can implement, and a
pluggy.PluginManager
(asxpublish.Rest.pm
) proxies requests to the plugins that have implemented the hooks and aggregates the results.Hooks define a set of possible kwargs that can be passed to downstream implementations, but not all implementations need to implement all of them (which makes it easy to add new kwargs without disrupting existing implementations). Hooks can also be defined as only returning the first response, or wrapping other hooks (dataset middleware?).
So far I've defined a handful of hooks in
xpublish.plugin.hooks:PluginSpec
:app_router()
dataset_router()
get_datasets()
get_dataset(dataset_id: str)
register_hookspec()
- Which allows plugins to register new hook typesget_datasets
andget_dataset
allow plugins to provide datasets without loading them on launch, or overridingRest._get_dataset_fn
orRest.setup_datasets()
.I've kept the kwargs relatively minimal right now on the hooks as it's easier to expand the kwargs later, than it is to reduce them.
Other potential hooks include middleware, tabular or datatree datasets, a more end user focused UI.
I've additionally refactored the single dataset usage into it's own class
xpublish.SingleDatasetRest
to simplify some of the conditional logic which the accessor uses.Example dataset provider plugin:
Pluggy references:
Replaces #145