Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: add plugin support for new command-line subcommands #2438

Merged
merged 63 commits into from
Mar 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
9f5dc06
generalize index loading, try supporting plugins?
ctb Dec 31, 2022
69448b9
some cleanup
ctb Dec 31, 2022
32d853f
add -d/--debug to sig cat
ctb Dec 31, 2022
af67315
refactor overengineered signature output mechanisms
ctb Dec 31, 2022
caa48ec
add save_to plugin mechanism
ctb Dec 31, 2022
eea5546
fix sort issue
ctb Jan 2, 2023
7634a7c
beginnings of tests
ctb Jan 2, 2023
8653d8a
a test, a test, my kingdom for a test
ctb Jan 3, 2023
e8eb329
test the save functionality too
ctb Jan 3, 2023
9d388c9
add some docs
ctb Jan 3, 2023
95179eb
tests of priority
ctb Jan 3, 2023
44228a2
refactor save/load stuff into new sourmash.save_load module
ctb Jan 4, 2023
c059b7a
fix module reference
ctb Jan 4, 2023
1f78d9d
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jan 4, 2023
96325ee
add custom exception IndexNotLoaded
ctb Jan 4, 2023
b58dc9a
cleanup and refactor
ctb Jan 4, 2023
82faf82
minor cleanup
ctb Jan 4, 2023
4330bdb
plugins
ctb Jan 4, 2023
9c54a46
upd docs
ctb Jan 4, 2023
27e85c2
more doc
ctb Jan 5, 2023
9f48a66
more plugin docs
ctb Jan 5, 2023
f4a18c1
fix docs
ctb Jan 5, 2023
c670c68
add plugin hook for CLI plugins
ctb Jan 6, 2023
f169421
switch up mechanisms
ctb Jan 7, 2023
ececc0f
factor out common code; cache
ctb Jan 7, 2023
7e5af49
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jan 7, 2023
bc61586
add some minimal docs
ctb Jan 7, 2023
bcd8a41
list all plugins sourmash info -v
ctb Jan 7, 2023
a6caafc
update for no plugins detected
ctb Jan 7, 2023
524ebca
fix sourmash info -v
ctb Jan 8, 2023
9c05ee8
regularize scripts as a module
ctb Jan 8, 2023
77ab5c0
fix space
ctb Jan 8, 2023
ae33ee3
first set of tests for command line stuff
ctb Jan 8, 2023
1fd6fc0
fix exit test
ctb Jan 8, 2023
ae3493d
force a 'raise' on error exit
ctb Jan 8, 2023
268100c
test multiple commands
ctb Jan 8, 2023
6e75d62
simplify? test stuff
ctb Jan 8, 2023
d08ac16
test the getattr
ctb Jan 9, 2023
495944f
improve coverage
ctb Jan 9, 2023
b3846b6
remove unnecessary comment ;)
ctb Jan 9, 2023
ac308a9
cleanup
ctb Jan 10, 2023
f888463
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jan 15, 2023
d3ad9c6
update tests to improve coverage
ctb Jan 15, 2023
e60b9db
provide a base CLI class with some useful features.
ctb Jan 15, 2023
853de7a
cleanup & testing
ctb Jan 16, 2023
e74073f
more tests
ctb Jan 16, 2023
e405436
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Jan 17, 2023
037efd2
minor cleanup
ctb Jan 18, 2023
697c597
commentary
ctb Jan 23, 2023
87d202b
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Feb 7, 2023
29cd697
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Feb 20, 2023
04759ff
update docstring
ctb Feb 20, 2023
1d53f25
add error message
ctb Feb 20, 2023
3a820c7
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Feb 20, 2023
b13d7d8
minor edits/cleanup
ctb Feb 20, 2023
b28e890
put in some more checks
ctb Feb 20, 2023
206b20f
test failed import code
ctb Feb 20, 2023
afaa6ee
minor update of wording
ctb Feb 21, 2023
72e9e9c
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Feb 21, 2023
635c8f7
Merge branch 'latest' of https://github.com/sourmash-bio/sourmash int…
ctb Feb 21, 2023
7fa809e
update text
ctb Mar 1, 2023
6b034ae
describe priorities better
ctb Mar 1, 2023
0813a01
add 'ext' as an alias for 'scripts'
ctb Mar 1, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions doc/command-line.md
Original file line number Diff line number Diff line change
Expand Up @@ -1966,3 +1966,19 @@ situations where you have a **very large** collection of signatures
in the collection (as you would have to, with a zipfile). This can be
useful if you want to refer to different subsets of the collection
without making multiple copies in a zip file.

### Using sourmash plugins

As of sourmash v4.7.0, sourmash has an experimental plugins interface!
The plugin interface supports extending sourmash to load and save
signatures in new ways, and also supports the addition of sourmash
subcommands via `sourmash scripts`.

In order to use a plugin with sourmash, you will need to use `pip`
or `conda` to install the plugin the same environment that sourmash
is installed in.

In the future, we will include a list of available sourmash plugins in
the documentation, and also provide a way to list available plugins.

You can list all installed plugins with `sourmash info -v`.
38 changes: 29 additions & 9 deletions doc/dev_plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

As of version 4.7.0, sourmash has experimental support for Python
plugins to load and save signatures in different ways (e.g. file
formats, RPC servers, databases, etc.). This support is provided via
formats, RPC servers, databases, etc.) and to run additional commands
via the command-line. This support is provided via
the "entry points" mechanism supplied by
[`importlib.metadata`](https://docs.python.org/3/library/importlib.metadata.html)
and documented
Expand All @@ -24,23 +25,38 @@ a_reader = "module_name:load_sketches"

[project.entry-points."sourmash.save_to"]
a_writer = "module_name:SaveSignatures_WriteFile"

[project.entry-points."sourmash.cli_script"]
new_cli = "module_name:Command_NewCommand"
```

Here, `module_name` should be the name of the module to import.
`load_sketches` should be a function that takes a location along with

* `load_sketches` should be a function that takes a location along with
arbitrary keyword arguments and returns an `Index` object
(e.g. `LinearIndex` for a collection of in-memory
signatures). `SaveSignatures_WriteFile` should be a class that
signatures).
* `SaveSignatures_WriteFile` should be a class that
subclasses `BaseSave_SignaturesToLocation` and implements its own
mechanisms of saving signatures. See the `sourmash.save_load` module
for saving and loading code already used in sourmash.

Note that if the function or class has a `priority` attribute, this will
be used to determine the order in which the plugins are called.

The `name` attribute of the plugin (`a_reader` and `a_writer` in
* `Command_NewCommand` should be a class that subclasses
`plugins.CommandLinePlugin` and provides an `__init__` and
`main` method.

Note that if the reader function or writer class has a `priority`
attribute, this will be used to determine the order in which the
plugins are called. Priorities lower than 10 will get called before
any internal load or save function, while priorities greater than 80
will get called after almost all internal load/save functions; see
`src/sourmash/save_load.py` for details and the current priorities.

The `name` attribute of the plugin (`a_reader`, `a_writer`, and `new_cli` in
`pyproject.toml`, above) is only used in debugging.

You can provide zero or more plugins, and you can define just a reader, or
just a writer, or just a CLI plugin.

## Templates and examples

If you want to create your own plug-in, you can start with the
Expand All @@ -53,15 +69,19 @@ Some (early stage) plugins are also available as examples:

## Debugging plugins

`sourmash info -v` will list all installed plugins.

`sourmash sig cat <input sig> -o <output sig>` is a simple way to
invoke a `save_to` plugin. Use `-d` to turn on debugging output.

`sourmash sig describe <input location>` is a simple way to invoke
a `load_from` plugin. Use `-d` to turn on debugging output.

`sourmash scripts` will list available command-line plugins.

## Semantic versioning and listing sourmash as a dependency

Plugins should probably list sourmash as a dependency for installation.
Plugins should generally list sourmash as a dependency for installation.

Once plugins are officially supported by sourmash, the plugin API will
be under [semantic versioning constraints](https://semver.org/). That
Expand Down
11 changes: 9 additions & 2 deletions src/sourmash/__main__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
import sourmash
"""
The main entry point for sourmash, defined in pyproject.toml.

Can also be executed as 'python -m sourmash'.
"""


def main(arglist=None):
import sourmash
args = sourmash.cli.get_parser().parse_args(arglist)
if hasattr(args, 'subcmd'):
mod = getattr(sourmash.cli, args.cmd)
Expand All @@ -10,7 +15,9 @@ def main(arglist=None):
else:
mod = getattr(sourmash.cli, args.cmd)
mainmethod = getattr(mod, 'main')
return mainmethod(args)

retval = mainmethod(args)
raise SystemExit(retval)


if __name__ == '__main__':
Expand Down
10 changes: 8 additions & 2 deletions src/sourmash/cli/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
from . import sketch
from . import storage
from . import tax
from . import scripts


class SourmashParser(ArgumentParser):
Expand Down Expand Up @@ -98,20 +99,25 @@ def get_parser():
'sketch': 'Create signatures',
'sig': 'Manipulate signature files',
'storage': 'Operations on storage',
'scripts': "Plug-ins",
}
alias = {
"sig": "signature"
"sig": "signature",
"ext": "scripts",
}
expert = set(['categorize', 'import_csv', 'migrate', 'multigather', 'sbt_combine', 'watch'])

clidir = os.path.dirname(__file__)
basic_ops = utils.command_list(clidir)
user_ops = [op for op in basic_ops if op not in expert]

# provide a list of the basic operations - not expert, not submodules.
user_ops = [op for op in basic_ops if op not in expert and op not in module_descs]
usage = ' Basic operations\n'
for op in user_ops:
docstring = getattr(sys.modules[__name__], op).__doc__
helpstring = 'sourmash {op:s} --help'.format(op=op)
usage += ' {hs:25s} {ds:s}\n'.format(hs=helpstring, ds=docstring)
# next, all the subcommand ones - dive into subdirectories.
cmd_group_dirs = next(os.walk(clidir))[1]
cmd_group_dirs = filter(utils.opfilter, cmd_group_dirs)
cmd_group_dirs = sorted(cmd_group_dirs)
Expand Down
3 changes: 3 additions & 0 deletions src/sourmash/cli/info.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import screed
import sourmash
from sourmash.logging import notify
from sourmash.plugins import list_all_plugins

def subparser(subparsers):
subparser = subparsers.add_parser('info')
Expand All @@ -26,6 +27,8 @@ def info(verbose=False):
notify(f'screed version {screed.__version__}')
notify(f'- loaded from path: {os.path.dirname(screed.__file__)}')

list_all_plugins()


def main(args):
info(verbose=args.verbose)
48 changes: 48 additions & 0 deletions src/sourmash/cli/scripts/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
"""Provide a mechanism to add CLI plugins to sourmash.

See https://sourmash.readthedocs.io/en/latest/dev_plugins.html for docs,
src/sourmash/plugins.py for core sourmash implementation code, and
https://github.com/sourmash-bio/sourmash_plugin_template for a template repo
for making new plugins.
"""

# CTB TODO:
# * provide suggestions for documentation & metadata for authors:
# * provide guidance on how to test your CLI plugin at the CLI
# (minimal testing regime: sourmash scripts, look for description etc.)

ctb marked this conversation as resolved.
Show resolved Hide resolved
import argparse
import sourmash

# Here, we decorate this module with the various extension objects
# e.g. 'sourmash scripts foo' will look up attribute 'scripts.foo'
# and we will return the extension class object, which will then
# be run by sourmash.__main__. This dictionary is loaded below
# by sourmash.plugins.add_cli_scripts.
_extension_dict = {}

def __getattr__(name):
if name in _extension_dict:
return _extension_dict[name]
raise AttributeError(name)

def subparser(subparsers):
subparser = subparsers.add_parser('scripts',
usage=argparse.SUPPRESS,
formatter_class=argparse.RawDescriptionHelpFormatter,
aliases=['ext'])

# get individual help strings:
descrs = list(sourmash.plugins.get_cli_scripts_descriptions())
if descrs:
description = "\n".join(descrs)
else:
description = "(No script plugins detected!)"

s = subparser.add_subparsers(title="available plugin/extension commands",
dest='subcmd',
metavar='subcmd',
help=argparse.SUPPRESS,
description=description)

_extension_dict.update(sourmash.plugins.add_cli_scripts(s))
124 changes: 120 additions & 4 deletions src/sourmash/plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Plugin entry point names:
* 'sourmash.load_from' - Index class loading.
* 'sourmash.save_to' - Signature saving.
* 'sourmash.picklist_filters' - extended Picklist functionality.
* 'sourmash.cli_script' - command-line extension.

CTB TODO:

Expand All @@ -15,7 +15,9 @@
DEFAULT_LOAD_FROM_PRIORITY = 99
DEFAULT_SAVE_TO_PRIORITY = 99

from .logging import debug_literal
import itertools

from .logging import (debug_literal, error, notify, set_quiet)

# cover for older versions of Python that don't support selection on load
# (the 'group=' below).
Expand All @@ -31,14 +33,23 @@
# load 'save_to' entry points as well.
_plugin_save_to = entry_points(group='sourmash.save_to')

# aaaaand CLI entry points:
_plugin_cli = entry_points(group='sourmash.cli_script')
_plugin_cli_once = False

###

def get_load_from_functions():
"Load the 'load_from' plugins and yield tuples (priority, name, fn)."
debug_literal(f"load_from plugins: {_plugin_load_from}")

# Load each plugin,
for plugin in _plugin_load_from:
loader_fn = plugin.load()
try:
loader_fn = plugin.load()
except (ModuleNotFoundError, AttributeError) as e:
debug_literal(f"plugins.load_from_functions: got error loading {plugin.name}: {str(e)}")
continue

# get 'priority' if it is available
priority = getattr(loader_fn, 'priority', DEFAULT_LOAD_FROM_PRIORITY)
Expand All @@ -55,7 +66,11 @@ def get_save_to_functions():

# Load each plugin,
for plugin in _plugin_save_to:
save_cls = plugin.load()
try:
save_cls = plugin.load()
except (ModuleNotFoundError, AttributeError) as e:
debug_literal(f"plugins.load_from_functions: got error loading {plugin.name}: {str(e)}")
continue

# get 'priority' if it is available
priority = getattr(save_cls, 'priority', DEFAULT_SAVE_TO_PRIORITY)
Expand All @@ -64,3 +79,104 @@ def get_save_to_functions():
name = plugin.name
debug_literal(f"plugins.save_to_functions: got '{name}', priority={priority}")
yield priority, save_cls


class CommandLinePlugin:
"""
Provide some minimal common CLI functionality - -q and -d.

Subclasses should call super().__init__(parser) and super().main(args).
"""
command = None
description = None

def __init__(self, parser):
parser.add_argument(
'-q', '--quiet', action='store_true',
help='suppress non-error output'
)
parser.add_argument(
'-d', '--debug', action='store_true',
help='provide debugging output'
)

def main(self, args):
set_quiet(args.quiet, args.debug)


def get_cli_script_plugins():
global _plugin_cli_once

x = []
for plugin in _plugin_cli:
name = plugin.name
mod = plugin.module
try:
script_cls = plugin.load()
except (ModuleNotFoundError, AttributeError):
if _plugin_cli_once is False:
error(f"ERROR: cannot find or load module for cli_script plugin '{name}'")
continue

command = getattr(script_cls, 'command', None)
if command is None:
# print error message only once...
if _plugin_cli_once is False:
error(f"ERROR: no command provided by cli_script plugin '{name}' from {mod}; skipping")
else:
x.append(plugin)

_plugin_cli_once = True
return x


def get_cli_scripts_descriptions():
"Build the descriptions for command-line plugins."
for plugin in get_cli_script_plugins():
name = plugin.name
script_cls = plugin.load()

command = getattr(script_cls, 'command')
description = getattr(script_cls, 'description',
f"(no description provided by plugin '{name}')")
yield f"sourmash scripts {command:16s} - {description}"


def add_cli_scripts(parser):
"Configure parsing for command-line plugins."
d = {}

for plugin in get_cli_script_plugins():
name = plugin.name
script_cls = plugin.load()

subparser = parser.add_parser(script_cls.command)
debug_literal(f"cls_script plugin '{name}' adding command '{script_cls.command}'")
obj = script_cls(subparser)
d[script_cls.command] = obj

return d


def list_all_plugins():
plugins = itertools.chain(_plugin_load_from,
_plugin_save_to,
_plugin_cli)
plugins = list(plugins)

if not plugins:
notify("\n(no plugins detected)\n")

notify("")
notify("the following plugins are installed:")
notify("")
notify(f"{'plugin type':<20s} {'from python module':<30s} {'v':<5s} {'entry point name':<20s}")
notify(f"{'-'*20} {'-'*30} {'-'*5} {'-'*20}")

for plugin in plugins:
name = plugin.name
mod = plugin.module
version = plugin.dist.version
group = plugin.group

notify(f"{group:<20s} {mod:<30s} {version:<5s} {name:<20s}")
Loading