Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: Collections support #839

Merged
merged 24 commits into from
Dec 11, 2024
Merged

CLI: Collections support #839

merged 24 commits into from
Dec 11, 2024

Conversation

josephjclark
Copy link
Collaborator

@josephjclark josephjclark commented Dec 6, 2024

Short Description

Adds collections support to the CLI

Closes #819

Docs are in OpenFn/docs#613

Usage Examples

openfn collections get my-collection \*

openfn collections get my-collection some-key

openfn collections get my-collection some-key --token $MY_OPENFN_PAT

openfn collections set my-collection some-key path/to/value.json

openfn collections set my-collection --items path/to/many-values.json

openfn collections remove my-collection 2024*

openfn collections remove my-collection 2024* --dry-run

Note: escaping * in the CLI is super important!

Implementation Details

The most important things to note are these:

  • OPENFN_PAT is used by default to load the user's PAT
  • Otherwise, pass --token with a different PAT
  • Set is deliberately split into "set one" and "set many", with the --items flag. It's a little cumbersome but designed, believe it or not, to reduce confusion.
  • If getting some items, you should be able to directly set them from the result. So the formats have to be compatible
  • Uploaded values are always loaded from a file
  • If getting or setting a single item (no pattern), that item will be uploaded and downloaded verbatim, as a value. This works GREAT for mappings, which is the allow cron #1 use-case for CLI IMO
  • If getting or setting multiple items, we use a JSON obejct to capture key/value pairs (not the [{ key, value }] format used on the wire)
  • remove --dry-run is cool, but crudely implemented. Would be improved by Collections: allow me to get keys only (or support HEAD with a count header) lightning#2758
  • Gets will fetch all items in batches (controlled by --page-size), unless --limit is passed

TODO

  • Error handling
  • Add a suite of unit tests
  • Support created-since etc flags
  • Proper support for limit

Other commands like keys and count will be implemented much later, when we have support form the backend

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

  • Code generation (copilot but not intellisense)
  • Learning or fact checking
  • Strategy / design
  • Optimisation / refactoring
  • Translation / spellchecking / doc gen
  • Other
  • I have not used AI

You can read more details in our Responsible AI Policy

@josephjclark
Copy link
Collaborator Author

Maybe get collection-name should get everything if you don't pass a key name? That dodges around the * problem a bit

@josephjclark
Copy link
Collaborator Author

The API is very strongly built around JSON at the moment - all values are assume to JSON.

That's probably fine, but shouldn't we also be able to handle non-json values? I think so? We do say the collection stores string values.

It's probably reasonable to try and parse all value as JSON, and if they fail, just return the verbatim value.

@josephjclark josephjclark marked this pull request as ready for review December 9, 2024 09:37
@doc-han
Copy link
Collaborator

doc-han commented Dec 9, 2024

The API is very strongly built around JSON at the moment - all values are assume to JSON.

That's probably fine, but shouldn't we also be able to handle non-json values? I think so? We do say the collection stores string values.

It's probably reasonable to try and parse all value as JSON, and if they fail, just return the verbatim value.

Hmmm, for non-json values(binary data like images, ...) it would be a bit tricky to just encode them and store in the same place.

@doc-han
Copy link
Collaborator

doc-han commented Dec 9, 2024

Maybe get collection-name should get everything if you don't pass a key name? That dodges around the * problem a bit

I personally side with returning nothing when nothing is specified. give exactly the data they asked for. so if they asked for nothing, we give nothing.

No unit tests becuase we don't have mock support yet
@josephjclark
Copy link
Collaborator Author

@doc-han Well 90% of our use-cases are going to be JSON based. I'm not sure why anyone would be saving binary data on our system - I'm not even sure why you'd store an arbitrary string really.

I might raise a wontfix issue to add support for --json (default true) and --no-json flags. This would give users and override and force JSON parsing off.

Meanwhile I also ought to add a catch for when JSON parsing fails, and we'll fallback to the raw value (perhaps with a warning)

Regarding the key thing, I'll leave it as it is now: a required field

Thanks for the input!

@josephjclark josephjclark merged commit 535da76 into main Dec 11, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

CLI: Support collections
2 participants