Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Import post-processing #420

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

sh0rez
Copy link
Contributor

@sh0rez sh0rez commented Jun 4, 2020

Adds a second interface for importing, ImportProcessor. If set on a
VM (non-nil), it is asked to perhaps convert the Importer result before
importCache stores it.

This can for example be used to enhance what import is capable of, so
that it also reads .yaml files, etc.

This is a feature we once had in Tanka, but had to disable as we were
not able to differentiate between import and importstr based on the
Importer interface. This change allows us to do that, as
ImportProcessor is only ever considered for import.

grafana/tanka#135 (comment)

Adds a second `interface` for importing, `ImportProcessor`. If set on a
VM (non-nil), it is asked to perhaps convert the Importer result before
importCache stores it.

This can for example be used to enhance what `import` is capable of, so
that it also reads `.yaml` files, etc.

This is a feature we once had in Tanka, but had to disable as we were
not able to differentiate between `import` and `importstr` based on the
`Importer` interface. This change allows us to do that, as
`ImportProcessor` is only ever considered for `import`.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 77.785% when pulling 4677e66 on sh0rez:import-process into d1c1457 on google:master.

2 similar comments
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 77.785% when pulling 4677e66 on sh0rez:import-process into d1c1457 on google:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 77.785% when pulling 4677e66 on sh0rez:import-process into d1c1457 on google:master.

@sh0rez
Copy link
Contributor Author

sh0rez commented Jun 9, 2020

@sbarzowski could you give this a look?

@sbarzowski
Copy link
Collaborator

I looked at it before, but I'm not sure how to go about it. I see the point, but I'm also worried about fragmentation with tools offering incompatible post-processing. Moreover, any solution will need to be ported to C++ implementation and supported by C and Python bindings. I need more time to think about this.

@sh0rez
Copy link
Contributor Author

sh0rez commented Jun 9, 2020

worried about fragmentation with tools offering incompatible post-processing

Thought about this as well, this is why it's done as an non-implemented interface{}, so that other downstream Jsonnet tools could experiment with what's needed and in the long run combine those findings into a reference implementation like jsonnet.FileImporter is one.

@sh0rez
Copy link
Contributor Author

sh0rez commented Aug 15, 2020

@sbarzowski for context, this was intended to be used to solve grafana/tanka#177.

I'm happy to use a totally different approach if it makes sense.

Maybe even just bring .yml support to Jsonnet when it comes to importing? Given it seems to gain YAML features anyways (#339), this wouldn't seem too out of scope anymore

@sbarzowski
Copy link
Collaborator

sbarzowski commented Aug 15, 2020

I was thinking a lot about this since you opened the PR.

Whatever we decide to do, we want to bring the "post-processing" capabilities to stdlib. We don't want to put a user in a situation when they can do some pure processing only during the import. So std.parseYaml is top priority for all solutions.

I see a few ways forward. First is to push std.parseYaml and piping syntax. This would allow the following code:

local myYaml = importstr "foo.yaml" |> std.parseYaml;

This is IMO a bit more readable than what we can do without piping syntax:

local myYaml = std.parseYaml(importstr "foo.yaml");

It's still much more typing than being able to just do the magic import, but the "noise" is at the end of the line, so it doesn't matter much IMO. This solution doesn't require any work downstream (maybe besides some documentation). All of the upstream work was planned anyway, so in a sense it doesn't add any work upstream either. It also generalizes nicely to other formats (e.g. ini, toml), including user-specified ones and doesn't require automagic recognition of file extensions.

The second way it to just add importjson and importyaml which are equivalent to std.parseXXX(importstr 'yyy.xxx'). Technically it's a breaking change, but I don't think these are common identifiers. It doesn't generalize at all to other formats (we won't add a new keyword for every imaginable format :-)). It requires more typing than automagic solution, but less than any explicit parsing. It might be a good pragmatic choice and it's certainly less work than the latter two.

The third way is to extend the importer to report a type-of-file to some well-known list (we can start with jsonnet, json, yaml). Then we can automatically parse it on the Jsonnet side. There is some downstream work to report these types with custom importers. May require some ugliness to avoid breaking changes. Still, changing importers might become nontrivial. A big advantage of this approach is that naively importing JSON files is no longer a potential security issue.

A fourth way is to do the postprocessing, but require the postprocessing to be a Jsonnet program. This way the path to change the importer is clear (if not necessarily easy). It is quite similar to the previous solution, but there's no global set of recognized files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants