Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplfy compiler patterns #5425

Merged
merged 2 commits into from
Nov 3, 2024
Merged

simplfy compiler patterns #5425

merged 2 commits into from
Nov 3, 2024

Conversation

mccanne
Copy link
Collaborator

@mccanne mccanne commented Nov 3, 2024

This commit simplifies the compiler so the code paths are easier to understand and the API is easier to use. compiler.Job goes away as does the variations implementation of the runtime.Compiler. runtime.Compiler now just has one query method for both local and lake cases. We also moved compiler/data.Source to runtime/exec.Environment as that much more accurately potrays the role of this struct as representing the execution environment (lake or local) of the query. Down the road, we may want to store config options here (e.g., parallelism, strict compilation, etc). We also added the flag -files to "super compile" to ask it to compile the query assuming there are command-line input files.

This commit simplifies the compiler so the code paths are easier to
understand and the API is easier to use.  compiler.Job goes away
as does the variations implementation of the runtime.Compiler.
runtime.Compiler now just has one query method for both local and
lake cases.  We also moved compiler/data.Source to
runtime/exec.Environment as that much more accurately potrays
the role of this struct as representing the execution environment
(lake or local) of the query.  Down the road, we may want to
store config options here (e.g., parallelism, strict compilation,
etc).  We also added the flag -files to "super compile" to ask it
to compile the query assuming there are command-line input files.
@mccanne mccanne requested a review from a team November 3, 2024 02:33
@@ -62,11 +61,16 @@ func (s *Shared) Run(ctx context.Context, args []string, lakeFlags *lakeflags.Fl
if len(args) == 1 {
query = args[0]
}
ast, err := parser.ParseQuery(query, s.includes...)
rctx := runtime.DefaultContext()
env := exec.NewEnvironment(nil, lk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Declare this closer to its use.

Comment on lines 28 to 29
func Analyze(rctx *runtime.Context, ast *parser.AST, env *exec.Environment, extInput bool) (dag.Seq, error) {
return semantic.Analyze(rctx.Context, ast, env, extInput)
Copy link
Member

@nwt nwt Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
func Analyze(rctx *runtime.Context, ast *parser.AST, env *exec.Environment, extInput bool) (dag.Seq, error) {
return semantic.Analyze(rctx.Context, ast, env, extInput)
func Analyze(ctx context.Context, ast *parser.AST, env *exec.Environment, extInput bool) (dag.Seq, error) {
return semantic.Analyze(ctx, ast, env, extInput)

Comment on lines 32 to 35
func Optimize(rctx *runtime.Context, seq dag.Seq, env *exec.Environment, parallel int) (dag.Seq, error) {
// Call optimize to possible push down a filter predicate into the
// kernel.Reader so that the zng scanner can do boyer-moore.
o := optimizer.New(rctx.Context, env)
Copy link
Member

@nwt nwt Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
func Optimize(rctx *runtime.Context, seq dag.Seq, env *exec.Environment, parallel int) (dag.Seq, error) {
// Call optimize to possible push down a filter predicate into the
// kernel.Reader so that the zng scanner can do boyer-moore.
o := optimizer.New(rctx.Context, env)
func Optimize(ctx context.Context, seq dag.Seq, env *exec.Environment, parallel int) (dag.Seq, error) {
// Call optimize to possible push down a filter predicate into the
// kernel.Reader so that the zng scanner can do boyer-moore.
o := optimizer.New(ctx, env)

Comment on lines 36 to 37
var err error
seq, err = o.Optimize(seq)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
var err error
seq, err = o.Optimize(seq)
seq, err := o.Optimize(seq)

if err != nil {
return nil, err
}
seq, err = o.Vectorize(seq) // XXX why is this here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we engage the vector runtime on lake queries when CSUP is available.

)

// Analyze performs a semantic analysis of the AST, translating it from AST
// to DAG form, resolving syntax ambiguities, and performing constant propagation.
// After semantic analysis, the DAG is ready for either optimization or compilation.
func Analyze(ctx context.Context, ast *parser.AST, source *data.Source, extInput bool) (dag.Seq, error) {
func Analyze(ctx context.Context, ast *parser.AST, source *exec.Environment, extInput bool) (dag.Seq, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
func Analyze(ctx context.Context, ast *parser.AST, source *exec.Environment, extInput bool) (dag.Seq, error) {
func Analyze(ctx context.Context, ast *parser.AST, env *exec.Environment, extInput bool) (dag.Seq, error) {

engine: engine,
lake: lake,
}
}

func (s *Source) IsLake() bool {
func (s *Environment) IsLake() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Update receiver name on all these methods.

Suggested change
func (s *Environment) IsLake() bool {
func (e *Environment) IsLake() bool {

zed_test.go Outdated
@@ -143,7 +144,7 @@ func runOneBoomerang(t *testing.T, format, data string) {
ast, err := parser.ParseQuery("fuse")
require.NoError(t, err)
rctx := runtime.NewContext(context.Background(), zctx)
q, err := compiler.NewCompiler().NewQuery(rctx, ast, []zio.Reader{dataReadCloser})
q, err := compiler.NewCompiler(storage.NewLocalEngine()).NewQuery(rctx, ast, []zio.Reader{dataReadCloser}, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This path doesn't use the engine.

Suggested change
q, err := compiler.NewCompiler(storage.NewLocalEngine()).NewQuery(rctx, ast, []zio.Reader{dataReadCloser}, 0)
q, err := compiler.NewCompiler(nil).NewQuery(rctx, ast, []zio.Reader{dataReadCloser}, 0)

ztest/ztest.go Outdated
@@ -523,7 +524,7 @@ func runzq(path, zedProgram, input string, outputFlags []string, inputFlags []st
if err != nil {
return "", err
}
q, err := runtime.CompileQuery(context.Background(), zctx, compiler.NewCompiler(), ast, []zio.Reader{zrc})
q, err := runtime.CompileQuery(context.Background(), zctx, compiler.NewCompiler(storage.NewLocalEngine()), ast, []zio.Reader{zrc})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This path doesn't use the engine.

Suggested change
q, err := runtime.CompileQuery(context.Background(), zctx, compiler.NewCompiler(storage.NewLocalEngine()), ast, []zio.Reader{zrc})
q, err := runtime.CompileQuery(context.Background(), zctx, compiler.NewCompiler(nil), ast, []zio.Reader{zrc})

@mccanne mccanne merged commit ca0cad8 into main Nov 3, 2024
3 checks passed
@mccanne mccanne deleted the simplify-compiler branch November 3, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants