Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: support emit on window close for window join and interval join #18445

Open
Tracked by #8348
chenzl25 opened this issue Sep 6, 2024 · 0 comments
Open
Tracked by #8348

Feat: support emit on window close for window join and interval join #18445

chenzl25 opened this issue Sep 6, 2024 · 0 comments
Assignees

Comments

@chenzl25
Copy link
Contributor

chenzl25 commented Sep 6, 2024

Is your feature request related to a problem? Please describe.

Currently, interval join and window join emit rows for each update. We can support an emit-on-window-close version for them like Aggregation.

CREATE SOURCE s1 (
 id int,
 value int,
 ts TIMESTAMP,
 WATERMARK FOR ts AS ts - INTERVAL '20' SECOND
) WITH (
    connector = 'datagen',
    fields.ts.kind = 'random',
    fields.ts.max_past = '20 seconds',
    fields.ts.max_past_mode = 'relative',
    datagen.rows.per.second='1',
);

CREATE SOURCE s2 (
 id int,
 value int,
 ts TIMESTAMP,
 WATERMARK FOR ts AS ts - INTERVAL '20' SECOND
) WITH (
    connector = 'datagen',
    fields.ts.kind = 'random',
    fields.ts.max_past = '20 seconds',
    fields.ts.max_past_mode = 'relative',
    datagen.rows.per.second='1',
);

Window join:

dev=> explain CREATE MATERIALIZED VIEW mv1 AS
SELECT s1.window_start as window_start, s1.id as id, s2.value AS value1, s2.value AS value2
FROM tumble(s1, s1.ts, interval '1 minute'),
     tumble(s2, s2.ts, interval '1 minute')
WHERE s1.window_start = s2.window_start and s1.id = s2.id
emit on window close;
NOTICE:  EMIT ON WINDOW CLOSE is currently an experimental feature. Please use it with caution.
ERROR:  Failed to run the query

Caused by:
  Not supported: The query cannot be executed in Emit-On-Window-Close mode.
HINT: Try define a watermark column in the source, or avoid aggregation without GROUP BY

Interval Join:

dev=> CREATE MATERIALIZED VIEW interval_join AS
SELECT s1.id AS id1,
       s1.value AS value1,
       s2.id AS id2,
       s2.value AS value2
FROM s1 JOIN s2
ON s1.id = s2.id and s1.ts between s2.ts and s2.ts + INTERVAL '1' MINUTE
EMIT ON WINDOW CLOSE;
NOTICE:  EMIT ON WINDOW CLOSE is currently an experimental feature. Please use it with caution.
ERROR:  Failed to run the query

Caused by:
  Not supported: The query cannot be executed in Emit-On-Window-Close mode.
HINT: Try define a watermark column in the source, or avoid aggregation without GROUP BY

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@github-actions github-actions bot added this to the release-2.1 milestone Sep 6, 2024
@fuyufjh fuyufjh modified the milestones: release-2.1, release-2.2 Oct 17, 2024
@st1page st1page self-assigned this Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants