Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter addFiles between hooks #370

Open
wants to merge 2 commits into
base: v3
Choose a base branch
from

Conversation

ilessiivi
Copy link

  • Makes addFiles call files-added hook with a file list filtered by every individual file-added hook
  • Adds a test to make sure that files-added hook won't receive files that file-added has deleted
  • Cleanup: removed the intermediary state objects and array, reducing the times the file list is looped through from 3 to 2
  • Code styling: unify variable names (flowfiles -> flowFiles, e -> flowFile)

@ilessiivi ilessiivi changed the title Always filter addFiles Filter addFiles between hooks Oct 11, 2021
* Make addFiles call files-added hook with a file list filtered by every individual file-added like in v2 and old synchronous v3 functions
* Cleanup: removed the intermediary state objects and array
* Cleanup: rename flowfiles array to flowFiles
@ilessiivi ilessiivi temporarily deployed to open-env October 11, 2021 16:31 Inactive
var flowfiles = await Promise.all(states);
for (let ff of flowfiles) {
await this.hook('file-added', ff, event);
if(flowFile && flowFile.file) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting is off, space should be placed after an if

Copy link
Collaborator

@drzraf drzraf Oct 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've a problem with this. When multiple files are added this is expected to be a (mostly) independent process for each of them.
Accumulating the (async) bootstrap (and possibly async initFileFn) then waiting for all of them at once is superior as it allows for parallel non-blocking initialization processes to occurs (FlowFile.bootstrap()).
Then, only file-added hooks were run sequentially (awaiting in a loop for each of them).

With this change, we'd now wait twice in a loop. It's inefficient and contrary to parallelization abilities offered by Promises.

(While talking about this, would you mind having a look at #368?)

it('should validate file-added filtering before files-added', async function() {
var valid = false;
flow.on('file-added', (flowFile) => {
if(flowFile.name === 'f2') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting is off, space should be placed after an if

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants