Fix erroneous additional batch execution #11113

QMalcolm · 2024-12-09T19:51:03Z

Resolves #11112

Problem

Microbatch models that only ran on batch (either due to lookback=0 or setting --event-time-start + --event-time-end) would raise an Unhandled exception like the following:

Solution

Ensure we skip the last batch execution path when only one batch needs to be run in total for a microbatch model

Checklist

I have read the contributing guide and understand what's expected of me.
I have run this code in development, and it appears to resolve the stated issue.
This PR includes tests, or tests are not required or relevant for this PR.
This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
This PR includes type annotations for new and modified functions.

Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception.

codecov · 2024-12-09T19:53:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.94%. Comparing base (03fdb4c) to head (5942aa5).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #11113      +/-   ##
==========================================
- Coverage   88.96%   88.94%   -0.03%     
==========================================
  Files         183      183              
  Lines       23933    23934       +1     
==========================================
- Hits        21291    21287       -4     
- Misses       2642     2647       +5

Flag	Coverage Δ
integration	`86.31% <100.00%> (-0.03%)`	⬇️
unit	`61.96% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Unit Tests	`61.96% <0.00%> (-0.01%)`	⬇️
Integration Tests	`86.31% <100.00%> (-0.03%)`	⬇️

ChenyuLInx

The change looks mostly good!

From organizing code perspective what's your thoughts on hide all logic here in the _submit_batch function? It seems like we also have some special logic to handle the last batch in that function. I think it's good to have all these in a single logic layer vs multiple. WDYT?

dbt-core/core/dbt/task/run.py

Lines 741 to 785 in 5942aa5

    
           relation_exists = self._submit_batch( 
        
               node=node, 
        
               adapter=runner.adapter, 
        
               relation_exists=relation_exists, 
        
               batches=batches, 
        
               batch_idx=batch_idx, 
        
               batch_results=batch_results, 
        
               pool=pool, 
        
               force_sequential_run=True, 
        
           ) 
        
           batch_idx += 1 
        
           skip_batches = batch_results[0].status != RunStatus.Success 
        
           # Run all batches except first and last batch, in parallel if possible 
        
           while batch_idx < len(runner.batches) - 1: 
        
               relation_exists = self._submit_batch( 
        
                   node=node, 
        
                   adapter=runner.adapter, 
        
                   relation_exists=relation_exists, 
        
                   batches=batches, 
        
                   batch_idx=batch_idx, 
        
                   batch_results=batch_results, 
        
                   pool=pool, 
        
                   skip=skip_batches, 
        
               ) 
        
               batch_idx += 1 
        
           # Wait until all submitted batches have completed 
        
           while len(batch_results) != batch_idx: 
        
               pass 
        
           # Only run "last" batch if there is more than one batch 
        
           if len(batches) != 1: 
        
               # Final batch runs once all others complete to ensure post_hook runs at the end 
        
               self._submit_batch( 
        
                   node=node, 
        
                   adapter=runner.adapter, 
        
                   relation_exists=relation_exists, 
        
                   batches=batches, 
        
                   batch_idx=batch_idx, 
        
                   batch_results=batch_results, 
        
                   pool=pool, 
        
                   force_sequential_run=True, 
        
                   skip=skip_batches, 
        
               )

QMalcolm · 2024-12-09T23:55:03Z

The change looks mostly good!

From organizing code perspective what's your thoughts on hide all logic here in the _submit_batch function? It seems like we also have some special logic to handle the last batch in that function. I think it's good to have all these in a single logic layer vs multiple. WDYT?

dbt-core/core/dbt/task/run.py

Lines 741 to 785 in 5942aa5

relation_exists = self._submit_batch(

node=node,

adapter=runner.adapter,

relation_exists=relation_exists,

batches=batches,

batch_idx=batch_idx,

batch_results=batch_results,

pool=pool,

force_sequential_run=True,

)

batch_idx += 1

skip_batches = batch_results[0].status != RunStatus.Success

# Run all batches except first and last batch, in parallel if possible

while batch_idx < len(runner.batches) - 1:

relation_exists = self._submit_batch(

node=node,

adapter=runner.adapter,

relation_exists=relation_exists,

batches=batches,

batch_idx=batch_idx,

batch_results=batch_results,

pool=pool,

skip=skip_batches,

)

batch_idx += 1

# Wait until all submitted batches have completed

while len(batch_results) != batch_idx:

pass

# Only run "last" batch if there is more than one batch

if len(batches) != 1:

# Final batch runs once all others complete to ensure post_hook runs at the end

self._submit_batch(

node=node,

adapter=runner.adapter,

relation_exists=relation_exists,

batches=batches,

batch_idx=batch_idx,

batch_results=batch_results,

pool=pool,

force_sequential_run=True,

skip=skip_batches,

)

@ChenyuLInx I'm down to refactor this code 🙂 I'd definitely like to simplify things here a fair bit. However, this change is going to need to be backported to 1.9.latest for 1.9.1. As such I think a refactor here is out of scope, as such work should probably be forward facing.

ChenyuLInx · 2024-12-10T00:15:48Z

My preference is do the refactor as you change the code if it is not too complex

going to need to be backported to 1.9.latest

Why this changes whether we should do the refactor?

QMalcolm · 2024-12-10T00:30:48Z

My preference is do the refactor as you change the code if it is not too complex

going to need to be backported to 1.9.latest

Why this changes whether we should do the refactor?

My preference is to keep refactors as separate as possible from changes in logic. I've found it's desirable to not do a refactor as part of work which is being backported. This is because:

The refactor is irrelevant to the fix, and thus doesn't have a need to be backported
The more changes we introduce, the greater the likelihood of introducing a new edge case or bug

Now, I'm generally of the opinion that refactors should be separated from logical changes or bug fixes, unless it is impossible to separate the two. I'm a little looser on that when it is a forward-looking change only (i.e. solely going into an alpha/beta). With backports though, I take it a bit more seriously

If the worry is that the refactor won't get done unless it comes alongside the bug fix / logic change, then I'll promise to do the refactor as the next PR I open 🙂

ChenyuLInx · 2024-12-10T00:56:50Z

I am happy this goes in as is with a fast follow-up PR of the refactor.

One thing to note is If you don't do the refactor as part of this change and get it backported, when we have to have some other fixes around this go in, you will have to backport the refactor and then the fix(or do a custom fix).

* Update single batch test case to check for generic exceptions * Explicitly skip last final batch execution when there is only one batch Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception. * Changie doc (cherry picked from commit c9582c2)

* Update single batch test case to check for generic exceptions * Explicitly skip last final batch execution when there is only one batch Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception. * Changie doc (cherry picked from commit c9582c2) Co-authored-by: Quigley Malcolm <[email protected]>

QMalcolm added 3 commits December 9, 2024 13:25

Update single batch test case to check for generic exceptions

00bcce7

Explicitly skip last final batch execution when there is only one batch

6ffed67

Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception.

Changie doc

5942aa5

QMalcolm requested a review from a team as a code owner December 9, 2024 19:51

cla-bot bot added the cla:yes label Dec 9, 2024

QMalcolm changed the title ~~Qmalcolm 11112 fix erroneous additional batch execution~~ Fix erroneous additional batch execution Dec 9, 2024

QMalcolm mentioned this pull request Dec 9, 2024

fix MicrobatchExecutionDebug message #11071

Merged

5 tasks

ChenyuLInx reviewed Dec 9, 2024

View reviewed changes

ChenyuLInx approved these changes Dec 10, 2024

View reviewed changes

QMalcolm merged commit c9582c2 into main Dec 10, 2024
62 of 63 checks passed

QMalcolm deleted the qmalcolm--11112-fix-erroneous-additional-batch-execution branch December 10, 2024 15:28

QMalcolm added the backport 1.9.latest label Dec 10, 2024

github-actions bot mentioned this pull request Dec 10, 2024

[Backport 1.9.latest] Fix erroneous additional batch execution #11117

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix erroneous additional batch execution #11113

Fix erroneous additional batch execution #11113

QMalcolm commented Dec 9, 2024

codecov bot commented Dec 9, 2024 •

edited

Loading

ChenyuLInx left a comment

QMalcolm commented Dec 9, 2024

ChenyuLInx commented Dec 10, 2024

QMalcolm commented Dec 10, 2024

ChenyuLInx commented Dec 10, 2024

	relation_exists = self._submit_batch(
	node=node,
	adapter=runner.adapter,
	relation_exists=relation_exists,
	batches=batches,
	batch_idx=batch_idx,
	batch_results=batch_results,
	pool=pool,
	force_sequential_run=True,
	)
	batch_idx += 1
	skip_batches = batch_results[0].status != RunStatus.Success

	# Run all batches except first and last batch, in parallel if possible
	while batch_idx < len(runner.batches) - 1:
	relation_exists = self._submit_batch(
	node=node,
	adapter=runner.adapter,
	relation_exists=relation_exists,
	batches=batches,
	batch_idx=batch_idx,
	batch_results=batch_results,
	pool=pool,
	skip=skip_batches,
	)
	batch_idx += 1

	# Wait until all submitted batches have completed
	while len(batch_results) != batch_idx:
	pass

	# Only run "last" batch if there is more than one batch
	if len(batches) != 1:
	# Final batch runs once all others complete to ensure post_hook runs at the end
	self._submit_batch(
	node=node,
	adapter=runner.adapter,
	relation_exists=relation_exists,
	batches=batches,
	batch_idx=batch_idx,
	batch_results=batch_results,
	pool=pool,
	force_sequential_run=True,
	skip=skip_batches,
	)

Fix erroneous additional batch execution #11113

Fix erroneous additional batch execution #11113

Conversation

QMalcolm commented Dec 9, 2024

Problem

Solution

Checklist

codecov bot commented Dec 9, 2024 • edited Loading

Codecov Report

ChenyuLInx left a comment

Choose a reason for hiding this comment

QMalcolm commented Dec 9, 2024

ChenyuLInx commented Dec 10, 2024

QMalcolm commented Dec 10, 2024

ChenyuLInx commented Dec 10, 2024

codecov bot commented Dec 9, 2024 •

edited

Loading