Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to skip saved query processing when no semantic manifest changes #10784

Merged
merged 3 commits into from
Sep 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .changes/unreleased/Fixes-20240926-101220.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
kind: Fixes
body: Attempt to skip saved query processing when no semantic manifest changes
time: 2024-09-26T10:12:20.193453-04:00
custom:
Author: gshank
Issue: "10563"
1 change: 1 addition & 0 deletions core/dbt/contracts/graph/nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -1693,6 +1693,7 @@ class ParsedSingularTestPatch(ParsedPatch):

TestNode = Union[SingularTestNode, GenericTestNode]

SemanticManifestNode = Union[SavedQuery, SemanticModel, Metric]

RESOURCE_CLASS_TO_NODE_CLASS: Dict[Type[BaseResource], Type[BaseNode]] = {
node_class.resource_class(): node_class
Expand Down
18 changes: 18 additions & 0 deletions core/dbt/parser/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
ResultNode,
SavedQuery,
SeedNode,
SemanticManifestNode,
SemanticModel,
SourceDefinition,
)
Expand Down Expand Up @@ -1141,6 +1142,23 @@ def process_metrics(self, config: RuntimeConfig):

def process_saved_queries(self, config: RuntimeConfig):
"""Processes SavedQuery nodes to populate their `depends_on`."""
# Note: This will also capture various nodes which have been re-parsed
# because they refer to some other changed node, so there will be
# false positives. Ideally we would compare actual changes.
semantic_manifest_changed = False
semantic_manifest_nodes: chain[SemanticManifestNode] = chain(
self.manifest.saved_queries.values(),
self.manifest.semantic_models.values(),
self.manifest.metrics.values(),
)
for node in semantic_manifest_nodes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This can be simplified to

if any(node.created_at > self.started_at for node in semantic_manifest_nodes):
    return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of "any" is backward, we want to execute the process metrics for node code instead of returning, but doing it that way would make merging into dbt-mantle more muddy because it executes a different function, so I'm going to leave this as it is.

# Check if this node has been modified in this parsing run
if node.created_at > self.started_at:
semantic_manifest_changed = True
break # as soon as we run into one changed node we can stop
if semantic_manifest_changed is False:
return

current_project = config.project_name
for saved_query in self.manifest.saved_queries.values():
# TODO:
Expand Down
Loading