Skip to content

Commit

Permalink
copy approved plans from 3.x to 3.5
Browse files Browse the repository at this point in the history
  • Loading branch information
andygrove committed Jun 18, 2024
1 parent 76a331f commit b822c71
Show file tree
Hide file tree
Showing 270 changed files with 57,894 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,287 @@
== Physical Plan ==
TakeOrderedAndProject (43)
+- * Project (42)
+- * BroadcastHashJoin Inner BuildRight (41)
:- * Project (36)
: +- * BroadcastHashJoin Inner BuildRight (35)
: :- * Project (29)
: : +- * BroadcastHashJoin Inner BuildRight (28)
: : :- * Filter (13)
: : : +- * HashAggregate (12)
: : : +- Exchange (11)
: : : +- * ColumnarToRow (10)
: : : +- CometHashAggregate (9)
: : : +- CometProject (8)
: : : +- CometBroadcastHashJoin (7)
: : : :- CometFilter (2)
: : : : +- CometScan parquet spark_catalog.default.store_returns (1)
: : : +- CometBroadcastExchange (6)
: : : +- CometProject (5)
: : : +- CometFilter (4)
: : : +- CometScan parquet spark_catalog.default.date_dim (3)
: : +- BroadcastExchange (27)
: : +- * Filter (26)
: : +- * HashAggregate (25)
: : +- Exchange (24)
: : +- * HashAggregate (23)
: : +- * HashAggregate (22)
: : +- Exchange (21)
: : +- * ColumnarToRow (20)
: : +- CometHashAggregate (19)
: : +- CometProject (18)
: : +- CometBroadcastHashJoin (17)
: : :- CometFilter (15)
: : : +- CometScan parquet spark_catalog.default.store_returns (14)
: : +- ReusedExchange (16)
: +- BroadcastExchange (34)
: +- * ColumnarToRow (33)
: +- CometProject (32)
: +- CometFilter (31)
: +- CometScan parquet spark_catalog.default.store (30)
+- BroadcastExchange (40)
+- * ColumnarToRow (39)
+- CometFilter (38)
+- CometScan parquet spark_catalog.default.customer (37)


(1) Scan parquet spark_catalog.default.store_returns
Output [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(sr_returned_date_sk#4), dynamicpruningexpression(sr_returned_date_sk#4 IN dynamicpruning#5)]
PushedFilters: [IsNotNull(sr_store_sk), IsNotNull(sr_customer_sk)]
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>

(2) CometFilter
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Condition : (isnotnull(sr_store_sk#2) AND isnotnull(sr_customer_sk#1))

(3) Scan parquet spark_catalog.default.date_dim
Output [2]: [d_date_sk#6, d_year#7]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>

(4) CometFilter
Input [2]: [d_date_sk#6, d_year#7]
Condition : ((isnotnull(d_year#7) AND (d_year#7 = 2000)) AND isnotnull(d_date_sk#6))

(5) CometProject
Input [2]: [d_date_sk#6, d_year#7]
Arguments: [d_date_sk#6], [d_date_sk#6]

(6) CometBroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: [d_date_sk#6]

(7) CometBroadcastHashJoin
Left output [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Right output [1]: [d_date_sk#6]
Arguments: [sr_returned_date_sk#4], [d_date_sk#6], Inner, BuildRight

(8) CometProject
Input [5]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4, d_date_sk#6]
Arguments: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3], [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]

(9) CometHashAggregate
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#3))]

(10) ColumnarToRow [codegen id : 1]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]

(11) Exchange
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Arguments: hashpartitioning(sr_customer_sk#1, sr_store_sk#2, 5), ENSURE_REQUIREMENTS, [plan_id=1]

(12) HashAggregate [codegen id : 7]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#8]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#3))#9]
Results [3]: [sr_customer_sk#1 AS ctr_customer_sk#10, sr_store_sk#2 AS ctr_store_sk#11, MakeDecimal(sum(UnscaledValue(sr_return_amt#3))#9,17,2) AS ctr_total_return#12]

(13) Filter [codegen id : 7]
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12]
Condition : isnotnull(ctr_total_return#12)

(14) Scan parquet spark_catalog.default.store_returns
Output [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Batched: true
Location: InMemoryFileIndex []
PartitionFilters: [isnotnull(sr_returned_date_sk#4), dynamicpruningexpression(sr_returned_date_sk#4 IN dynamicpruning#13)]
PushedFilters: [IsNotNull(sr_store_sk)]
ReadSchema: struct<sr_customer_sk:int,sr_store_sk:int,sr_return_amt:decimal(7,2)>

(15) CometFilter
Input [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Condition : isnotnull(sr_store_sk#2)

(16) ReusedExchange [Reuses operator id: 6]
Output [1]: [d_date_sk#6]

(17) CometBroadcastHashJoin
Left output [4]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4]
Right output [1]: [d_date_sk#6]
Arguments: [sr_returned_date_sk#4], [d_date_sk#6], Inner, BuildRight

(18) CometProject
Input [5]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3, sr_returned_date_sk#4, d_date_sk#6]
Arguments: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3], [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]

(19) CometHashAggregate
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sr_return_amt#3]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#3))]

(20) ColumnarToRow [codegen id : 2]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#14]

(21) Exchange
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#14]
Arguments: hashpartitioning(sr_customer_sk#1, sr_store_sk#2, 5), ENSURE_REQUIREMENTS, [plan_id=2]

(22) HashAggregate [codegen id : 3]
Input [3]: [sr_customer_sk#1, sr_store_sk#2, sum#14]
Keys [2]: [sr_customer_sk#1, sr_store_sk#2]
Functions [1]: [sum(UnscaledValue(sr_return_amt#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#3))#9]
Results [2]: [sr_store_sk#2 AS ctr_store_sk#11, MakeDecimal(sum(UnscaledValue(sr_return_amt#3))#9,17,2) AS ctr_total_return#12]

(23) HashAggregate [codegen id : 3]
Input [2]: [ctr_store_sk#11, ctr_total_return#12]
Keys [1]: [ctr_store_sk#11]
Functions [1]: [partial_avg(ctr_total_return#12)]
Aggregate Attributes [2]: [sum#15, count#16]
Results [3]: [ctr_store_sk#11, sum#17, count#18]

(24) Exchange
Input [3]: [ctr_store_sk#11, sum#17, count#18]
Arguments: hashpartitioning(ctr_store_sk#11, 5), ENSURE_REQUIREMENTS, [plan_id=3]

(25) HashAggregate [codegen id : 4]
Input [3]: [ctr_store_sk#11, sum#17, count#18]
Keys [1]: [ctr_store_sk#11]
Functions [1]: [avg(ctr_total_return#12)]
Aggregate Attributes [1]: [avg(ctr_total_return#12)#19]
Results [2]: [(avg(ctr_total_return#12)#19 * 1.2) AS (avg(ctr_total_return) * 1.2)#20, ctr_store_sk#11 AS ctr_store_sk#11#21]

(26) Filter [codegen id : 4]
Input [2]: [(avg(ctr_total_return) * 1.2)#20, ctr_store_sk#11#21]
Condition : isnotnull((avg(ctr_total_return) * 1.2)#20)

(27) BroadcastExchange
Input [2]: [(avg(ctr_total_return) * 1.2)#20, ctr_store_sk#11#21]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [plan_id=4]

(28) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [ctr_store_sk#11]
Right keys [1]: [ctr_store_sk#11#21]
Join type: Inner
Join condition: (cast(ctr_total_return#12 as decimal(24,7)) > (avg(ctr_total_return) * 1.2)#20)

(29) Project [codegen id : 7]
Output [2]: [ctr_customer_sk#10, ctr_store_sk#11]
Input [5]: [ctr_customer_sk#10, ctr_store_sk#11, ctr_total_return#12, (avg(ctr_total_return) * 1.2)#20, ctr_store_sk#11#21]

(30) Scan parquet spark_catalog.default.store
Output [2]: [s_store_sk#22, s_state#23]
Batched: true
Location [not included in comparison]/{warehouse_dir}/store]
PushedFilters: [IsNotNull(s_state), EqualTo(s_state,TN), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>

(31) CometFilter
Input [2]: [s_store_sk#22, s_state#23]
Condition : ((isnotnull(s_state#23) AND (s_state#23 = TN)) AND isnotnull(s_store_sk#22))

(32) CometProject
Input [2]: [s_store_sk#22, s_state#23]
Arguments: [s_store_sk#22], [s_store_sk#22]

(33) ColumnarToRow [codegen id : 5]
Input [1]: [s_store_sk#22]

(34) BroadcastExchange
Input [1]: [s_store_sk#22]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=5]

(35) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [ctr_store_sk#11]
Right keys [1]: [s_store_sk#22]
Join type: Inner
Join condition: None

(36) Project [codegen id : 7]
Output [1]: [ctr_customer_sk#10]
Input [3]: [ctr_customer_sk#10, ctr_store_sk#11, s_store_sk#22]

(37) Scan parquet spark_catalog.default.customer
Output [2]: [c_customer_sk#24, c_customer_id#25]
Batched: true
Location [not included in comparison]/{warehouse_dir}/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string>

(38) CometFilter
Input [2]: [c_customer_sk#24, c_customer_id#25]
Condition : isnotnull(c_customer_sk#24)

(39) ColumnarToRow [codegen id : 6]
Input [2]: [c_customer_sk#24, c_customer_id#25]

(40) BroadcastExchange
Input [2]: [c_customer_sk#24, c_customer_id#25]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [plan_id=6]

(41) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [ctr_customer_sk#10]
Right keys [1]: [c_customer_sk#24]
Join type: Inner
Join condition: None

(42) Project [codegen id : 7]
Output [1]: [c_customer_id#25]
Input [3]: [ctr_customer_sk#10, c_customer_sk#24, c_customer_id#25]

(43) TakeOrderedAndProject
Input [1]: [c_customer_id#25]
Arguments: 100, [c_customer_id#25 ASC NULLS FIRST], [c_customer_id#25]

===== Subqueries =====

Subquery:1 Hosting operator id = 1 Hosting Expression = sr_returned_date_sk#4 IN dynamicpruning#5
BroadcastExchange (48)
+- * ColumnarToRow (47)
+- CometProject (46)
+- CometFilter (45)
+- CometScan parquet spark_catalog.default.date_dim (44)


(44) Scan parquet spark_catalog.default.date_dim
Output [2]: [d_date_sk#6, d_year#7]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>

(45) CometFilter
Input [2]: [d_date_sk#6, d_year#7]
Condition : ((isnotnull(d_year#7) AND (d_year#7 = 2000)) AND isnotnull(d_date_sk#6))

(46) CometProject
Input [2]: [d_date_sk#6, d_year#7]
Arguments: [d_date_sk#6], [d_date_sk#6]

(47) ColumnarToRow [codegen id : 1]
Input [1]: [d_date_sk#6]

(48) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=7]

Subquery:2 Hosting operator id = 14 Hosting Expression = sr_returned_date_sk#4 IN dynamicpruning#5


Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
TakeOrderedAndProject [c_customer_id]
WholeStageCodegen (7)
Project [c_customer_id]
BroadcastHashJoin [ctr_customer_sk,c_customer_sk]
Project [ctr_customer_sk]
BroadcastHashJoin [ctr_store_sk,s_store_sk]
Project [ctr_customer_sk,ctr_store_sk]
BroadcastHashJoin [ctr_store_sk,ctr_store_sk,ctr_total_return,(avg(ctr_total_return) * 1.2)]
Filter [ctr_total_return]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [sum(UnscaledValue(sr_return_amt)),ctr_customer_sk,ctr_store_sk,ctr_total_return,sum]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #1
WholeStageCodegen (1)
ColumnarToRow
InputAdapter
CometHashAggregate [sr_customer_sk,sr_store_sk,sr_return_amt]
CometProject [sr_customer_sk,sr_store_sk,sr_return_amt]
CometBroadcastHashJoin [sr_returned_date_sk,d_date_sk]
CometFilter [sr_store_sk,sr_customer_sk]
CometScan parquet spark_catalog.default.store_returns [sr_customer_sk,sr_store_sk,sr_return_amt,sr_returned_date_sk]
SubqueryBroadcast [d_date_sk] #1
BroadcastExchange #2
WholeStageCodegen (1)
ColumnarToRow
InputAdapter
CometProject [d_date_sk]
CometFilter [d_year,d_date_sk]
CometScan parquet spark_catalog.default.date_dim [d_date_sk,d_year]
CometBroadcastExchange #3
CometProject [d_date_sk]
CometFilter [d_year,d_date_sk]
CometScan parquet spark_catalog.default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (4)
Filter [(avg(ctr_total_return) * 1.2)]
HashAggregate [ctr_store_sk,sum,count] [avg(ctr_total_return),(avg(ctr_total_return) * 1.2),ctr_store_sk,sum,count]
InputAdapter
Exchange [ctr_store_sk] #5
WholeStageCodegen (3)
HashAggregate [ctr_store_sk,ctr_total_return] [sum,count,sum,count]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [sum(UnscaledValue(sr_return_amt)),ctr_store_sk,ctr_total_return,sum]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #6
WholeStageCodegen (2)
ColumnarToRow
InputAdapter
CometHashAggregate [sr_customer_sk,sr_store_sk,sr_return_amt]
CometProject [sr_customer_sk,sr_store_sk,sr_return_amt]
CometBroadcastHashJoin [sr_returned_date_sk,d_date_sk]
CometFilter [sr_store_sk]
CometScan parquet spark_catalog.default.store_returns [sr_customer_sk,sr_store_sk,sr_return_amt,sr_returned_date_sk]
ReusedSubquery [d_date_sk] #1
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #7
WholeStageCodegen (5)
ColumnarToRow
InputAdapter
CometProject [s_store_sk]
CometFilter [s_state,s_store_sk]
CometScan parquet spark_catalog.default.store [s_store_sk,s_state]
InputAdapter
BroadcastExchange #8
WholeStageCodegen (6)
ColumnarToRow
InputAdapter
CometFilter [c_customer_sk]
CometScan parquet spark_catalog.default.customer [c_customer_sk,c_customer_id]
Loading

0 comments on commit b822c71

Please sign in to comment.