Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(query): refactor decimal scalar functions #14057

Merged
merged 7 commits into from
Dec 19, 2023

Conversation

sundy-li
Copy link
Member

@sundy-li sundy-li commented Dec 18, 2023

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR.

decimal function's register code is messy, this pr improve the codes.

Fixes #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label Dec 18, 2023
@BohuTANG
Copy link
Member

Better to add some unit tests.

@sundy-li sundy-li marked this pull request as draft December 18, 2023 08:19
@sundy-li sundy-li marked this pull request as ready for review December 18, 2023 13:29
@BohuTANG BohuTANG added the ci-cloud Build docker image for cloud test label Dec 18, 2023
Copy link
Contributor

Docker Image for PR

  • tag: pr-14057-882f336

note: this image tag is only available for internal use,
please check the internal doc for more details.

@BohuTANG
Copy link
Member

BohuTANG commented Dec 18, 2023

This PR passed the wizard test: 👍

Click me

python3 checksb.py --database selects --case selects --run-check-only
Preparing to run SELECT-BASE-1...
Executing command: bendsql --query=-- SELECT-BASE-1: Top 5 customers by total spending, including only active customers
SELECT c.customer_id, c.customer_name, SUM(s.net_paid) AS total_spent
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
WHERE c.active = TRUE
GROUP BY c.customer_id, c.customer_name
ORDER BY total_spent DESC
    LIMIT 5 -D selects
Command executed successfully. Output:
984624	Customer 984624	96.16
737912	Customer 737912	95.83
277109	Customer 277109	95.70
559180	Customer 559180	95.61
690478	Customer 690478	95.56

Executing command: snowsql --query -- SELECT-BASE-1: Top 5 customers by total spending, including only active customers
SELECT c.customer_id, c.customer_name, SUM(s.net_paid) AS total_spent
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
WHERE c.active = TRUE
GROUP BY c.customer_id, c.customer_name
ORDER BY total_spent DESC
    LIMIT 5 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	96.16
737912	Customer 737912	95.83
277109	Customer 277109	95.70
559180	Customer 559180	95.61
690478	Customer 690478	95.56

OK - SELECT-BASE-1
984624	Customer 984624	96.16
737912	Customer 737912	95.83
277109	Customer 277109	95.70
559180	Customer 559180	95.61
690478	Customer 690478	95.56

Preparing to run SELECT-BASE-2...
Executing command: bendsql --query=

-- SELECT-BASE-2: Total sales per category for the first quarter of 2021
SELECT p.category, SUM(s.net_paid) AS total_sales
FROM sales s
         JOIN products p ON s.product_id = p.product_id
WHERE s.sale_date BETWEEN '2021-01-01' AND '2021-03-31'
GROUP BY p.category
ORDER BY total_sales DESC -D selects
Command executed successfully. Output:
Furniture	5137120.60
Electronics	3093445.84
Clothing	2294013.57
Grocery	1734867.89

Executing command: snowsql --query

-- SELECT-BASE-2: Total sales per category for the first quarter of 2021
SELECT p.category, SUM(s.net_paid) AS total_sales
FROM sales s
         JOIN products p ON s.product_id = p.product_id
WHERE s.sale_date BETWEEN '2021-01-01' AND '2021-03-31'
GROUP BY p.category
ORDER BY total_sales DESC --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Furniture	5137120.60
Electronics	3093445.84
Clothing	2294013.57
Grocery	1734867.89

OK - SELECT-BASE-2
Furniture	5137120.60
Electronics	3093445.84
Clothing	2294013.57
Grocery	1734867.89

Preparing to run SELECT-BASE-3...
Executing command: bendsql --query=

-- SELECT-BASE-3: Average sale amount per customer segment
SELECT c.segment, TRUNCATE(AVG(s.net_paid), 7) AS avg_sale_amount
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
GROUP BY c.segment
ORDER BY avg_sale_amount DESC -D selects
Command executed successfully. Output:
Small	10.0056284
Medium	10.0006918
Large	9.9974745

Executing command: snowsql --query

-- SELECT-BASE-3: Average sale amount per customer segment
SELECT c.segment, TRUNCATE(AVG(s.net_paid), 7) AS avg_sale_amount
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
GROUP BY c.segment
ORDER BY avg_sale_amount DESC --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Small	10.0056284
Medium	10.0006918
Large	9.9974745

OK - SELECT-BASE-3
Small	10.0056284
Medium	10.0006918
Large	9.9974745

Preparing to run SELECT-BASE-5...
Executing command: bendsql --query=

-- SELECT-BASE-5: Sales trend by month in 2021
SELECT dd.month, SUM(s.net_paid) AS monthly_sales
FROM sales s
         JOIN date_dim dd ON s.sale_date = dd.date_key
WHERE dd.year = 2021
GROUP BY dd.month
ORDER BY dd.month -D selects
Command executed successfully. Output:
1	4088267.35
2	3813672.13
3	4357508.42
4	4341026.35
5	4556100.58
6	4026130.65
7	3934729.51
8	4068240.37
9	4053746.47
10	4327055.51
11	4312773.77
12	4127065.89

Executing command: snowsql --query

-- SELECT-BASE-5: Sales trend by month in 2021
SELECT dd.month, SUM(s.net_paid) AS monthly_sales
FROM sales s
         JOIN date_dim dd ON s.sale_date = dd.date_key
WHERE dd.year = 2021
GROUP BY dd.month
ORDER BY dd.month --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
1	4088267.35
2	3813672.13
3	4357508.42
4	4341026.35
5	4556100.58
6	4026130.65
7	3934729.51
8	4068240.37
9	4053746.47
10	4327055.51
11	4312773.77
12	4127065.89

OK - SELECT-BASE-5
1	4088267.35
2	3813672.13
3	4357508.42
4	4341026.35
5	4556100.58
6	4026130.65
7	3934729.51
8	4068240.37
9	4053746.47
10	4327055.51
11	4312773.77
12	4127065.89

Preparing to run SELECT-BASE-6...
Executing command: bendsql --query=

-- SELECT-BASE-6: Detailed view of customers with the highest number of transactions in 2021
SELECT
    c.customer_id,
    c.customer_name,
    c.segment,
    MIN(s.sale_date) AS first_transaction_date,
    MAX(s.sale_date) AS last_transaction_date,
    SUM(s.net_paid) AS total_spent,
    COUNT(*) AS transaction_count
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
WHERE
        s.sale_date >= '2021-01-01' AND s.sale_date < '2022-01-01'
GROUP BY
    c.customer_id, c.customer_name, c.segment
ORDER BY
    transaction_count DESC, total_spent DESC, c.customer_id
    LIMIT 5 -D selects
Command executed successfully. Output:
984624	Customer 984624	Large	2021-04-24	2021-12-12	96.16	5
737912	Customer 737912	Large	2021-01-24	2021-11-29	95.83	5
277109	Customer 277109	Medium	2021-02-26	2021-10-28	95.70	5
559180	Customer 559180	Large	2021-03-24	2021-10-14	95.61	5
690478	Customer 690478	Medium	2021-04-06	2021-11-21	95.56	5

Executing command: snowsql --query

-- SELECT-BASE-6: Detailed view of customers with the highest number of transactions in 2021
SELECT
    c.customer_id,
    c.customer_name,
    c.segment,
    MIN(s.sale_date) AS first_transaction_date,
    MAX(s.sale_date) AS last_transaction_date,
    SUM(s.net_paid) AS total_spent,
    COUNT(*) AS transaction_count
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
WHERE
        s.sale_date >= '2021-01-01' AND s.sale_date < '2022-01-01'
GROUP BY
    c.customer_id, c.customer_name, c.segment
ORDER BY
    transaction_count DESC, total_spent DESC, c.customer_id
    LIMIT 5 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	Large	2021-04-24	2021-12-12	96.16	5
737912	Customer 737912	Large	2021-01-24	2021-11-29	95.83	5
277109	Customer 277109	Medium	2021-02-26	2021-10-28	95.70	5
559180	Customer 559180	Large	2021-03-24	2021-10-14	95.61	5
690478	Customer 690478	Medium	2021-04-06	2021-11-21	95.56	5

OK - SELECT-BASE-6
984624	Customer 984624	Large	2021-04-24	2021-12-12	96.16	5
737912	Customer 737912	Large	2021-01-24	2021-11-29	95.83	5
277109	Customer 277109	Medium	2021-02-26	2021-10-28	95.70	5
559180	Customer 559180	Large	2021-03-24	2021-10-14	95.61	5
690478	Customer 690478	Medium	2021-04-06	2021-11-21	95.56	5

Preparing to run SELECT-BASE-7...
Executing command: bendsql --query=

-- SELECT-BASE-7: Average product price by category
SELECT p.category, TRUNCATE(AVG(p.price), 7) AS avg_price
FROM products p
GROUP BY p.category
ORDER BY avg_price DESC -D selects
Command executed successfully. Output:
Electronics	10.0503993
Clothing	10.0153535
Grocery	10.0033790
Furniture	9.9829794

Executing command: snowsql --query

-- SELECT-BASE-7: Average product price by category
SELECT p.category, TRUNCATE(AVG(p.price), 7) AS avg_price
FROM products p
GROUP BY p.category
ORDER BY avg_price DESC --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Electronics	10.0503993
Clothing	10.0153535
Grocery	10.0033790
Furniture	9.9829794

OK - SELECT-BASE-7
Electronics	10.0503993
Clothing	10.0153535
Grocery	10.0033790
Furniture	9.9829794

Preparing to run SELECT-BASE-8...
Executing command: bendsql --query=

-- SELECT-BASE-8: Customer ranking by total spending using window function
SELECT
    c.customer_id,
    c.customer_name,
    SUM(s.net_paid) AS total_spent,
    RANK() OVER (ORDER BY SUM(s.net_paid) DESC) AS spending_rank
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
GROUP BY
    c.customer_id, c.customer_name
ORDER BY
    spending_rank
    LIMIT 10 -D selects
Command executed successfully. Output:
984624	Customer 984624	96.16	1
737912	Customer 737912	95.83	2
277109	Customer 277109	95.70	3
559180	Customer 559180	95.61	4
690478	Customer 690478	95.56	5
756159	Customer 756159	95.52	6
914163	Customer 914163	95.39	7
798797	Customer 798797	95.33	8
574783	Customer 574783	95.10	9
447017	Customer 447017	94.99	10

Executing command: snowsql --query

-- SELECT-BASE-8: Customer ranking by total spending using window function
SELECT
    c.customer_id,
    c.customer_name,
    SUM(s.net_paid) AS total_spent,
    RANK() OVER (ORDER BY SUM(s.net_paid) DESC) AS spending_rank
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
GROUP BY
    c.customer_id, c.customer_name
ORDER BY
    spending_rank
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	96.16	1
737912	Customer 737912	95.83	2
277109	Customer 277109	95.70	3
559180	Customer 559180	95.61	4
690478	Customer 690478	95.56	5
756159	Customer 756159	95.52	6
914163	Customer 914163	95.39	7
798797	Customer 798797	95.33	8
574783	Customer 574783	95.10	9
447017	Customer 447017	94.99	10

OK - SELECT-BASE-8
984624	Customer 984624	96.16	1
737912	Customer 737912	95.83	2
277109	Customer 277109	95.70	3
559180	Customer 559180	95.61	4
690478	Customer 690478	95.56	5
756159	Customer 756159	95.52	6
914163	Customer 914163	95.39	7
798797	Customer 798797	95.33	8
574783	Customer 574783	95.10	9
447017	Customer 447017	94.99	10

Preparing to run SELECT-BASE-9...
Executing command: bendsql --query=

-- SELECT-BASE-9: Count of active and inactive customers per segment
SELECT c.segment, SUM(CASE WHEN c.active = TRUE THEN 1 ELSE 0 END) AS active_customers,
       SUM(CASE WHEN c.active = FALSE THEN 1 ELSE 0 END) AS inactive_customers
FROM customers c
GROUP BY c.segment
ORDER BY c.segment -D selects
Command executed successfully. Output:
Large	166977	166292
Medium	166539	167209
Small	166748	166235

Executing command: snowsql --query

-- SELECT-BASE-9: Count of active and inactive customers per segment
SELECT c.segment, SUM(CASE WHEN c.active = TRUE THEN 1 ELSE 0 END) AS active_customers,
       SUM(CASE WHEN c.active = FALSE THEN 1 ELSE 0 END) AS inactive_customers
FROM customers c
GROUP BY c.segment
ORDER BY c.segment --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Large	166977	166292
Medium	166539	167209
Small	166748	166235

OK - SELECT-BASE-9
Large	166977	166292
Medium	166539	167209
Small	166748	166235

Preparing to run SELECT-BASE-10...
Executing command: bendsql --query=

-- SELECT-BASE-10: Total sales per day for the first week of 2021
SELECT dd.date_key, SUM(s.net_paid) AS daily_sales
FROM sales s
         JOIN date_dim dd ON s.sale_date = dd.date_key
WHERE dd.date_key BETWEEN '2021-01-01' AND '2021-01-07'
GROUP BY dd.date_key
ORDER BY dd.date_key -D selects
Command executed successfully. Output:
2021-01-01	131088.23
2021-01-02	130964.05
2021-01-03	129599.82
2021-01-04	130219.86
2021-01-05	130719.25
2021-01-06	132098.51
2021-01-07	130679.53

Executing command: snowsql --query

-- SELECT-BASE-10: Total sales per day for the first week of 2021
SELECT dd.date_key, SUM(s.net_paid) AS daily_sales
FROM sales s
         JOIN date_dim dd ON s.sale_date = dd.date_key
WHERE dd.date_key BETWEEN '2021-01-01' AND '2021-01-07'
GROUP BY dd.date_key
ORDER BY dd.date_key --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
2021-01-01	131088.23
2021-01-02	130964.05
2021-01-03	129599.82
2021-01-04	130219.86
2021-01-05	130719.25
2021-01-06	132098.51
2021-01-07	130679.53

OK - SELECT-BASE-10
2021-01-01	131088.23
2021-01-02	130964.05
2021-01-03	129599.82
2021-01-04	130219.86
2021-01-05	130719.25
2021-01-06	132098.51
2021-01-07	130679.53

Preparing to run SELECT-BASE-11...
Executing command: bendsql --query=

-- SELECT-BASE-11: Top 5 customers with the least spending in Electronics
SELECT c.customer_id, c.customer_name, SUM(s.net_paid) AS total_spent
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
         JOIN products p ON s.product_id = p.product_id
WHERE p.category = 'Electronics'
GROUP BY c.customer_id, c.customer_name
ORDER BY total_spent ASC
    LIMIT 5 -D selects
Command executed successfully. Output:
114117	Customer 114117	3.35
274146	Customer 274146	4.55
637130	Customer 637130	4.69
806431	Customer 806431	4.87
998302	Customer 998302	5.09

Executing command: snowsql --query

-- SELECT-BASE-11: Top 5 customers with the least spending in Electronics
SELECT c.customer_id, c.customer_name, SUM(s.net_paid) AS total_spent
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
         JOIN products p ON s.product_id = p.product_id
WHERE p.category = 'Electronics'
GROUP BY c.customer_id, c.customer_name
ORDER BY total_spent ASC
    LIMIT 5 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
114117	Customer 114117	3.35
274146	Customer 274146	4.55
637130	Customer 637130	4.69
806431	Customer 806431	4.87
998302	Customer 998302	5.09

OK - SELECT-BASE-11
114117	Customer 114117	3.35
274146	Customer 274146	4.55
637130	Customer 637130	4.69
806431	Customer 806431	4.87
998302	Customer 998302	5.09

Preparing to run SELECT-BASE-12...
Executing command: bendsql --query=

-- SELECT-BASE-12: Number of products sold per category
SELECT p.category, COUNT(*) AS products_sold
FROM sales s
         JOIN products p ON s.product_id = p.product_id
GROUP BY p.category
ORDER BY products_sold DESC -D selects
Command executed successfully. Output:
Furniture	2092350
Electronics	1263400
Clothing	936950
Grocery	707300

Executing command: snowsql --query

-- SELECT-BASE-12: Number of products sold per category
SELECT p.category, COUNT(*) AS products_sold
FROM sales s
         JOIN products p ON s.product_id = p.product_id
GROUP BY p.category
ORDER BY products_sold DESC --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Furniture	2092350
Electronics	1263400
Clothing	936950
Grocery	707300

OK - SELECT-BASE-12
Furniture	2092350
Electronics	1263400
Clothing	936950
Grocery	707300

Preparing to run SELECT-BASE-13...
Executing command: bendsql --query=

-- SELECT-BASE-13: Total sales and average quantity sold per product
SELECT p.product_id, p.product_name, SUM(s.net_paid) AS total_sales, AVG(s.quantity) AS avg_quantity_sold
FROM sales s
         JOIN products p ON s.product_id = p.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_sales DESC
    LIMIT 10 -D selects
Command executed successfully. Output:
82031	Product 82031	672.61	10.22
82500	Product 82500	665.14	10.88
86712	Product 86712	663.19	11.78
62901	Product 62901	660.80	11
44761	Product 44761	657.28	10.76
31449	Product 31449	654.57	10.94
73600	Product 73600	653.89	11.18
89327	Product 89327	653.71	9.62
81897	Product 81897	653.59	11.38
22793	Product 22793	651.41	8.62

Executing command: snowsql --query

-- SELECT-BASE-13: Total sales and average quantity sold per product
SELECT p.product_id, p.product_name, SUM(s.net_paid) AS total_sales, AVG(s.quantity) AS avg_quantity_sold
FROM sales s
         JOIN products p ON s.product_id = p.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_sales DESC
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
82031	Product 82031	672.61	10.220000
82500	Product 82500	665.14	10.880000
86712	Product 86712	663.19	11.780000
62901	Product 62901	660.80	11.000000
44761	Product 44761	657.28	10.760000
31449	Product 31449	654.57	10.940000
73600	Product 73600	653.89	11.180000
89327	Product 89327	653.71	9.620000
81897	Product 81897	653.59	11.380000
22793	Product 22793	651.41	8.620000

DIFFERENCE FOUND

SELECT-BASE-13:


-- SELECT-BASE-13: Total sales and average quantity sold per product
SELECT p.product_id, p.product_name, SUM(s.net_paid) AS total_sales, AVG(s.quantity) AS avg_quantity_sold
FROM sales s
         JOIN products p ON s.product_id = p.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_sales DESC
    LIMIT 10
Differences:

bendsql:
82031	Product 82031	672.61	10.22
82500	Product 82500	665.14	10.88
86712	Product 86712	663.19	11.78
62901	Product 62901	660.80	11
44761	Product 44761	657.28	10.76
31449	Product 31449	654.57	10.94
73600	Product 73600	653.89	11.18
89327	Product 89327	653.71	9.62
81897	Product 81897	653.59	11.38
22793	Product 22793	651.41	8.62

snowsql:
82031	Product 82031	672.61	10.220000
82500	Product 82500	665.14	10.880000
86712	Product 86712	663.19	11.780000
62901	Product 62901	660.80	11.000000
44761	Product 44761	657.28	10.760000
31449	Product 31449	654.57	10.940000
73600	Product 73600	653.89	11.180000
89327	Product 89327	653.71	9.620000
81897	Product 81897	653.59	11.380000
22793	Product 22793	651.41	8.620000

Preparing to run SELECT-BASE-14...
Executing command: bendsql --query=

-- SELECT-BASE-14: Customers with the most transactions, top 10, with stable results
SELECT
    c.customer_id,
    c.customer_name,
    COUNT(*) AS transaction_count
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
GROUP BY
    c.customer_id, c.customer_name
ORDER BY
    transaction_count DESC, c.customer_id
    LIMIT 10 -D selects
Command executed successfully. Output:
0	Customer 0	5
1	Customer 1	5
2	Customer 2	5
3	Customer 3	5
4	Customer 4	5
5	Customer 5	5
6	Customer 6	5
7	Customer 7	5
8	Customer 8	5
9	Customer 9	5

Executing command: snowsql --query

-- SELECT-BASE-14: Customers with the most transactions, top 10, with stable results
SELECT
    c.customer_id,
    c.customer_name,
    COUNT(*) AS transaction_count
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
GROUP BY
    c.customer_id, c.customer_name
ORDER BY
    transaction_count DESC, c.customer_id
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	Customer 0	5
1	Customer 1	5
2	Customer 2	5
3	Customer 3	5
4	Customer 4	5
5	Customer 5	5
6	Customer 6	5
7	Customer 7	5
8	Customer 8	5
9	Customer 9	5

OK - SELECT-BASE-14
0	Customer 0	5
1	Customer 1	5
2	Customer 2	5
3	Customer 3	5
4	Customer 4	5
5	Customer 5	5
6	Customer 6	5
7	Customer 7	5
8	Customer 8	5
9	Customer 9	5

Preparing to run SELECT-BASE-15...
Executing command: bendsql --query=

-- SELECT-BASE-15: Sales comparison between Small and Large segment customers
SELECT c.segment, SUM(s.net_paid) AS total_sales
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
WHERE c.segment IN ('Small', 'Large')
GROUP BY c.segment
ORDER BY c.segment -D selects
Command executed successfully. Output:
Large	16659241.69
Small	16658520.85

Executing command: snowsql --query

-- SELECT-BASE-15: Sales comparison between Small and Large segment customers
SELECT c.segment, SUM(s.net_paid) AS total_sales
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
WHERE c.segment IN ('Small', 'Large')
GROUP BY c.segment
ORDER BY c.segment --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Large	16659241.69
Small	16658520.85

OK - SELECT-BASE-15
Large	16659241.69
Small	16658520.85

Preparing to run SELECT-BASE-16...
Executing command: bendsql --query=

-- SELECT-BASE-16: Customers' last purchase date and the total number of purchases, with stable results
WITH CustomerPurchases AS (
    SELECT
        c.customer_id,
        c.customer_name,
        MAX(s.sale_date) AS last_purchase_date,
        COUNT(*) AS total_purchases
    FROM
        sales s
            JOIN
        customers c ON s.customer_id = c.customer_id
    GROUP BY
        c.customer_id, c.customer_name
)
SELECT *
FROM CustomerPurchases
ORDER BY
    last_purchase_date DESC, total_purchases DESC, customer_id
    LIMIT 10 -D selects
Command executed successfully. Output:
41	Customer 41	2021-12-31	5
56	Customer 56	2021-12-31	5
242	Customer 242	2021-12-31	5
278	Customer 278	2021-12-31	5
303	Customer 303	2021-12-31	5
460	Customer 460	2021-12-31	5
476	Customer 476	2021-12-31	5
581	Customer 581	2021-12-31	5
587	Customer 587	2021-12-31	5
724	Customer 724	2021-12-31	5

Executing command: snowsql --query

-- SELECT-BASE-16: Customers' last purchase date and the total number of purchases, with stable results
WITH CustomerPurchases AS (
    SELECT
        c.customer_id,
        c.customer_name,
        MAX(s.sale_date) AS last_purchase_date,
        COUNT(*) AS total_purchases
    FROM
        sales s
            JOIN
        customers c ON s.customer_id = c.customer_id
    GROUP BY
        c.customer_id, c.customer_name
)
SELECT *
FROM CustomerPurchases
ORDER BY
    last_purchase_date DESC, total_purchases DESC, customer_id
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
41	Customer 41	2021-12-31	5
56	Customer 56	2021-12-31	5
242	Customer 242	2021-12-31	5
278	Customer 278	2021-12-31	5
303	Customer 303	2021-12-31	5
460	Customer 460	2021-12-31	5
476	Customer 476	2021-12-31	5
581	Customer 581	2021-12-31	5
587	Customer 587	2021-12-31	5
724	Customer 724	2021-12-31	5

OK - SELECT-BASE-16
41	Customer 41	2021-12-31	5
56	Customer 56	2021-12-31	5
242	Customer 242	2021-12-31	5
278	Customer 278	2021-12-31	5
303	Customer 303	2021-12-31	5
460	Customer 460	2021-12-31	5
476	Customer 476	2021-12-31	5
581	Customer 581	2021-12-31	5
587	Customer 587	2021-12-31	5
724	Customer 724	2021-12-31	5

Preparing to run SELECT-BASE-18...
Executing command: bendsql --query=

-- SELECT-BASE-18: Customers with average spending higher than the overall average
WITH AverageSpending AS (
    SELECT
        AVG(s.net_paid) AS avg_spending
    FROM
        sales s
)
SELECT
    c.customer_id,
    c.customer_name,
    AVG(s.net_paid) AS customer_avg_spending
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
GROUP BY
    c.customer_id, c.customer_name
HAVING
        AVG(s.net_paid) > (SELECT avg_spending FROM AverageSpending)
ORDER BY
    customer_avg_spending DESC
    LIMIT 10 -D selects
Command executed successfully. Output:
984624	Customer 984624	19.23200000
737912	Customer 737912	19.16600000
277109	Customer 277109	19.14000000
559180	Customer 559180	19.12200000
690478	Customer 690478	19.11200000
756159	Customer 756159	19.10400000
914163	Customer 914163	19.07800000
798797	Customer 798797	19.06600000
574783	Customer 574783	19.02000000
447017	Customer 447017	18.99800000

Executing command: snowsql --query

-- SELECT-BASE-18: Customers with average spending higher than the overall average
WITH AverageSpending AS (
    SELECT
        AVG(s.net_paid) AS avg_spending
    FROM
        sales s
)
SELECT
    c.customer_id,
    c.customer_name,
    AVG(s.net_paid) AS customer_avg_spending
FROM
    sales s
        JOIN
    customers c ON s.customer_id = c.customer_id
GROUP BY
    c.customer_id, c.customer_name
HAVING
        AVG(s.net_paid) > (SELECT avg_spending FROM AverageSpending)
ORDER BY
    customer_avg_spending DESC
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	19.23200000
737912	Customer 737912	19.16600000
277109	Customer 277109	19.14000000
559180	Customer 559180	19.12200000
690478	Customer 690478	19.11200000
756159	Customer 756159	19.10400000
914163	Customer 914163	19.07800000
798797	Customer 798797	19.06600000
574783	Customer 574783	19.02000000
447017	Customer 447017	18.99800000

OK - SELECT-BASE-18
984624	Customer 984624	19.23200000
737912	Customer 737912	19.16600000
277109	Customer 277109	19.14000000
559180	Customer 559180	19.12200000
690478	Customer 690478	19.11200000
756159	Customer 756159	19.10400000
914163	Customer 914163	19.07800000
798797	Customer 798797	19.06600000
574783	Customer 574783	19.02000000
447017	Customer 447017	18.99800000

Preparing to run SELECT-BASE-19...
Executing command: bendsql --query=

-- SELECT-BASE-19: Total sales per category in each year
SELECT
    p.category,
    dd.year,
    SUM(s.net_paid) AS total_sales
FROM
    sales s
        JOIN
    products p ON s.product_id = p.product_id
        JOIN
    date_dim dd ON s.sale_date = dd.date_key
GROUP BY
    p.category, dd.year
ORDER BY
    p.category, dd.year -D selects
Command executed successfully. Output:
Clothing	2021	9365802.78
Electronics	2021	12621986.37
Furniture	2021	20938846.82
Grocery	2021	7079681.03

Executing command: snowsql --query

-- SELECT-BASE-19: Total sales per category in each year
SELECT
    p.category,
    dd.year,
    SUM(s.net_paid) AS total_sales
FROM
    sales s
        JOIN
    products p ON s.product_id = p.product_id
        JOIN
    date_dim dd ON s.sale_date = dd.date_key
GROUP BY
    p.category, dd.year
ORDER BY
    p.category, dd.year --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Clothing	2021	9365802.78
Electronics	2021	12621986.37
Furniture	2021	20938846.82
Grocery	2021	7079681.03

OK - SELECT-BASE-19
Clothing	2021	9365802.78
Electronics	2021	12621986.37
Furniture	2021	20938846.82
Grocery	2021	7079681.03

Preparing to run SELECT-BASE-21...
Executing command: bendsql --query=

-- SELECT-BASE-21: Products with sales above average in their category
WITH CategoryAverage AS (
    SELECT
        p.category,
        AVG(s.net_paid) AS avg_sales
    FROM
        sales s
            JOIN
        products p ON s.product_id = p.product_id
    GROUP BY
        p.category
)
SELECT
    p.product_id,
    p.product_name,
    p.category,
    SUM(s.net_paid) AS total_sales,
    TRUNCATE(ca.avg_sales, 7)
FROM
    sales s
        JOIN
    products p ON s.product_id = p.product_id
        JOIN
    CategoryAverage ca ON p.category = ca.category
GROUP BY
    p.product_id, p.product_name, p.category, ca.avg_sales
HAVING
        SUM(s.net_paid) > ca.avg_sales
ORDER BY
    p.product_id, total_sales DESC LIMIT 10 -D selects
Command executed successfully. Output:
0	Product 0	Grocery	464.31	10.0094458
1	Product 1	Furniture	537.36	10.0073347
2	Product 2	Electronics	493.15	9.9904910
3	Product 3	Furniture	458.03	10.0073347
4	Product 4	Grocery	515.89	10.0094458
5	Product 5	Furniture	489.04	10.0073347
6	Product 6	Furniture	503.13	10.0073347
7	Product 7	Furniture	485.33	10.0073347
8	Product 8	Clothing	543.00	9.9960539
9	Product 9	Clothing	490.70	9.9960539

Executing command: snowsql --query

-- SELECT-BASE-21: Products with sales above average in their category
WITH CategoryAverage AS (
    SELECT
        p.category,
        AVG(s.net_paid) AS avg_sales
    FROM
        sales s
            JOIN
        products p ON s.product_id = p.product_id
    GROUP BY
        p.category
)
SELECT
    p.product_id,
    p.product_name,
    p.category,
    SUM(s.net_paid) AS total_sales,
    TRUNCATE(ca.avg_sales, 7)
FROM
    sales s
        JOIN
    products p ON s.product_id = p.product_id
        JOIN
    CategoryAverage ca ON p.category = ca.category
GROUP BY
    p.product_id, p.product_name, p.category, ca.avg_sales
HAVING
        SUM(s.net_paid) > ca.avg_sales
ORDER BY
    p.product_id, total_sales DESC LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	Product 0	Grocery	464.31	10.0094458
1	Product 1	Furniture	537.36	10.0073347
2	Product 2	Electronics	493.15	9.9904910
3	Product 3	Furniture	458.03	10.0073347
4	Product 4	Grocery	515.89	10.0094458
5	Product 5	Furniture	489.04	10.0073347
6	Product 6	Furniture	503.13	10.0073347
7	Product 7	Furniture	485.33	10.0073347
8	Product 8	Clothing	543.00	9.9960539
9	Product 9	Clothing	490.70	9.9960539

OK - SELECT-BASE-21
0	Product 0	Grocery	464.31	10.0094458
1	Product 1	Furniture	537.36	10.0073347
2	Product 2	Electronics	493.15	9.9904910
3	Product 3	Furniture	458.03	10.0073347
4	Product 4	Grocery	515.89	10.0094458
5	Product 5	Furniture	489.04	10.0073347
6	Product 6	Furniture	503.13	10.0073347
7	Product 7	Furniture	485.33	10.0073347
8	Product 8	Clothing	543.00	9.9960539
9	Product 9	Clothing	490.70	9.9960539

Preparing to run SELECT-J01...
Executing command: bendsql --query=

-- SELECT-J01: LEFT JOIN with COUNT - Find top 3 customers with the least purchases
SELECT c.customer_id, c.customer_name, COALESCE(COUNT(s.sale_id), 0) AS purchase_count
FROM customers c
         LEFT JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.customer_id, c.customer_name
ORDER BY purchase_count ASC, c.customer_id ASC
    LIMIT 3 -D selects
Command executed successfully. Output:
0	Customer 0	5
1	Customer 1	5
2	Customer 2	5

Executing command: snowsql --query

-- SELECT-J01: LEFT JOIN with COUNT - Find top 3 customers with the least purchases
SELECT c.customer_id, c.customer_name, COALESCE(COUNT(s.sale_id), 0) AS purchase_count
FROM customers c
         LEFT JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.customer_id, c.customer_name
ORDER BY purchase_count ASC, c.customer_id ASC
    LIMIT 3 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	Customer 0	5
1	Customer 1	5
2	Customer 2	5

OK - SELECT-J01
0	Customer 0	5
1	Customer 1	5
2	Customer 2	5

Preparing to run SELECT-J02...
Executing command: bendsql --query=


-- SELECT-J02: INNER JOIN with SUM - Top 3 products by total sales value
SELECT p.product_id, p.product_name, SUM(s.net_paid) AS total_sales_value
FROM products p
         INNER JOIN sales s ON p.product_id = s.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_sales_value DESC, p.product_id ASC
    LIMIT 3 -D selects
Command executed successfully. Output:
82031	Product 82031	672.61
82500	Product 82500	665.14
86712	Product 86712	663.19

Executing command: snowsql --query


-- SELECT-J02: INNER JOIN with SUM - Top 3 products by total sales value
SELECT p.product_id, p.product_name, SUM(s.net_paid) AS total_sales_value
FROM products p
         INNER JOIN sales s ON p.product_id = s.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_sales_value DESC, p.product_id ASC
    LIMIT 3 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
82031	Product 82031	672.61
82500	Product 82500	665.14
86712	Product 86712	663.19

OK - SELECT-J02
82031	Product 82031	672.61
82500	Product 82500	665.14
86712	Product 86712	663.19

Preparing to run SELECT-J03...
Executing command: bendsql --query=

-- SELECT-J03: INNER JOIN with AVG - Top 3 product categories by average product price
SELECT p.category, TRUNCATE(AVG(p.price), 7) AS avg_price
FROM products p
         INNER JOIN sales s ON p.product_id = s.product_id
GROUP BY p.category
ORDER BY avg_price DESC, p.category
    LIMIT 3 -D selects
Command executed successfully. Output:
Electronics	10.0503993
Clothing	10.0153535
Grocery	10.0033790

Executing command: snowsql --query

-- SELECT-J03: INNER JOIN with AVG - Top 3 product categories by average product price
SELECT p.category, TRUNCATE(AVG(p.price), 7) AS avg_price
FROM products p
         INNER JOIN sales s ON p.product_id = s.product_id
GROUP BY p.category
ORDER BY avg_price DESC, p.category
    LIMIT 3 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Electronics	10.0503993
Clothing	10.0153535
Grocery	10.0033790

OK - SELECT-J03
Electronics	10.0503993
Clothing	10.0153535
Grocery	10.0033790

Preparing to run SELECT-J04...
Executing command: bendsql --query=

-- SELECT-J04: RIGHT JOIN with COUNT - Count of sales for products not sold to 'Large' segment customers
SELECT p.product_id, p.product_name, COUNT(s.sale_id) AS sales_count
FROM products p
         RIGHT JOIN sales s ON p.product_id = s.product_id
         LEFT JOIN customers c ON s.customer_id = c.customer_id AND c.segment != 'Large'
GROUP BY p.product_id, p.product_name
ORDER BY sales_count DESC, p.product_id ASC
    LIMIT 3 -D selects
Command executed successfully. Output:
0	Product 0	50
1	Product 1	50
2	Product 2	50

Executing command: snowsql --query

-- SELECT-J04: RIGHT JOIN with COUNT - Count of sales for products not sold to 'Large' segment customers
SELECT p.product_id, p.product_name, COUNT(s.sale_id) AS sales_count
FROM products p
         RIGHT JOIN sales s ON p.product_id = s.product_id
         LEFT JOIN customers c ON s.customer_id = c.customer_id AND c.segment != 'Large'
GROUP BY p.product_id, p.product_name
ORDER BY sales_count DESC, p.product_id ASC
    LIMIT 3 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	Product 0	50
1	Product 1	50
2	Product 2	50

OK - SELECT-J04
0	Product 0	50
1	Product 1	50
2	Product 2	50

Preparing to run SELECT-J05...
Executing command: bendsql --query=

-- SELECT-J05: Join all tables, aggregate data, and use window functions to rank products within each customer segment based on their net paid amount
SELECT
    c.customer_id,
    c.customer_name,
    c.segment,
    p.product_name,
    p.category,
    s.sale_date,
    SUM(s.net_paid) as total_net_paid,
    RANK() OVER (PARTITION BY c.segment ORDER BY SUM(s.net_paid) DESC) as rank_in_segment
FROM customers c
         JOIN sales s ON c.customer_id = s.customer_id
         JOIN products p ON s.product_id = p.product_id
         JOIN date_dim d ON s.sale_date = d.date_key
GROUP BY c.customer_id, c.customer_name, c.segment, p.product_name, p.category, s.sale_date
ORDER BY c.segment, rank_in_segment, c.customer_id, s.sale_date
    LIMIT 10 -D selects
Command executed successfully. Output:
704505	Customer 704505	Large	Product 4505	Furniture	2021-11-22	52.28	1
749323	Customer 749323	Large	Product 49323	Furniture	2021-03-29	48.93	2
163339	Customer 163339	Large	Product 63339	Electronics	2021-01-05	44.38	3
684555	Customer 684555	Large	Product 84555	Furniture	2021-04-27	42.50	4
795061	Customer 795061	Large	Product 95061	Electronics	2021-08-01	41.12	5
640172	Customer 640172	Large	Product 40172	Clothing	2021-11-30	39.96	6
35275	Customer 35275	Large	Product 35275	Furniture	2021-12-13	39.93	7
771397	Customer 771397	Large	Product 71397	Grocery	2021-09-21	39.75	8
113298	Customer 113298	Large	Product 13298	Clothing	2021-06-09	39.69	9
874734	Customer 874734	Large	Product 74734	Furniture	2021-05-25	39.69	9

Executing command: snowsql --query

-- SELECT-J05: Join all tables, aggregate data, and use window functions to rank products within each customer segment based on their net paid amount
SELECT
    c.customer_id,
    c.customer_name,
    c.segment,
    p.product_name,
    p.category,
    s.sale_date,
    SUM(s.net_paid) as total_net_paid,
    RANK() OVER (PARTITION BY c.segment ORDER BY SUM(s.net_paid) DESC) as rank_in_segment
FROM customers c
         JOIN sales s ON c.customer_id = s.customer_id
         JOIN products p ON s.product_id = p.product_id
         JOIN date_dim d ON s.sale_date = d.date_key
GROUP BY c.customer_id, c.customer_name, c.segment, p.product_name, p.category, s.sale_date
ORDER BY c.segment, rank_in_segment, c.customer_id, s.sale_date
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
704505	Customer 704505	Large	Product 4505	Furniture	2021-11-22	52.28	1
749323	Customer 749323	Large	Product 49323	Furniture	2021-03-29	48.93	2
163339	Customer 163339	Large	Product 63339	Electronics	2021-01-05	44.38	3
684555	Customer 684555	Large	Product 84555	Furniture	2021-04-27	42.50	4
795061	Customer 795061	Large	Product 95061	Electronics	2021-08-01	41.12	5
640172	Customer 640172	Large	Product 40172	Clothing	2021-11-30	39.96	6
35275	Customer 35275	Large	Product 35275	Furniture	2021-12-13	39.93	7
771397	Customer 771397	Large	Product 71397	Grocery	2021-09-21	39.75	8
113298	Customer 113298	Large	Product 13298	Clothing	2021-06-09	39.69	9
874734	Customer 874734	Large	Product 74734	Furniture	2021-05-25	39.69	9

OK - SELECT-J05
704505	Customer 704505	Large	Product 4505	Furniture	2021-11-22	52.28	1
749323	Customer 749323	Large	Product 49323	Furniture	2021-03-29	48.93	2
163339	Customer 163339	Large	Product 63339	Electronics	2021-01-05	44.38	3
684555	Customer 684555	Large	Product 84555	Furniture	2021-04-27	42.50	4
795061	Customer 795061	Large	Product 95061	Electronics	2021-08-01	41.12	5
640172	Customer 640172	Large	Product 40172	Clothing	2021-11-30	39.96	6
35275	Customer 35275	Large	Product 35275	Furniture	2021-12-13	39.93	7
771397	Customer 771397	Large	Product 71397	Grocery	2021-09-21	39.75	8
113298	Customer 113298	Large	Product 13298	Clothing	2021-06-09	39.69	9
874734	Customer 874734	Large	Product 74734	Furniture	2021-05-25	39.69	9

Preparing to run SELECT-J06...
Executing command: bendsql --query=

-- SELECT-J06: Aggregate sales data by product category and month, and find top selling categories each month
SELECT
    p.category, d.month, d.year,
    SUM(s.quantity) as total_quantity_sold,
    ROW_NUMBER() OVER (PARTITION BY d.month, d.year ORDER BY SUM(s.quantity) DESC) as rank
FROM sales s
         JOIN products p ON s.product_id = p.product_id
         JOIN date_dim d ON s.sale_date = d.date_key
GROUP BY p.category, d.month, d.year
ORDER BY d.year, d.month, rank
    LIMIT 10 -D selects
Command executed successfully. Output:
Furniture	1	2021	1986839	1
Electronics	1	2021	1192843	2
Clothing	1	2021	891124	3
Grocery	1	2021	673659	4
Furniture	2	2021	1806332	1
Electronics	2	2021	1097343	2
Clothing	2	2021	803719	3
Grocery	2	2021	606319	4
Furniture	3	2021	2017537	1
Electronics	3	2021	1216713	2

Executing command: snowsql --query

-- SELECT-J06: Aggregate sales data by product category and month, and find top selling categories each month
SELECT
    p.category, d.month, d.year,
    SUM(s.quantity) as total_quantity_sold,
    ROW_NUMBER() OVER (PARTITION BY d.month, d.year ORDER BY SUM(s.quantity) DESC) as rank
FROM sales s
         JOIN products p ON s.product_id = p.product_id
         JOIN date_dim d ON s.sale_date = d.date_key
GROUP BY p.category, d.month, d.year
ORDER BY d.year, d.month, rank
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Furniture	1	2021	1986839	1
Electronics	1	2021	1192843	2
Clothing	1	2021	891124	3
Grocery	1	2021	673659	4
Furniture	2	2021	1806332	1
Electronics	2	2021	1097343	2
Clothing	2	2021	803719	3
Grocery	2	2021	606319	4
Furniture	3	2021	2017537	1
Electronics	3	2021	1216713	2

OK - SELECT-J06
Furniture	1	2021	1986839	1
Electronics	1	2021	1192843	2
Clothing	1	2021	891124	3
Grocery	1	2021	673659	4
Furniture	2	2021	1806332	1
Electronics	2	2021	1097343	2
Clothing	2	2021	803719	3
Grocery	2	2021	606319	4
Furniture	3	2021	2017537	1
Electronics	3	2021	1216713	2

Preparing to run SELECT-J07...
Executing command: bendsql --query=


-- SELECT-J07: Check the distribution of product categories purchased per customer
SELECT
    c.customer_id,
    c.customer_name,
    COUNT(DISTINCT p.category) as categories_purchased,
    SUM(s.net_paid) as total_spent
FROM customers c
         JOIN sales s ON c.customer_id = s.customer_id
         JOIN products p ON s.product_id = p.product_id
GROUP BY c.customer_id, c.customer_name
ORDER BY categories_purchased DESC, total_spent DESC
    LIMIT 10 -D selects
Command executed successfully. Output:
984624	Customer 984624	1	96.16
737912	Customer 737912	1	95.83
277109	Customer 277109	1	95.70
559180	Customer 559180	1	95.61
690478	Customer 690478	1	95.56
756159	Customer 756159	1	95.52
914163	Customer 914163	1	95.39
798797	Customer 798797	1	95.33
574783	Customer 574783	1	95.10
447017	Customer 447017	1	94.99

Executing command: snowsql --query


-- SELECT-J07: Check the distribution of product categories purchased per customer
SELECT
    c.customer_id,
    c.customer_name,
    COUNT(DISTINCT p.category) as categories_purchased,
    SUM(s.net_paid) as total_spent
FROM customers c
         JOIN sales s ON c.customer_id = s.customer_id
         JOIN products p ON s.product_id = p.product_id
GROUP BY c.customer_id, c.customer_name
ORDER BY categories_purchased DESC, total_spent DESC
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	1	96.16
737912	Customer 737912	1	95.83
277109	Customer 277109	1	95.70
559180	Customer 559180	1	95.61
690478	Customer 690478	1	95.56
756159	Customer 756159	1	95.52
914163	Customer 914163	1	95.39
798797	Customer 798797	1	95.33
574783	Customer 574783	1	95.10
447017	Customer 447017	1	94.99

OK - SELECT-J07
984624	Customer 984624	1	96.16
737912	Customer 737912	1	95.83
277109	Customer 277109	1	95.70
559180	Customer 559180	1	95.61
690478	Customer 690478	1	95.56
756159	Customer 756159	1	95.52
914163	Customer 914163	1	95.39
798797	Customer 798797	1	95.33
574783	Customer 574783	1	95.10
447017	Customer 447017	1	94.99

Preparing to run SELECT-J08...
Executing command: bendsql --query=

-- SELECT-J08: List sales where customers bought more than one item, ranked by the number of items bought and the total net paid in each sale, with sale_id ensuring stable order
SELECT
    s.sale_id,
    s.customer_id,
    c.customer_name,
    s.product_id,
    p.product_name,
    s.quantity,
    DENSE_RANK() OVER (ORDER BY s.quantity DESC, s.net_paid DESC, s.sale_id) as quantity_rank
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
         JOIN products p ON s.product_id = p.product_id
WHERE s.quantity > 1
ORDER BY quantity_rank, s.sale_id
    LIMIT 10 -D selects
Command executed successfully. Output:
8816	8816	Customer 8816	8816	Product 8816	21	1
10243	10243	Customer 10243	10243	Product 10243	21	2
10770	10770	Customer 10770	10770	Product 10770	21	3
14114	14114	Customer 14114	14114	Product 14114	21	4
14541	14541	Customer 14541	14541	Product 14541	21	5
16799	16799	Customer 16799	16799	Product 16799	21	6
28288	28288	Customer 28288	28288	Product 28288	21	7
30548	30548	Customer 30548	30548	Product 30548	21	8
32481	32481	Customer 32481	32481	Product 32481	21	9
44367	44367	Customer 44367	44367	Product 44367	21	10

Executing command: snowsql --query

-- SELECT-J08: List sales where customers bought more than one item, ranked by the number of items bought and the total net paid in each sale, with sale_id ensuring stable order
SELECT
    s.sale_id,
    s.customer_id,
    c.customer_name,
    s.product_id,
    p.product_name,
    s.quantity,
    DENSE_RANK() OVER (ORDER BY s.quantity DESC, s.net_paid DESC, s.sale_id) as quantity_rank
FROM sales s
         JOIN customers c ON s.customer_id = c.customer_id
         JOIN products p ON s.product_id = p.product_id
WHERE s.quantity > 1
ORDER BY quantity_rank, s.sale_id
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
8816	8816	Customer 8816	8816	Product 8816	21	1
10243	10243	Customer 10243	10243	Product 10243	21	2
10770	10770	Customer 10770	10770	Product 10770	21	3
14114	14114	Customer 14114	14114	Product 14114	21	4
14541	14541	Customer 14541	14541	Product 14541	21	5
16799	16799	Customer 16799	16799	Product 16799	21	6
28288	28288	Customer 28288	28288	Product 28288	21	7
30548	30548	Customer 30548	30548	Product 30548	21	8
32481	32481	Customer 32481	32481	Product 32481	21	9
44367	44367	Customer 44367	44367	Product 44367	21	10

OK - SELECT-J08
8816	8816	Customer 8816	8816	Product 8816	21	1
10243	10243	Customer 10243	10243	Product 10243	21	2
10770	10770	Customer 10770	10770	Product 10770	21	3
14114	14114	Customer 14114	14114	Product 14114	21	4
14541	14541	Customer 14541	14541	Product 14541	21	5
16799	16799	Customer 16799	16799	Product 16799	21	6
28288	28288	Customer 28288	28288	Product 28288	21	7
30548	30548	Customer 30548	30548	Product 30548	21	8
32481	32481	Customer 32481	32481	Product 32481	21	9
44367	44367	Customer 44367	44367	Product 44367	21	10

Preparing to run SELECT-J09...
Executing command: bendsql --query=


-- SELECT-J09: Aggregate sales and customer data to find the average sale amount per customer segment, ranked by average sale amount
SELECT
    c.segment,
    TRUNCATE(AVG(s.net_paid), 7) as avg_sale_amount,
    COUNT(s.sale_id) as number_of_sales,
    RANK() OVER (ORDER BY AVG(s.net_paid) DESC) as avg_sale_rank
FROM customers c
         JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.segment
ORDER BY avg_sale_rank
    LIMIT 10 -D selects
Command executed successfully. Output:
Small	10.0056284	1664915	1
Medium	10.0006918	1668740	2
Large	9.9974745	1666345	3

Executing command: snowsql --query


-- SELECT-J09: Aggregate sales and customer data to find the average sale amount per customer segment, ranked by average sale amount
SELECT
    c.segment,
    TRUNCATE(AVG(s.net_paid), 7) as avg_sale_amount,
    COUNT(s.sale_id) as number_of_sales,
    RANK() OVER (ORDER BY AVG(s.net_paid) DESC) as avg_sale_rank
FROM customers c
         JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.segment
ORDER BY avg_sale_rank
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
Small	10.0056284	1664915	1
Medium	10.0006918	1668740	2
Large	9.9974745	1666345	3

OK - SELECT-J09
Small	10.0056284	1664915	1
Medium	10.0006918	1668740	2
Large	9.9974745	1666345	3

Preparing to run SELECT-W1...
Executing command: bendsql --query=

-- SELECT-W1: Rank customers by total spending within each segment and show their average purchase value, limited to top 10
SELECT
    sub.customer_id,
    sub.customer_name,
    sub.segment,
    sub.total_spending,
    RANK() OVER (PARTITION BY sub.segment ORDER BY sub.total_spending DESC) AS rank_in_segment
FROM (
         SELECT
             c.customer_id,
             c.customer_name,
             c.segment,
             SUM(s.net_paid) AS total_spending
         FROM
             customers c
                 JOIN
             sales s ON c.customer_id = s.customer_id
         GROUP BY
             c.customer_id, c.customer_name, c.segment
     ) AS sub
ORDER BY
    sub.segment, rank_in_segment
    LIMIT 10 -D selects
Command executed successfully. Output:
984624	Customer 984624	Large	96.16	1
737912	Customer 737912	Large	95.83	2
559180	Customer 559180	Large	95.61	3
574783	Customer 574783	Large	95.10	4
447017	Customer 447017	Large	94.99	5
990362	Customer 990362	Large	94.61	6
961247	Customer 961247	Large	93.91	7
120441	Customer 120441	Large	93.90	8
583152	Customer 583152	Large	93.84	9
31987	Customer 31987	Large	93.77	10

Executing command: snowsql --query

-- SELECT-W1: Rank customers by total spending within each segment and show their average purchase value, limited to top 10
SELECT
    sub.customer_id,
    sub.customer_name,
    sub.segment,
    sub.total_spending,
    RANK() OVER (PARTITION BY sub.segment ORDER BY sub.total_spending DESC) AS rank_in_segment
FROM (
         SELECT
             c.customer_id,
             c.customer_name,
             c.segment,
             SUM(s.net_paid) AS total_spending
         FROM
             customers c
                 JOIN
             sales s ON c.customer_id = s.customer_id
         GROUP BY
             c.customer_id, c.customer_name, c.segment
     ) AS sub
ORDER BY
    sub.segment, rank_in_segment
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	Large	96.16	1
737912	Customer 737912	Large	95.83	2
559180	Customer 559180	Large	95.61	3
574783	Customer 574783	Large	95.10	4
447017	Customer 447017	Large	94.99	5
990362	Customer 990362	Large	94.61	6
961247	Customer 961247	Large	93.91	7
120441	Customer 120441	Large	93.90	8
583152	Customer 583152	Large	93.84	9
31987	Customer 31987	Large	93.77	10

OK - SELECT-W1
984624	Customer 984624	Large	96.16	1
737912	Customer 737912	Large	95.83	2
559180	Customer 559180	Large	95.61	3
574783	Customer 574783	Large	95.10	4
447017	Customer 447017	Large	94.99	5
990362	Customer 990362	Large	94.61	6
961247	Customer 961247	Large	93.91	7
120441	Customer 120441	Large	93.90	8
583152	Customer 583152	Large	93.84	9
31987	Customer 31987	Large	93.77	10

Preparing to run SELECT-W3...
Executing command: bendsql --query=


-- SELECT-W3: Determine the growth in sales quantity for each product from the first sale to the latest sale
SELECT product_id,
       first_sale_quantity,
       last_sale_quantity,
       last_sale_quantity - first_sale_quantity AS growth
FROM (
         SELECT product_id,
                FIRST_VALUE(quantity) OVER (PARTITION BY product_id ORDER BY sale_date ASC, sale_id ASC) AS first_sale_quantity,
                 LAST_VALUE(quantity) OVER (PARTITION BY product_id ORDER BY sale_date ASC, sale_id ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_sale_quantity
         FROM sales
     ) AS sub
ORDER BY growth DESC, product_id ASC
    LIMIT 10 -D selects
Command executed successfully. Output:
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20

Executing command: snowsql --query


-- SELECT-W3: Determine the growth in sales quantity for each product from the first sale to the latest sale
SELECT product_id,
       first_sale_quantity,
       last_sale_quantity,
       last_sale_quantity - first_sale_quantity AS growth
FROM (
         SELECT product_id,
                FIRST_VALUE(quantity) OVER (PARTITION BY product_id ORDER BY sale_date ASC, sale_id ASC) AS first_sale_quantity,
                 LAST_VALUE(quantity) OVER (PARTITION BY product_id ORDER BY sale_date ASC, sale_id ASC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_sale_quantity
         FROM sales
     ) AS sub
ORDER BY growth DESC, product_id ASC
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20

OK - SELECT-W3
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20
1406	1	21	20

Preparing to run SELECT-W5...
Executing command: bendsql --query=


-- SELECT-W5: Show the first 10 sales with a running total and running average of net_paid per customer
SELECT customer_id, sale_id, net_paid,
       SUM(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date) AS running_total,
        AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date) AS running_avg
FROM sales
ORDER BY customer_id, sale_date
    LIMIT 10 -D selects
Command executed successfully. Output:
0	3000000	14.14	14.14	14.1400
0	1000000	8.60	22.74	11.3700
0	4000000	12.37	35.11	11.7033
0	0	11.40	46.51	11.6275
0	2000000	8.22	54.73	10.9460
1	1	7.36	7.36	7.3600
1	2000001	15.42	22.78	11.3900
1	1000001	19.67	42.45	14.1500
1	4000001	8.90	51.35	12.8375
1	3000001	14.51	65.86	13.1720

Executing command: snowsql --query


-- SELECT-W5: Show the first 10 sales with a running total and running average of net_paid per customer
SELECT customer_id, sale_id, net_paid,
       SUM(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date) AS running_total,
        AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date) AS running_avg
FROM sales
ORDER BY customer_id, sale_date
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	3000000	14.14	14.14	14.14000000
0	1000000	8.60	22.74	11.37000000
0	4000000	12.37	35.11	11.70333333
0	0	11.40	46.51	11.62750000
0	2000000	8.22	54.73	10.94600000
1	1	7.36	7.36	7.36000000
1	2000001	15.42	22.78	11.39000000
1	1000001	19.67	42.45	14.15000000
1	4000001	8.90	51.35	12.83750000
1	3000001	14.51	65.86	13.17200000

DIFFERENCE FOUND

SELECT-W5:



-- SELECT-W5: Show the first 10 sales with a running total and running average of net_paid per customer
SELECT customer_id, sale_id, net_paid,
       SUM(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date) AS running_total,
        AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date) AS running_avg
FROM sales
ORDER BY customer_id, sale_date
    LIMIT 10
Differences:

bendsql:
0	3000000	14.14	14.14	14.1400
0	1000000	8.60	22.74	11.3700
0	4000000	12.37	35.11	11.7033
0	0	11.40	46.51	11.6275
0	2000000	8.22	54.73	10.9460
1	1	7.36	7.36	7.3600
1	2000001	15.42	22.78	11.3900
1	1000001	19.67	42.45	14.1500
1	4000001	8.90	51.35	12.8375
1	3000001	14.51	65.86	13.1720

snowsql:
0	3000000	14.14	14.14	14.14000000
0	1000000	8.60	22.74	11.37000000
0	4000000	12.37	35.11	11.70333333
0	0	11.40	46.51	11.62750000
0	2000000	8.22	54.73	10.94600000
1	1	7.36	7.36	7.36000000
1	2000001	15.42	22.78	11.39000000
1	1000001	19.67	42.45	14.15000000
1	4000001	8.90	51.35	12.83750000
1	3000001	14.51	65.86	13.17200000

Preparing to run SELECT-W6...
Executing command: bendsql --query=

-- SELECT-W6: Find the top 10 sales with the highest net_paid, including their percentage contribution to total sales, with secondary sorting for unique order
SELECT sale_id, product_id, customer_id, net_paid,
       net_paid / SUM(net_paid) OVER () AS percent_of_total_sales
FROM sales
ORDER BY net_paid DESC, sale_id ASC
    LIMIT 10 -D selects
Command executed successfully. Output:
8816	8816	8816	20.00	0.00000039
10243	10243	10243	20.00	0.00000039
10770	10770	10770	20.00	0.00000039
14114	14114	14114	20.00	0.00000039
14541	14541	14541	20.00	0.00000039
16799	16799	16799	20.00	0.00000039
28288	28288	28288	20.00	0.00000039
30548	30548	30548	20.00	0.00000039
32481	32481	32481	20.00	0.00000039
44367	44367	44367	20.00	0.00000039

Executing command: snowsql --query

-- SELECT-W6: Find the top 10 sales with the highest net_paid, including their percentage contribution to total sales, with secondary sorting for unique order
SELECT sale_id, product_id, customer_id, net_paid,
       net_paid / SUM(net_paid) OVER () AS percent_of_total_sales
FROM sales
ORDER BY net_paid DESC, sale_id ASC
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
8816	8816	8816	20.00	0.00000040
10243	10243	10243	20.00	0.00000040
10770	10770	10770	20.00	0.00000040
14114	14114	14114	20.00	0.00000040
14541	14541	14541	20.00	0.00000040
16799	16799	16799	20.00	0.00000040
28288	28288	28288	20.00	0.00000040
30548	30548	30548	20.00	0.00000040
32481	32481	32481	20.00	0.00000040
44367	44367	44367	20.00	0.00000040

DIFFERENCE FOUND

SELECT-W6:


-- SELECT-W6: Find the top 10 sales with the highest net_paid, including their percentage contribution to total sales, with secondary sorting for unique order
SELECT sale_id, product_id, customer_id, net_paid,
       net_paid / SUM(net_paid) OVER () AS percent_of_total_sales
FROM sales
ORDER BY net_paid DESC, sale_id ASC
    LIMIT 10
Differences:

bendsql:
8816	8816	8816	20.00	0.00000039
10243	10243	10243	20.00	0.00000039
10770	10770	10770	20.00	0.00000039
14114	14114	14114	20.00	0.00000039
14541	14541	14541	20.00	0.00000039
16799	16799	16799	20.00	0.00000039
28288	28288	28288	20.00	0.00000039
30548	30548	30548	20.00	0.00000039
32481	32481	32481	20.00	0.00000039
44367	44367	44367	20.00	0.00000039

snowsql:
8816	8816	8816	20.00	0.00000040
10243	10243	10243	20.00	0.00000040
10770	10770	10770	20.00	0.00000040
14114	14114	14114	20.00	0.00000040
14541	14541	14541	20.00	0.00000040
16799	16799	16799	20.00	0.00000040
28288	28288	28288	20.00	0.00000040
30548	30548	30548	20.00	0.00000040
32481	32481	32481	20.00	0.00000040
44367	44367	44367	20.00	0.00000040

Preparing to run SELECT-W8...
Executing command: bendsql --query=

-- SELECT-W8: Calculate the average sale value for each customer, compared to the overall average, top 10 customers
SELECT
    customer_id,
    AVG(net_paid) OVER (PARTITION BY customer_id) AS customer_avg,
        AVG(net_paid) OVER () - AVG(net_paid) OVER (PARTITION BY customer_id) AS diff_from_overall_avg
FROM
    sales
ORDER BY
    diff_from_overall_avg DESC, customer_id ASC
    LIMIT 10 -D selects
Command executed successfully. Output:
160211	0.5600	9.4412
160211	0.5600	9.4412
160211	0.5600	9.4412
160211	0.5600	9.4412
160211	0.5600	9.4412
114117	0.6700	9.3312
114117	0.6700	9.3312
114117	0.6700	9.3312
114117	0.6700	9.3312
114117	0.6700	9.3312

Executing command: snowsql --query

-- SELECT-W8: Calculate the average sale value for each customer, compared to the overall average, top 10 customers
SELECT
    customer_id,
    AVG(net_paid) OVER (PARTITION BY customer_id) AS customer_avg,
        AVG(net_paid) OVER () - AVG(net_paid) OVER (PARTITION BY customer_id) AS diff_from_overall_avg
FROM
    sales
ORDER BY
    diff_from_overall_avg DESC, customer_id ASC
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
160211	0.56000	9.44126
160211	0.56000	9.44126
160211	0.56000	9.44126
160211	0.56000	9.44126
160211	0.56000	9.44126
114117	0.67000	9.33126
114117	0.67000	9.33126
114117	0.67000	9.33126
114117	0.67000	9.33126
114117	0.67000	9.33126

DIFFERENCE FOUND

SELECT-W8:


-- SELECT-W8: Calculate the average sale value for each customer, compared to the overall average, top 10 customers
SELECT
    customer_id,
    AVG(net_paid) OVER (PARTITION BY customer_id) AS customer_avg,
        AVG(net_paid) OVER () - AVG(net_paid) OVER (PARTITION BY customer_id) AS diff_from_overall_avg
FROM
    sales
ORDER BY
    diff_from_overall_avg DESC, customer_id ASC
    LIMIT 10
Differences:

bendsql:
160211	0.5600	9.4412
160211	0.5600	9.4412
160211	0.5600	9.4412
160211	0.5600	9.4412
160211	0.5600	9.4412
114117	0.6700	9.3312
114117	0.6700	9.3312
114117	0.6700	9.3312
114117	0.6700	9.3312
114117	0.6700	9.3312

snowsql:
160211	0.56000	9.44126
160211	0.56000	9.44126
160211	0.56000	9.44126
160211	0.56000	9.44126
160211	0.56000	9.44126
114117	0.67000	9.33126
114117	0.67000	9.33126
114117	0.67000	9.33126
114117	0.67000	9.33126
114117	0.67000	9.33126

Preparing to run SELECT-W9...
Executing command: bendsql --query=


-- SELECT-W9: Top 10 sales with the most recent previous sale date for each product
SELECT sale_id, product_id, sale_date, LAG(sale_date, 1) OVER (PARTITION BY product_id ORDER BY sale_date) AS previous_sale_date
FROM sales
ORDER BY product_id, sale_date
    LIMIT 10 -D selects
Command executed successfully. Output:
2800000	0	2021-01-03	NULL
3600000	0	2021-01-04	2021-01-03
4500000	0	2021-02-06	2021-01-04
700000	0	2021-02-11	2021-02-06
800000	0	2021-02-22	2021-02-11
300000	0	2021-03-27	2021-02-22
4700000	0	2021-03-29	2021-03-27
3800000	0	2021-04-02	2021-03-29
600000	0	2021-04-16	2021-04-02
1400000	0	2021-04-17	2021-04-16

Executing command: snowsql --query


-- SELECT-W9: Top 10 sales with the most recent previous sale date for each product
SELECT sale_id, product_id, sale_date, LAG(sale_date, 1) OVER (PARTITION BY product_id ORDER BY sale_date) AS previous_sale_date
FROM sales
ORDER BY product_id, sale_date
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
2800000	0	2021-01-03	None
3600000	0	2021-01-04	2021-01-03
4500000	0	2021-02-06	2021-01-04
700000	0	2021-02-11	2021-02-06
800000	0	2021-02-22	2021-02-11
300000	0	2021-03-27	2021-02-22
4700000	0	2021-03-29	2021-03-27
3800000	0	2021-04-02	2021-03-29
600000	0	2021-04-16	2021-04-02
1400000	0	2021-04-17	2021-04-16

DIFFERENCE FOUND

SELECT-W9:



-- SELECT-W9: Top 10 sales with the most recent previous sale date for each product
SELECT sale_id, product_id, sale_date, LAG(sale_date, 1) OVER (PARTITION BY product_id ORDER BY sale_date) AS previous_sale_date
FROM sales
ORDER BY product_id, sale_date
    LIMIT 10
Differences:

bendsql:
2800000	0	2021-01-03	NULL
3600000	0	2021-01-04	2021-01-03
4500000	0	2021-02-06	2021-01-04
700000	0	2021-02-11	2021-02-06
800000	0	2021-02-22	2021-02-11
300000	0	2021-03-27	2021-02-22
4700000	0	2021-03-29	2021-03-27
3800000	0	2021-04-02	2021-03-29
600000	0	2021-04-16	2021-04-02
1400000	0	2021-04-17	2021-04-16

snowsql:
2800000	0	2021-01-03	None
3600000	0	2021-01-04	2021-01-03
4500000	0	2021-02-06	2021-01-04
700000	0	2021-02-11	2021-02-06
800000	0	2021-02-22	2021-02-11
300000	0	2021-03-27	2021-02-22
4700000	0	2021-03-29	2021-03-27
3800000	0	2021-04-02	2021-03-29
600000	0	2021-04-16	2021-04-02
1400000	0	2021-04-17	2021-04-16

Preparing to run SELECT-W10...
Executing command: bendsql --query=

-- SELECT-W10: Display the top 10 customers by the number of distinct products they have purchased
SELECT customer_id, COUNT(DISTINCT product_id) OVER (PARTITION BY customer_id) AS distinct_product_count
FROM sales
ORDER BY distinct_product_count DESC, customer_id
    LIMIT 10 -D selects
Command executed successfully. Output:
0	1
0	1
0	1
0	1
0	1
1	1
1	1
1	1
1	1
1	1

Executing command: snowsql --query

-- SELECT-W10: Display the top 10 customers by the number of distinct products they have purchased
SELECT customer_id, COUNT(DISTINCT product_id) OVER (PARTITION BY customer_id) AS distinct_product_count
FROM sales
ORDER BY distinct_product_count DESC, customer_id
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	1
0	1
0	1
0	1
0	1
1	1
1	1
1	1
1	1
1	1

OK - SELECT-W10
0	1
0	1
0	1
0	1
0	1
1	1
1	1
1	1
1	1
1	1

Preparing to run SELECT-W11...
Executing command: bendsql --query=

-- SELECT-W11: Calculate each customer's average sale value and rank these averages within each customer segment
WITH CustomerAverage AS (
    SELECT
        c.customer_id,
        c.customer_name,
        c.segment,
        AVG(s.net_paid) AS avg_sale_value
    FROM
        customers c
            JOIN
        sales s ON c.customer_id = s.customer_id
    GROUP BY
        c.customer_id, c.customer_name, c.segment
)
SELECT
    customer_id,
    customer_name,
    segment,
    avg_sale_value,
    RANK() OVER (PARTITION BY segment ORDER BY avg_sale_value DESC) AS rank_in_segment
FROM
    CustomerAverage
ORDER BY
    segment, rank_in_segment
    LIMIT 10 -D selects
Command executed successfully. Output:
984624	Customer 984624	Large	19.23200000	1
737912	Customer 737912	Large	19.16600000	2
559180	Customer 559180	Large	19.12200000	3
574783	Customer 574783	Large	19.02000000	4
447017	Customer 447017	Large	18.99800000	5
990362	Customer 990362	Large	18.92200000	6
961247	Customer 961247	Large	18.78200000	7
120441	Customer 120441	Large	18.78000000	8
583152	Customer 583152	Large	18.76800000	9
31987	Customer 31987	Large	18.75400000	10

Executing command: snowsql --query

-- SELECT-W11: Calculate each customer's average sale value and rank these averages within each customer segment
WITH CustomerAverage AS (
    SELECT
        c.customer_id,
        c.customer_name,
        c.segment,
        AVG(s.net_paid) AS avg_sale_value
    FROM
        customers c
            JOIN
        sales s ON c.customer_id = s.customer_id
    GROUP BY
        c.customer_id, c.customer_name, c.segment
)
SELECT
    customer_id,
    customer_name,
    segment,
    avg_sale_value,
    RANK() OVER (PARTITION BY segment ORDER BY avg_sale_value DESC) AS rank_in_segment
FROM
    CustomerAverage
ORDER BY
    segment, rank_in_segment
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
984624	Customer 984624	Large	19.23200000	1
737912	Customer 737912	Large	19.16600000	2
559180	Customer 559180	Large	19.12200000	3
574783	Customer 574783	Large	19.02000000	4
447017	Customer 447017	Large	18.99800000	5
990362	Customer 990362	Large	18.92200000	6
961247	Customer 961247	Large	18.78200000	7
120441	Customer 120441	Large	18.78000000	8
583152	Customer 583152	Large	18.76800000	9
31987	Customer 31987	Large	18.75400000	10

OK - SELECT-W11
984624	Customer 984624	Large	19.23200000	1
737912	Customer 737912	Large	19.16600000	2
559180	Customer 559180	Large	19.12200000	3
574783	Customer 574783	Large	19.02000000	4
447017	Customer 447017	Large	18.99800000	5
990362	Customer 990362	Large	18.92200000	6
961247	Customer 961247	Large	18.78200000	7
120441	Customer 120441	Large	18.78000000	8
583152	Customer 583152	Large	18.76800000	9
31987	Customer 31987	Large	18.75400000	10

Preparing to run SELECT-W12...
Executing command: bendsql --query=

-- SELECT-W12: Display the top 5 products with the highest average sales quantity, along with their rank across all categories
WITH ProductAverage AS (
    SELECT
        p.product_id,
        p.product_name,
        AVG(s.quantity) AS avg_quantity
    FROM
        products p
            JOIN
        sales s ON p.product_id = s.product_id
    GROUP BY
        p.product_id, p.product_name
)
SELECT
    product_id,
    product_name,
    TRUNCATE(avg_quantity, 2),
    RANK() OVER (ORDER BY avg_quantity DESC) AS overall_rank
FROM
    ProductAverage
ORDER BY
    overall_rank
    LIMIT 5 -D selects
Command executed successfully. Output:
29952	Product 29952	14.18	1
18738	Product 18738	14.08	2
32378	Product 32378	14.08	2
30774	Product 30774	14.06	4
26567	Product 26567	14.06	4

Executing command: snowsql --query

-- SELECT-W12: Display the top 5 products with the highest average sales quantity, along with their rank across all categories
WITH ProductAverage AS (
    SELECT
        p.product_id,
        p.product_name,
        AVG(s.quantity) AS avg_quantity
    FROM
        products p
            JOIN
        sales s ON p.product_id = s.product_id
    GROUP BY
        p.product_id, p.product_name
)
SELECT
    product_id,
    product_name,
    TRUNCATE(avg_quantity, 2),
    RANK() OVER (ORDER BY avg_quantity DESC) AS overall_rank
FROM
    ProductAverage
ORDER BY
    overall_rank
    LIMIT 5 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
29952	Product 29952	14.18	1
32378	Product 32378	14.08	2
18738	Product 18738	14.08	2
26567	Product 26567	14.06	4
30774	Product 30774	14.06	4

DIFFERENCE FOUND

SELECT-W12:


-- SELECT-W12: Display the top 5 products with the highest average sales quantity, along with their rank across all categories
WITH ProductAverage AS (
    SELECT
        p.product_id,
        p.product_name,
        AVG(s.quantity) AS avg_quantity
    FROM
        products p
            JOIN
        sales s ON p.product_id = s.product_id
    GROUP BY
        p.product_id, p.product_name
)
SELECT
    product_id,
    product_name,
    TRUNCATE(avg_quantity, 2),
    RANK() OVER (ORDER BY avg_quantity DESC) AS overall_rank
FROM
    ProductAverage
ORDER BY
    overall_rank
    LIMIT 5
Differences:

bendsql:
29952	Product 29952	14.18	1
18738	Product 18738	14.08	2
32378	Product 32378	14.08	2
30774	Product 30774	14.06	4
26567	Product 26567	14.06	4

snowsql:
29952	Product 29952	14.18	1
32378	Product 32378	14.08	2
18738	Product 18738	14.08	2
26567	Product 26567	14.06	4
30774	Product 30774	14.06	4

Preparing to run SELECT-W13...
Executing command: bendsql --query=

-- SELECT-W13: Calculate a cumulative total of sales and a running three-month average, then rank these by customer
WITH SalesData AS (
    SELECT
        customer_id,
        sale_date,
        net_paid,
        sale_id, -- assuming sale_id is a unique identifier for each sale
        SUM(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date, sale_id) AS cumulative_sales,
            AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date, sale_id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS running_3m_avg
    FROM
        sales
)
SELECT
    customer_id,
    sale_date,
    cumulative_sales,
    TRUNCATE(running_3m_avg, 4),
    RANK() OVER (ORDER BY running_3m_avg DESC, cumulative_sales DESC, customer_id, sale_date, sale_id) AS sales_rank
FROM
    SalesData
ORDER BY
    customer_id, sale_date
    LIMIT 10 -D selects
Command executed successfully. Output:
0	2021-04-24	14.14	14.1400	800131
0	2021-04-25	22.74	11.3700	1843988
0	2021-05-07	35.11	11.7033	1678552
0	2021-07-25	46.51	10.7900	2117116
0	2021-09-24	54.73	10.6633	2172956
1	2021-01-07	7.36	7.3600	3691423
1	2021-03-23	22.78	11.3900	1833683
1	2021-04-23	42.45	14.1500	797034
1	2021-11-16	51.35	14.6633	678057
1	2021-11-22	65.86	14.3600	745852

Executing command: snowsql --query

-- SELECT-W13: Calculate a cumulative total of sales and a running three-month average, then rank these by customer
WITH SalesData AS (
    SELECT
        customer_id,
        sale_date,
        net_paid,
        sale_id, -- assuming sale_id is a unique identifier for each sale
        SUM(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date, sale_id) AS cumulative_sales,
            AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date, sale_id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS running_3m_avg
    FROM
        sales
)
SELECT
    customer_id,
    sale_date,
    cumulative_sales,
    TRUNCATE(running_3m_avg, 4),
    RANK() OVER (ORDER BY running_3m_avg DESC, cumulative_sales DESC, customer_id, sale_date, sale_id) AS sales_rank
FROM
    SalesData
ORDER BY
    customer_id, sale_date
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	2021-04-24	14.14	14.1400	800131
0	2021-04-25	22.74	11.3700	1843988
0	2021-05-07	35.11	11.7033	1678552
0	2021-07-25	46.51	10.7900	2117116
0	2021-09-24	54.73	10.6633	2172956
1	2021-01-07	7.36	7.3600	3691423
1	2021-03-23	22.78	11.3900	1833683
1	2021-04-23	42.45	14.1500	797034
1	2021-11-16	51.35	14.6633	678057
1	2021-11-22	65.86	14.3600	745852

OK - SELECT-W13
0	2021-04-24	14.14	14.1400	800131
0	2021-04-25	22.74	11.3700	1843988
0	2021-05-07	35.11	11.7033	1678552
0	2021-07-25	46.51	10.7900	2117116
0	2021-09-24	54.73	10.6633	2172956
1	2021-01-07	7.36	7.3600	3691423
1	2021-03-23	22.78	11.3900	1833683
1	2021-04-23	42.45	14.1500	797034
1	2021-11-16	51.35	14.6633	678057
1	2021-11-22	65.86	14.3600	745852

Preparing to run SELECT-W14...
Executing command: bendsql --query=

-- SELECT-W14: Find the top 5 days with the highest sales, along with a row number indicating their rank ordered by date
SELECT
    sale_date,
    daily_total,
    ROW_NUMBER() OVER (ORDER BY sale_date) AS date_rank
FROM (
         SELECT
             sale_date,
             SUM(net_paid) AS daily_total
         FROM
             sales
         GROUP BY
             sale_date
     ) AS DailySales
ORDER BY
    daily_total DESC
    LIMIT 5 -D selects
Command executed successfully. Output:
2021-05-25	152184.19	145
2021-05-24	150972.64	144
2021-05-21	150842.78	141
2021-05-19	150320.68	139
2021-05-16	150297.39	136

Executing command: snowsql --query

-- SELECT-W14: Find the top 5 days with the highest sales, along with a row number indicating their rank ordered by date
SELECT
    sale_date,
    daily_total,
    ROW_NUMBER() OVER (ORDER BY sale_date) AS date_rank
FROM (
         SELECT
             sale_date,
             SUM(net_paid) AS daily_total
         FROM
             sales
         GROUP BY
             sale_date
     ) AS DailySales
ORDER BY
    daily_total DESC
    LIMIT 5 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
2021-05-25	152184.19	145
2021-05-24	150972.64	144
2021-05-21	150842.78	141
2021-05-19	150320.68	139
2021-05-16	150297.39	136

OK - SELECT-W14
2021-05-25	152184.19	145
2021-05-24	150972.64	144
2021-05-21	150842.78	141
2021-05-19	150320.68	139
2021-05-16	150297.39	136

Preparing to run SELECT-W16...
Executing command: bendsql --query=

-- SELECT-W16: Compare each sale's net_paid to the average of the previous 5 sales of the same customer
SELECT
    customer_id,
    sale_id,
    net_paid,
    AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING) AS prev_5_avg
FROM
    sales
ORDER BY
    customer_id, sale_id
    LIMIT 10 -D selects
Command executed successfully. Output:
0	0	11.40	11.7033
0	1000000	8.60	14.1400
0	2000000	8.22	11.6275
0	3000000	14.14	NULL
0	4000000	12.37	11.3700
1	1	7.36	NULL
1	1000001	19.67	11.3900
1	2000001	15.42	7.3600
1	3000001	14.51	12.8375
1	4000001	8.90	14.1500

Executing command: snowsql --query

-- SELECT-W16: Compare each sale's net_paid to the average of the previous 5 sales of the same customer
SELECT
    customer_id,
    sale_id,
    net_paid,
    AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING) AS prev_5_avg
FROM
    sales
ORDER BY
    customer_id, sale_id
    LIMIT 10 --dbname selects --schemaname PUBLIC -o output_format=tsv -o header=false -o timing=false -o friendly=false --warehouse COMPUTE_WH
Command executed successfully. Output:
0	0	11.40	11.70333
0	1000000	8.60	14.14000
0	2000000	8.22	11.62750
0	3000000	14.14	None
0	4000000	12.37	11.37000
1	1	7.36	None
1	1000001	19.67	11.39000
1	2000001	15.42	7.36000
1	3000001	14.51	12.83750
1	4000001	8.90	14.15000

DIFFERENCE FOUND

SELECT-W16:


-- SELECT-W16: Compare each sale's net_paid to the average of the previous 5 sales of the same customer
SELECT
    customer_id,
    sale_id,
    net_paid,
    AVG(net_paid) OVER (PARTITION BY customer_id ORDER BY sale_date ROWS BETWEEN 5 PRECEDING AND 1 PRECEDING) AS prev_5_avg
FROM
    sales
ORDER BY
    customer_id, sale_id
    LIMIT 10
Differences:

bendsql:
0	0	11.40	11.7033
0	1000000	8.60	14.1400
0	2000000	8.22	11.6275
0	3000000	14.14	NULL
0	4000000	12.37	11.3700
1	1	7.36	NULL
1	1000001	19.67	11.3900
1	2000001	15.42	7.3600
1	3000001	14.51	12.8375
1	4000001	8.90	14.1500

snowsql:
0	0	11.40	11.70333
0	1000000	8.60	14.14000
0	2000000	8.22	11.62750
0	3000000	14.14	None
0	4000000	12.37	11.37000
1	1	7.36	None
1	1000001	19.67	11.39000
1	2000001	15.42	7.36000
1	3000001	14.51	12.83750
1	4000001	8.90	14.15000

@BohuTANG BohuTANG merged commit a827842 into databendlabs:main Dec 19, 2023
68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cloud Build docker image for cloud test pr-chore this PR only has small changes that no need to record, like coding styles.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants