Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add logging #323

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
003f950
Cluster base table by stream_id to save cost
JumboDVDH0 Apr 18, 2024
02cb8d9
Added client_id to staging table
JumboDVDH0 Apr 18, 2024
fa1b0d0
Fixed type-o
JumboDVDH0 Apr 18, 2024
c970c6a
Forgot one reference
JumboDVDH0 Apr 18, 2024
ca16d32
Fixed grouping issue:
JumboDVDH0 Apr 18, 2024
ac57735
Added Client ID to user_id_mapping
JumboDVDH0 Apr 19, 2024
22901a3
Fixed grouping
JumboDVDH0 Apr 19, 2024
bb3c341
Merge pull request #1 from Springbok-Agency/stream-id-stg_ga4__user_i…
DVDH-000 Apr 23, 2024
a0adf6e
Added Client ID to Derived User Properties
JumboDVDH0 Apr 23, 2024
e9f37e6
Merge pull request #2 from Springbok-Agency/client-id-user-derived-pr…
DVDH-000 Apr 23, 2024
aaf9563
Added stream id
JumboDVDH0 Apr 24, 2024
9811f8f
Added stream_id
JumboDVDH0 Apr 26, 2024
bbf50d9
date range
tessa-beijloos May 6, 2024
f209e0b
Merge pull request #5 from Springbok-Agency/date-filter
tessa-beijloos May 6, 2024
715339c
Update base_ga4__events.sql
tessa-beijloos May 6, 2024
9c80574
Merge pull request #6 from Springbok-Agency/fix-for-base
DVDH-000 May 6, 2024
6e5c212
Update base_ga4__events.sql
tessa-beijloos May 6, 2024
391a432
Merge pull request #7 from Springbok-Agency/base-fix-v2
DVDH-000 May 6, 2024
a7f9885
Update base_ga4__events.sql
tessa-beijloos May 6, 2024
0ee8419
Merge pull request #8 from Springbok-Agency/fix-3-base-table
DVDH-000 May 6, 2024
33f1f2c
Update schema.yml
tessa-beijloos May 6, 2024
190e86e
Merge pull request #9 from Springbok-Agency/change-macro-name
DVDH-000 May 6, 2024
3ba4ed7
Added logging
JumboDVDH0 May 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,13 @@ vars:
ga4:
session_attribution_lookback_window_days: 90
```
# Select Date Range

To select a date range in a where statement you can use `select_date_range` macro. You can use it in your where statement like this:

``` WHERE <statement> (or TRUE) and {{ ga4.select_date_range(start_date, end_date, date_column) }}```



# Custom Events

Expand Down
15 changes: 15 additions & 0 deletions macros/schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
version: 2

macros:
- name: select_date_range
description: A macro to convert cents to dollars
arguments:
- name: date_column
type: string
description: The name of the date column you want to use for filtering
- name: start_date
type: string
description: the start date you want to use to filter the date_column (format 'YYYYMMDD')
- name: end_date
type: string
description: the end date you want to use to filter the date_column (format 'YYYYMMDD')
7 changes: 7 additions & 0 deletions macros/select_date_range.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{% macro select_date_range(start_date, end_date, date_column) %}
{% if start_date is not none and end_date is not none %}
date_column >= start_date and date_column <= end_date
{% else %}
date_column >= CURRENT_DATE - var("lookback_window")
{% endif %}
{% endmacro %}
10 changes: 9 additions & 1 deletion models/staging/base/base_ga4__events.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,11 @@
{% set partitions_to_replace = partitions_to_replace.append('date_sub(current_date, interval ' + (i+1)|string + ' day)') %}
{% endfor %}

{{ log("Running with start_date: " ~ var('start_date'), info=True) }}
{% if var('end_date') is not none %}
{{ log("Running with end_date: " ~ var('end_date'), info=True) }}
{% endif %}

{{
config(
pre_hook="{{ ga4.combine_property_data() }}" if var('combined_dataset', false) else "",
Expand All @@ -13,7 +18,7 @@
"data_type": "date",
},
partitions = partitions_to_replace,
cluster_by=['event_name']
cluster_by=['event_name', 'stream_id']
)
}}

Expand All @@ -22,6 +27,9 @@ with source as (
{{ ga4.base_select_source() }}
from {{ source('ga4', 'events') }}
where cast(left(replace(_table_suffix, 'intraday_', ''), 8) as int64) >= {{var('start_date')}}
{% if var('end_date') is not none %}
and cast(left(replace(_table_suffix, 'intraday_', ''), 8) as int64) <= {{ var('end_date')}}
{% endif %}
{% if is_incremental() %}
and parse_date('%Y%m%d', left(replace(_table_suffix, 'intraday_', ''), 8)) in ({{ partitions_to_replace | join(',') }})
{% endif %}
Expand Down
2 changes: 2 additions & 0 deletions models/staging/stg_ga4__client_key_first_last_pageviews.sql
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

with page_views_first_last as (
select
stream_id,
client_key,
FIRST_VALUE(event_key) OVER (PARTITION BY client_key ORDER BY event_timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS first_page_view_event_key,
LAST_VALUE(event_key) OVER (PARTITION BY client_key ORDER BY event_timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS last_page_view_event_key
Expand All @@ -12,6 +13,7 @@ with page_views_first_last as (
),
page_views_by_client_key as (
select distinct
stream_id,
client_key,
first_page_view_event_key,
last_page_view_event_key
Expand Down
2 changes: 2 additions & 0 deletions models/staging/stg_ga4__derived_user_properties.sql
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ with events_from_valid_users as (
unnest_user_properties as
(
select
stream_id,
client_key,
event_timestamp
{% for up in var('derived_user_properties', []) %}
Expand All @@ -20,6 +21,7 @@ unnest_user_properties as
)

SELECT DISTINCT
stream_id,
client_key
{% for up in var('derived_user_properties', []) %}
, LAST_VALUE({{ up.event_parameter }} IGNORE NULLS) OVER (user_window) AS {{ up.user_property_name }}
Expand Down
6 changes: 4 additions & 2 deletions models/staging/stg_ga4__sessions_traffic_sources.sql
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
with session_events as (
select
session_key
stream_id
,session_key
,event_timestamp
,events.event_source
,event_medium
Expand All @@ -22,7 +23,8 @@ set_default_channel_grouping as (
),
session_source as (
select
session_key
stream_id
,session_key
,COALESCE(FIRST_VALUE((CASE WHEN event_source <> '(direct)' THEN event_source END) IGNORE NULLS) OVER (session_window), '(direct)') AS session_source
,COALESCE(FIRST_VALUE((CASE WHEN event_source <> '(direct)' THEN COALESCE(event_medium, '(none)') END) IGNORE NULLS) OVER (session_window), '(none)') AS session_medium
,COALESCE(FIRST_VALUE((CASE WHEN event_source <> '(direct)' THEN COALESCE(source_category, '(none)') END) IGNORE NULLS) OVER (session_window), '(none)') AS session_source_category
Expand Down
12 changes: 7 additions & 5 deletions models/staging/stg_ga4__sessions_traffic_sources_daily.sql
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@

with session_events as (
select
client_key
stream_id
,client_key
,session_partition_key
,event_date_dt as session_partition_date
,event_timestamp
Expand Down Expand Up @@ -47,7 +48,8 @@ set_default_channel_grouping as (
),
first_session_source as (
select
client_key
stream_id
,client_key
,session_partition_key
,session_partition_date
,event_timestamp
Expand All @@ -69,8 +71,8 @@ find_non_direct_session_partition_key as (
from first_session_source
)

select
client_key
select stream_id
,client_key
,session_partition_key
,session_partition_date
,session_source
Expand All @@ -83,4 +85,4 @@ select
,non_direct_session_partition_key
,min(event_timestamp) as session_partition_timestamp
from find_non_direct_session_partition_key
group by 1,2,3,4,5,6,7,8,9,10,11
group by all
5 changes: 4 additions & 1 deletion models/staging/stg_ga4__user_id_mapping.sql
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
with events_with_user_id as (
select
stream_id,
user_id,
client_key,
event_timestamp
Expand All @@ -9,14 +10,16 @@ with events_with_user_id as (
),
include_last_seen_timestamp as (
select
stream_id,
user_id,
client_key,
max(event_timestamp) as last_seen_user_id_timestamp
from events_with_user_id
group by 1,2
group by 1,2,3
),
pick_latest_timestamp as (
select
stream_id,
user_id as last_seen_user_id,
client_key,
last_seen_user_id_timestamp
Expand Down
Loading