Reproducible-example.Rmd

---
title: "Novel Randomized M&V Examples"
author:
  - name: Aoyu Zou
    email: aoyuzou@berkeley.edu
    affiliation: CBE
  - name: Paul Raftery
    email: p.raftery@berkeley.edu
    affiliation: CBE
address:
  - code: CBE
    address: Center for the Built Environment, UC Berkeley, 390 Wurster Hall, Berkeley, CA, 94720, USA
date: "`r Sys.Date()`"
output:
  #prettydoc::html_pretty:
  bookdown::word_document2:
    toc: true
    toc_depth: 2
    toc_float: true
    number_sections: true
    theme: paper
    highlight: tango
editor_options: 
  markdown: 
    wrap: 72
knit: (function(input, ...) {
    rmarkdown::render(
      input,
      output_dir = "./docs"
    )
  })
---

```{r setup, include = FALSE, cache = FALSE}

# knitr setup
knitr::opts_chunk$set(message = F,
                      cache = F,
                      echo = F,
                      warning = F,
                      dev = "jpeg",
                      dpi = 300,
                      fig.show = "hold",
                      fig.path = "figs/",
                      fig.pos = "b")

# str(knitr::opts_chunk$get()) # see for all options
```

# Objective

The purpose of this document is to provide open-source code for analysts
who wish to apply the novel measurement and verification (M&V) procedure
described in the associated paper to their own projects or datasets. By
following the guidance below, users should be able to:

1.  Replicate the novel M&V method proposed in the manuscript:
    -   Design a randomized and balanced switchback control
        implementation schedule consisting of a baseline strategy and an
        intervention strategy,
    -   Conduct sequential analysis to infer intervention effect on a
        target metric (e.g. energy consumption, carbon emissions, or
        thermal comfort).
2.  Demonstrate the reliability of the novel randomized method and
    compare it with the conventional method under:
    -   Scenario A: Discrete non-routine events
    -   Scenario B: Continuous baseline change

By replicating the method proposed in the manuscript, we aim to
demonstrate that the novel M&V approach can reach a conclusion faster
than the conventional method and provide a robust uncertainty
quantification. Additionally, by evaluating savings estimations under
various common scenarios, we showcase the reliability of this new method
in overcoming the limitations of conventional approaches (e.g. the
impact of changes in building performance that are wholely unrelated to
the intervention). This example code follows the flowchart depicted in
the manuscript and provides a minimal example for demonstration
purposes. The analyst should make reasonable adjustments to address
their specific use case.

```{r theme, message=FALSE}
require(pacman)

# load packages using pacman
pacman::p_load(tidyverse, lubridate, here, rmarkdown,  
               scales, patchwork, magrittr, janitor, qpcR, knitr, # general
               ggpmisc, # linear regression
               ggpubr,
               sjstats, pwr, # anova results
               nmecr, # m&v modeling package for TOWT
               slider, # moving averages
               sprtt, # sequential testing
               base, blocksdesign, # blocking
               rstatix, #pipe friendly stats
               overlapping, # distribution overlapping percentage
               effsize, bootES, dabestr) #effect size

# turn off scientific notation
options(scipen = 999, digits = 15)

# set directory
here::i_am("Reproducible-example.rmd")

# set default theme for ggplot
theme_set(theme_minimal())

# define base ggplot theme
theme_update(plot.title = element_text(size = 14, colour = "grey20", face = "bold", hjust = 0.5),
             plot.subtitle = element_text(size = 10, colour = "grey20", face = "italic", hjust = 0.5, margin = margin(b = 10)),
             plot.caption = element_text(size = 10, colour = "grey20", face = "italic", hjust = 0.5),
             plot.background = element_rect(fill = "white", colour = NA),
             panel.grid.minor = element_blank(),
             panel.grid.major = element_blank(),
             axis.text = element_text(size = 10),
             strip.text = element_text(size = 10, color = "grey20", face = "bold"),
             strip.background = element_blank())

# define global color theme (from palatte "Set2")
ls_colours <- c("Baseline" = "#99d8c9",
                "Measured baseline" = "#99d8c9", 
                "Adjusted baseline" = "#99d8c9",
                "Projected baseline\n(no change)" = "#1b9e77",
                "Intervention" = "#fdbb84",
                "Measured interv" = "#fdbb84", 
                "True savings" = "black", 
                "True normalized savings" = "black")

# Get case study example data
readfile_path <- "./readfiles/"

# Load defined functions
function_path <- "./functions/"
```

# M&V use case

We apply the same use case described in the manuscript where a
software-as-a-service control company offers a more efficient chiller
control algorithm. Here, the analyst aims to determine the amount of
HVAC energy saved by this control intervention. The overall M&V protocol
regarding sampling ratio, carryover effect, blocking design and stopping
criteria remain the same as the manuscript. To replicate the analysis,
we also fitted a time-of-week and temperature model and normalized the
savings using a typical meterological year weather file for the same
location. We compared the randomized method with the conventional method
using the average treatment effect (ATE) by calculating the mean
difference between baseline samples and intervention samples to estimate
the energy-saved by the control algorithm. For both the TMY-normalized,
and actual-weather savings, we also represent these as fractional
savings (or percent savings) by normalizing using the mean of the
baseline measurements.

# Use case measurement dataset

For this demonstration, we simply create a hypothetical baseline and
intervention measurement set by using a linear function including
outdoor temperature ($temp$), peak and off-peak hour indicator
($\delta_{peak}$, $\delta_{off-peak}$) as independent variables:

$$
Energy = \beta_0 + \beta_1*temp + \beta_2 * \delta_{peak} + \beta_3 * \delta_{off-peak} + \epsilon
$$

We apply that to two years of measured, actual weather data in Chicago,
USA. The $\epsilon$ term accounts for random noise sampled from a normal
distribution $N(0, 10^2)$. The created measurement set for baseline and
intervention controls is shown in figure \@ref(fig:plotdata). If the
analyst follows the conventional M&V method described in ASHRAE
Guideline 14 (G14) or the International Performance Measurement and
Verification Protocol (IPMVP), then only baseline measurements are
measurable/observable during the pre-retrofit period and only
intervention measurements are measurable during the post-retrofit
period. In other words, the control strategy switches between baseline
and intervention one time in the entire study period.

```{r readfiles, message=FALSE, warning=FALSE}

# load holidays
list_holidays <- read_csv(paste0(readfile_path, "us_holidays_2022.csv"),
                          col_names = c("date", "weekday", "holiday")) %>%
  mutate(date = mdy(date))

# load tmy weather file
df_tmy <- read_csv(paste0(readfile_path, "USA_IL_Chicago.Midway.Intl.AP.725340_TMY3.epw"),
                   skip = 8, col_types = "ddddd-d---------------------------------",
                   col_names = c("year", "month", "day", "hour", "min", "tmy")) %>%
  mutate(year = 2022,
         time = ymd_h(paste(paste(year, month, day, sep = "-"), hour, sep = " ")),
         temp = tmy) %>%
  dplyr::select(time, temp)

# load temperature from local weather station
df_weather <- read_csv(paste0(readfile_path, "weather.csv")) %>%
  mutate(datetime = with_tz(datetime_UTC, tz = "America/Chicago")) %>% 
  select(c(datetime, t_out)) %>% 
  mutate(across(t_out, ~ zoo::na.approx(., na.rm = FALSE))) %>%
  filter(datetime >= as.Date("2021-01-01")) %>% 
  filter(datetime < as.Date("2023-01-01")) %>% 
  unique() %>% 
  arrange(datetime)

# Create a sequence of hourly timestamps for one year
timestamps <- seq(from = as.POSIXct("2021-01-01 00:00:00"),
                  to = as.POSIXct("2022-12-31 23:00:00"),
                  by = "hour")

# Define peak hours (e.g., 12 PM to 6 PM)
peak_hours <- as.integer(format(timestamps, "%H") %in% c("12", "13", "14", "15", "16", "17"))

# Calculate non-peak hours
non_peak_hours <- 1 - peak_hours

# Baseline model parameters
intercept <- 45
beta_peak <- 20
beta_temp <- 0.9
beta_non_peak <- 5

# Compute the energy consumption
energy_consumption <- intercept +
                      beta_temp * df_weather$t_out +
                      beta_peak * peak_hours +
                      beta_non_peak * non_peak_hours +
                      rnorm(length(timestamps), mean = 0, sd = sqrt(100))

# Create a data frame
df_base <- data.frame(datetime = timestamps,
                      t_out = df_weather$t_out ,
                      power = energy_consumption) 

# Output the first few rows of the dataset
kable(head(df_base))

# Intervention model parameters
intercept <- 38
beta_temp <- 0.4
beta_peak <- 12
beta_non_peak <- 10

# Compute the energy consumption
energy_consumption <- intercept +
                      beta_temp * df_weather$t_out +
                      beta_peak * peak_hours +
                      beta_non_peak * non_peak_hours +
                      rnorm(length(timestamps), mean = 0, sd = sqrt(100))

# Create a data frame
df_interv <- data.frame(datetime = timestamps,
                        t_out = df_weather$t_out,
                        power = energy_consumption)

```

```{r plotdata, fig.cap="Generated hypothetical baseline and intervention measurements across the whole study period using a linear function (data points show hourly energy consumption and line fitted using the loess function)."}
df_base %>% 
  rename(Baseline = power, 
         base_t_out = t_out) %>% 
  left_join(df_interv %>% rename(Intervention = power), by = "datetime") %>% 
  pivot_longer(-c(datetime, base_t_out, t_out), names_to = "parameter", values_to = "value") %>% 
  ggplot(aes(x = t_out, y = value, color = parameter)) +
  geom_point(size = 0.1, alpha = 0.1) +
  geom_smooth() +
  scale_x_continuous(expand = c(0, 0), 
                     limits = c(-15, 35),
                     breaks = breaks_pretty(n = 5),
                     labels = number_format(suffix = " °C")) +
  scale_y_continuous(expand = c(0, 0), 
                     labels = number_format(suffix = " kW")) +
  scale_color_manual(values = ls_colours) +
  labs(x = NULL, 
       y = NULL,
       color = NULL, 
       title = "Created baseline and intervention measurements") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        legend.direction = "horizontal",
        legend.position = "bottom",
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))
```

# Novel randomized M&V analysis

## Switchback experimental design

As most control intervention strategies can be implemented
interchangeably with the building baseline control, we argue in the
paper that randomizing the implementation sequence between intervention
and baseline control strategies can balance the impact of confounding
variables when estimating the intervention effect, find a result faster,
and be more robust to the occurrence of changes in performance that are
unrelated to the intervention. A simple way to get started is using a
built-in function '*sample()'* in R package *'base'* to randomly select
at each sampling interval which strategy to implement. The advantage of
this function is easy to implement for a short test run (e.g. 1 week) on
a single unit. But it is not recommended for more complicated tests, and
we will address these limitations below.

```{r simple_schedule, include=TRUE, message=FALSE}
# Define strategies and number of days
strategies <- c("Base", "Intervention")
num_weeks <- 4
days_per_week <- 7
total_days <- num_weeks * days_per_week

# Randomly assign strategies to each day
assigned_strategies <- sample(strategies, total_days, replace = TRUE)

# Convert the linear structure into a weekly structure for easier interpretation
weekly_assignment <- matrix(assigned_strategies, nrow = num_weeks, ncol = days_per_week)

# print(weekly_assignment[1, ])
```

In statistics, average treatment effect (ATE) estimation depends on the
statistical power of unbiased randomization and balanced sample size.
This means to accurately estimate the energy-saving effect of the
intervention control strategy, we should design an unbiased
randomization that has an equal amount of measurements for each level of
the confounding variables. Therefore, a simple randomization can still
be biased, for instance, by yielding a poor randomization where the
baseline - by chance - is sampled more often on weekends and vice versa.
This can be avoided by designing a block randomization, which is a type
of experimental design that first divides the testing units into blocks
and then randomly assigns which strategy to implement for each unit
within each block. Thus in practice, each block type corresponds to one
confounding variable, and since strategies are sampled at the block
level, this means each confounding variable is covered by all testing
strategies. For example, building energy consumption is known to be
affected by occupancy, time, and outdoor weather conditions. If the
control service company wishes to bill the customer based on the savings
solely from the control retrofit itself, the M&V analyst should
decompose the measured energy savings and adjust for the influence from
differences in weather.

For example in this document, we assume two strategies are being
compared over a blocking period of 12 weeks. To do block design, we use
a function in R called *'blocks()'* from the *'blocksdesign'* package.
We further bonded other functions and inputs to make the package more
versatile (see code in .rmd file). Once successfully called, the
function will return a list consisting of schedule output and schedule
summaries. The detailed schedule can be viewed by calling the `schedule`
key name.

```{r block, include=TRUE, message=FALSE}
source(paste0(function_path, "blocking.R")) # block design schedule generation function
schedule <- blocking(start_date = "2021-01-01",
                     n_weeks = 48, 
                     n_seasons = 4, 
                     seed = 390, 
                     searches = 20, 
                     jumps = 20, 
                     treatments = 2, 
                     consec = 1)

kable(head(schedule$schedule, n = 14))
```

To check whether the blocking experimental design is balanced, we can
call three built-in summary table outputs:

1.  Weekday summary: calculates how many days are sampled for each
    season weekday and strategy combination.
2.  Consecutive days summary: calculates how many consecutive days are
    sampled for each strategy.

```{r summaryweek}
kable(head(schedule$weekday_summary))
```

```{r summaryconsec}
kable(head(schedule$consec_summary))
```

## Sequential analysis of the intervention effect

This section replicates the sequential analysis described in the
manuscript using the newly created measurement dataset. The results are
shown in figure \@ref(fig:sprtresults).

```{r sprtdata}
# source(paste0(function_path, "saving_pred.R")) # annual saving estimation function
source(paste0(function_path, "ol_est.R")) # independent variable overlapping calculation
source(paste0(function_path, "saving_pred.R")) # annual saving estimation function
# Combine random sampling
df_schedule <- data.frame(strategy = schedule$schedule$strategy,
                          datetime = schedule$schedule$date)

df_base_sch <- df_base %>% 
  left_join(df_schedule, by = "datetime") %>% 
  fill(strategy, .direction = "down") %>% 
  filter(strategy == 1)

df_interv_sch <- df_interv %>% 
  left_join(df_schedule, by = "datetime") %>% 
  fill(strategy, .direction = "down") %>% 
  filter(strategy == 2)

df_hourly <- bind_rows(df_base_sch, df_interv_sch) %>% 
  arrange(datetime)

# make daily totals
df_daily <- df_hourly %>%
  group_by(datetime = floor_date(datetime, unit = "day")) %>%
  summarise(strategy = unique(strategy),
            power_ave = mean(power, na.rm = TRUE),
            power_peak = max(power, na.rm = TRUE),
            t_out = mean(t_out, na.rm = TRUE)) %>%
  ungroup()
```

```{r, sprtrun, cache=F}
# prepare sequential test dataset
sprt_hourly <- df_hourly %>%
  mutate(week = interval(min(datetime), datetime) %>% as.numeric('weeks') %>% floor())

# set analysis parameters
param_sprt <- list(baseline = "Baseline",
                   strategy = "Intervention",
                   parameter = "power_ave", # choice: power_ave; mmoer; co2
                   label = "power", # choice: power; emissions
                   n_weeks = 48)

# prepare dataframe for analysis
df_sprt <- df_daily %>%
  mutate(strategy = as.factor(strategy),
         strategy = recode_factor(strategy, "1" = "Baseline", "2" = "Intervention")) %>%
  filter(strategy %in% c(param_sprt$baseline, param_sprt$strategy)) %>%
  pivot_longer(cols = -c(datetime, strategy), 
               names_to = "parameter", 
               values_to = "value") %>%
  mutate(week = interval(min(datetime), datetime) %>% as.numeric('weeks') %>% floor(),
         value = value) %>%
  filter(str_detect(parameter, param_sprt$parameter)) %>%
  droplevels()


# define list to store weekly means
df_means <- list()

# calculate weekly means
for (i in 1:param_sprt$n_weeks) {
  
  # subset data by week
  df_means[[i]] <- df_sprt %>%
    filter(week <= i) %>%
    group_by(strategy, 
             parameter) %>%
    summarise(week = i,
              value_ave = mean(value, na.rm = TRUE),
              value_sd = sd(value, na.rm = TRUE), 
              .groups = "keep") %>%
    ungroup()

}

# combine list into df
df_means <- bind_rows(df_means)

# define lists to store stopping criteria results
## SPRT results
sprt_res <- list()

## 80% independent variable
overlap_base <- list()
overlap_s2 <- list()
quantile_tmy <- quantile(df_tmy$temp, na.rm = TRUE, c(0, 0.99))

## annual saving estimation
annual_saving <- list()

# do sample
for (i in 2:param_sprt$n_weeks) {
  
  # subset data by week
  df_seq <- df_sprt %>%
    filter(week <= i) %>%
    droplevels()
  
 # Calculate overlapping temperature range
  overlap_base[[i]] <- tibble("n_weeks" = i, 
                              overlap_base = sprt_hourly %>%
                                filter(week <= i & strategy == 1) %>% 
                                ol_est(., quantile_tmy))
  
  overlap_s2[[i]] <- tibble("n_weeks" = i, 
                            overlap_s2 = sprt_hourly %>%
                              filter(week <= i & strategy == 2) %>% 
                              ol_est(., quantile_tmy))
  
 # start estimation after 2 months for complete ftow profile
  if (i/4 > 2) {
    
    # Update user on the week of update
    # print(paste0("updating TOWT model in week ", i))
    annual_saving[[i]] <- tibble("n_weeks" = i,
                                 annual_saving = saving_pred(sprt_hourly %>% filter(week <= i)))

  }
  
  # do test
  results_seq <- seq_ttest(x = value ~ strategy, 
                           data = df_seq,
                           d = 0.5, 
                           power = 0.9, 
                           alternative = "less", # greater
                           paired = FALSE,
                           verbose = TRUE)
  
  # calculate effect size
  results_ci <- effsize::cohen.d(d = df_seq$value,
                                 f = df_seq$strategy,
                                 conf.level = 0.90,
                                 paired = FALSE,
                                 na.rm = TRUE)
  
  # update user
  # print(paste0("Calculating effect size for week ", i))
  
  # bootstrap effect size
  results_bs <- bootES::bootES(data = df_seq, R = 1000, 
                               contrast = c(param_sprt$baseline, param_sprt$strategy),
                               data.col = "value", group.col = "strategy")
  
  # extract test statistic
  sprt_res[[i]] <- tibble("n_weeks" = i, 
                          "threshold_lower" = exp(results_seq@B_boundary_log),
                          "threshold_upper" = exp(results_seq@A_boundary_log),
                          "statistic" = results_seq@likelihood_ratio,
                          "decision" = results_seq@decision,
                          "cohens_d" = round(results_ci$estimate, digits = 2),
                          "ci_low" = round(results_ci$conf.int[[1]], digits = 2),
                          "ci_high" = round(results_ci$conf.int[[2]], digits = 2),
                          "ns_stat" = round(results_bs$t0, digits = 2),
                          "ns_ci_low" = round(results_bs$bounds[[1]], digits = 2),
                          "ns_ci_high" = round(results_bs$bounds[[2]], digits = 2))
  
}    

# combine list and vectors into df
annual_saving <- bind_rows(annual_saving)

# work out when threshold is reached
sprt_res <- bind_rows(sprt_res)

sprt_res <- sprt_res %>%
  left_join(., annual_saving, by = "n_weeks") %>%
  mutate(flag = ifelse(str_detect(decision, "accept"), 1, 0))

sprt_overlap_base <- bind_rows(overlap_base) %>% 
  left_join(., annual_saving, by = "n_weeks") %>% 
  mutate(flag = ifelse(overlap_base >= 0.8, 1, 0),
         flag = ifelse(flag == lag(flag, 1), 0, flag))

sprt_overlap_s2 <- bind_rows(overlap_s2) %>% 
  left_join(., annual_saving, by = "n_weeks") %>% 
  mutate(flag = ifelse(overlap_s2 >= 0.8, 1, 0),
         flag = ifelse(flag == lag(flag, 1), 0, flag))

```

```{r truesaving}
df_week <- df_base %>% 
  select(datetime, 
         base_power = power) %>% 
  left_join(df_interv, by = "datetime") %>% 
  mutate(savings = power - base_power) %>% 
  mutate(week = interval(min(datetime), datetime) %>% as.numeric('weeks') %>% floor()) %>% 
  filter(week <= 48)

true_saving <- list()

for (i in 2:48){
  saving <- df_week %>% 
    filter(week <= i)
  
  true_saving[[i]] <- tibble("n_weeks" = i, 
                             savings = mean(saving %>% .$saving))
}

true_saving <- bind_rows(true_saving)

# Create a sequence of hourly timestamps for one year
timestamps <- seq(from = as.POSIXct("2022-01-01 01:00:00"),
                  to = as.POSIXct("2023-01-01 00:00:00"),
                  by = "hour")

# Define peak hours (e.g., 12 PM to 6 PM)
peak_hours <- as.integer(format(timestamps, "%H") %in% c("12", "13", "14", "15", "16", "17"))

# Calculate non-peak hours
non_peak_hours <- 1 - peak_hours

# Baseline model parameters
intercept <- 45
beta_peak <- 20
beta_temp <- 0.9
beta_non_peak <- 5

# Compute the energy consumption
energy_consumption <- intercept +
                      beta_temp * df_tmy$temp +
                      beta_peak * peak_hours +
                      beta_non_peak * non_peak_hours

# Create a data frame
df_base_tmy <- data.frame(datetime = timestamps,
                      t_out = df_tmy$temp,
                      power = energy_consumption) 

# Intervention model parameters
intercept <- 38
beta_temp <- 0.4
beta_peak <- 12
beta_non_peak <- 10

# Compute the energy consumption
energy_consumption <- intercept +
                      beta_temp * df_tmy$temp +
                      beta_peak * peak_hours +
                      beta_non_peak * non_peak_hours

# Create a data frame
df_interv_tmy <- data.frame(datetime = timestamps,
                        t_out = df_tmy$temp,
                        power = energy_consumption)

df_hourly_tmy <- df_base_tmy %>%
  select(datetime,
         base_power = power) %>% 
  left_join(df_interv_tmy, by = "datetime")

tmy_saving <- (mean(df_hourly_tmy %>% .$base_power) - 
               mean(df_hourly_tmy %>% .$power)) /
               mean(df_hourly_tmy %>% .$base_power)
```

```{r sprtresults, fig.height=10, fig.width=7.5, fig.cap="Comprehensive results summary of the proposed M&V method applied to the case study throughout a total of 48 weeks"}
p2 <- ggplot(sprt_res, aes(x = n_weeks, y = ns_stat)) +
  geom_ribbon(aes(ymin = ns_ci_low, ymax = ns_ci_high), alpha = 0.5, fill = "#D0E3F1") +
  geom_line(size = 1.25, colour = "grey20", alpha = 0.4) +
  geom_line(data = true_saving, aes(x = n_weeks, y = savings, color = "True savings"), 
             linetype = "dashed") +
  geom_text(data = sprt_res[2, ], 
            aes(x = n_weeks, y = ns_ci_high, label = "Upper 90% CI"),
            position = position_nudge(x = 1), 
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic") +
  geom_text(data = sprt_res[2, ], 
            aes(x = n_weeks, y = ns_ci_low, label = "Lower 90% CI"),
            position = position_nudge(x = 1), 
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic") +
  geom_point(data = sprt_res %>% filter(n_weeks%%12 == 0), 
             size = 8, 
             shape = 16, 
             colour = "#347EB3", 
             alpha = 0.5) +
  geom_point(data = sprt_res %>% filter(n_weeks%%12 == 0), 
             size = 4, 
             shape = 16, 
             colour = "#347EB3", 
             alpha = 0.9) +
  geom_text(data = sprt_res %>% filter(n_weeks%%12 == 0), 
            aes(x = n_weeks, y = ns_stat, label = paste0(round(ns_stat, 1L), " kW")), 
            position = position_nudge(x = 2.5), 
            colour = "grey20", 
            size = 3.0, 
            fontface = "italic") +
  geom_errorbar(data = sprt_res %>% filter(n_weeks%%12 == 0), 
                aes(x = n_weeks, ymin = ns_ci_low, ymax = ns_ci_high), 
                width = 1, 
                colour = "#347EB3", 
                alpha = 0.5, 
                size = 0.8) +
  geom_text(data = sprt_res %>% filter(n_weeks%%12 == 0), 
            aes(x = n_weeks, y = ns_ci_high, label = paste0(round(ns_ci_high, 1L), " kW")), 
            position = position_nudge(x = 2.5), 
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic") +
  geom_text(data = sprt_res %>% filter(n_weeks%%12 == 0), 
            aes(x = n_weeks, y = ns_ci_low, label = paste0(round(ns_ci_low, 1L), " kW")), 
            position = position_nudge(x = 2.5), 
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic") +
  geom_text(data = sprt_res %>% filter(n_weeks%%12 == 0 & flag == 1),
            aes(x = n_weeks, y = -16, label = "Difference\nfound"), 
            colour = "grey20", 
            size = 3.0, 
            fontface = "italic") +
  geom_text(data = slice_tail(sprt_res, n = 1), 
            aes(x = n_weeks, y = ns_stat, label = paste0(round(ns_stat, 1L), " kW")), 
            position = position_nudge(x = 0.5), 
            colour = "grey20", 
            size = 3.0, 
            check_overlap = TRUE, 
            hjust = 0) +
  geom_text(data = slice_tail(sprt_res, n = 1), 
            aes(x = ifelse(is.na(first(flag)), 0, n_weeks), 
                y = ns_ci_high, 
                label = ifelse(is.na(first(flag)), NULL, paste0(round(ns_ci_high, 1L), " kW"))), 
            position = position_nudge(x = 0.5), 
            colour = "grey20", 
            size = 2.5, 
            check_overlap = TRUE, 
            hjust = 0) +
  geom_text(data = slice_tail(sprt_res, n = 1), 
            aes(x = ifelse(is.na(first(flag)), 0, n_weeks), 
                y = ns_ci_low, 
                label = ifelse(is.na(first(flag)), NULL, paste0(round(ns_ci_low, 1L), " kW"))), 
            position = position_nudge(x = 0.5), 
            colour = "grey20", 
            size = 2.5, 
            check_overlap = TRUE, 
            hjust = 0) +
  geom_vline(xintercept = seq(0, 50, by = 12), 
             linetype = "dashed", 
             color = "grey20", 
             alpha = 0.3, 
             size = 0.5) +
  annotate(geom = "text", 
           x = seq(5, 50, by = 12), 
           y = 0, 
           label = paste0("12-week block"), 
           alpha = 0.5, 
           size = 3) +
  scale_x_continuous(expand = c(0, 0), 
                     limits = c(1, param_sprt$n_weeks + 0.5),
                     labels = number_format(accuracy = 1L, suffix = "\nweeks")) +
  scale_y_continuous(expand = c(0, 0), 
                     limits = c(-16, 0),
                     breaks = pretty_breaks(n = 3),
                     labels = number_format(suffix = " kW")) +
  labs(title = NULL, 
       subtitle = "SPRT results and estimated difference in power consumption without weather normalization",
       x = NULL, 
       y = NULL, 
       color = NULL) +
  scale_color_manual(values = ls_colours) +
  guides(alpha = "none") +
  coord_cartesian(clip = "off") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        legend.position = "bottom",
        axis.text.x = element_blank(),
        plot.margin = margin(3, 15, 3, 3, unit = "mm"))

p4 <- ggplot(sprt_overlap_base) +
  geom_line(aes(x = n_weeks, y = overlap_base, color = "Baseline"), 
            size = 1.25) +
  geom_line(data = sprt_overlap_s2, 
            aes(x = n_weeks, y = overlap_s2, color = "Intervention"),
            size = 1.25) +
  geom_text(data = sprt_overlap_base[nrow(sprt_overlap_base), ],
            aes(x = n_weeks, y = 0.8, label = "80% of\nTMY range\nthreshold"),
            position = position_nudge(x = 0.5), 
            color = "red", 
            size = 3.0,
            check_overlap = TRUE,
            hjust = 0) +
  geom_hline(yintercept = 0.8, linetype = "dashed", color = "red") +
  scale_color_manual(values = ls_colours) +
  scale_x_continuous(expand = c(0, 0), 
                     limits = c(1, param_sprt$n_weeks + 0.5),
                     labels = number_format(accuracy = 1L, suffix = "\nweeks")) +
  scale_y_continuous(breaks = seq(0.2, 1.0, by = 0.2), 
                     labels = c("20%", "40%", "60%", "80%", "100%")) +
  labs(title = NULL, 
       subtitle = "Confounding variable (outdoor drybulb temperature) range coverage",
       x = NULL, 
       y = NULL)  +
  coord_cartesian(clip = "off") +
  theme(panel.grid.major.y = element_line(),
        axis.text.x = element_blank(),
        legend.position = "none",
        plot.margin = margin(3, 15, 3, 3, unit = "mm"))

# do running mean plot
p1 <- ggplot(df_means, aes(x = week, y = value_ave)) +
  geom_line(aes(colour = strategy), size = 1.25) +
  geom_text(data = filter(df_means, 
                          week == param_sprt$n_weeks, 
                          parameter == param_sprt$parameter), 
            aes(x = param_sprt$n_weeks, 
                y = value_ave, 
                colour = strategy, 
                label = strategy), 
            position = position_nudge(x = 0.5), 
            size = 3.0, 
            check_overlap = TRUE, 
            hjust = 0) +
  scale_x_continuous(expand = c(0, 0), 
                     limits = c(1, param_sprt$n_weeks + 0.5)) +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 4),
                     labels = number_format(suffix = " kW")) +
  scale_color_manual(values = ls_colours) +
  labs(title = NULL, 
       subtitle = "Running average power consumption of the case study building",
       x = NULL, 
       y = NULL) +
  guides(alpha = "none", colour = "none") +
  coord_cartesian(clip = "off") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        axis.text.x = element_blank(),
        plot.margin = margin(3, 15, 3, 3, unit = "mm"))

p3 <- ggplot(annual_saving) +
  geom_line(aes(x = n_weeks, y = annual_saving), 
            size = 1.25, color="darkgrey") +
  geom_hline(aes(yintercept = tmy_saving * 100, color = "True normalized savings"), 
             linetype = "dashed") +
  geom_point(data = annual_saving[1, ], 
             aes(x = n_weeks, y = annual_saving), 
             size = 4,
             alpha = 0.8, 
             shape = 16, 
             colour = "#347EB3") +
  geom_text(data = annual_saving[1, ], 
            aes(x = n_weeks - 3, 
                y = annual_saving, 
                label = paste0("Estimation start\nweek ", n_weeks)), 
            colour = "grey20", size = 2.5, fontface = "italic") +
  geom_point(data = sprt_res %>% filter(flag == 1), 
             aes(x = ifelse(is.null(first(flag)), max(sprt_res$n_weeks), n_weeks), 
                 y = ifelse(is.null(first(flag)), max(sprt_res$annual_saving), annual_saving)), 
             size = 4,
             alpha = 0.8, 
             shape = 16, 
             colour = "#347EB3") +
  geom_point(data = sprt_overlap_base %>% filter(flag == 1), 
             aes(x = ifelse(is.null(first(flag)), max(sprt_overlap_base$n_weeks), n_weeks), 
                 y = ifelse(is.null(first(flag)), max(sprt_overlap_base$annual_saving), annual_saving)), 
             size = 4,
             alpha = 0.8, 
             shape = 16, 
             colour = "#347EB3") +
  geom_text(data = sprt_overlap_base %>% filter(flag == 1), 
            aes(x = ifelse(is.null(first(flag)), max(sprt_overlap_base$n_weeks), n_weeks), 
                y = ifelse(is.null(first(flag)), max(sprt_overlap_base$annual_saving), annual_saving) - 3, 
                label = paste0("Temperature check\n(Baseline)\nweek ", n_weeks)), 
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic",
            check_overlap = TRUE) +
  geom_point(data = sprt_overlap_s2 %>% filter(flag == 1), 
             aes(x = ifelse(is.null(first(flag)), max(sprt_overlap_s2$n_weeks), n_weeks), 
                 y = ifelse(is.null(first(flag)), max(sprt_overlap_s2$annual_saving), annual_saving)), 
             size = 4,
             alpha = 0.8, 
             shape = 16, 
             colour = "#347EB3") +
  geom_text(data = sprt_overlap_s2 %>% filter(flag == 1), 
            aes(x = ifelse(is.null(first(flag)), max(sprt_overlap_s2$n_weeks), n_weeks), 
                y = ifelse(is.null(first(flag)), max(sprt_overlap_s2$annual_saving), annual_saving) + 3, 
                label = paste0("Temperature check\n(Intervention)\nweek ", n_weeks)), 
            colour = "grey20", size = 2.5, fontface = "italic") +
  geom_point(data = annual_saving %>% filter(n_weeks == 12), 
             aes(x = n_weeks, y = annual_saving), 
             size = 4,
             alpha = 0.8, 
             shape = 16, 
             colour = "#347EB3") +
  geom_text(data = annual_saving %>% filter(n_weeks == 12),
            aes(x = n_weeks, y = annual_saving + 4,label = paste0("Minimum\ntest length check\nweek ", n_weeks)),
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic")+
  geom_point(data = annual_saving %>% filter(n_weeks == 24), 
             aes(x = n_weeks, y = annual_saving), 
             size = 4,
             alpha = 0.8, 
             shape = 16, 
             colour = "#347EB3") +
  geom_text(data = annual_saving %>% filter(n_weeks == 24),
            aes(x = n_weeks, y = annual_saving - 3, label = paste0("End of block period\nall stopping criteria met\nweek ", n_weeks)),
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic")+
  geom_text(data = sprt_res %>% filter(n_weeks%%12 == 0 & flag == 1) %>% slice(1),
            aes(x = n_weeks, y = annual_saving - 4, label = paste0("SPRT check\nweek ", n_weeks)), 
            colour = "grey20", 
            size = 2.5, 
            fontface = "italic") +
  geom_vline(xintercept = seq(0, 50, by = 12), 
             linetype = "dashed", 
             color = "grey20", 
             alpha = 0.3, 
             size = 0.5) +
  annotate(geom = "text", 
           x = seq(5, 50, by = 12), 
           y = 24, 
           label = paste0("12-week block"), 
           alpha = 0.5, 
           size = 3) +
   geom_text(data = annual_saving[nrow(annual_saving), ], 
            aes(x = n_weeks, y = annual_saving, label = paste0(round(annual_saving, 0), "%")), 
            position = position_nudge(x = 0.5), 
            colour = "grey20", 
            size = 3.0, 
            check_overlap = TRUE, 
            hjust = 0) +
  scale_x_continuous(expand = c(0, 0), 
                     breaks = seq(0, 48, by = 12), 
                     limits = c(1, param_sprt$n_weeks + 0.5),
                     labels = number_format(accuracy = 1L, suffix = "\nweeks")) +
  scale_y_continuous(breaks = seq(0, 25, by = 5), 
                     limits = c(10, 24),
                     labels = c("0%", "5%", "10%", "15%", "20%", "25%")) +
  scale_color_manual(values = ls_colours) +
  coord_cartesian(clip = "off") +
  labs(title = NULL, 
       subtitle = "Normalized annual energy savings estimate using TOWT model predictions for a typical meteorological year",
       x = NULL, 
       y = NULL, 
       color = NULL)  +
  theme(panel.grid.major.y = element_line(),
        legend.position = "bottom",
        plot.margin = margin(3, 15, 3, 3, unit = "mm"))

# Overall result
p1 / p2 / p4 / p3 +
  plot_annotation(title = "Overall sequential evaluation results using pre-defined stopping criteria") +
  plot_annotation(tag_levels = c('a'), tag_suffix = ')') &
  theme(plot.tag.position = c(0, 1),
        plot.tag = element_text(color="black"))
```

As figure \@ref(fig:sprtresults) indicates, the running average of power
consumption in subplot a) shows more savings from the intervention
strategy during warmer outdoor weather in general. This also corresponds
to the sequential probability ratio test (SPRT) results shown in subplot
b). The energy savings as the difference between baseline samples and
intervention samples were estimated over actual weather encountered
during the study. Subplot b) also shows a minimal detectable saving was
observed early in week 12. It also highlights how consistently close the
estimated interventon effect is to the true effect (without the effect
of noise) throughout the study period. However, subplot c) indicates
that outdoor weather has not reached the 80% threshold range calculated
with reference to the typical meteorological year (TMY) weather file, we
need to continue the M&V process in order to be able to normalize to a
full range of weather conditions. At week 16, the accumulated
temperature measured in sampled baseline days covered 80% range and at
week 24, sampled intervention days satisfied the criterion. Therefore,
we report a 10 kW savings at the end of this blocking period (i.e. week
24). If the analyst decides to continue sampling, the plot further shows
the uncertainty associated with saving estimation further decreased
throughout the remaining 24 weeks. Subplot d) plots the sequentially
estimated normalized savings on the TMY weather conditions and outlines
the check-point for satisfying each critical stopping criterion. It
shows although the saving was detected early by the statistical test at
week 12, the normalized annual estimation still largely fluctuates at
the point. After covered sufficient weather conditions, the annualized
savings stabilize at 18% which corresponds to the true TMY normalized
savings as shown in subplot d).

Considering now the scenario where the analyst had applied conventional
M&V following ASHRAE Guideline 14 or IPMVP, it would take 2 years to
reach the same result (12 month baseline, 12 month intervention). The
point at which the new method was able to accurately estimate TMY
normalized savings was week 24. Using conventional M&V methods, this
would merely be the mid-point of baseline collection assuming no
interruptions or delays and they would need to wait for another one year
and half to estimate weather normalized savings. Even if the analyst
performed conventional M&V using a 6 month baseline and 6 month
intervention period (1 year total), the result would be less accurate
and still take longer than the method proposed above. Attempting to
perform conventional M&V using less than that (e.g., 24 weeks) would
yield nonsensical data as the baseline and intervention periods would
span very different weather data ranges. Yet this randomized M&V method
was able to achieve a valid, robust result in this time.

## Scenario A: A discrete non-routine event

One common and typical non-routine event in an energy-saving M&V project
is a change in occupancy, or a substantial change in occupant behavior.
This is because occupancy has a substantial influence on building energy
consumption but it is typically not closely related to intervention. In
reality, it is very rare to have occupancy measurements (compared to
other independent variables such as outdoor temperature and therefore)
when a analyst fits a regression model (e.g. TOWT), thus, the analyst
normally must assume that occupancy remains unchanged throughout the
study. ASHRAE G14 and IPMVP only require the analyst to check for the
coefficient of the variation of the root mean square error (CVRMSE) and
normalized mean square error (NMSE) of a fitted baseline model. Thus, if
occupancy changes after the retrofit implementation, the analyst can
still obtain an 'accurate' baseline model with qualified CV(RMSE) but it
is biased due to the unrelated change event being overlooked. The novel
M&V method, on the other hand, resolves this by sampling randomly and
frequently throughout the full distribution of occupancy over the full
study period, and thus the results are robust to the change in
occupancy.

Consider a scenario (Scenario A), where a tenant in a high rise office
building moves out during the M&V study period, vacating one floor of
the building. We expect the impact of non-routine events such as these
to be balanced in sampled baseline days and intervention days if we
sample randomly with equal probability. This means if occupancy is
notably different over say, a 4 week period, the randomized M&V will
sample approximately 2 weeks with baseline strategy and 2 weeks with
intervention strategy. Therefore, the calculated energy savings can
still represent the intervention's true effect. Under conventional M&V,
either the baseline or the intervention estimate will be affected by
this difference in occupancy. To illustrate one example of this
scenario, we modeled a situation where the 12-month baseline occurs
under stable occupancy, and after the retrofit implementation, the
energy consumption decreased by 10 kW for each metered interval from
April to September because of decreased occupancy (i.e. tenants moving
out leaving one floor vacant) and that the analyst was not aware of this
change in occupancy.

```{r, SA}

# Generate schedule for the post-intervention period
schedule <- blocking(start_date = "2022-01-01",
                     n_weeks = 60, 
                     n_seasons = 5, 
                     seed = 390, 
                     searches = 20, 
                     jumps = 20, 
                     treatments = 2, 
                     consec = 1)

df_schedule <- data.frame(strategy = schedule$schedule$strategy,
                          datetime = schedule$schedule$date)

df_base_sch <- df_base %>% 
  left_join(df_schedule, by = "datetime") %>% 
  fill(strategy, .direction = "down") %>%
  filter(datetime >= as.Date("2022-01-01")) 

base_pre <- df_base %>% 
  filter(datetime < as.Date("2022-01-01"))

df_interv_sch <- df_interv %>% 
  left_join(df_schedule, by = "datetime") %>% 
  fill(strategy, .direction = "down") %>%
  filter(datetime >= as.Date("2022-01-01")) 

# Apply occupancy changes
df_base_alt <- df_base_sch %>%
  mutate(power = ifelse(month(datetime) >= 4 & month(datetime) < 10, power - 10, power))

df_interv_alt <- df_interv_sch %>% 
  mutate(power = ifelse(month(datetime) >= 4 & month(datetime) < 10, power - 10, power))

df_hourly_sch <- bind_rows(df_base_alt %>% filter(strategy == 1), df_interv_alt %>% filter(strategy == 2)) %>% 
  arrange(datetime) %>% 
  mutate(power = ifelse(month(datetime) >= 4 & month(datetime) < 10, power - 10, power))

# Baseline projection
model_input_options <- nmecr::assign_model_inputs(regression_type = "TOWT", occupancy_threshold = 0.65)

baseline_train <- base_pre %>%
  select(time = datetime, 
         eload = power, 
         temp = t_out)

df_towt <- df_weather %>% 
  filter(datetime >= as.Date("2022-01-01")) %>% 
  rename(time = datetime, 
         temp = t_out) %>% 
  mutate(eload = NA)

# do baseline model to estimate occupancy
towt_base <- nmecr::model_with_TOWT(training_data = baseline_train,
                                    model_input_options = model_input_options)

# Add occupancy info to model specification
base_model_input_options <- map(model_input_options, ~.) %>% 
  append(list(find_occ_unocc(training_data = baseline_train, model_input_options = towt_base$model_input_options))) %>% 
  set_names(., c(names(model_input_options), "occupancy_info"))

# re-estimate baseline with occupancy
towt_base <- nmecr::model_with_TOWT(training_data = baseline_train,
                                    model_input_options = base_model_input_options)

baseline_results <- model_pred(amod = towt_base$model_occupied, 
                               bmod = towt_base$model_unoccupied,
                               time = df_towt$time, 
                               temp = df_towt$temp,
                               temp_knots = towt_base$model_input_options$calculated_temp_knots, 
                               dataframe = df_towt,
                               occ_info = towt_base$model_input_options$occupancy_info,
                               interval_minutes = towt_base$model_input_options$interval_minutes) %>%
  rename("datetime" = "time") %>% 
  mutate("datetime" = force_tz(datetime, tzone = "America/Chicago"))
```

```{r SAplot, fig.cap="Energy measurement and modeling results for a 2-year conventional M&V project. Left: 500 random samples of measured baseline. Right: 500 random samples of measured intervention, projected baseline from modeling results without considering occupancy change and adjusted baseline measurement for occupancy change. (All lines fitted using loess function)", fig.height=4, fig.width=8}

# Visualization

p1 <- ggplot() +
  geom_point(data = base_pre %>% 
               slice_sample(n = 500), 
             aes(x = datetime, y = power, color = "Measured baseline"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = base_pre, 
              aes(x = datetime, y = power, color = "Measured baseline")) +
  scale_x_datetime(date_breaks = "2 months",
                   date_labels = "%b")  +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 3),
                     labels = number_format(suffix = " kW")) +
  coord_cartesian(clip = "on", ylim = c(30, 90)) +
  scale_color_manual(values = ls_colours) +
  labs(x = NULL, 
       y = NULL,
       color = NULL, 
       title = NULL,
       subtitle = "Pre-retrofit period") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        legend.direction = "horizontal",
        legend.position = "bottom",
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))

p2 <- ggplot() +
  geom_point(data = baseline_results %>% 
               slice_sample(n = 500),
             aes(x = datetime, y = towt, color = "Projected baseline\n(no change)"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = baseline_results, 
              aes(x = datetime, y = towt, color = "Projected baseline\n(no change)")) +
  geom_point(data = df_base_alt %>% 
               slice_sample(n = 500), 
             aes(x = datetime, y = power, color = "Adjusted baseline"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = df_base_alt, 
              aes(x = datetime, y = power, color = "Adjusted baseline")) +
  geom_point(data = df_interv_alt %>% 
               slice_sample(n = 500),
             aes(x = datetime, y = power, color = "Measured interv"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = df_interv_alt, 
              aes(x = datetime, y = power, color = "Measured interv")) +
  scale_x_datetime(date_breaks = "2 months",
                   date_labels = "%b")  +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 3),
                     labels = number_format(suffix = " kW")) +
  coord_cartesian(clip = "on", ylim = c(30, 90)) +
  scale_color_manual(values = ls_colours) +
  labs(x = NULL, 
       y = NULL,
       color = NULL, 
       title = NULL,
       subtitle = "post-retrofit period") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        axis.text.y = element_blank(),
        legend.direction = "horizontal",
        legend.position = "bottom",
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))

p1 + p2 +
  plot_annotation(title = "Building energy consumption over a 2-year period") +
  plot_annotation(tag_levels = c('a'), tag_suffix = ')') &
  theme(plot.tag.position = c(0, 1),
        plot.tag = element_text(color="black"))
```

Figure \@ref(fig:SAplot) shows that if the analyst follows the steps
described in G14 and IPMVP, they would start baseline measurements 12
months prior to the control retrofit implementation (subplot a) and fit
an outdoor weather and time-dependent regression model using the
collected metered data as the baseline. In other words, the model is
only valid for stable occupancy. Then, during the intervention period,
an unexpected change in occupancy occurs, which makes the baseline model
no longer representative, but without a reasonable means of adjusting
it. Here, conventional M&V would overestimate the savings by the
intervention.

Instead, if the analyst follows the steps of the novel randomized M&V
described in the manuscript, they would configure both control
strategies in the BAS and switch back and forth between them depending
on a pre-determined, randomly selected schedule. Figure
\@ref(fig:SArand) shows that even if the M&V test period spans the
period during which occupancy changes, and the analyst has no prior
knowledge about when and how long the occupancy has changed, this method
will still find a reasonable estimate of the true effect of the
intervention. This is because it is randomly and frequently sampling
over the full distribution of occupancy experienced by the building.

```{r SArand, fig.cap="Averaged energy measurements at sampling interval (daily) using novel M&V method", fig.width=8, fig.height=4}
# make daily totals
df_daily <- df_hourly_sch %>%
  group_by(datetime = floor_date(datetime, unit = "day")) %>%
  summarise(strategy = unique(strategy),
            power_ave = mean(power, na.rm = TRUE),
            power_peak = max(power, na.rm = TRUE),
            t_out = mean(t_out, na.rm = TRUE)) %>%
  ungroup()

df_daily %>%
  mutate(strategy = as.factor(strategy),
         strategy = recode_factor(strategy, "1" = "Baseline", "2" = "Intervention")) %>% 
  ggplot(aes(x = datetime, y = power_ave)) + 
  geom_col(aes(fill = strategy), alpha = 0.3) +
  geom_line(aes(colour = strategy, group = 1)) +
  scale_x_datetime(expand = c(0, 0), 
                   date_breaks = "3 months", 
                   labels = label_date_short()) +
  scale_y_continuous(expand = c(0, 0), 
                     labels = number_format(suffix =" kW")) +
  scale_color_manual(values = ls_colours) +
  scale_fill_manual(values = ls_colours) +
  labs(title = "Building energy consumption measured\nusing randomized M&V method",
       x = NULL, 
       y = NULL, 
       colour = NULL, 
       fill = NULL) +
  guides(fill = guide_legend(label.position = "bottom", label.hjust = 0.5, keywidth = 2.5, keyheight = 0.5, nrow = 1)) +
  coord_cartesian(clip = "off") +
  theme(panel.grid.major.y = element_line(colour = "grey80", linewidth = 0.25),
        legend.position = "bottom",
        axis.text = element_text(size = 12))
```

```{r SAresults, fig.cap="Comparison of savings estimation between conventional method and novel method proposed in the paper.", fig.height=4, fig.width=8}

# saving calculation

true <- (mean(df_base_alt %>% .$power) - 
         mean(df_interv_alt %>% .$power))

new <- (mean(df_hourly_sch %>% filter(strategy == 1) %>% .$power) - 
        mean(df_hourly_sch %>% filter(strategy == 2) %>% .$power))

conv <- (mean(baseline_results %>% .$towt) - 
         mean(df_interv_alt %>% .$power))

df_FS <- data.frame(
  values = c(true, conv, new),
  category = factor(c("True", "Conventional M&V", "Randomized M&V"), 
                    levels = c("True", "Conventional M&V", "Randomized M&V"))
)

df_FS %>% 
  ggplot() +
  geom_col(aes(x = category, y = values), 
           alpha = 0.6, 
           position = "identity") +
  geom_text(aes(x = category, y = values, label = paste0(round(values, digits = 1), " kW")), 
            position = position_nudge(y = -0.5),
            size = 4) +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 3),
                     label = number_format(suffix = "kW")) +
  labs(title = "Estimated savings by different methods", 
       x = NULL, 
       y = NULL) +
  theme(panel.grid.major.y = element_line(colour = "grey80", linewidth = 0.25),
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))
```

## Scenario B: Continuous baseline change due to HVAC component degradation

Building performance can also change intermittently or continuously over
time, which can be more difficult to detect. For example, gradually
replacing older, lower performance lights as they fail with LEDs.
Another example is a filter becoming increasingly clogged in an air
handling unit, increasing fan power due to increased static pressure to
overcome the increased resistance of air passing through the filter. In
this scenario, if the building manager changes the air filter before
starting the M&V and the analyst chooses to use the conventional
pre-/post-type analysis, the energy measured 12 months later in the
post-retrofit period should correspond to a higher baseline due to
increased fan energy as the particle accumulates in the filter. Here,
the conventional M&V method will underestimate the intervention
energy-saving effect. If the analyst chooses to use the novel random
sampling M&V, the results will be far more robust to this issue.

In this scenario, we simulated a situation where the fan power gradually
increased due to accumulated particles in the filter. We assume the fan
power accounts for 25% of the HVAC electricity consumption and due to
clogging, system static pressure increases gradually by 50% over the
course of the 2 year study period. The relationship between fan power
and static pressure can be roughly estimated as $P \sim\ Q^3*\Delta P$,
which means the fan power would also increase from 100% to 150%.

```{r SB}
# Generate schedule for the post-intervention period
schedule <- blocking(start_date = "2021-01-01",
                     n_weeks = 108, 
                     n_seasons = 9, 
                     seed = 390, 
                     searches = 20, 
                     jumps = 20, 
                     treatments = 2, 
                     consec = 1)

df_schedule <- data.frame(strategy = schedule$schedule$strategy,
                          datetime = schedule$schedule$date)

df_base_sch <- df_base %>% 
  left_join(df_schedule, by = "datetime") %>% 
  fill(strategy, .direction = "down") %>% 
  filter(strategy == 1) 

df_interv_sch <- df_interv %>% 
  left_join(df_schedule, by = "datetime") %>% 
  fill(strategy, .direction = "down") %>% 
  filter(strategy == 2)

df_hourly_sch <- bind_rows(df_base_sch, df_interv_sch) %>% 
  arrange(datetime)

# Add gradual change
gradi_week <- 1 / 106
df_base_alt <- df_base %>%
  mutate(week = interval(min(datetime), datetime) %>% as.numeric('weeks') %>% floor()) %>%
  mutate(ratio = gradi_week * week) %>%
  mutate(delta_fan = power * 0.25 * 0.5 * ratio) %>%
  mutate(power = power + delta_fan)

df_interv_alt <- df_interv %>%
  mutate(week = interval(min(datetime), datetime) %>% as.numeric('weeks') %>% floor()) %>%
  mutate(ratio = gradi_week * week) %>%
  mutate(delta_fan = power * 0.25 * 0.5 * ratio) %>%
  mutate(power = power + delta_fan)

df_hourly_alt <- df_hourly_sch %>% 
  mutate(week = interval(min(datetime), datetime) %>% as.numeric('weeks') %>% floor()) %>%
  mutate(ratio = gradi_week * week) %>%
  mutate(delta_fan = power * 0.25 * 0.5 * ratio) %>%
  mutate(power = power + delta_fan)

# Baseline projection
model_input_options <- nmecr::assign_model_inputs(regression_type = "TOWT", occupancy_threshold = 0.65)

baseline_train <- df_base_alt %>%
  filter(datetime < as.Date("2022-01-01")) %>% 
  select(time = datetime, 
         eload = power, 
         temp = t_out)

# do baseline model to estimate occupancy
towt_base <- nmecr::model_with_TOWT(training_data = baseline_train,
                                    model_input_options = model_input_options)

# Add occupancy info to model specification
base_model_input_options <- map(model_input_options, ~.) %>% 
  append(list(find_occ_unocc(training_data = baseline_train, model_input_options = towt_base$model_input_options))) %>% 
  set_names(., c(names(model_input_options), "occupancy_info"))

# re-estimate baseline with occupancy
towt_base <- nmecr::model_with_TOWT(training_data = baseline_train,
                                    model_input_options = base_model_input_options)

baseline_results <- model_pred(amod = towt_base$model_occupied, 
                               bmod = towt_base$model_unoccupied,
                               time = df_towt$time, 
                               temp = df_towt$temp,
                               temp_knots = towt_base$model_input_options$calculated_temp_knots, 
                               dataframe = df_towt,
                               occ_info = towt_base$model_input_options$occupancy_info,
                               interval_minutes = towt_base$model_input_options$interval_minutes) %>%
  rename("datetime" = "time") %>% 
  mutate("datetime" = force_tz(datetime, tzone = "America/Chicago"))

```

```{r SBdata, fig.cap="Energy measurement and modeling results for a 2-year conventional M&V project. Left: 500 random samples of measured baseline. Right: 500 random samples of measured intervention, projected baseline from modeling results without considering filter clogging and adjusted baseline measurement for filter clogging. (All lines fitted using loess function)", fig.width=8, fig.height=4}

p1 <- ggplot() +
   geom_point(data = df_base_alt %>% 
               filter(datetime < as.Date("2022-01-01")) %>% 
               slice_sample(n = 500), 
             aes(x = datetime, y = power, color = "Measured baseline"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = df_base_alt %>% 
               filter(datetime < as.Date("2022-01-01")), 
              aes(x = datetime, y = power, color = "Measured baseline")) +
  scale_x_datetime(date_breaks = "2 months",
               date_labels = "%b")  +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 3),
                     labels = number_format(suffix = " kW")) +
  coord_cartesian(clip = "on", ylim = c(30, 90)) +
  scale_color_manual(values = ls_colours) +
  labs(x = NULL, 
       y = NULL,
       color = NULL, 
       title = NULL,
       subtitle = "Pre-retrofit period") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        legend.direction = "horizontal",
        legend.position = "bottom",
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))

p2 <- ggplot() +
  geom_point(data = baseline_results %>% 
               slice_sample(n = 500), 
             aes(x = datetime, y = towt, color = "Projected baseline\n(no change)"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = baseline_results, 
              aes(x = datetime, y = towt, color = "Projected baseline\n(no filter loading change)")) +

  geom_point(data = df_base_alt %>% 
               filter(datetime >= as.Date("2022-01-01")) %>% 
               slice_sample(n = 500), 
             aes(x = datetime, y = power, color = "Adjusted baseline"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = df_base_alt %>% 
               filter(datetime >= as.Date("2022-01-01")), 
              aes(x = datetime, y = power, color = "Adjusted baseline")) +
  geom_point(data = df_interv_alt %>% 
               filter(datetime >= as.Date("2022-01-01")) %>% 
               slice_sample(n = 500), 
             aes(x = datetime, y = power, color = "Measured interv"), 
             size = 0.5, 
             alpha = 0.3) +
  geom_smooth(data = df_interv_alt %>% filter(datetime >= as.Date("2022-01-01")), 
              aes(x = datetime, y = power, color = "Measured interv")) +
  scale_x_datetime(date_breaks = "2 months",
               date_labels = "%b")  +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 3),
                     labels = number_format(suffix = " kW")) +
  coord_cartesian(clip = "on", ylim = c(30, 90)) +
  scale_color_manual(values = ls_colours) +
  labs(x = NULL, 
       y = NULL,
       color = NULL, 
       title = NULL, 
       subtitle = "Post-retrofit period") +
  theme(panel.grid.major.y = element_line(colour = "grey80", size = 0.25),
        axis.text.y = element_blank(),
        legend.direction = "horizontal",
        legend.position = "bottom",
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))

p1 + p2 +
  plot_annotation(title = "Building energy consumption over a 2-year period") +
  plot_annotation(tag_levels = c('a'), tag_suffix = ')') &
  theme(plot.tag.position = c(0, 1),
        plot.tag = element_text(color="black"))
```

The above figure \@ref(fig:SBdata) shows both metered energy consumption
and modeled baseline for a 2-year M&V project in this scenario. The
figure shows that if using the conventional method, the analyst would
start a 12-month baseline and fit a regression model using the collected
measurements. Subplot b) shows that the projected baseline model will
under-estimate the fan energy increase compared to a correctly adjusted
baseline in the post-retrofit period. This will bias the energy savings
estimate associated with the intervention, yielding a lower energy
savings. However, if the analyst chose to apply the random switching
method shown in figure \@ref(fig:SBrand), it will still accurately
estimate the intervention effect.

```{r SBrand, fig.cap="Averaged energy measurements at sampling interval (daily) using novel M&V method", fig.width=8, fig.height=4}
# make daily totals
df_daily <- df_hourly_alt %>%
  group_by(datetime = floor_date(datetime, unit = "day")) %>%
  summarise(strategy = unique(strategy),
            power_ave = mean(power, na.rm = TRUE),
            power_peak = max(power, na.rm = TRUE),
            t_out = mean(t_out, na.rm = TRUE)) %>%
  ungroup() %>% 
  filter(datetime >= as.Date("2022-01-01"))

df_daily %>%
  mutate(strategy = as.factor(strategy),
         strategy = recode_factor(strategy, "1" = "Baseline", "2" = "Intervention")) %>% 
  ggplot(aes(x = datetime, y = power_ave)) + 
  geom_col(aes(fill = strategy), alpha = 0.3) +
  geom_line(aes(colour = strategy, group = 1)) +
  scale_x_datetime(expand = c(0, 0), 
                   date_breaks = "3 months", 
                   labels = label_date_short()) +
  scale_y_continuous(expand = c(0, 0), 
                     labels = number_format(suffix =" kW")) +
  scale_color_manual(values = ls_colours) +
  scale_fill_manual(values = ls_colours) +
  labs(title = "Building energy consumption measured\nusing randomized M&V method",
       x = NULL, 
       y = NULL, 
       colour = NULL, 
       fill = NULL) +
  guides(fill = guide_legend(label.position = "bottom", label.hjust = 0.5, keywidth = 2.5, keyheight = 0.5, nrow = 1)) +
  coord_cartesian(clip = "off") +
  theme(panel.grid.major.y = element_line(colour = "grey80", linewidth = 0.25),
        legend.position = "bottom",
        axis.text = element_text(size = 12))
```

Figure \@ref(fig:SBresults) implies that conventional M&V significantly
underestimates the intervention energy-saving effect due to a lack of
baseline measurements in the post-retrofit duration. Additionally as
expected, when using the randomized M&V, savings are estimated much more
accurately.

```{r SBresults, fig.cap="Saving estimation comparison between true savings and the two methods (conventional M&V method: no prior knowledge of filter clogging and does not measure baseline,  and the novel randomized M&V method: no prior knowledge of filter clogging but measures updated baseline).", fig.height=4, fig.width=8}
# True M&V savings
true <- (mean(df_base_alt %>% .$power) - 
         mean(df_interv_alt %>% .$power))

# Conventional M&V savings (biased)
conv <- (mean(baseline_results %>% .$towt) - 
         mean(df_interv_alt %>% filter(datetime >= as.Date("2022-01-01")) %>% .$power))

# Noevl M&V savings 

new <- (mean(df_hourly_alt %>% filter(strategy == 1) %>% .$power) - 
        mean(df_hourly_alt %>% filter(strategy == 2) %>% .$power))

df_FS <- data.frame(
  values = c(true, conv, new),
  category = factor(c("True", "Conventional M&V", "Randomized M&V"), 
                    levels = c("True", "Conventional M&V", "Randomized M&V"))
)

df_FS %>% 
  ggplot() +
  geom_col(aes(x = category, y = values), 
           alpha = 0.6, 
           position = "identity") +
  geom_text(aes(x = category, y = values, label = paste0(round(values, digits = 1), " kW")), 
            position = position_nudge(y = -0.5),
            size = 4) +
  scale_y_continuous(expand = c(0, 0), 
                     breaks = breaks_pretty(n = 3),
                     label = number_format(suffix = " kW")) +
  labs(title = "Estimated savings by different methods", 
       x = NULL, 
       y = NULL) +
  theme(panel.grid.major.y = element_line(colour = "grey80", linewidth = 0.25),
        plot.margin = margin(t = 2, r = 7, b = 2, l = 2, unit = "mm"))
```