Experimentation

This is the repository to demonstrate the analysis approach for experimentation, including A/B testing, A/A testing, and quasi-experimental testing for data driven product decision making.
I use one of my experimentation analysis for an example, although data cannot be share due to the confidentiality.

I am open to hear your approach, and let me know if I should detail on certain topics.

Essential step of Experimentation:

1. Sanity check and data quality check

Core step of experimentation as there can be technical issues during the experiments or test design defects.
If the data is not correct, then further analysis would not have value.

Check the latency of the new features and if the data are recorded into database, and the definition of the data is consistent across all the tables
Check buckets that you designed with javascript are equal both in control and treatment. This include sub-buckets like geography, device, environment, marketing channel, or any user-segment
Check if the invariant metrics (user-defined) are equal both in control and treatment
Check extreme outliers, missing values, data type, # of unique values

2. EDA

Before jumping into statistical tests, visualize the KPIs first to have the sense of the result !
It is important when we present the result to stakeholders and also to detect the faulty statistical test results.

Check mean and medium, group by subgroups
Plot boxplot, distribution, bar chart

3. Explanatory modeling

This step can be skipped if you are testing for conventional and one-directional KPIs, such as CVR, CTR, or OR, and no causality is found.

4. Statistical tests

There are 2 type of statistical tests, Frequentist and Bayesian, and many companies uses 3rd party tools to run the test and the preference between these 2 can vary among companies. c However, I would recommend to test by 2 methods whenever possible to cross validate, especially when the delta is small or if the result is against the business sense, as both methodoloy have pitfalls.

Frequentist: risk to commit type I & II errors
Bayesian: risk to choose wrong prior

5. Causal inference

If you detect so-called co-founders, we have analyse differently.
There are several methodologies to approach causality problems:

Propensity score matching
Propensity score stratification
Difference in difference test
Synthetic conttrol

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
img		img
README.MD		README.MD
Sanity check and EDA.ipynb		Sanity check and EDA.ipynb
Statistical Tests.ipynb		Statistical Tests.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Experimentation

Essential step of Experimentation:

1. Sanity check and data quality check

2. EDA

3. Explanatory modeling

4. Statistical tests

5. Causal inference

About

Releases

Packages

Languages

cnai-ds/Experimentation-Product-Analysis

Folders and files

Latest commit

History

Repository files navigation

Experimentation

Essential step of Experimentation:

1. Sanity check and data quality check

2. EDA

3. Explanatory modeling

4. Statistical tests

5. Causal inference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages