Skip to content

Commit

Permalink
Merge pull request #68 from Sage-Bionetworks/SNOW-103-streamlit-template
Browse files Browse the repository at this point in the history
[SNOW-103] Create a `streamlit` app template
  • Loading branch information
jaymedina authored Aug 13, 2024
2 parents 54534b0 + d96317e commit 1e72767
Show file tree
Hide file tree
Showing 18 changed files with 733 additions and 1 deletion.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
.terraform*
terraform.tfstate*
*.csv
.DS_Store
.DS_Store
2 changes: 2 additions & 0 deletions streamlit_template/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# .dockerignore
.streamlit/secrets.toml
2 changes: 2 additions & 0 deletions streamlit_template/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
.streamlit/secrets.toml
__pycache__/
12 changes: 12 additions & 0 deletions streamlit_template/.pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
repos:
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
name: isort (python)

- repo: https://github.com/psf/black
rev: 24.3.0
hooks:
- id: black
language_version: python3
7 changes: 7 additions & 0 deletions streamlit_template/.streamlit/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[theme]
#primaryColor="#F63366"
primaryColor="#47C7DA"
backgroundColor="#FFFFFF"
secondaryBackgroundColor="#F0F2F6"
textColor="#262730"
font="sans serif"
4 changes: 4 additions & 0 deletions streamlit_template/.streamlit/example_secrets.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[snowflake]
user = "EXAMPLE_USER"
password = "EXAMPLE_PASSWORD"
account = "example-0000000"
22 changes: 22 additions & 0 deletions streamlit_template/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Use the official Python base image
FROM python:3.11-slim

# Set environment variables
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1

# Copy requirements file
COPY requirements.txt .

# Install dependencies
RUN pip install --upgrade pip \
&& pip install -r requirements.txt

# Copy the rest of the application code
COPY . .

# Expose the port Streamlit runs on
EXPOSE 8501

# Command to run the app
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
136 changes: 136 additions & 0 deletions streamlit_template/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
## Introduction
This area of the repository serves as a template for developing your own Streamlit application for internal use within Sage Bionetworks.
The template is designed to source data from the databases in Snowflake and compose a dashboard using the various tools provided by [Streamlit](https://docs.streamlit.io/)
and plotly.

Below is the directory structure for all the components within `streamlit_template`. In the following section we will break down the purpose for
each component within `streamlit_template`, and how to use these components to design your own application and deploy via an AWS EC2 instance.

```
streamlit_template/
├── .streamlit/
│ ├── config.toml
│ └── example_secrets.toml
├── tests/
│ ├── __init__.py
│ └── test_app.py
├── toolkit/
│ ├── __init__.py
│ ├── queries.py
│ ├── utils.py
| └── widgets.py
├── Dockerfile
├── app.py
├── requirements.txt
└── style.css
```

## Create your own Streamlit application

### 1. Setup and Enable Access to Snowflake

- Create a fork of this repository under your GitHub user account.
- Within the `.streamlit` folder, you will need a file called `secrets.toml` which will be read by Streamlit before making communications with Snowflake.
Use the contents in `example_secrets.toml` as a syntax guide for how `secrets.toml` should be set up. See the [Snowflake documentation](https://docs.snowflake.com/en/user-guide/admin-account-identifier#using-an-account-name-as-an-identifier) for how to find your
account name.
- Test your connection to Snowflake by running the example Streamlit app at the base of this directory. This will launch the application on port 8501, the default port for Streamlit applications.

```
streamlit run app.py
```

> [!CAUTION]
> Do not commit your `secrets.toml` file to your forked repository. Keep your credentials secure and do not expose them to the public.
### 2. Build your Queries

Once you've completed the setup above, you can begin working on your SQL queries.
- Navigate to `queries.py` under the `toolkit/` folder.
- Your queries will be string objects. Assign each of them an easy-to-remember variable name, as they will be imported into `app.py` later on.
- It is encouraged that you test these queries in a SQL Worksheet on Snowflake's Snowsight before running them on your application.

Example:
```
QUERY_NUMBER_OF_FILES = """
select
count(*) as number_of_files
from
synapse_data_warehouse.synapse.node_latest
where
project_id = '53214489'
and
node_type = 'file';
"""
```

### 3. Build your Widgets

Your widgets will be the main visual component of your Streamlit application.

- Navigate to `widgets.py` under the `toolkit/` folder.
- Modify the imports as necessary. By default we are using `plotly` to design our widgets.
- Create a function for each widget. For guidance, follow one of the examples in `widgets.py`.

### 4. Build your Application

Here is where all your work on `queries.py` and `widgets.py` come together.
- Navigate to `app.py` to begin developing.
- Import the queries you developed in Step 2.
- Import the widgets you developed in Step 3.
- Begin developing! Use the pre-existing `app.py` in the template as a guide for structuring your application.

> [!TIP]
> The `utils.py` houses the functions used to connect to Snowflake and run your SQL queries. Make sure to reserve an area
> in the script for using `get_data_from_snowflake` with your queries from Step 2.
>
> Example:
>
> ```
> from toolkit.queries import (QUERY_ENTITY_DISTRIBUTION, QUERY_PROJECT_SIZES,
> QUERY_PROJECT_DOWNLOADS, QUERY_UNIQUE_USERS)
>
> entity_distribution_df = get_data_from_snowflake(QUERY_ENTITY_DISTRIBUTION)
> project_sizes_df = get_data_from_snowflake(QUERY_PROJECT_SIZES)
> project_downloads_df = get_data_from_snowflake(QUERY_PROJECT_DOWNLOADS)
> unique_users_df = get_data_from_snowflake(QUERY_UNIQUE_USERS)
> ```
### 5. Test your Application
We encourage implementing unit and regression tests in your application, particularly if there are components that involve interacting with the application
to display and/or transform data (e.g. buttons, dropdown menus, sliders, so on).
- Navigate to `tests/test_app.py` to modify the existing script.
- The default tests use [Streamlit's AppTest tool](https://docs.streamlit.io/develop/api-reference/app-testing/st.testing.v1.apptest#run-an-apptest-script) to launch the application and retrieve its components. Please modify these existing tests or create brand new ones
as you see fit.
> [!TIP]
> Make sure to launch the test suite from the base directory of the `streamlit_app/` (i.e `pytest tests/test_app.py`)
> to avoid import issues.
### 6. Dockerize your Application
- Update the `requirements.txt` file with the packages used in any of the scripts above.
- Ensure you have pushed all your changes to your fork of the repository that you are working in (remember not to commit your `secrets.toml` file).
- **_(Optional)_** You can choose to push a Docker image to the GitHub Container Registry to pull it directly from the container registry when ready to deploy.
For instructions on how to deploy your Docker image to the GitHub Container Registry, [see here](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry).
### 7. Launch your Application on AWS EC2
- Create an EC2: Linux Docker product from the Sage Service Catalog.
- Go to _Provisioned Products_ in the menu on the left-hand-side.
- Once your EC2 product's `status` is set to `Available`, click it and navigate to the _Events_ tab.
- Click the URL next to `ConnectionURI` to launch a shell session in your instance.
- Navigate to your home directory (`cd ~`).
- Clone your repository in your desired working directory.
- Create your `secrets.toml` file again. The Docker image of your Streamlit application will not have the `secrets.toml` for security reasons.
- Build your Docker image from the Dockerfile in the repository
- Run your Docker container from the image, and make sure to have your `secrets.toml` mounted and the 8501 port specified, like so:
```
docker run -p 8501:8501 \
-v $PWD/secrets.toml:.streamlit/secrets.toml \
<image name>
```
> [!TIP]
> If you would like to leave the app running after you close your shell session, be sure to run with the container detached (i.e. Have `-d` somewhere in the `docker run` command)
Empty file added streamlit_template/__init__.py
Empty file.
60 changes: 60 additions & 0 deletions streamlit_template/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
import numpy as np
import streamlit as st
from toolkit.queries import (
QUERY_ENTITY_DISTRIBUTION,
QUERY_PROJECT_DOWNLOADS,
QUERY_PROJECT_SIZES,
QUERY_UNIQUE_USERS,
)
from toolkit.utils import get_data_from_snowflake
from toolkit.widgets import plot_download_sizes, plot_unique_users_trend

# Custom CSS for styling
with open("style.css") as f:
st.markdown(f"<style>{f.read()}</style>", unsafe_allow_html=True)


def main():

# 1. Retrieve the data using your queries in queries.py
entity_distribution_df = get_data_from_snowflake(QUERY_ENTITY_DISTRIBUTION)
project_sizes_df = get_data_from_snowflake(QUERY_PROJECT_SIZES)
project_downloads_df = get_data_from_snowflake(QUERY_PROJECT_DOWNLOADS)
unique_users_df = get_data_from_snowflake(QUERY_UNIQUE_USERS)

# 2. Transform the data as needed
convert_to_gib = 1024 * 1024 * 1024
project_sizes = dict(
PROJECT_ID=list(project_sizes_df["PROJECT_ID"]),
TOTAL_CONTENT_SIZE=list(project_sizes_df["TOTAL_CONTENT_SIZE"]),
)
total_data_size = sum(
project_sizes["TOTAL_CONTENT_SIZE"]
) # round(sum(project_sizes['TOTAL_CONTENT_SIZE']) / convert_to_gib, 2)
average_project_size = round(
np.mean(project_sizes["TOTAL_CONTENT_SIZE"]) / convert_to_gib, 2
)

# 3. Format the app, and visualize the data with your widgets in widgets.py
# -------------------------------------------------------------------------
# Row 1 -------------------------------------------------------------------
st.markdown("### Monthly Overview :calendar:")
col1, col2, col3 = st.columns([1, 1, 1])
col1.metric("Total Storage Occupied", f"{total_data_size} GB", "7.2 GB")
col2.metric("Avg. Project Size", f"{average_project_size} GB", "8.0 GB")
col3.metric("Annual Cost", "102,000 USD", "10,000 USD")

# Row 2 -----------------------------------------------------------------
st.markdown("### Unique Users Report :bar_chart:")
st.plotly_chart(plot_unique_users_trend(unique_users_df))

# Row 3 -------------------------------------------------------------------
st.plotly_chart(plot_download_sizes(project_downloads_df, project_sizes_df))

# Row 4 -------------------------------------------------------------------
st.markdown("### Entity Trends :pencil:")
st.dataframe(entity_distribution_df)


if __name__ == "__main__":
main()
10 changes: 10 additions & 0 deletions streamlit_template/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
black==24.3.0
isort==5.13.2
numpy==1.26.3
streamlit==1.36.0
pandas==2.2.2
plotly==5.22.0
pytest==8.3.2
pre-commit==3.6.0
snowflake-connector-python==3.9.1
snowflake-snowpark-python==1.15.0
52 changes: 52 additions & 0 deletions streamlit_template/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
/* Logo */
/* Adapted from Zachary Blackwood */
/* [data-testid="stSidebar"] {
background-image: url(https://streamlit.io/images/brand/streamlit-logo-secondary-colormark-darktext.png);
background-size: 200px;
background-repeat: no-repeat;
background-position: 4px 20px;
} */


/* Card */
/* Adapted from https://startbootstrap.com/theme/sb-admin-2 */
div.css-1r6slb0.e1tzin5v2 {
background-color: #FFFFFF;
border: 1px solid #CCCCCC;
padding: 5% 5% 5% 10%;
border-radius: 5px;

border-left: 0.5rem solid #9AD8E1 !important;
box-shadow: 0 0.15rem 1.75rem 0 rgba(58, 59, 69, 0.15) !important;

}

label.css-mkogse.e16fv1kl2 {
color: #36b9cc !important;
font-weight: 700 !important;
text-transform: uppercase !important;
}


/* Move block container higher */
div.block-container.css-18e3th9.egzxvld2 {
margin-top: -5em;
}


/* Hide hamburger menu and footer */
div.css-r698ls.e8zbici2 {
display: none;
}

footer.css-ipbk5a.egzxvld4 {
display: none;
}

footer.css-12gp8ed.eknhn3m4 {
display: none;
}

div.vg-tooltip-element {
display: none;
}
Empty file.
67 changes: 67 additions & 0 deletions streamlit_template/tests/test_app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
"""A suite of unit tests for the streamlit app in the base directory. We encourage
adding tests into this suite to ensure functionality within your streamlit app, particularly
for the components that allow users to interact with the app (buttons, dropdown menus, etc).
These tests were all written using Streamlit's AppTest class. See here for more details:
https://docs.streamlit.io/develop/api-reference/app-testing/st.testing.v1.apptest#run-an-apptest-script
A few considerations:
1. This suite is meant to be run from the base directory, not from the tests directory.
2. The streamlit app is meant to be run from the base directory.
3. The streamlit app is assumed to be called ``app.py``.
"""

import os
import sys

import pytest
from streamlit.testing.v1 import AppTest

# Ensure that the base directory is in PYTHONPATH so ``toolkit`` and other tools can be found
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..")))

# The timeout limit to wait for the app to load before shutdown ( in seconds )
DEFAULT_TIMEOUT = 30


@pytest.fixture(scope="module")
def app():
return AppTest.from_file(
"app.py", default_timeout=DEFAULT_TIMEOUT
).run() # Point to your main Streamlit app file


def test_monthly_overview(app):
"""
Ensure that the Monthly Overview section is being displayed
with the appropriate labels in the right order.
"""

# Access the Monthly Overview columns in Row 1
total_storage_occupied = app.columns[0].children[0]
avg_project_size = app.columns[1].children[0]
annual_cost = app.columns[2].children[0]

# Check that the labels are correct for each metric
assert total_storage_occupied.label == "Total Storage Occupied"
assert avg_project_size.label == "Avg. Project Size"
assert annual_cost.label == "Annual Cost"


def test_plotly_charts(app):
"""Ensure both plotly charts are being displayed."""

plotly_charts = app.get("plotly_chart")

assert plotly_charts is not None
assert len(plotly_charts) == 2


def test_dataframe(app):
"""Ensure that the dataframe is being displayed."""

dataframe = app.dataframe

assert dataframe is not None
assert len(dataframe) == 1
1 change: 1 addition & 0 deletions streamlit_template/toolkit/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
import toolkit
Loading

0 comments on commit 1e72767

Please sign in to comment.