diff --git a/README.md b/README.md index 9acc97cb1..5e37fc3a9 100644 --- a/README.md +++ b/README.md @@ -20,69 +20,53 @@ You can submit your own SFGuide to be published on Snowflake's website by submit * Remembers where the student left off when returning to a sfguide * Mobile friendly user experience -## How to get started +## Getting Started - 1. nodejs & npm ([Install NodeJS & npm](https://nodejs.org/en/download/)) - 2. install gulp-cli: - ````bash - npm install --global gulp-cli - ```` - 3. install Golang ([Install Go](https://golang.org/doc/install)) - 4. add `/usr/local/go/bin` to the `PATH` environment variable. You can do this by adding the following line to your profile (`.bashrc` or `.zshrc`): +### Prerequisites + 1. [Install Node](https://nodejs.org/en/download/); Homebrew installed? `brew install node` + - Install gulp-cli `npm i -g gulp-cli` + 2. [Install Go](https://golang.org/doc/install); Homebrew installed? `brew install golang` + - Install claat `go get github.com/googlecodelabs/tools/claat` + +## Run locally + + 1. Fork this repository to your personal github account (top right of webpage, `fork` button) + 2. Clone your new fork `git clone git@github.com:/sfguides.git sfguides` + 3. Navigate to the site directory `cd sfguides/site` + 4. Install node dependencies `npm install` + 5. Run the site `npm run serve` + +Congratulations! You now have the Snowflake Guides landing page running. + +##### Common environment errors: +1. You get a `claat not found` error + - Make sure Go is properly in your `PATH`. Add the following lines to your profile (`~/.profile`): ````bash #adding Golang to path export PATH=$PATH:/usr/local/go/bin export PATH=$PATH:$HOME/go/bin ```` -***Note: Changes made to a profile file may not apply until the next time you log into your computer. To apply the changes immediately, just run the shell commands directly or execute them from the profile using a command such as `source $HOME/.zshrc`.*** - - 5. install claat: - ````bash - go get github.com/googlecodelabs/tools/claat - ```` - 6. navigate to the site directory: - ````bash - cd site/ - ```` - 7. install dependencies: - ````bash - npm install - ```` - 8. run the site locally - ````bash - gulp serve - ```` + ***Note:** After adding Go to your `PATH`, be sure to apply your new profile: `source ~/.profile`* -Congratulations! You now have the Snowflake Guides landing page running. +2. You get a `EACCES` error when installing gulp-cli + - This means that your npm location needs to be updated. Follow the steps here: [Resolve EACCESS permissions](https://docs.npmjs.com/resolving-eacces-permissions-errors-when-installing-packages-globally#manually-change-npms-default-directory) + +#### Write Your First SFGuide: + + 1. Terminate the running server with `ctrl C` and and navigate to the `sfguides` source directory `cd sfguides/src` + - In this directory, you will see all existing guides and their markdown files. + 2. Generate a new guide from the guide template `npm run template ` + - Don't use spaces in the name of your guide, instead use underscores. + 3. Navigate to the newly generated guide (`cd sfguides/src/`) and edit your guide in a tool like vscode. + 4. Run the website again `npm run serve` + 5. As you edit and save changes, your changes will automatically load in the browser. -#### Now lets add our first SFGuide: - - 1. Terminate the running gulp server with `ctrl C` and navigate to the sfguide directory - ````bash - cd site/sfguides - ```` - The sfguides directory is where to store all SFGuide content, written in markdown. - - 2. Use the claat tool to convert the markdown file to HTML - ````bash - claat export sample.md - ```` - - You should see `ok sample` as the response. This means claat has successfully converted your .md file to HTML and created a new directory named `sample`. - - 3. Now lets run our server again, this time specifying our sfguides directory of content - ````bash - gulp serve --codelabs-dir=sfguides - ```` -You can now navigate to the landing page in your browser to see your new sfguide! - -You can use the [sample SFGuide](site/sfguides/sample.md) as a template, just change the name of the file and the id listed in the header. +You can always read the [sample SFGuide](site/sfguides/sample.md) online. ### Tips - Review the [sample.md](site/sfguides/sample.md) file to learn more about to to structure your SFGuide for the claat tool. -- You can also see more formating information in the [claat documentation](claat/README.md), and use the command `claat -h` - You can see the supported SFGuide categories [here](site/app/styles/_overrides.scss). If you want to suggest a new category please create a github issue! - Checkout [how to use VS Code to write markdown files](https://code.visualstudio.com/docs/languages/markdown) - If you want to learn more about SFGuides, check out this [excellent tutorial](https://medium.com/@zarinlo/publish-technical-tutorials-in-google-codelab-format-b07ef76972cd) diff --git a/site/app/styles/_overrides.scss b/site/app/styles/_overrides.scss index 0e4f512d3..7101192ec 100644 --- a/site/app/styles/_overrides.scss +++ b/site/app/styles/_overrides.scss @@ -25,13 +25,14 @@ /* Snowflake specific category classes */ $color-resource-optimization: #134369; -$color-demos2: #21a2e0; $color-getting-started: #5bc6d1; $color-demos: #c7477a; +$color-architecture-patterns: #097e89; $color-data-applications: #282A72; $color-data-exchange: #282A72; @include codelab-card(['getting-started'], $color-getting-started, ''); @include codelab-card(['resource-optimization'], $color-resource-optimization, ''); +@include codelab-card(['architecture-patterns'], $color-architecture-patterns, ''); @include codelab-card(['demos'], $color-demos, ''); @include codelab-card(['data-engineering'], $snowflake-blue, 'data-engineering.svg'); @include codelab-card(['data-lake'], $star-blue, 'data-lake.svg'); diff --git a/site/sfguides/src/getting_started_with_snowsql/getting_started_with_snowsql.md b/site/sfguides/src/getting_started_with_snowsql/getting_started_with_snowsql.md index 1378fa710..fa57b3f2f 100644 --- a/site/sfguides/src/getting_started_with_snowsql/getting_started_with_snowsql.md +++ b/site/sfguides/src/getting_started_with_snowsql/getting_started_with_snowsql.md @@ -227,7 +227,7 @@ Here is an example command to `select` everything on the `emp_basic` table. ![Snowflake_SELECT_image](assets/Snowflake_SELECT.png) Sifting through everything on your table may not be the best use of your time. Getting specific -results is simple, with a few functions and some query syntax. +results are simple, with a few functions and some query syntax. - [WHERE​](https://docs.snowflake.com/en/sql-reference/constructs/where.html#where) is an additional clause you can add to your select query. @@ -307,4 +307,4 @@ Continue by [​developing an application](https://docs.snowflake.com/en/develop - SnowSQL setup - Uploading data using SnowSQL - Querying data using SnowSQL -- Managing and deleting data using SnowSQL \ No newline at end of file +- Managing and deleting data using SnowSQL diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/_shared_assets b/site/sfguides/src/getting_started_with_user_defined_functions/_shared_assets new file mode 120000 index 000000000..55f7fe191 --- /dev/null +++ b/site/sfguides/src/getting_started_with_user_defined_functions/_shared_assets @@ -0,0 +1 @@ +../_shared_assets \ No newline at end of file diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_DROP.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_DROP.png new file mode 100644 index 000000000..71ca01f10 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_DROP.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_SwitchRole_DemoUser.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_SwitchRole_DemoUser.png new file mode 100644 index 000000000..86cafd2f2 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_SwitchRole_DemoUser.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_select_udf_max.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_select_udf_max.png new file mode 100644 index 000000000..12ba0add9 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_select_udf_max.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_select_udtf.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_select_udtf.png new file mode 100644 index 000000000..79dc294bc Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_select_udtf.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateDB.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateDB.png new file mode 100644 index 000000000..2b6f0fa8b Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateDB.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateSchema.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateSchema.png new file mode 100644 index 000000000..470def6a7 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateSchema.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateTable.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateTable.png new file mode 100644 index 000000000..dc79fd287 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_CreateTable.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_DropDB.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_DropDB.png new file mode 100644 index 000000000..8958c9ed4 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_DropDB.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_DropTable.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_DropTable.png new file mode 100644 index 000000000..c0d9df705 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_DropTable.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_max.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_max.png new file mode 100644 index 000000000..70b8a6260 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_max.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_schema_public_drop.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_schema_public_drop.png new file mode 100644 index 000000000..b04f694c6 Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udf_schema_public_drop.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udtf.png b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udtf.png new file mode 100644 index 000000000..3e2e553be Binary files /dev/null and b/site/sfguides/src/getting_started_with_user_defined_functions/assets/Snowflake_udtf.png differ diff --git a/site/sfguides/src/getting_started_with_user_defined_functions/getting_started_with_user_defined_functions.md b/site/sfguides/src/getting_started_with_user_defined_functions/getting_started_with_user_defined_functions.md new file mode 100644 index 000000000..4b9816f28 --- /dev/null +++ b/site/sfguides/src/getting_started_with_user_defined_functions/getting_started_with_user_defined_functions.md @@ -0,0 +1,263 @@ +summary: Guide to getting started with user-defined functions +Id: getting_started_with_user_defined_functions +categories: Getting Started, UDF +environments: Web +status: Published +feedback link: https://github.com/Snowflake-Labs/devlabs/issues +tags: Getting Started, SQL + +# Getting Started With User-Defined Functions + +## Overview + +Duration: 0:03:00 + +Sometimes the built-in system functions don't offer answers to the specific questions your organization has. Custom functions are necessary when managing and analyzing data. Snowflake provides a way to make diverse functions on the fly with user-defined functions. + +This guide will walk you through getting set up with Snowflake and becoming familiar with creating and executing user-defined functions(UDFs) and user-defined table functions(UDTFs). + +Review the material below and start with the essentials in the following section. + +### Prerequisites + +- Quick Video [Introduction to Snowflake](https://www.youtube.com/watch?v=fEtoYweBNQ4&ab_channel=SnowflakeInc.) + +### What You’ll Learn + +- Snowflake account and user permissions +- Make database objects +- Query with a user-defined scalar function +- Query with a user-defined table function +- Delete database objects +- Review secure user-defined function + +### What You’ll Need + +- [Snowflake](https://signup.snowflake.com/) Account + +### What You’ll Build + +- Database objects and user-defined functions to query those objects. + + + +## Begin With the Basics + +Duration: 0:03:00 + +First, we'll go over how to create your Snowflake account and manage user permissions. + +1. Create a Snowflake Account + +Snowflake lets you try out their services for free with a [trial account](https://signup.snowflake.com/). Follow the prompts to activate your account via email. + +2. Access Snowflake’s Web Console + +`https://.snowflakecomputing.com/console/login` + +Log in to the [web interface](https://docs.snowflake.com/en/user-guide/connecting.html#logging-in-using-the-web-interface) from your browser. The URL contains your [account name](https://docs.snowflake.com/en/user-guide/connecting.html#your-snowflake-account-name) and potentially the region. + +3. Increase Your Account Permission + +![Snowflake_SwitchRole_DemoUser-image](assets/Snowflake_SwitchRole_DemoUser.png) + +Switch the account role from the default SYSADMIN to ACCOUNTADMIN. + +With your new account created and the role configured, you're ready to begin creating database objects in the following section. + + + +## Generate Database Objects + +Duration: 0:05:00 + +With your Snowflake account at your fingertips, it's time to create the database objects. + +Within the Snowflake web console, navigate to **Worksheets** and use a fresh worksheet to run the following commands. + +1. **Create Database** + +```SQL +create or replace database udf_db; +``` + +Build your new database named `udf_db` with the command above. + +![Snowflake_udf_CreateDB-image](assets/Snowflake_udf_CreateDB.png) + +The **Results** displays a status message of `Database UDF_DB successfully created` if all went as planned. + +2. **Make Schema** + +```SQL +create schema if not exists udf_schema_public; +``` + +Use the above command to whip up a schema called `udf_schema_public`. + +![Snowflake_udf_CreateSchema-image](assets/Snowflake_udf_CreateSchema.png) + +The **Results** show a status message of `Schema UDF_SCHEMA_PUBLIC successfully created`. + +3. **Copy Sample Data Into New Table** + +```SQL +create or replace table udf_db.udf_schema_public.sales as +(select * from snowflake_sample_data.TPCDS_SF10TCL.store_sales + sample block (1)); +``` + +Create a table named ‘sales’ and import the sales data with this command. Bear in mind, importing the sample data will take a longer time to execute than the previous steps. + +![Snowflake_udf_CreateTable-image](assets/Snowflake_udf_CreateTable.png) + +The **Results** will display a status of `Table SALES successfully created` if the sample data and table made it. + +With the necessary database objects created, it’s time to move onto the main course of working with a UDF in the next section. + + + +## Execute Scalar User-Defined Function + +Duration: 0:06:00 + +With the database primed with sample sales data, we're _almost_ ready to try creating a scalar UDF. Before diving in, let’s first understand more about UDF naming conventions. + +If the function name doesn't specify the database and schema(e.x. `udf_db.udf_schema_public.udf_name`) then it defaults to the active session. Since UDFs are database objects, it's better to follow their [naming conventions](https://docs.snowflake.com/en/sql-reference/udf-overview.html#naming-conventions-for-udfs). For this quick practice, we'll rely on our active session. + +1. **Create UDF** + +```SQL +create function udf_max() + returns NUMBER(7,2) + as + $$ + select max(SS_LIST_PRICE) from udf_db.udf_schema_public.sales + $$ + ; +``` + +The [SQL function](https://docs.snowflake.com/en/sql-reference/functions/min.html#min-max) `max` returns the highest value in the column `SS_LIST_PRICE`. + +![Snowflake_udf_max-image](assets/Snowflake_udf_max.png) + +The image shows the successful creation of the function `udf_max`. + +2. **Call the UDF** + +```SQL +select udf_max(); +``` + +Summon your new UDF with the [SQL command](https://docs.snowflake.com/en/sql-reference/sql/select.html) `select`. + +![Snowflake_select_udf_max-image](assets/Snowflake_select_udf_max.png) + +Pictured above is the returned **Results**. + +Now that you've practiced the basics of creating a UDF, we'll kick it up a notch in the next section by creating a UDF that returns a new table. + + + +## Query With User-Defined Table Function + +Duration: 0:06:00 + +After creating a successful scalar UDF, move onto making a function that returns a table with a UDTF(user-defined table function). + +1. **Create a UDTF** + +```SQL +create or replace function +udf_db.udf_schema_public.get_market_basket(input_item_sk number(38)) +returns table (input_item NUMBER(38,0), basket_item_sk NUMBER(38,0), +num_baskets NUMBER(38,0)) +as + 'select input_item_sk, ss_item_sk basket_Item, count(distinct +ss_ticket_number) baskets +from udf_db.udf_schema_public.sales +where ss_ticket_number in (select ss_ticket_number from udf_db.udf_schema_public.sales where ss_item_sk = input_item_sk) +group by ss_item_sk +order by 3 desc, 2'; +``` + +The code snippet above creates a function that returns a table with a market basket analysis. + +![Snowflake_udtf-image](assets/Snowflake_udtf.png) + +2. **Run the UDTF** + +```SQL +select * from table(udf_db.udf_schema_public.get_market_basket(6139)); +``` + +Just like for the scalar UDF, this will execute your function. + +![Snowflake_select_udtf-image](assets/Snowflake_select_udtf.png) + +Returned is the market basket analysis table based on the sample sales data. + +You've practiced making UDTFs and have become familiar with UDFs. In the last section, we'll delete our unneeded database objects. + + + +## Cleanup + +Duration: 0:03:00 + +We've covered a lot of ground! Before we wrap-up, drop the practice database objects created in this guide. + +1. **Drop Table** + +```SQL +drop table if exists sales; +``` + +Begin by dropping the child object before dropping parent database objects. Use the command above to start by removing the table. + +![Snowflake_udf_DropTable-image](assets/Snowflake_udf_DropTable.png) + +Ensure you've successfully dropped the table in the **Results** section. + +2. **Drop Schema** + +```SQL +drop schema if exists udf_schema_public; +``` + +The command above drops the schema `udf_schema_public`. + +![Snowflake_udf_schema_public_drop-image](assets/Snowflake_udf_schema_public_drop.png) + +The **Results** return should display `UDF_SCHEMA_PUBLIC successfully dropped`. + +3. **Drop Database** + +```SQL +drop database if exists udf_db; +``` + +Complete the process by dropping the parent object `udf_db`. + +![Snowflake_udf_DropDB-image](assets/Snowflake_udf_DropDB.png) + +Verify the database is entirely gone by checking the **Results** for `UDF_DB successfully dropped`. + + + +## Conclusion and Next Steps + +Duration: 0:02:00 + +You have a good handle on UDFs by practicing both scalar and table functions. With our database objects cleared, it's time to look ahead. + +Consider the potential in a sharable and [secure](https://docs.snowflake.com/en/sql-reference/udf-secure.html#secure-udfs) user-defined function. You can learn how to share user-defined functions, such as the market basket analysis table, following this post about [the power of secure UDFs](https://www.snowflake.com/blog/the-power-of-secure-user-defined-functions-for-protecting-shared-data/). + +### What we've covered + +- Registered a Snowflake account +- Configured role permissions +- Produced database objects +- Queried with a custom UDF +- Composed a table to analyze data with a UDTF +- Eliminated database objects diff --git a/site/sfguides/src/sample/sample.md b/site/sfguides/src/sample/sample.md index d4390d982..03c9cc544 100644 --- a/site/sfguides/src/sample/sample.md +++ b/site/sfguides/src/sample/sample.md @@ -14,7 +14,7 @@ Duration: 1 Please use [this markdown file](https://raw.githubusercontent.com/Snowflake-Labs/sfguides/master/site/sfguides/sample.md) as a template for writing your own Snowflake Guides. This example guide has elements that you will use when writing your own guides, including: code snippet highlighting, downloading files, inserting photos, and more. -It is important to include on the first page of your guide the following sections: Prerequisites, What you'll learn, What you'll need, and What you'll build. Remember, part of the purpose of a Snowflake Guide is that the reader will have **built** something by the end of the tutorial; this means that actual code needs to be included (not just pseudo-code). +It is important to include on the first page of your guide the following sections: Prerequisites, What you'll learn, What you'll need, and What you'll build. Remember, part of the purpose of a Snowflake Guide is that the reader will have **built** something by the end of the tutorial; this means that actual code needs to be included (not just pseudo-code or concepts). The rest of this Snowflake Guide explains the steps of writing your own guide. @@ -171,15 +171,26 @@ Duration: 1 -## Conclusion +## Conclusion & Next Steps Duration: 1 -At the end of your Snowflake Guide, always have a clear call to action (CTA). This CTA could be a link to the docs pages, links to videos on youtube, a GitHub repo link, etc. +The Conclusion and Next Steps section is one of the most important parts of a guide. This last section helps to sum up all the information the reader has gone through, and in many ways should read like a [TLDR summary](https://www.howtogeek.com/435266/what-does-tldr-mean-and-how-do-you-use-it/#post-435266:~:text=How%20Do%20You%20Use%20TLDR%3F,you%E2%80%99re%20the%20author%20or%20commenter.%20Using). -If you want to learn more about Snowflake Guide formatting, checkout the official documentation here: [Formatting Guide](https://github.com/googlecodelabs/tools/blob/master/FORMAT-GUIDE.md) +There are three main sub-headers in a Conclusion step: -### What we've covered +1. a general conclusion paragraph (what you are reading now!) +2. "What We've Covered" section with a bulleted list of things +3. "Related Resources" with links to various other resources, other guides, docs, videos, GitHub source code, etc. + +It's also important to remember that by the time a reader has completed a Guide, the goal is that they have actually built something! Guides teach through hands-on examples -- not just explaining concepts. + +### What We've Covered - creating steps and setting duration - adding code snippets - embedding images, videos, and surveys -- importing other markdown files \ No newline at end of file +- importing other markdown files + +### Related Resources +- [SFGuides on GitHub](https://github.com/Snowflake-Labs/sfguides) +- [Learn the GitHub Flow](https://guides.github.com/introduction/flow/) +- [Learn How to Fork a project on GitHub](https://guides.github.com/activities/forking/) \ No newline at end of file diff --git a/site/sfguides/src/security_access_to_sensitive_objects/_shared_assets b/site/sfguides/src/security_access_to_sensitive_objects/_shared_assets new file mode 120000 index 000000000..55f7fe191 --- /dev/null +++ b/site/sfguides/src/security_access_to_sensitive_objects/_shared_assets @@ -0,0 +1 @@ +../_shared_assets \ No newline at end of file diff --git a/site/sfguides/src/security_access_to_sensitive_objects/assets/Snowflake_RBAC_Suggested.png b/site/sfguides/src/security_access_to_sensitive_objects/assets/Snowflake_RBAC_Suggested.png new file mode 100644 index 000000000..9785543d7 Binary files /dev/null and b/site/sfguides/src/security_access_to_sensitive_objects/assets/Snowflake_RBAC_Suggested.png differ diff --git a/site/sfguides/src/security_access_to_sensitive_objects/assets/Snowflake_RBAC_Traditional.png b/site/sfguides/src/security_access_to_sensitive_objects/assets/Snowflake_RBAC_Traditional.png new file mode 100644 index 000000000..082a4a731 Binary files /dev/null and b/site/sfguides/src/security_access_to_sensitive_objects/assets/Snowflake_RBAC_Traditional.png differ diff --git a/site/sfguides/src/security_access_to_sensitive_objects/security_access_to_sensitive_objects.md b/site/sfguides/src/security_access_to_sensitive_objects/security_access_to_sensitive_objects.md new file mode 100644 index 000000000..b94fc412a --- /dev/null +++ b/site/sfguides/src/security_access_to_sensitive_objects/security_access_to_sensitive_objects.md @@ -0,0 +1,264 @@ +summary: Security - Access to Sensitive Objects +id: security_access_to_sensitive_objects +categories: Architecture Patterns +tags: patterns, security, rbac, objects, access +status: Published + +# Architecture Pattern: Security - Access to Sensitive Objects + +## Overview + +This pattern provides an +approach for granting access to schemas containing sensitive data +without creating a fork in the RBAC role hierarchy.  Forking the RBAC +hierarchy is commonly prescribed in order to provide one role set which +grants access to non-sensitive data and another  with sensitive data +access. This privileged role must then be properly inherited and/or +activated by the end user, and it results in a duplication of the +privileges set; one for non sensitive data and one for sensitive. +  + +This pattern proposes alternatively, to instead grant +a privilege set to all objects in a database regardless of their +sensitivity.  It is then only the USAGE privilege, which is controlled +by a separate database specific sensitive role, that would be inherited +by the top level role.  This effectively eliminates the fork in the +hierarchy and simplifies the number of roles a user must request access +to . Instead of the user having to request a sensitive role with its own +access privileges, they can  simply request  the enabling of sensitive +data access. + +This pattern does not prescribe  how to populate these objects, perform +row or column level security, or grant roles to users; each of which may +also be required.  The scope of this pattern is simply how to provide +visibility to the objects themselves.   + +### Pattern Series: Security + +This guide is part of a series on Security. The guides are: +- [Access to sensitive objects](../security_access_to_sensitive_objects/index.html) +- [Authentication](../security_authentication_pattern/index.html) +- [Network Architecture](../security_network_architecture_pattern/index.html) + +### Intended Audience + +This document is for Enterprise and Solution Architects who want to understand the connectivity capabilities and best practices of Snowflake and Snowflake Partner technologies. This document is not intended for use by implementation teams, although an implementation example is provided. + +### When To Use This Pattern + +This pattern implements well when the following +conditions are true: + +1. Within a database, datasets are grouped by schema + by which access must be controlled +2. Access to these schemas is controlled by an + identity governance & access management system.   +3. User request access to specific data sets which + must be approved +4. Access roles are inherited by some level of + functional role.  The functional role could be at a group or + individual level.   + +### What You'll Learn + +1. Snowflake's Role Based Access Control enables complex access requirements to be developed through access and functional roles +2. Snowflake can integrate with Enterprise permissions management systems +3. Sensitive data access can be managed simply and clearly for users + + +## Pattern Details + +Objects in Snowflake are contained in a hierarchy of containers. + Databases contain schemas which contain objects.  Each level of the +hierarchy has its own set of privileges.  In order to be able to +successfully access data in the lowest level objects which contain data - such as table or a view - the role must have the appropriate +privileges on all objects higher in the hierarchy.  A role must first +have the privilege to view a database, commonly granted with the +database usage privilege.  Then the role can only see schemas for +which the schema usage privilege has been granted. + Finally the user must have privileges on the underlying objects. +  + +Although the object containers - meaning database, schema and tables +(for example) -  are hierarchical, the privileges can be granted out of +order, which is what this pattern is suggesting.  A role inherits a +certain privilege set on all objects in a database - +this privilege set can be any combination of CRUD privileges.  The role +is then granted usage on the database.  At this point the role can see +the database, and has common privileges on objects - but is unable to +view the underlying objects because no schema level privileges have been +granted.  Now, a user can request permissions to specific schemas.  The +only privilege the security admin must grant is the usage privilege. + Once that usage is granted and properly inherited by a functional role +to aggregate the object level privileges along with the usage privilege, + the user will be able to access the data set.   + +The granting of these schema level roles is commonly managed by an +enterprise identity governance and access management system.  Within +this enterprise system, a user requests access to specific data sets +which then follows an approval process.  Once the proper approvals have +occurred, the role containing the usage privilege on the approved schema +is assigned to the requesting user's functional role.  This granting and +inheritance can be implemented using either SCIM2.0 API, JDBC calls to +customer stored procedures, or calling procedures or executing SQL +directly in Snowflake. + +#### Key Points + +1. Even if a role has privileges on an object, if it does not have the + USAGE privilege on the database and schema containing the object it + will not be visible to that role. +2. For each schema containing sensitive data a role + is created and granted the USAGE privilege on that schema +3. This sensitive role is then granted to the + functional role which has been approved to access the sensitive + data. + +## Pattern Example - Sensitive RBAC Hierarchy + +This is a working example of how this pattern could be +implemented, within a particular context. + +### Business Scenario + +1. Snowflake will integrate with an enterprise + permissions management catalog system.  All roles in Snowflake which + a customer will be granted need to be listed in this catalog.  Given + the volume of databases in schemas for the project, an emphasis on + role reduction must be made. +2. The data set in Snowflake will include two + sensitivity classifications.  Sensitive, which will have limited + access,  and non sensitive which all users will have access + to. + +### Pattern Details + +1. Database `PROD_DB` contains two schemas, + `PUBLIC_SCHEMA` and `SENSITIVE_SCHEMA`. +2. A `PROD_DB_RO` role is created. The following + privileges are granted to the role + 1. `USAGE` on `PROD_DB` + 2. `USAGE` on `PUBLIC_SCHEMA` + 3. `SELECT` on all `TABLES` in `PROD_DB` +3. A `PROD_DB_RW` role is created. The following + privileges are granted to the role + 1. `INSERT` & `UPDATE` on all `TABLES` in database + `PROD_DB` + 2. `PROD_DB_RO` is granted to `PROD_DB_RW` +4. A privileges are granted to the role: + 1. `USAGE` on schema `SENSITIVE_SCHEMA` + 2. Note there are no lower level object grants to the + `SENSITIVE` schema role. It also is not inherited nor does it inherit + other object access roles. +5. A functional role, `IT_ANALYTICS_ROLE` is + created. This role will inherit the access level roles and be + granted to users. This role will be activated by the user. +6. Within the enterprise identity governance and + access management solution, the following roles will be listed for a + user to request, with a user required to select at least one from + each category: + 1. Access roles: + 1. `PROD_DB_RO` + 2. `PROD_DB_RW` + 3. `PROD_DB_SENSITIVE` + 2. Functional Roles + 1. `IT_ANALYTICS_ROLE` +7. Scenario 1: Bill, an IT Business Analyst, requires + read write access to non sensitive data in `PROD_DB`.   + 1. Bill already has the `IT_ANALYTICS` granted to his + user.   + 2. Bill requests `PROD_DB_RW`.   + 3. The `PROD_DB_RW`, after following the approval + process, is granted the `IT_ANALYTICS` role.  Bill now has the + read/write on all objects in the public schema. +8. Scenario 2: Alice, an HR Business Analyst, + requires read access to `PROD_DB` but also requires access to payroll + data kept within the sensitive schema.   + 1. Alice already has the `HR_ANALYSTS` functional role + granted to her user. + 2. Alice requests the `PROD_DB_RO` role + 3. Alice requests the `PROD_DB_SENSITIVE` role + 4. After the appropriate approval process, the roles + are granted to the `HR_ANALYSTS` role and Alice can now read all + tables in both the `PUBLIC` and `SENSITIVE` schemas.   + +![Snowflake_RBAC_Suggested-image](assets/Snowflake_RBAC_Suggested.png) + +Fig 1.0 Suggested Approach + +![Snowflake_RBAC_Traditional-image](assets/Snowflake_RBAC_Traditional.png) + +Fig 2.0 Traditional Pattern + +## Conclusion + +### What We've Covered + +1. Snowflake's Role Based Access Control enables complex access requirements to be developed through access and functional roles +2. Snowflake can integrate with Enterprise permissions management systems +3. Sensitive data access can be managed simply and clearly for users + +### Guidance + +#### Incompatibilities + +1. This pattern assumes a user should have the same + access level permissions on objects in a database. If the user + indeed requires separate permissions levels for schemas contained + within the same database the model may need to be extended or a + different model used. + +#### Other Implications + +1. Some applications which integrate with SCIM may not support all + functionality required to properly manage this approach requiring + custom SCIM or JDBC integration. + +### Design Principles Enabled by this Pattern + +With a traditional approach of having non-sensitive +and sensitive versions of RBAC roles for a database and/or schema, the +user must determine both which dataset they should have access to as +well as which level of access they should have to this data - and +request access to that role.  This may not be intuitive to users not +properly trained and experienced with Snowflake RBAC.  With the model +proposed in this pattern, the access level has already been determined, +likely based on the organizational role of the user.  The only request +the user is making is which datasets the user should be able to view. + +### Key Benefits of this Pattern + +The benefit of this pattern is when a user is + reviewing the possible roles to request access to, they only see three +roles and must decide 1) what privilege level do I need and 2) do I need +access to sensitive data. These decisions are made independently of +each other. In a typical model, this same hierarchy would require at +least 4 roles, and each role would be a distinct set of combined +privileges. More importantly, a legacy model would require at least 9 +grants to be made of privileges to roles whereas the suggested pattern +only requires 5. These numbers may seem insignificant, however as +implementations of snowflake grow and evolve, simplification of RBAC +hierarchies will be critical to successful extensibility and ease of +management. + +1. Simplified RBAC Hierarchy +2. Simplified enterprise catalog of available + roles +3. More intuitive access selections for common + users +4. Simplified integration with IAM (Identity and + Access Management) or IGA (Identity Governance and Administration) + tools + +### Related Resources + +- Snowflake community posts + - [Role Inheritance and Role Composition in Snowflake](https://community.snowflake.com/s/article/snowflake-rbac-security-prefers-role-inheritance-to-role-composition) +- Snowflake Documentation + - [Access Control in Snowflake](https://docs.snowflake.com/en/user-guide/security-access-control.html) + - [Overview of Access Control](https://docs.snowflake.com/en/user-guide/security-access-control-overview.html) + - [Access Control Considerations](https://docs.snowflake.com/en/user-guide/security-access-control-considerations.html) + - [Access Control Privileges](https://docs.snowflake.com/en/user-guide/security-access-control-privileges.html) + - [Configuring Access Control](https://docs.snowflake.com/en/user-guide/security-access-control-configure.html) + - [User Management](https://docs.snowflake.com/en/user-guide/admin-user-management.html) + - [User & Security DDL](https://docs.snowflake.com/en/sql-reference/ddl-user-security.html) diff --git a/site/sfguides/src/security_authentication_pattern/_shared_assets b/site/sfguides/src/security_authentication_pattern/_shared_assets new file mode 120000 index 000000000..55f7fe191 --- /dev/null +++ b/site/sfguides/src/security_authentication_pattern/_shared_assets @@ -0,0 +1 @@ +../_shared_assets \ No newline at end of file diff --git a/site/sfguides/src/security_authentication_pattern/security_authentication_pattern.md b/site/sfguides/src/security_authentication_pattern/security_authentication_pattern.md new file mode 100644 index 000000000..553b32520 --- /dev/null +++ b/site/sfguides/src/security_authentication_pattern/security_authentication_pattern.md @@ -0,0 +1,300 @@ +summary: Security - Authentication Pattern +id: security_authentication_pattern +categories: Architecture Patterns +tags: patterns, authentication, security +status: Published + +# Architecture Pattern: Security - Authentication + +## Overview + +Snowflake supports authentication methods that cover a number of +scenarios, ranging from human interactive scenarios,  to programmatic +service-account use cases. Client applications that connect to data +sources like Snowflake typically have their own specifically supported +authentication methods that vary from application to application. +Consider BI tools, for example: Some support SAML 2.0 for single +sign-on, while others don't. + +This document spotlights authentication patterns that support the +following scenarios: + +1. Interactive, SSO authentication for humans +2. Non-interactive authentication for non-human users, such as + programmatic accounts and service accounts + +### Pattern Series: Security + +This guide is part of a series on Security. The guides are: +- [Access to sensitive objects](../security_access_to_sensitive_objects/index.html) +- [Authentication](../security_authentication_pattern/index.html) +- [Network Architecture](../security_network_architecture_pattern/index.html) + +### Intended Audience + +This document is for Enterprise and Solution Architects who want to understand the connectivity capabilities and best practices of Snowflake and Snowflake Partner technologies. This document is not intended for use by implementation teams, although an implementation example is provided. + +### When To Use This Pattern + +The patterns in this document satisfy one or more of the following +requirements: + +1. The organization has both cloud and on-prem tools that need to +  authenticate to Snowflake. +2. The organization uses an IdP to manage authentication, which + eliminates the need for users to maintain multiple passwords for + different systems. +3. The organization has service accounts that require an authentication + method that's more secure than username and password.  +4. Legal or contractual agreements require the organization to + implement specific authentication methods. + +### What You'll Learn + +How to apply two techniques for authenticating access to Snowflake: +- Federated authentication +- Key pairs + +## Pattern Details + +Across the three authentication patterns, there are five ways to +authenticate to Snowflake: + +1. Built-in username/password authentication + password is stored in the Snowflake USER object and the user + authenticates with Snowflake. The USER object is  delivered as a + string, or the password is typed in by the user. This option is not + as secure as alternative options. +2. Built-in username/password authentication with multi-factor + authentication (MFA). +  multi-factor authentication for security. Note that this option + only supports Duo MFA. +3. SSO powered by SAMLv2 + cases. +4. Key Pair + users, such as programmatic access or service accounts. +5. OAuth 2.0 code grant flow + Snowflake data. + +Snowflake does not recommend basic, built-in username/password +authentication (option 1) because the alternatives offer better +security. In situations where the only method to define a connection to +an application is the username and password on the connection screen, +Snowflake recommends option 2, multi-factor authentication implemented +through Duo. + +There are human, interactive use cases where federated authentication is +the best supported method. Any SAML 2 compatible IdP can achieve this. +Some partner applications also deliver federated SSO experiences +leveraging OAuth 2.0, however, client application support for federated +authentication varies. SAML 2 is an option in some cases through +Snowflake's "External Browser" mode on the desktop. When a desktop +application is configured to use External Browser mode, a +Snowflake-provided driver opens a new browser tab/window so the user can +authenticate with their IdP credentials.   + +Programmatic (non-human) use cases can use built-in service account +passwords to authenticate, but only use this method as a last resort. +Instead, consider key pair authentication combined with a secrets +management solution where the client uses its private key, and Snowflake +uses public keys to decrypt it and authenticate. A third option is +External OAuth, which is the only method that allows for an SSO-based +user credential in the programmatic scenario. + +## Pattern Example - OAuth + +Now let's look at two applications, Tableau and Microsoft Power BI, + that support OAuth. The type of OAuth supported differs and in each +case the client application determines which technology you should use. +We will start with Tableau. + +When connecting to Snowflake, Tableau Desktop/Server/Online supports an +SSO-like experience through Snowflake OAuth. This example is appropriate +when customers want to provide an SSO-like experience to their Tableau +user population when accessing Snowflake data. This method prompts user +intervention to grant Tableau access to Snowflake.  An external +authorization server is not required in this scenario. Snowflake acts as +both the Authorization Server and the resource server.  Customers that +manage identity centrally and wish to provide a more secure method of +accessing Snowflake other than username and password will want to use +this pattern. + +When connecting to Snowflake, Microsoft Power BI supports an SSO-like +experience through External OAuth. This scenario is appropriate when +customers want to allow Microsoft Power BI users to connect to Snowflake +using Identity Provider credentials and an OAuth 2.0 implementation. +Customers should take into account that this approach requires Azure AD, +because that is what issues the token for access to Snowflake on behalf +of the user. This approach is appropriate to consider if a customer has +an IdP other than Azure AD. For example, if the customer IdP is Okta or +Active Directory Federation Services (ADFS), Azure AD takes the user +through the Security Assertion Markup Language (SAML) authentication +process with the IdP before logging the user into the Power BI service. +This scenario is appropriate when a customer wants to manage identity +centrally and provide a more secure method of accessing Snowflake other +than username and password. + +SAML SSO may not be appropriate in cases that involve a customer's +Snowflake administrators. Outages with an IdP may prevent Snowflake +administrators whose passwords are stored in the IdP from logging in to +Snowflake. In this case SAML SSO is not recommended and customers opt to +maintain an administrator with a Snowflake password to manage federated +authentication and troubleshoot any issues that occur. +SAML SSO is also not appropriate if the client applications do not +support that method. Consider evaluating if the application supports an +SSO-like experience similar to Power BI and Tableau. + +Some customers use Cloud Service Provider (CSP) private networking +technologies such as AWS PrivateLink and Azure Private Link with the use +of SAML SSO. This is an appropriate scenario but requires a choice. +Currently customers can only configure single sign-on to either work +with their regular, non-PrivateLink URL, or with the PrivateLink URL. + +### Guidance + +#### Incompatibilities + +1. As of March 2021, SAML-based SSO can only be used on + either public or private Snowflake endpoints at one time. This will + be addressed in future releases. +2. As of March 2021, Snowflake only supports a single IdP + at a time for each Snowflake Account for SSO. +3. For SSO the web UI only supports SAML 2. + +### Design Principles & Benefits Enabled by this Pattern + +The benefit of configuring Snowflake for federated authentication is +users need to log in just once with one set of credentials to get access +to all corporate apps, websites, and data for which they have +permission. This benefits users in the form of simplicity as they do not +have to manage multiple passwords for access to SaaS applications, data, +or websites required to perform their job duties. + +The unification of user access management means there is a central +directory to provision and deprovision users. Configuring SSO for +Snowflake with any SAML 2.0 compliant IdP helps customers think of +Snowflake as many other SaaS applications that use this common protocol. + + +SSO for human interactive use cases must consider the capabilities +supported by Snowflake in concert with the authentication capabilities +of the systems users need to authenticate to Snowflake. This matrix +needs to be considered along with the three scenarios described to +enable SSO for as many systems as possible. + +## Pattern Example - Key Pair Authentication + +This pattern example compares when you should use key pair +authentication for non-human users versus when you should use external +OAuth for secure, programmatic access to Snowflake data. This example is +relevant to programmatic or service account requirements that require +access to Snowflake. An evaluation of supported authentication methods +of service accounts and programmatic access requirements helps to +determine which non-human user authentication methods to use with +Snowflake. + +Key pair for authentication of service accounts to Snowflake is an +appropriate example when customers have requirements to not rely on a +third party or for the secret to travel over the wire as part of +authenticating service accounts. The private key can be managed +internally by a customer without relying on cloud-based IdPs, such as +Azure AD or Okta. Secrets not traveling over the wire is a great benefit +of key pair authentication where the private key stays with the customer +and the public key lives in Snowflake. + +An additional example appropriate for key pair authentication is if the +customer wants to remove the management of the secret from the service +account authenticating to Snowflake. With key pair authentication, the +private key does not need to be in the possession of the user. The key +is managed by code and therefore the service account itself is not in +possession of the key. + +This example allows for the aggressive rotation of keys without +disrupting connectivity. Since Snowflake allows for two active public +key values at any time, consider this as part of the pattern. This +example is best used with a secrets management platform like Hashicorp +Vault, AWS Secrets Manager, or Azure Key Vault to manage the private +key. + +Key pair authentication is not appropriate in scenarios where existing +key infrastructure is not in place to provide for the protection of +private keys. This method may not be appropriate in large environments +where the ability to distribute and manage keys becomes more +administrative overhead than what the customer is willing to deal with. + + +External OAuth 2.0 is a supported method for non-human users to access +Snowflake. Customers that seek to allow for SSO-based user credentials +in the programmatic scenario should consider this option. OAuth 2.0 is +appropriate for customers that want to centralize the management of +tokens issued to Snowflake by service accounts to ensure that +programmatic access or access by the service account to Snowflake data +has to go through the External OAuth configured service. Customers with +this requirement may have additional requirements to centralize the +monitoring of authorizations across a number of applications. Customers +that do not wish to pass credentials over the wire or store secrets in +Snowflake will find this method useful. + +More specific examples where OAuth is appropriate include embedding +Snowflake into your application where the application requires access to +Snowflake, or if SAML cannot be accomplished because there is a program +that requires access to Snowflake then OAuth is an appropriate pattern. + +Customers with centralized monitoring requirements should consider +OAuth. With OAuth, customers can see delegated access to Snowflake and +other applications such as Salesforce in one place. This differs from +key pair in that, if I want to audit users authenticating through key +pair, that answer lives in Snowflake. This model also supports examples +where customers want to deprovision service identities from a +centralized place. + +### Guidance + +#### Incompatibilities + +1. Snowflake OAuth is not applicable in the programmatic scenario. + External OAuth should be used. + +### Design Principles & Benefits Enabled by this Pattern + +The benefits of configuring Snowflake for Key Pair Authentication +include: + +1. The secret does not travel over the network +2. The user does not need the private key. +3. Snowflake allows for the aggressive rotation of key pairs. The outcome of key pair authentication is that it is more secure than username and password. + +The result of External OAuth authentication is centralized management of +tokens issued to Snowflake, and service accounts or users used +exclusively for programmatic access will only ever be able to use +Snowflake data when going through the External OAuth configured service. +Customers benefit from sessions initiated with Snowflake do not require +a password and only initiate their sessions through external OAuth. + +## Conclusion + +### What We've Covered + +1. Snowflake allows multiple authentication methods +2. Single Sign On and OAuth can be used +3. Service accounts have special considerations in how they should be used, with key-pair authentication providing an option + +### Related Resources + +The following related information is available. + +- Snowflake Community Posts + - [Using SSO between PowerBI and Snowflake](https://www.snowflake.com/blog/using-sso-between-power-bi-and-snowflake/) + - [Using OAuth 2.0 with Snowflake](https://www.snowflake.com/blog/using-oauth-2-0-with-snowflake/) + - [Snowflake Service Account Security Part 1](https://www.snowflake.com/blog/snowflake-service-account-securitypart-1/) + - [Snowflake Service Account Security Part 2](https://www.snowflake.com/blog/snowflake-service-account-security-part-2/) +- Snowflake Documentation + - [Snowflake Password Policy](https://docs.snowflake.com/en/user-guide/admin-user-management.html#snowflake-password-policy) + - [Federated Authentication & SSO](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth.html) + - [Using Key Pair Authentication](https://docs.snowflake.com/en/user-guide/odbc-parameters.html#:~:text=Snowflake%20supports%20using%20key%20pair,will%20use%20the%20Snowflake%20client.) + - [Snowflake OAuth](https://docs.snowflake.com/en/user-guide/oauth-snowflake.html) + - [External OAuth](https://docs.snowflake.com/en/user-guide/oauth-external.html) + - [Summary of Security Features](https://docs.snowflake.com/en/user-guide/admin-security.html) +- Partner Documentation + - [Configure SSO - Azure AD and Snowflake](https://docs.microsoft.com/en-us/azure/active-directory/saas-apps/snowflake-tutorial) + - [Configure SSO - Okta and Snowflake](https://saml-doc.okta.com/SAML_Docs/How-to-Configure-SAML-2.0-for-Snowflake.html) diff --git a/site/sfguides/src/security_network_architecture_pattern/_shared_assets b/site/sfguides/src/security_network_architecture_pattern/_shared_assets new file mode 120000 index 000000000..55f7fe191 --- /dev/null +++ b/site/sfguides/src/security_network_architecture_pattern/_shared_assets @@ -0,0 +1 @@ +../_shared_assets \ No newline at end of file diff --git a/site/sfguides/src/security_network_architecture_pattern/assets/Snowflake_NAP_Connectivity.png b/site/sfguides/src/security_network_architecture_pattern/assets/Snowflake_NAP_Connectivity.png new file mode 100644 index 000000000..dda6ec1ab Binary files /dev/null and b/site/sfguides/src/security_network_architecture_pattern/assets/Snowflake_NAP_Connectivity.png differ diff --git a/site/sfguides/src/security_network_architecture_pattern/assets/Snowflake_NAP_PrivateLink.png b/site/sfguides/src/security_network_architecture_pattern/assets/Snowflake_NAP_PrivateLink.png new file mode 100644 index 000000000..a2067fe01 Binary files /dev/null and b/site/sfguides/src/security_network_architecture_pattern/assets/Snowflake_NAP_PrivateLink.png differ diff --git a/site/sfguides/src/security_network_architecture_pattern/security_network_architecture_pattern.md b/site/sfguides/src/security_network_architecture_pattern/security_network_architecture_pattern.md new file mode 100644 index 000000000..63fc2f02a --- /dev/null +++ b/site/sfguides/src/security_network_architecture_pattern/security_network_architecture_pattern.md @@ -0,0 +1,302 @@ +summary: Security - Network Architecture Pattern +id: security_network_architecture_pattern +categories: Architecture Patterns +tags: patterns, security, network +status: Published + +# Architecture Pattern: Security - Network +## Overview + +SaaS-style cloud data platforms present a number of network +connectivity challenges. This is especially true with regards to +security. In the past, traditional data platforms lived in the deepest, + most interior parts of the organization's networks and benefited from +layered security defenses that built up over time. With data workloads +migrating to the cloud, architects have had to rethink data security. +"Zero Trust" security models are popular for applications, but are they +the right choice to protect the data itself? Understandably, architects +are wary. + +This document seeks to allay Enterprise and Solution Architects' +concerns. In the following pages, we present connectivity patterns that +architects can combine as needed to address different security +requirements. + +Three security measures comprise these connectivity patterns. These +include: + +1. **Leveraging Snowflake's out-of-the-box network security**. All + Snowflake communications have multiple layers of built-in security + (described below). This is the default, + "do-nothing-extra" security option, which is the most common + choice among Snowflake customers. +2. **Layering on built-in Network Policies**. In addition to + out-of-the-box security, security-sensitive organizations typically + implement Network Policies to specify which IP addresses can connect + to the Snowflake data platform. +3. **Integrating CSP capabilities that may add more security to network connectivity**. A smaller number of organizations choose to use cloud service provider features such as private networking if they determine it's appropriate. + +Alone or in combination, these measures comprise network connectivity +patterns that are extremely secure. + +### Pattern Series: Security + +This guide is part of a series on Security. The guides are: +- [Access to sensitive objects](../security_access_to_sensitive_objects/index.html) +- [Authentication](../security_authentication_pattern/index.html) +- [Network Architecture](../security_network_architecture_pattern/index.html) + +### Intended Audience + +This document is for Enterprise and Solution Architects who want to +understand the connectivity capabilities and best practices of Snowflake +and Snowflake Partner technologies. This document is +not intended for use by implementation teams, although an +implementation example is provided. + +### When To Use This Pattern + +Consider patterns that incorporate Network Policies or private +networking if any of the following requirements describe your +organization: + +1. Your organization requires SaaS and other cloud platforms to + restrict communications to only your organization's authorized + networks +2. Your organization is bound by third-party regulatory guidance to + only allow specific kinds of ingress and egress communications with + cloud providers +3. Your organization wants to run sensitive workloads that require + more stringent security controls that include controls specific to + networking +4. Your organization is bound by legal or contractual agreements that + require specific network controls + +If your organization does not [have one or more of these +requirements, then the out-of-the-box Snowflake network security +controls are likely more than sufficient to meet your needs. + +### What You'll Learn + +1. The different components of the Snowflake service have different types of network access +2. Network policies allow an additional layer of access control to the Snowflake Service +3. CSP provider features such as PrivateLink can also provide an additional layer of security + +## Pattern Details + +All Snowflake network connectivity architectures include five basic +connections: + +1. The connection between the Snowflake driver/connector and + the Snowflake account URL, e.g. `acme.us-east-1.snowflakecomputing.com` + +2. The connection between the Snowflake driver/connector and one or + more OCSP providers, e.g. `ocsp.digicert.com` + +3. The connection between the Snowflake driver/connector and the + Snowflake Internal Stage, + e.g. `randomname1stg.blob.core.windows.net` +4. The connection between the Snowflake service and the customer-owned + cloud storage e.g. a customer's GCS bucket +5. The connection between the users' browsers and the Snowflake Apps + layer e.g. `apps.snowflake.com` + +This document describes the first three connections in detail. The +first and fifth connections are functionally equivalent in this context. +A full discussion of the fourth connection, which is connectivity from +Snowflake to your organization's resources, will be covered in a +separate article. + +There are two types of data flowing on these network paths: + +- The first is the organization's data (aka customer data[), + which is the information the organization is interested in + protecting. +- The second is OCSP (Online Certificate Status Protocol) + information, which is used to validate certificates used to + establish TLS 1.2 tunnels for network communications. Only the OCSP + traffic uses an unencrypted channel over port 80. There are patterns + where your organization may use TLS inspection of some kind, which + may make this OCSP communication moot, but those discussions are out + of scope for this document. + +The first and most common pattern is to leverage Snowflake's +out-of-the-box connectivity. This uses TLS 1.2 encryption for all +customer data communications. It also leverages OCSP checking to ensure +the validity and integrity of the certificates being used for +establishing the TLS tunnels. In Figure 1, you see a diagram showing +connections 1, 2, 3, and 4. (As 4 and 5 are out of scope, we will focus +on parts 1, 2, and 3.) + + + + ![Snowflake_NAP_Connectivity-image](assets/Snowflake_NAP_Connectivity.png) + Figure 1. Diagram of Snowflake's out-of-the-box network + connectivity. Azure is shown, but the pattern is the same for AWS, + Azure, and GCP. The only thing that changes are the CSP component + names. + +A common misconception is that connectivity to the Snowflake Internal +Stage is optional, but it's not. However, using an External Stage, +which leverages the customer's cloud storage for use with Snowflake, +is optional. (The External Stage connection is labeled as number four +in Figure 1). There is no action needed to use this pattern of +communication other than to not actively block any aspect of the +connectivity. + +The second pattern is to  add Snowflake Network Policies to the +out-of-the-box connectivity shown in Figure 1. The full scope of options +for [Network Policies is discussed at length in the Snowflake +documentation](https://docs.snowflake.com/en/user-guide/network-policies.html). +For this discussion we will only note a few details that could affect +architectural considerations: + +1. Network Policies use IP CIDR ranges as inputs, and contain both + Allow +2. One can apply a Network Policy to the entire Snowflake account, to + specific integrations that have endpoints for network communications + exposed on channel 1 e.g. + [SCIM](https://docs.snowflake.com/en/user-guide/scim-okta.html#managing-scim-network-policies), + or to specific Snowflake Users. The most specific Policy always + wins. +3. There can be only one active Network Policy in any given context at + one time (e.g.only one per account, integration, or + user). + +The third pattern incorporates integration with Cloud Service Provider +(CSP) private networking options. Right now the integrations available +are with [AWS +PrivateLink](https://docs.snowflake.com/en/user-guide/admin-security-privatelink.html) and +[Azure Private +Link](https://docs.snowflake.com/en/user-guide/privatelink-azure.html). +These offerings from the CSPs offer a point-to-point, client-side +initiated private network channel for communications. They do not have +the downsides of VPN or peering, but still offer a point-to-point +network channel. Figure 2 offers a view of how this alters the +connectivity. + +![Snowflake_NAP_PrivateLink-image](assets/Snowflake_NAP_PrivateLink.png) + +Figure 2. Diagram showing Snowflake network connectivity integrated +with CSP private networking. AWS is shown, but the pattern is the same +for Azure and GCP. The only thing that changes are the CSP component +names. + +Figure 2 shifts the origin of communication from a network outside the +CSP to one inside the CSP, but it is also possible to accomplish this +integration if communicating from outside the CSP network +(e.g. from on premises). This uses more network components to +form a connection between the organization's network and the CSP's +(e.g. [AWS Direct Connect and BGP settings), but those components +are not Snowflake specific and therefore out of scope for this +discussion. + +From a technical point of view, the communications happening for parts +1, 2, and 3 of the Snowflake network pattern do not change in the +PrivateLink integration pictured in Figure 2. What changes is that the +Snowflake driver or connector is told to connect to a privately hosted +DNS address in the CSP infrastructure, which then points to a private +endpoint in the organization's CSP private network. The Snowflake +driver or connector is making an altered DNS call because the settings +tell it to, and then everything that follows is the normal functioning +of DNS and routing. The Snowflake driver or connector is unaware of all +the DNS and routing work, and therefore there is full support for this +in all of the platforms for which Snowflake has drivers or connectors. + +## Pattern Example - Pharmaceuticals + +Let's look at a real world example of how a customer applied all of +these approaches in the same deployment. This customer is in the +pharmaceutical industry and had many different workloads on Snowflake, +and, as a result, many different users consuming data from the platform. +The main challenge the customer had to overcome was simultaneously +meeting the needs of the many employees they had in the field using +reports and other ad hoc information, with the highly regulated +information that was being loaded, unloaded, and used in complicated +server-side analytics in order to produce the data users would consume. +The core of the challenge was serving the users without forcing them to +use cumbersome VPN connections or other client-side infrastructure to +get their laptops and other devices onto the private networks where the +CSP private networking integrations would live. + +The solution was to set up multiple paths to the Snowflake Data +Platform that would be used in different conditions. The key insight was +when the architects involved realized that the users who were outside +the corporate networks were never handling sensitive information in an +unmasked form. Since the real information---and therefore the real +risk---was not being exposed, the out-of-the-box Snowflake networking +protections more than met the customer's minimum network security +policies for non-protected information. Most of their users were able to +use default networking for consuming reports and other information in +the field. + +Where there was data loading, unloading, and large scale analytics going +on that did handle sensitive information in its unmasked form, the +organization leveraged two extra controls in combination. First, they +used Snowflake built-in Network Policies to lock down all the service +account users doing this work so that they could only communicate from +their cloud private networks where all the processing was taking place. +Second, they integrated these communications with their CSP's private +networking features to ensure that the channels were all point-to-point +and client-side initiated. + +Certain users had access that straddled the barrier between information +in its masked and unmasked forms. The most challenging was the growing +number of data scientists this organization was supporting. These users +would design models and do engineering on masked data, but then do large +scale training and apply their work using unmasked data. Because they +needed to move between two network contexts, they would use their +personal accounts with the same network connectivity as all other users +in the field. But they also had to access service accounts that were +restricted to running server-side using CSP private-networking +integrated networking. To access these accounts, they used VPN +connectivity to connect to systems running in their CSP private +networks. Other users that also needed both sets of access used similar +approaches. + +## Conclusion + +Network access control is an important tool you can use as a part of securing access to Snowflake. + +### What We've Covered + +1. The different components of the Snowflake service have different types of network access +2. Network policies allow an additional layer of access control to the Snowflake Service +3. CSP provider features such as PrivateLink can also provide an additional layer of security + +### Guidance + +#### Misapplications to avoid + +1. Any design where a Network Policy is being used for every user + is likely on the wrong path based on all evidence as of March 2021. +2. Many organizations will attempt to apply CSP private networking + technologies to many communication channels where it provides little + additional security, but does add a lot of operational overhead. + Consider CSP private networking only where large volumes of data or + extremely sensitive data is flowing. + +#### Incompatibilities +1. As of March 2021, SAML-based SSO can only be used + on either public URL or private URL (for CSP private networking + integration) Snowflake endpoints at one time. This will be addressed + in future releases. + +### Related Resources + +- Snowflake community posts + - [Setup Considerations When Integrating AWS PrivateLink with Snowflake](https://community.snowflake.com/s/article/Setup-Considerations-When-Integrating-AWS-PrivateLink-With-Snowflake) + - [HOWTO: Troubleshoot PrivateLink Configuration for Snowflake](https://community.snowflake.com/s/article/HowTo-Troubleshoot-Privatelink-configuration-for-Snowflake) + - [HOWTO: Block a Specific IP in Snowflake](https://community.snowflake.com/s/article/How-to-block-a-specific-IP-address-using-Network-Policies-in-Snowflake) +- Snowflake documentation + - [Snowflake Network Policies](https://docs.snowflake.com/en/user-guide/network-policies.html) + - [Applying Network Policy to a specific SCIM integration](https://docs.snowflake.com/en/user-guide/scim-okta.html#managing-scim-network-policies) + - e.g. [Okta in this case](https://docs.snowflake.com/en/user-guide/scim-okta.html#managing-scim-network-policies) + - [Snowflake AWS PrivateLink Integration](https://docs.snowflake.com/en/user-guide/network-policies.html) + - [Snowflake Azure Private Link Integration](https://docs.snowflake.com/en/user-guide/privatelink-azure.html) + - [Summary of Snowflake Security Features](https://docs.snowflake.com/en/user-guide/admin-security.html) +- Partner documentation + - [AWS PrivateLink Product Page](https://aws.amazon.com/privatelink/&sa=D&ust=1608660555648000&usg=AOvVaw16Wgd1tOdpAgnOMU3hiYII) + - [AWS PrivateLink Documentation](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html%23what-is-privatelink&sa=D&ust=1608660555649000&usg=AOvVaw1Q1lpnr0FYZf_X3kHthIya) + - [Azure Private Link Product Page](https://azure.microsoft.com/en-us/services/private-link/&sa=D&ust=1608660555649000&usg=AOvVaw1xbD4uzi6A3RCbUYZ0KBiK) + - [Azure Private Link Documentation](https://docs.microsoft.com/en-us/azure/private-link/private-link-overview&sa=D&ust=1608660555650000&usg=AOvVaw2BGDTLXGqLToQI-Svcljfh)