The primary purpose of this stage is to enable critical organization-level functionalities that depend on broad administrative permissions, and prepare the prerequisites needed to enable automation in this and future stages.
It is intentionally simple, to minimize usage of administrative-level permissions and enable simple auditing and troubleshooting, and only deals with three sets of resources:
- project, service accounts, and GCS buckets for automation
- projects, BQ datasets, and sinks for audit log and billing exports
- IAM bindings on the organization
Use the following diagram as a simple high level reference for the following sections, which describe the stage and its possible customizations in detail.
As mentioned above, this stage only does the bare minimum required to bootstrap automation, and ensure that base audit and billing exports are in place from the start to provide some measure of accountability, even before the security configurations are applied in a later stage.
It also sets up organization-level IAM bindings so the Organization Administrator role is only used here, trading off some design freedom for ease of auditing and troubleshooting, and reducing the risk of costly security mistakes down the line. The only exception to this rule is for the Resource Management stage service account, described below.
User groups are important, not only here but throughout the whole automation process. They provide a stable frame of reference that allows decoupling the final set of permissions for each group, from the stage where entities and resources are created and their IAM bindings defined. For example, the final set of roles for the networking group is contributed by this stage at the organization level (XPN Admin, Cloud Asset Viewer, etc.), and by the Resource Management stage at the folder level.
We have standardized the initial set of groups on those outlined in the GCP Enterprise Setup Checklist to simplify adoption. They provide a comprehensive and flexible starting point that can suit most users. Adding new groups, or deviating from the initial setup is possible and reasonably simple, and it's briefly outlined in the customization section below.
The service account used in the Resource Management stage needs to be able to grant specific permissions at the organizational level, to enable specific functionality for subsequent stages that deal with network or security resources, or billing-related activities.
In order to be able to assign those roles without having the full authority of the Organization Admin role, this stage defines a custom role that only allows setting IAM policies on the organization, and grants it via a delegated role grant that only allows it to be used to grant a limited subset of roles.
In this way, the Resource Management service account can effectively act as an Organization Admin, but only to grant the specific roles it needs to control.
One consequence of the above setup is the need to configure IAM bindings that can be assigned via the condition as non-authoritative, since those same roles are effectively under the control of two stages: this one and Resource Management. Using authoritative bindings for these roles (instead of non-authoritative ones) would generate potential conflicts, where each stage could try to overwrite and negate the bindings applied by the other at each apply
cycle.
A full reference of IAM roles managed by this stage is available here.
One other design choice worth mentioning here is using a single automation project for all foundational stages. We trade off some complexity on the API side (single source for usage quota, multiple service activation) for increased flexibility and simpler operations, while still effectively providing the same degree of separation via resource-level IAM.
We support three use cases in regards to billing:
- the billing account is part of this same organization, IAM bindings will be set at the organization level
- the billing account is part of a different organization, billing IAM bindings will be set at the organization level in the billing account owning organization
- the billing account is not considered part of an organization (even though it might be), billing IAM bindings are set on the billing account itself
For same-organization billing, we configure a custom organization role that can set IAM bindings, via a delegated role grant to limit its scope to the relevant roles.
For details on configuring the different billing account modes, refer to the How to run this stage section below.
Because of limitations of API availability, manual steps have to be followed to enable billing export within billing project to BigQuery dataset billing_export
which will be created as part of the bootstrap stage. The process to share billing data is outlined here.
We create organization-level log sinks early in the bootstrap process to ensure a proper audit trail is in place from the very beginning. By default, we provide log filters to capture Cloud Audit Logs and VPC Service Controls violations into a Bigquery dataset in the top-level audit project.
The Customizations section explains how to change the logs captured and their destination.
We are intentionally not supporting random prefix/suffixes for names, as that is an antipattern typically only used in development. It does not map to our customer's actual production usage, where they always adopt a fixed naming convention.
What is implemented here is a fairly common convention, composed of tokens ordered by relative importance:
- a static prefix less or equal to 9 characters (e.g.
myco
ormyco-gcp
) - an environment identifier (e.g.
prod
) - a team/owner identifier (e.g.
sec
for Security) - a context identifier (e.g.
core
orkms
) - an arbitrary identifier used to distinguish similar resources (e.g.
0
,1
)
Tokens are joined by a -
character, making it easy to separate the individual tokens visually, and to programmatically split them in billing exports to derive initial high-level groupings for cost attribution.
The convention is used in its full form only for specific resources with globally unique names (projects, GCS buckets). Other resources adopt a shorter version for legibility, as the full context can always be derived from their project.
The Customizations section on names below explains how to configure tokens, or implement a different naming convention.
This stage also implements initial support for two interrelated features
- configuration of Workload Identity Federation pools and providers
- configuration of CI/CD repositories to allow impersonation via Workload identity Federation, and stage running via provided workflow templates
Workload Identity Federation support allows configuring external providers independently from CI/CD, and offers predefined attributes for a few well known ones (more can be easily added by editing the identity-providers.tf
file). Once providers have been configured their names are passed to the following stages via interface outputs, and can be leveraged to set up access or impersonation in IAM bindings.
CI/CD support is fully implemented for GitHub, Gitlab, and Cloud Source Repositories / Cloud Build. For GitHub, we also offer a separate supporting setup to quickly create / configure repositories.
For details on how to configure both features, refer to the Customizations sections below on Workload Identity Federation and CI/CD repositories.
These features are optional and only enabled if the relevant variables have been populated.
This stage has straightforward initial requirements, as it is designed to work on newly created GCP organizations. Four steps are needed to bring up this stage:
- an Organization Admin self-assigns the required roles listed below
- the same administrator runs the first
init/apply
sequence passing a special variable toapply
- the providers configuration file is derived from the Terraform output or linked from the generated file
- a second
init
is run to migrate state, and from then on, the stage is run via impersonation
The roles that the Organization Admin used in the first apply
needs to self-grant are:
- Billing Account Administrator (
roles/billing.admin
) either on the organization or the billing account (see the following section for details) - Logging Admin (
roles/logging.admin
) - Organization Role Administrator (
roles/iam.organizationRoleAdmin
) - Organization Administrator (
roles/resourcemanager.organizationAdmin
) - Project Creator (
roles/resourcemanager.projectCreator
)
To quickly self-grant the above roles, run the following code snippet as the initial Organization Admin:
# set variable for current logged in user
export FAST_BU=$(gcloud config list --format 'value(core.account)')
# find and set your org id
gcloud organizations list --filter display_name:$partofyourdomain
export FAST_ORG_ID=123456
# set needed roles
export FAST_ROLES="roles/billing.admin roles/logging.admin \
roles/iam.organizationRoleAdmin roles/resourcemanager.projectCreator"
for role in $FAST_ROLES; do
gcloud organizations add-iam-policy-binding $FAST_ORG_ID \
--member user:$FAST_BU --role $role
done
Then make sure the same user is also part of the gcp-organization-admins
group so that impersonating the automation service account later on will be possible.
If you are using a billing account belonging to a different organization (e.g. in multiple organization setups), some initial configurations are needed to ensure the identities running this stage can assign billing-related roles.
If the billing organization is managed by another version of this stage, we leverage the organizationIamAdmin
role created there, to allow restricted granting of billing roles at the organization level.
If that's not the case, an equivalent role needs to exist, or the predefined resourcemanager.organizationAdmin
role can be used if not managed authoritatively. The role name then needs to be manually changed in the billing.tf
file, in the google_organization_iam_binding
resource.
The identity applying this stage for the first time also needs two roles in billing organization, they can be removed after the first apply
completes successfully:
export FAST_BILLING_ORG_ID=789012
export FAST_ROLES=(roles/billing.admin roles/resourcemanager.organizationAdmin)
for role in $FAST_ROLES; do
gcloud organizations add-iam-policy-binding $FAST_BILLING_ORG_ID \
--member user:$FAST_BU --role $role
done
If you are using a standalone billing account, the identity applying this stage for the first time needs to be a billing account administrator:
export FAST_BILLING_ACCOUNT_ID=ABCD-01234-ABCD
gcloud beta billing accounts add-iam-policy-binding $FAST_BILLING_ACCOUNT_ID \
--member user:$FAST_BU --role roles/billing.admin
Before the first run, the following IAM groups must exist to allow IAM bindings to be created (actual names are flexible, see the Customization section):
gcp-billing-admins
gcp-devops
gcp-network-admins
gcp-organization-admins
gcp-security-admins
You can refer to this animated image for a step by step on group creation.
Please note that FAST also supports an additional group for users with permissions to create support tickets and view logging and monitoring data. To remain consistent with the Google Cloud Enterprise Checklist we map these permissions to the gcp-devops
by default. However, we recommend creating a dedicated gcp-support
group and updating the groups
variable with the right value.
Then make sure you have configured the correct values for the following variables by providing a terraform.tfvars
file:
billing_account
an object containingid
as the id of your billing account, derived from the Cloud Console UI or by runninggcloud beta billing accounts list
, andorganization_id
as the id of the organization owning it, ornull
to use the billing account in isolationgroups
the name mappings for your groups, if you're following the default convention you can leave this to the provided defaultorganization.id
,organization.domain
,organization.customer_id
the id, domain and customer id of your organization, derived from the Cloud Console UI or by runninggcloud organizations list
prefix
the fixed prefix used in your naming, maximum 9 characters long
You can also adapt the example that follows to your needs:
# use `gcloud beta billing accounts list`
# if you have too many accounts, check the Cloud Console :)
billing_account = {
id = "012345-67890A-BCDEF0"
organization_id = 1234567890
}
# use `gcloud organizations list`
organization = {
domain = "example.org"
id = 1234567890
customer_id = "C000001"
}
outputs_location = "~/fast-config"
# use something unique and no longer than 9 characters
prefix = "abcd"
Each foundational FAST stage generates provider configurations and variable files can be consumed by the following stages, and saves them in a dedicated GCS bucket in the automation project. These files are a handy way to simplify stage configuration, and are also used by our CI/CD workflows to configure the repository files in the pipelines that validate and apply the code.
Alongisde the GCS stored files, you can also configure a second copy to be saves on the local filesystem, as a convenience when developing or bringing up the infrastructure before a proper CI/CD setup is in place.
This second set of files is disabled by default, you can enable it by setting the outputs_location
variable to a valid path on a local filesystem, e.g.
outputs_location = "~/fast-config"
Once the variable is set, apply
will generate and manage providers and variables files, including the initial one used for this stage after the first run. You can then link these files in the relevant stages, instead of manually transfering outputs from one stage, to Terraform variables in another.
Below is the outline of the output files generated by all stages, which is identical for both the GCS and local filesystem copies:
[path specified in outputs_location]
├── providers
│ ├── 00-bootstrap-providers.tf
│ ├── 01-resman-providers.tf
│ ├── 02-networking-providers.tf
│ ├── 02-security-providers.tf
│ ├── 03-project-factory-dev-providers.tf
│ ├── 03-project-factory-prod-providers.tf
│ └── 99-sandbox-providers.tf
└── tfvars
│ ├── 00-bootstrap.auto.tfvars.json
│ ├── 01-resman.auto.tfvars.json
│ ├── 02-networking.auto.tfvars.json
│ └── 02-security.auto.tfvars.json
└── workflows
└── [optional depending on the configured CI/CD repositories]
Before running init
and apply
, check your environment so no extra variables that might influence authentication are present (e.g. GOOGLE_IMPERSONATE_SERVICE_ACCOUNT
). In general you should use user application credentials, and FAST will then take care to provision automation identities and configure impersonation for you.
When running the first apply
as a user, you need to pass a special runtime variable so that the user roles are preserved when setting IAM bindings.
terraform init
terraform apply \
-var bootstrap_user=$(gcloud config list --format 'value(core.account)')
If you see an error related to project name already exists, please make sure the project name is unique or the project was not deleted recently
Once the initial apply
completes successfully, configure a remote backend using the new GCS bucket, and impersonation on the automation service account for this stage. To do this you can use the generated providers.tf
file if you have configured output files as described above, or extract its contents from Terraform's output, then migrate state with terraform init
:
# if using output files via the outputs_location and set to `~/fast-config`
ln -s ~/fast-config/providers/00-bootstrap* ./
# or from outputs if not using output files
terraform output -json providers | jq -r '.["00-bootstrap"]' \
> providers.tf
# migrate state to GCS bucket configured in providers file
terraform init -migrate-state
# run terraform apply to remove the bootstrap_user iam binding
terraform apply
Make sure the user you're logged in with is a member of the gcp-organization-admins
group or impersonation will not be possible.
Most variables (e.g. billing_account
and organization
) are only used to input actual values and should be self-explanatory. The only meaningful customizations that apply here are groups, and IAM roles.
As we mentioned above, groups reflect the convention used in the GCP Enterprise Setup Checklist, with an added level of indirection: the groups
variable maps logical names to actual names, so that you don't need to delve into the code if your group names do not comply with the checklist convention.
For example, if your network admins team is called [email protected]
, simply set that name in the variable, minus the domain which is interpolated internally with the organization domain:
variable "groups" {
description = "Group names to grant organization-level permissions."
type = map(string)
default = {
gcp-network-admins = "net-rockstars"
# [...]
}
}
# tftest skip
If your groups layout differs substantially from the checklist, define all relevant groups in the groups
variable, then rearrange IAM roles in the code to match your setup.
One other area where we directly support customizations is IAM. The code here, as in all stages, follows a simple pattern derived from best practices:
- operational roles for humans are assigned to groups
- any other principal is a service account
In code, the distinction above reflects on how IAM bindings are specified in the underlying module variables:
- group roles "for humans" always use
iam_groups
variables - service account roles always use
iam
variables
This makes it easy to tweak user roles by adding mappings to the iam_groups
variables of the relevant resources, without having to understand and deal with the details of service account roles.
In those cases where roles need to be assigned to end-user service accounts (e.g. an application or pipeline service account), we offer a stage-level iam
variable that allows pinpointing individual role/members pairs, without having to touch the code internals, to avoid the risk of breaking a critical role for a robot account. The variable can also be used to assign roles to specific users or to groups external to the organization, e.g. to support external suppliers.
The one exception to this convention is for roles which are part of the delegated grant condition described above, and which can then be assigned from other stages. In this case, use the iam_additive
variable as they are implemented with non-authoritative resources. Using non-authoritative bindings ensure that re-executing this stage will not override any bindings set in downstream stages.
A full reference of IAM roles managed by this stage is available here.
You can customize organization-level logs through the log_sinks
variable in two ways:
- creating additional log sinks to capture more logs
- changing the destination of captured logs
By default, all logs are exported to Bigquery, but FAST can create sinks to Cloud Logging Buckets, GCS, or PubSub.
If you need to capture additional logs, please refer to GCP's documentation on scenarios for exporting logging data, where you can find ready-made filter expressions for different use cases.
Configuring the individual tokens for the naming convention described above, has varying degrees of complexity:
- the static prefix can be set via the
prefix
variable once - the environment identifier is set to
prod
as resources here influence production and are considered as such, and can be changed inmain.tf
locals
All other tokens are set directly in resource names, as providing abstractions to manage them would have added too much complexity to the code, making it less readable and more fragile.
If a different convention is needed, identify names via search/grep (e.g. with ^\s+name\s+=\s+"
) and change them in an editor: it should take a couple of minutes at most, as there's just a handful of modules and resources to change.
Names used in internal references (e.g. module.foo-prod.id
) are only used by Terraform and do not influence resource naming, so they are best left untouched to avoid having to debug complex errors.
At any time during this stage's lifecycle you can configure a Workload Identity Federation pool, and one or more providers. These are part of this stage's interface, included in the automatically generated .tfvars
files and accepted by the Resource Managent stage that follows.
The variable maps each provider's issuer
attribute with the definitions in the identity-providers.tf
file. We currently support GitHub and Gitlab directly, and extending to definitions to support more providers is trivial (send us a PR if you do!).
Provider key names are used by the cicd_repositories
variable to configure authentication for CI/CD repositories, and generally from your Terraform code whenever you need to configure IAM access or impersonation for federated identities.
This is a sample configuration of a GitHub and a Gitlab provider, attribute_condition
attribute can use any of the mapped attribute for the provider (refer to the identity-providers.tf
file for the full list) or set to null
if needed:
federated_identity_providers = {
github-sample = {
attribute_condition = "attribute.repository_owner==\"my-github-org\""
issuer = "github"
custom_settings = null
}
gitlab-sample = {
attribute_condition = "attribute.namespace_path==\"my-gitlab-org\""
issuer = "gitlab"
custom_settings = null
}
gitlab-ce-sample = {
attribute_condition = "attribute.namespace_path==\"my-gitlab-org\""
issuer = "gitlab"
custom_settings = {
issuer_uri = "https://gitlab.fast.example.com"
allowed_audiences = ["https://gitlab.fast.example.com"]
}
}
}
FAST is designed to directly support running in automated workflows from separate repositories for each stage. The cicd_repositories
variable allows you to configure impersonation from external repositories leveraging Workload identity Federation, and pre-configures a FAST workflow file that can be used to validate and apply the code in each repository.
The repository design we support is fairly simple, with a repository for modules that enables centralization and versioning, and one repository for each stage optionally configured from the previous stage.
This is an example of configuring the bootstrap and resource management repositories in this stage. CI/CD configuration is optional, so the entire variable or any of its attributes can be set to null if not needed.
cicd_repositories = {
bootstrap = {
branch = null
identity_provider = "github-sample"
name = "my-gh-org/fast-bootstrap"
type = "github"
}
cicd = {
branch = null
identity_provider = "github-sample"
name = "my-gh-org/fast-cicd"
type = "github"
}
resman = {
branch = "main"
identity_provider = "github-sample"
name = "my-gh-org/fast-resman"
type = "github"
}
}
The type
attribute can be set to one of the supported repository types: github
, gitlab
, or sourcerepo
.
Once the stage is applied the generated output files will contain pre-configured workflow files for each repository, that will use Workload Identity Federation via a dedicated service account for each repository to impersonate the automation service account for the stage.
You can use Terraform to automate creation of the repositories using the extra stage defined in fast/extras/0-cicd-github (only for Github for now).
The remaining configuration is manual, as it regards the repositories themselves:
- create a repository for modules
- clone and populate it with the Fabric modules
- configure authentication to the modules repository
- for GitHub
- create a key pair
- create a deploy key in the modules repository with the public key
- create a
CICD_MODULES_KEY
secret with the private key in each of the repositories that need to access modules (for Gitlab, please Base64 encode the private key for masking)
- for Gitlab
- TODO
- for Source Repositories
- assign the reader role to the CI/CD service accounts
- for GitHub
- create one repository for each stage
- clone and populate them with the stage source
- edit the modules source to match your modules repository
- a simple way is using the "Replace in files" function of your editor
- search for
source\s*= "../../../modules/([^"]+)"
- replace with
source = "[email protected]:my-org/fast-modules.git//$1?ref=v1.0"
- search for
- a simple way is using the "Replace in files" function of your editor
- copy the generated workflow file for the stage from the GCS output files bucket or from the local clone if enabled
- for GitHub, place it in a
.github/workflows
folder in the repository root - for Gitlab, rename it to
.gitlab-ci.yml
and place it in the repository root - for Source Repositories, place it in
.cloudbuild/workflow.yaml
- for GitHub, place it in a
name | description | modules | resources |
---|---|---|---|
automation.tf | Automation project and resources. | gcs · iam-service-account · project |
|
billing.tf | Billing export project and dataset. | bigquery-dataset · project |
google_billing_account_iam_member |
cicd.tf | Workload Identity Federation configurations for CI/CD. | iam-service-account · source-repository |
|
identity-providers.tf | Workload Identity Federation provider definitions. | google_iam_workload_identity_pool · google_iam_workload_identity_pool_provider |
|
log-export.tf | Audit log project and sink. | bigquery-dataset · gcs · logging-bucket · project · pubsub |
|
main.tf | Module-level locals and resources. | ||
organization.tf | Organization-level IAM. | organization |
google_organization_iam_binding |
outputs-files.tf | Output files persistence to local filesystem. | local_file |
|
outputs-gcs.tf | Output files persistence to automation GCS bucket. | google_storage_bucket_object |
|
outputs.tf | Module outputs. | ||
variables.tf | Module variables. |
name | description | type | required | default | producer |
---|---|---|---|---|---|
billing_account | Billing account id. If billing account is not part of the same org set is_org_level to false. |
object({…}) |
✓ | ||
organization | Organization details. | object({…}) |
✓ | ||
prefix | Prefix used for resources that need unique names. Use 9 characters or less. | string |
✓ | ||
bootstrap_user | Email of the nominal user running this stage for the first time. | string |
null |
||
cicd_repositories | CI/CD repository configuration. Identity providers reference keys in the federated_identity_providers variable. Set to null to disable, or set individual repositories to null if not needed. |
object({…}) |
null |
||
custom_role_names | Names of custom roles defined at the org level. | object({…}) |
{…} |
||
fast_features | Selective control for top-level FAST features. | object({…}) |
{} |
||
federated_identity_providers | Workload Identity Federation pools. The cicd_repositories variable references keys here. |
map(object({…})) |
{} |
||
groups | Group names to grant organization-level permissions. | map(string) |
{…} |
||
iam | Organization-level custom IAM settings in role => [principal] format. | map(list(string)) |
{} |
||
iam_additive | Organization-level custom IAM settings in role => [principal] format for non-authoritative bindings. | map(list(string)) |
{} |
||
locations | Optional locations for GCS, BigQuery, and logging buckets created here. | object({…}) |
{…} |
||
log_sinks | Org-level log sinks, in name => {type, filter} format. | map(object({…})) |
{…} |
||
outputs_location | Enable writing provider, tfvars and CI/CD workflow files to local filesystem. Leave null to disable. | string |
null |
||
project_parent_ids | Optional parents for projects created here in folders/nnnnnnn format. Null values will use the organization as parent. | object({…}) |
{…} |
name | description | sensitive | consumers |
---|---|---|---|
automation | Automation resources. | ||
billing_dataset | BigQuery dataset prepared for billing export. | ||
cicd_repositories | CI/CD repository configurations. | ||
custom_roles | Organization-level custom roles. | ||
federated_identity | Workload Identity Federation pool and providers. | ||
outputs_bucket | GCS bucket where generated output files are stored. | ||
project_ids | Projects created by this stage. | ||
providers | Terraform provider files for this stage and dependent stages. | ✓ | stage-01 |
service_accounts | Automation service accounts created by this stage. | ||
tfvars | Terraform variable files for the following stages. | ✓ |