- Glossary
- Useful links
- Issue description
- Out of scope
- Solution
- Security
- Test Plan
- Logs
- Documentation
- Automation Design
- Open questions
- Implementation plan
Term | Description |
---|---|
GCP | Google Cloud Platform |
Google compute engine (GCP) | Virtual machine in google (similar to AWS EC2) |
Environment-provided service account | An identity for accessing private data on behalf of a service account inside Google Cloud environments |
Service account key | App can self sign a google token with the service account key in order to accessing private data on behalf of a service account outside Google Cloud environments |
- Feature doc
- GCP authentication overview
- Google metadata server
- Type of service accounts
- Creating short-lived service account credentials
Enable GCP hosts to authenticate to Conjur using their Google identity as credentials in order to issue a Conjur token that will allow them to access secrets in Conjur.
-
The only authentication strategy we are supporting is Google Service Account in Google Compute Engine. Other strategies are not supported: Service account keys, OAuth 2.0 client, API keys. Other google components are not supported: App Engine, GKE, Cloud Run and Cloud Functions.
-
Google-managed service accounts are out of scoped, you can find more info about different service accounts, here
A new GCP authenticator will be added to Conjur, GCP hosts will authenticate with this authenticator by presenting their host Conjur identity and Google identity in a signed JWT token.
We will support only the following service accounts types:
- Default service accounts - automatically granted by google
- User-managed service accounts - created by the user
As mentioned in the feature doc (Confluence - Conjur GCP Authenticator), before any authentication request is sent to Conjur, the admin will load the authenticator policy:
The YAML snippet below depicts the Google authenticator policy in Conjur:
Note: The service-id
is not allowed in this authenticator since its redundant, we will have only one authn-gcp
in Conjur.
# policy id needs to match the convention `conjur/authn-gcp`
- !policy
id: conjur/authn-gcp
body:
- !webservice
- !group apps
- !permit
role: !group apps
privilege: [ read, authenticate ]
resource: !webservice
All of the following configurations will be hardcoded in Conjur: Note: Those configuration are not exposed to the customer, in order to have a better UX, the ideal way is to expose them and to have a default values, but currently we do not have the infrastructure where and how to define system internal configuration or default values.
- provider-uri:
https://www.googleapis.com/oauth2/v1/certs
#represnts the address to fetch the public key in order to validate the JWT token. - mandatory-claims:
iss, exp, iat
#the list of the mandatory claims to validate in the JWT token. - iss:
https://accounts.google.com
#the issuer claim we should validate in the JWT token
To authenticate with Conjur, GCP-specific fields will need to be provided in the host annotations of the Conjur host identity with the following options:
- project-id - A customizable unique identifier of the Google Cloud Platform project.
- instance-name - The name of the GCP machine.
- service-account-id - The GCE instance identity (Service Account).
- service-account-email - The email of the instance identity (Service Account).
At least one of the above should be provided, if more than 1 is provided, we will authenticate them with AND
logic.
A Conjur host will be defined as follows with their annotations holding GCP-specific identification attributes.
The YAML snippet below depicts Conjur host policy that can be identified using GCP identifiers:
- !policy
id: <policy-id>
body:
- !group
- &hosts
- !host
id: myapp
annotations:
authn-gcp/instance-name: <instance-name>
authn-gcp/project-id: <project-id>
authn-gcp/service-account-id: <service-account-id>
authn-gcp/service-account-email: <service-account-email>
- !grant
role: !group
members: *hosts
- !grant
role: !group /conjur/authn-gcp/apps
member: !group <policy-id>
In order to authenticate with Conjur GCP authenticator, a GCE instance first needs to fetch its environment-provided identity.
Example: Run the following from the GCP machine:
curl \ --header Metadata-Flavor: Google" \ --get \ --data-urlencode "audience=conjur/<account-name>/<host-id>" \ --data-urlencode "format=full" \ "http://metadata/computeMetadata/v1/instance/service-accounts/default/identity
Note: the format=full
query string parameter is mandatory otherwise the output token will not include the compute
engine metadata claim and Conjur will not be able to identify the caller.
The audience value must to be in following format: conjur/<account-name>/<host-id>
For example: conjur/myorg/myapp
The JSON snippet below depicts a GCP identity JWT token issued by Google meta data service:
{
"aud": "conjur/myorg/myapp",
"azp": "110987294251917851298",
"email": "[email protected]",
"email_verified": true,
"exp": 1595160638,
"google": {
"compute_engine": {
"instance_creation_timestamp": 1595155766,
"instance_id": "4340508760561261530",
"instance_name": "vm-for-gcp",
"project_id": "eng-serenity-231813",
"project_number": 716149158341,
"zone": "us-central1-a"
}
},
"iat": 1595157038,
"iss": "https://accounts.google.com",
"sub": "110987294251917851298"
}
The following is a mapping of Host annotations with the values we will be extracting in the google JWT token:
Host annotation | GCP JWT token |
---|---|
authn-gcp/project_id |
google/compute_engine/project_id |
authn-gcp/instance_name |
google/compute_engine/instance_name |
authn-gcp/service_account_id |
sub claim |
authn-gcp/service_account_email |
email field |
Send the following POST request:
https://<DAP-server-hostname>/authn-gcp/<account>/authenticate
The URL will not include the following:
- The host id as it is written in the token’s audience claim
- There is no service-id in this authenticator
Header | Content-Type: application/x-www-form-urlencoded |
Body | The body must include the GCP jwt token for GCE instance. jwt=eyJhbGciOiJSUzI1NiIs......uTonCA |
When we got the requirement to implement GCP authenticator the first thought was: if we had a generic JWT authenticator we could support it, maybe not with the best UX but surly we could present a proof of concept.
So the goal of this deign is to present a GCP implementation plan with a small effort to support JWT authenticator as well. While the main differences between them is UX and configuration options: GCP authenticator - should focus on UX in Google Cloud Platform. JWT authenticator - should be generic as possible to support the JWT standard and any Vendor in the future.
The following table represents all the configuration options i want to have in JWT and its comparison to GCP
JWT | GCP | |
---|---|---|
api | authn-jwt/ | authn-gcp/ |
Mandatory base claims to validate | Iss, exp, nbf (Configurable) | iss, exp, iat (Constants) |
Optional base claims to validate (if exists in the token) | aud,sub,iat (Configurable) | - |
provider-uri | Configurable | Constant |
Public-certificate | Configurable (In case of self sign) | - |
iss value | Configurable (list?) | Constant |
aud value | Configurable (list?) | Constant |
sub value | Configurable (list?) | - |
Host annotation prefix | authn-jwt | authn-gcp |
Permitted annotation keys | any | instance-name OR project-id OR service-account-id OR service-account-email |
Mapping host annotation logic | - | Map each above key to his location in the google JWT token |
The following diagram represents JWT authenticator with a lot of small responsibility units, according to the above table and past experience, in order to have the flexibility to support future vendors that uses JWT token for their authentication process, like Google, GCP.
Note: in this feature we will support only GCP
GCP & JWT will have 3 dependencies which will change their behavior by dependency injection
-
Fetch Authenticator configuration
1.1 Fetch provider uri
- GCP implementation - return an hardcoded
- Authenticator policy implementation - fetch from provider uri value in the policy, this logic is relevant for: OIDC, Azure, JWT
1.2 Fetch claims
1.2.1 Fetch mandatory claims keys to validate - list of claims that must exist in token
- GCP implementation - return an hardcoded list (Iss, exp, iat)
- OIDC needs refactor - return an hardcoded list (Iss, exp, nbf)
- Azure needs refactor - return an hardcoded list (Iss, exp, nbf)
- JWT design phase - return an list which defined in a variable under the authenticator policy
1.2.2 Fetch mandatory claims values - values of claims to validate which can be reside in different location: hardcoded, policy, hostname, uri of the request
- GCP implementation - return an hardcoded values (Iss = "https://www.googleapis.com/oauth2/v1/certs", aud = "conjur"
- OIDC needs refactor - fetch iss value from the provider uri
- Azure needs refactor - fetch iss value from the provider uri
- JWT design phase - return the values of mandatory claims from variable in policy
1.2.3 Fetch optional claims to validate - list of claims that we should validate only if they appears in the token
- GCP - not relevant
- OIDC - not relevant
- Azure - not relevant
- JWT design phase - return the values of optional claims from variable in policy
1.2.4 Fetch optional claims values
- JWT design phase - return the values of optional claims from variable in policy
1.2.5 authn prefix parameter - the authenticator prefix which will later be use the fetch a the host annotations
-
Validate and decode - same code a today with small refactoring of sending also the claims values if necessary
- Fetch public key
- Validate signature
- Validate claims
- Save key in cache
-
Validate resource restrictions - Responsible to validate the resource restrictions which defined in conjur, the idea is to refactor Azure and K8S to work with this pattern
3.1 Extract Source Restrictions
- Host annotations implementation - Fetch host annotations values, this logic is relevant to GCP, JWT, Azure, K8s
3.2 Validate Source Restrictions - Each authenticator will have his special logic of which annotation are permitted (In perfect world this logic should be trigger in load policy stage)
- GCP implementation - validate GCP annotations logic (instance-name OR project-id OR service-account-id OR service-account-email)
- Azure needs refactor - same Azure logic
- K8s needs refactor - same K8S logic
- JWT design phase - any key is permitted
3.3 Validate destination Restrictions - In GCP,JWT,Azure we are validating against JWT token, in K8S against api
3.3.1 Key Adapter - Responsible of mapping source key to its corresponding destination key value
- GCP implementation - use a key adapter to find the corresponding key to validate
- Azure needs refactor - same Azure xms_mirid logic
- K8s needs refactor - same K8S logic
- JWT design phase - straightforward validation
None because this is a new feature.
GCP authenticator performance should conform with our other authenticators with an average call time of 1 second.
-
Conjur
-
Conjur Enterprise
-
Future support in Service account key needs to be done very carefully since there is a concern of impersonate other Conjur hosts, because the Service account key enable you to control the content of the JWT token. The reason it is not a concern in our use case, because we are validating only tokens which supplied by google metadata server, and signed by google private keys which are not expose to the apps in contrast to the Service account key.
-
Default token expiration is 1 hour - we will stay with the default
-
onetime - Ori suggested to consider a mechanism to allow to each token to be used only once, but in this version we will not implement this capability
Confluence: Conjur GCP authenticator - Test plan
Scenario | Log message | Comment | |
---|---|---|---|
1 | Authenticator is not enabled (in DB/ENV) | Authenticator '{0-authenticator-name}' is not enabled | |
2 | Webservice is not defined in a Conjur policy | Webservice '{0-webservice-name}' wasn't found | |
3 | Host is not permitted to authenticate with the webservice | '{0-role-name}' does not have 'authenticate' privilege on {1-service-name} | |
4 | Host is not defined in Conjur | '{0-role-name}' wasn't found | |
5 | Couldn't make connection with Google Identity Provider in time | Google Identity Provider failed with timeout error (Provider URI: '{0}'). Reason: '{1}' | |
6 | Failed to confirm Google token signature | Failed to confirm signature of '{0-token}' issued by (Provider URI: '{1}' | |
7 | Failed to validate Google token claims | Will be decided in Impl phase | |
8 | Authentication request body is missing a field (e.g jwt_token ) |
Field '{0-field-name}' is missing or empty in request body | |
9 | A required annotation is missing in the Conjur host | Will be decided in Impl phase | Annotation is missing for authentication |
10 | Resource Restrictions includes an illegal constraints | Will be decided in Impl phase | Host contains an un permitted annotation |
11 | Resource Restrictions defined in Conjur host doesn't match JWT token | Will be decided in Impl phase | The message should say to validate if the token issued with Format=full |
Scenario | Log message | Comment | |
---|---|---|---|
1 | Before each logic action | Will be decided in Impl phase | For example: fetch claims, validate token, validate resource ... |
2 | After each logic action | Will be decided in Impl phase | |
3 | Value of configuration | Will be decided in Impl phase | For example: for example provider uri, claims ... |
Same logic like all other authenticators.
- Document examples of how to fetch google identity in GCP
- Document the solution supported use cases, logs and supportability section
We would like to add automated tests for the GCP Authenticator. While Unit Tests are already automated, we will need to add some infrastructure for running integration tests.
- Run integration tests in OSS
- Run a vanilla test in appliance
- Run automation on a Conjur Enterprise image that is deployed on GCP (Manual at first stage, automation on POST GA)
GCP Authenticator tests will be added to the the general infrastructure of our integration test.
The integration will be supported both from ci/test
script and dev/start
the later will require creating GCP manually.
The integration test does not require creation of secrets.
The tests will cover issuing an identity token
in Google Compute Engine (GCP) and
presenting the issued token to Conjur's host/authenticate
end point and exchange it with a valid Conjur token.
- Non-sensitive
- GCP project ID
- GCE instance name
- GCP Service Account ID
- GCP Service Account email
- Sensitive
- GCP VM IP / dns name
- user name
- ssh key
Although the non-sensitive data can be hard-coded, it is better to store them as Conjur secrets as we already need summon for the sensitive data. This way if they need rotation it can be easily done in Conjur without the need to update the code-base.
To do so, we will need to add a "secrets.yml" with the variables and run the scripts ("ci/test" and in "dev/start")
with summon. However, we will not just start running these scripts with summon to run the current tests just they the
authn-cp tests can run as part of their run. If we do that, then users from the community will not be able to run
the authenticators tests (because they don't have access to conjurops
).
Thus it is better to run the integration tests separately from the other authentication tests. We will add a new step to the Jenkinsfile for running the tests.
stage('Run Tests') {
parallel {
...
stage('Authenticators') {
steps { sh 'ci/test cucumber_authenticators' }
}
stage('GCP Authenticator') {
steps { sh 'summon ci/test cucumber_gcp_authenticator' }
}
...
}
}
For the "dev/start" script we can verify that if the "--authn-gcp" flag was added then the required variables were retrieved, and raise an "GCP tests must run with summon" message.
Please note that we will need to add a new flag to the "ci/test" script - "cucumber_gcp_authenticator" as it will need summon.
We need to have an GCE instance for our tests to generate a GCP identity access token. For this, we will use a Jenkins plugin for GCP) that allows to deploy instances of GCP VMs which are provisioned quickly, and get destroyed by Jenkins when idle. This way the cost in minimal, we don't have concurrency issues and we don't need to maintain an IP in our secrets for the VM IP addresses.
The new Jenkinsfile will look as follows:
stage('Allocate GCP Authenticator Instance'){
steps {
script {
node('gcp-authn'){
env.GCP_AUTHN_INSTANCE_IP = sh(script: 'curl icanhazip.com', returnStdout: true)
env.KEEP_GCP_AUTHN_INSTANCE = "true"
while(env.KEEP_GCP_AUTHN_INSTANCE == "true"){
sleep(time: 1, unit: "SECONDS")
}
}
}
}
}
stage('Test GCP Authenticator'){
steps{
script {
while (!env.GCP_AUTHN_INSTANCE_IP?.trim()){
sleep(time: 1, unit: "SECONDS")
}
sh(
script: 'summon -f ci/authn-gcp/secrets.yml ci/test cucumber_authenticators_gcp',
returnStdout: true
)
}
}
post {
always {
script {
env.KEEP_GCP_AUTHN_INSTANCE = "false"
}
}
}
}
As you can see, we have a stage that allocates the GCP VM and sets its IP in the env. That IP is then consumed from the env in the GCP Authenticator tests. Once the tests are finished, we update KEEP_GCP_AUTHN_INSTANCE and the GCP VM allocation stage is finished.
To get a Conjur access token with an GCP identity token we need to issue a GCP identity token first.
Send an authn-gcp request to Conjur with the GCP identity token in the body. In production, both actions will be done inside a GCE instance. However, in our tests, it will be much easier to perform only the first action inside a GCE instance and perform the second action from the cucumber container. This way will only need to add a step for SSHing to a remote machine and send the command for retrieving an access token. We already have the infrastructure and the code for sending an authentication request to the server.
The test verifies that Conjur authenticates a valid Google identity token and get Conjur access token as a result. The only thing that we don't test here is the communication between the GCP machine to Conjur server but it's acceptable for 2 reasons:
- The test is done manually.
- Even if we do test communication in our automation, it doesn't guarantee that there will not be any errors in the customer's env.
The two reasons above brings us to the conclusion that running the authentication request from outside the GCP machine has the same quality of running it inside the machine.
- GCP VM for tests
- Secrets in
conjurops
- GCP project id
- GCE instance name
- Service account id
- Service account email
- Machine IP
- user id / ssh key
TODO: Inbal to decide which versions?
-
Conjur
-
Conjur Enterprise
- Authenticator name may change
authn-gcp
? TODO: Inbal to decide
Merging to master strategy
In this feature we are prefer to work with a feature branch instead of small PRs to master, Since we are planning to deliver it before the planned release version.
Pros: We will have more flexibility to any change of release date and its quality (GA or CA)
Cons: More complex merge back to the master
Parallelism: The feature can be paralleled to 3 team members all the way
already covered in research phase
None. Note: LLD designs need will be decided at implementation level
- JWT 3rd party validation - Use our JWT 3rd lib to authenticate a google JWT token, with the following claims: iss, aud, exp, nbf, iat Raise a concerns if any EE2
- Instigate in which scenarios the following google JWT token field is false
email_verified": true
? do we need a special treatment there ? E1
-
Implement classes according to design and write LLD where is needed
1.1. 6 classes - EE2 for implement each plus UT EE12
-
Implement status validation EE1
-
Enhance logs and supportability EE1
-
Release from side branch E1
4.1 Merge stable released Conjur to our side branch 4.2 Update Conjur and Conjur Enterprise versions 4.3 Release Conjur Enterprise as CA from side branch
-
Implement user extraction for aud claim indie the token E1
-
Automation infrastructure
1.1. OSS - Create GCE instance EE1
1.2. OSS - Issue an identity token EE1
1.3. Conjur Enterprise - Create GCE instance EE1
1.4. Conjur Enterprise - Issue an identity token EE1
-
OSS - Implement integration tests EE5
-
Conjur Enterprise - Implement integration tests for EE3
-
Manual tests according to docs (customer env) EE1
-
Performance tests EE3
-
Setting customer env EE2
6.1 1 Master and 2 standbys outside GKE according to our docs
6.2 2 Followers inside GKE with LB
6.3 1 GCP client machine
-
Support GCP in dev/start script EE1
- Review EE2
-
Document examples of how to fetch google identity in GCP EE2
-
Document the solution supported use cases, logs and supportability section EE2
2.1 Require a
"format=full"
parameter when issuing the google identity token, in order to have thecompute_engine
structure which contains the metadata on the GCP machine2.2 Recommend not to use google default service account in production
- Infrastructure to deploy Conjur Enterprise image on GCP
- LLD for K8S, Azure ang GCP to use same component of ValidateResourceRestrictions class EE3
- Implement LLD1 EE3
- Automated Left over testsEE??
- Validate again the HLD EE2
- Implement JWT authenticator according to design EE10
- Implement status validation EE1
- Enhance logs and supportability EE1
- Review EE2
- Only manual unless we will decide to release it as CA or GA EE2