Skip to content

Commit

Permalink
Merge pull request #1363 from alphagov/ianhowell-gds/opensearch-regis…
Browse files Browse the repository at this point in the history
…ter-repository-tool

Add README.md and script to register S3 buckets as snapshot repositories for OpenSearch Clusters.
  • Loading branch information
rtrinque authored Jul 4, 2024
2 parents 093c8cf + 6366fae commit 0f013a0
Show file tree
Hide file tree
Showing 2 changed files with 104 additions and 0 deletions.
31 changes: 31 additions & 0 deletions terraform/deployments/opensearch/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
## Chat OpenSearch Snapshots - `register-snapshot-repository.py`
This document details how the S3 buckets created for the backup process should be registered in each environment. Detailed instructions on how to create index snapshots in Amazon OpenSearch Service can be found [here]. Full instructions on how to access the Amazon OpenSearch Dashboard can be found on this [page].

Registering the S3 buckets as snapshot repositories is a manual one-off process to be carried out in each environment (Integration, Staging and Production). The first step is to log in to the OpenSearch Dashboard and map the AWS IAM Role of the user who will register the repositories. This is followed by running the `register-snapshot-repository.py` script. The backup jobs are run as cronjobs on the EKS cluster. The Production snapshot is created first, which gets imported by Staging and then Integration.

### Commands to run to map the IAM Role in the OpenSearch Dashboard:

```
eval $(gds aws govuk-[integration|staging|production]-admin -e -art 8h)
OPENSEARCH_URL=$(aws opensearch describe-domain --domain-name chat-engine | jq -r '.DomainStatus.Endpoints.vpc')
kubectl relay host/$OPENSEARCH_URL 4443:443
```

Open https://localhost:4443/_dashboards in a browser and log in. Map your AWS Role using instructions in Step 1 of https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshots.html#managedomains-snapshot-registerdirectory.

### Commands to run to register the S3 buckets (with the relay host from above still running):

```
virtualenv venv
source venv/bin/activate
pip install boto3 requests requests-aws4auth
python register-snapshot-repository.py [integration|staging|production]
```

[here]: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshots.html
[page]: https://docs.publishing.service.gov.uk/manual/manage-opensearch-on-aws.html
73 changes: 73 additions & 0 deletions terraform/deployments/opensearch/register-snapshot-repository.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
"""
https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshots.html
This script has been copied from:
https://raw.githubusercontent.com/alphagov/govuk-aws/main/terraform/projects/app-elasticsearch6/register-snapshot-repository.py
and is to be used to register the required S3 buckets as repositories for the Opensearch backup jobs,
in Integration, Staging and Production environments, which are run by EKS as cronjobs.
Instructions for running this script:
$ eval $(gds aws govuk-[integration|staging|production]-admin -e -art 8h)
$ OPENSEARCH_URL=$(aws opensearch describe-domain --domain-name chat-engine | jq -r '.DomainStatus.Endpoints.vpc')
$ kubectl relay host/$OPENSEARCH_URL 4443:443
Open https://localhost:4443/_dashboards in a browser and log in
Map your AWS Role using instructions in Step 1 of https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshots.html#managedomains-snapshot-registerdirectory
$ virtualenv venv
$ source venv/bin/activate
$ pip install boto3 requests requests-aws4auth
$ python register-snapshot-repository.py [integration|staging|production]
"""

import os
import sys
import boto3
import requests
from requests_aws4auth import AWS4Auth

host = 'https://localhost:4443/'
region = 'eu-west-1'
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

def register_repository(name, role_arn, delete_first=False, read_only=False):
print(name)

url = host + '_snapshot/' + name
print(url)

if delete_first:
r = requests.delete(url)
r.raise_for_status()
print(r.text)

payload = {
"type": "s3",
"settings": {
"bucket": name + '-chat-opensearch-snapshots',
"region": region,
"role_arn": role_arn,
"readonly": read_only
}
}

headers = {"Content-Type": "application/json"}

r = requests.put(url, auth=awsauth, json=payload, headers=headers)
r.raise_for_status()
print(r.text)

delete_first = 'DELETE_FIRST' in os.environ

if sys.argv[1] == 'integration':
role_arn = 'arn:aws:iam::210287912431:role/govuk-integration-chat-opensearch-snapshot-role'
register_repository('govuk-integration', role_arn, delete_first=delete_first)
register_repository('govuk-staging', role_arn, delete_first=delete_first, read_only=True)
elif sys.argv[1] == 'staging':
role_arn = 'arn:aws:iam::696911096973:role/govuk-staging-chat-opensearch-snapshot-role'
register_repository('govuk-staging', role_arn, delete_first=delete_first)
register_repository('govuk-production', role_arn, delete_first=delete_first, read_only=True)
elif sys.argv[1] == 'production':
role_arn = 'arn:aws:iam::172025368201:role/govuk-production-chat-opensearch-snapshot-role'
register_repository('govuk-production', role_arn, delete_first=delete_first)
else:
print('expected one of [integration|staging|production]')

0 comments on commit 0f013a0

Please sign in to comment.