Skip to content
myktaylor edited this page Sep 20, 2018 · 5 revisions

Setup Guide

Set up a Google Cloud project

As per the README, it is recommended to make the project that hosts ts-bridge separate from the rest of your infrastructure.

  1. Log in to GCP and create a new Google Cloud project
  2. Ensure the new project is linked to a billing account (Note that the Stackdriver free tier can accommodate up to about 220 metrics. If you have already consumed your free quota with other usage, the incremental cost per metric is between US$2.32 and US$0.55 per month, depending on which pricing tier you are already in.)
  3. Enable stackdriver monitoring for the new project. When prompted:
    • Create a new stackdriver account for the project
    • Monitor only the new project (it should be selected by default)
    • Skip AWS setup and agent installation
    • Choose whether to receive email status reports for the project

Set up your dev environment

We recommend using Cloud Shell to run these commands. Shared dev environments can be set up using a git repository and open-in-cloud-shell links.

  1. If you are not using Cloud Shell:
    • Install go
    • Download and install the Cloud SDK for Go
      • Initialize with the following commands to set the linked project and auth cookie:
        • gcloud init
        • gcloud auth application-default login
    • Download and install the original App Engine SDK for Go (Expand the "Previous App Engine SDK" header to see it)
  2. Clone the ts-bridge source
    • go get github.com/google/ts-bridge/...
    • The ts-bridge source code should appear in ~/gopath/src/github.com/google/ts-bridge/

End to end test (dev server)

  1. Ensure that you either have Owner permissions for the whole Cloud project, or at minimum the "Monitoring Editor" role

  2. If using a service account to authenticate, set the Application Default Credentials environment variable to your auth key file

    • export GOOGLE_APPLICATION_CREDENTIALS=/path/to/file.json
  3. Create a ts-bridge config with 0 metrics

    • cd ~/gopath/src/github.com/google/ts-bridge/app; cp metrics.yaml.example metrics.yaml
    • Edit the yaml file, remove the datadog_metrics sample content, and copy in the name of the project you just created into the stackdriver_destinations section.
    • Your metrics.yaml file should look like this:
    datadog_metrics:
    stackdriver_destinations:
      - name: stackdriver
        project_id: "your-project-name"
    
  4. Turn on the status page (uncomment #ENABLE_STATUS_PAGE: "yes" in app.yaml)

  5. Launch a dev server

    • dev_appserver.py app.yaml --port 18080
  6. Test via localhost/sync

    • curl http://localhost:18080/sync
  7. Verify that no error messages are shown. Troubleshooting guide:

    Error message Remedy
    ERROR: StatsCollector: rpc error: code = PermissionDenied desc = The caller does not have permission Ensure the authenticating user has at least the “Monitoring Editor” role
  8. Configure metrics by getting your API and application keys from datadog and copying over your sli queries into metrics.yaml

    • Your metrics.yaml.file should now look something like this:
    datadog_metrics:
      - name: your_first_sli_name
        query: "your sli query (copied from your datadog dashboard)"
        api_key: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
        application_key: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        destination: stackdriver
      - name: your_second_sli_name
        query: "your sli query (copied from your datadog dashboard)"
        api_key: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
        application_key: bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        destination: stackdriver
    stackdriver_destinations:
      - name: stackdriver
        project_id: "your-project-name"
    
    • see the README for more configuration documentation
  9. Test metric ingestion via localhost/sync

    • curl http://localhost:18080/sync
  10. Verify that metrics are visible on status page

    • In Cloud Shell, click the ‘web preview’ button and change the port to 18080
    • If running on a local workstation, browse to http://localhost:18080/
  11. Verify that metrics are visible in the Stackdriver UI

  12. Kill the local dev server

Deploy in production

  1. Ensure that you either have Owner permissions for the whole Cloud project, or at minimum the "App Engine Admin" and "Cloud Scheduler Admin" roles
  2. Disable the status page (comment out ENABLE_STATUS_PAGE: "yes" in app.yaml)
    • You are free to leave the status page enabled, but be aware that it will be publicly accessible once you deploy unless you take steps to secure it with the Identity-Aware Proxy (IAP)
  3. Create the App Engine application
    • gcloud app create
    • Choose the App Engine region. If you are using ts-bridge to import metrics originating from a system running on GCP, you should run ts-bridge in a different Cloud region from the system itself to ensure independent failure domains.
  4. Deploy app
    • goapp deploy -application <your_project_name> -version live
  5. Verify in the Stackdriver metrics explorer that metrics are being imported once a minute

Maintenance automation

SLI definitions and SLO targets change over time. It is possible to keep Datadog and Stackdriver manually aligned, but experience shows that without automation, there will be mistakes.

  1. Autogenerate ts-bridge config from datadog SLI queries or generate both datadog config and ts-bridge config from the same source
  2. Automate ts-bridge app deployment
Clone this wiki locally