Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ContainerPilot telemetry #16

Open
misterbisson opened this issue Apr 18, 2016 · 6 comments
Open

Implement ContainerPilot telemetry #16

misterbisson opened this issue Apr 18, 2016 · 6 comments

Comments

@misterbisson
Copy link
Contributor

ContainerPilot 2.0 introduced a telemetry feature that would be very useful for monitoring this application.

TritonDataCenter/containerpilot#27 proposed the following gauge:

The count of MySQL Query entries from SHOW PROCESSLIST that are in any Waiting state. 0 is great. 1 or above can be trouble. 10 or more is probably critical.

There are other MySQL-specific stats that would be very useful in scaling decisions. How would we write those sensors?

@tgross
Copy link
Contributor

tgross commented Apr 18, 2016

Looks like we can get replication lag for the replicas via pt-heartbeat

@misterbisson
Copy link
Contributor Author

misterbisson commented Sep 12, 2016

@Smithx10 asked how to autoscale MySQL in #54. With telemetry implemented per this ticket (though the sensors still need to be defined), scaling will require two more pieces:

  1. configured thresholds at which to scale up or down
  2. a scheduler/supervisor that can apply those scaling rules

It's incredibly minimalistic, but I've been experimenting for the past few months with running docker-compose scale <service>=<count> via a recurring task (Jenkins or cron both work fine). I have to name all the services and their counts in that line, but that's pretty much all there is to supervision. If an instance of a service fails, that will bring it back up to healthy. If you log the activity and set alarms on the logging....

What I haven't done yet is to make the <count> dynamic based on telemetry data and scaling thresholds, but that would seem to be the next step. Of course, I plan to set some min and max values, but....

@Smithx10
Copy link

After watching a few promcon presentations, would it make sense to use prometheus exporters and use a separate http call?

@tgross
Copy link
Contributor

tgross commented Sep 23, 2016

@neuroserve wrote in #58:

To enhance the setup, it might be a good idea to add Percona monitoring and management:
https://www.percona.com/doc/percona-monitoring-and-management/index.html

It consists basically of two Docker containers and the pmm-client package, that needs to be installed and activated on the mysql servers. The pmm-server IP/name could be transferred via its cns name (similar to the consul name).

It delivers query analysis and a grafana based metrics monitor. The backend is prometheus.

@tgross
Copy link
Contributor

tgross commented Sep 23, 2016

@Smithx10 and @neuroserve we've provided the Prometheus endpoint in ContainerPilot so that we can use the same interface to capture metrics from arbitrary applications. What the end user does with those metrics afterwards (put graphana in front of Prometheus or pipe them out via an exporter to a different storage engine) is left intentionally agnostic.

@misterbisson
Copy link
Contributor Author

misterbisson commented Jun 5, 2017

With ContainerPilot 3's first-class support for multi-process containers, it probably makes more sense to implement the "official" MySQL exporter for Prometeheus.

Related: a fancy dashboard for Grafana for that data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants