Detect `user_data` failure #44

lawliet89 · 2018-04-25T04:42:49Z

We run pretty elaborate scripts in the user_data portions of the EC2 instances.

We need some way to detect if these scripts have failed.

Probabilities:

Idea:

Define the existence of a user_data completion marker file as a service health check in Consul
Forward logs via td-agent from User_data

TODOs:

Consul Server
Consul Agents (Report User Data Failure in Consul Agents only #131)
Forward user_data logs via td-agent

The text was updated successfully, but these errors were encountered:

Currently, Fluentd and Prometheus/whatever time series DB can potentially be run on Nomad. If the Nomad cluster is unhealthy or the jobs are not running properly, new nodes will never be able to bootstrap properly. We rearrange the order of launching so that these nodes can still bootstrap their respective services. On the downside, this means that we might not even know that the nodes are not sending their logs. Perhaps #44 will help?

* Basic Policies * Add some missing policies

lawliet89 added the enhancement New feature or request label Apr 25, 2018

lawliet89 mentioned this issue Aug 2, 2018

Reorder User Data #124

Merged

lawliet89 added P-High Priority High D-Medium Difficulty Medium labels Aug 2, 2018

lawliet89 mentioned this issue Aug 2, 2018

Report User Data Failure in Consul Agents only #131

Merged

qbiqing pushed a commit that referenced this issue Mar 10, 2020

Manage Terraform policies (#44)

458c349

* Basic Policies * Add some missing policies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect `user_data` failure #44

Detect `user_data` failure #44

lawliet89 commented Apr 25, 2018 •

edited

Loading

Detect user_data failure #44

Detect user_data failure #44

Comments

lawliet89 commented Apr 25, 2018 • edited Loading

Detect `user_data` failure #44

Detect `user_data` failure #44

lawliet89 commented Apr 25, 2018 •

edited

Loading