We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
本章節一開始有帶到分布式系統不是一個理想的世界,時常發生預期外的錯誤,文中滿多著墨在網路的部分,不過我之前就滿常遇過不是單一網路問題造成的節點失效,還滿常 A(這邊假設GKE) -> B(POD/Container) 沒問題,不過某幾台 B -> C(Internal Service) 會偶爾出現問題或是直接掛掉,目前是用下面這種神奇的方式去主動偵測預期外的掛掉:
livenessProbe: exec: command: - /bin/sh - -c - "cat `find ./health.json -mmin -1440 | awk -v def=default-cannot-cat-file '{print} END { if (NR==0) {print def} }'`" initialDelaySeconds: 60 periodSeconds: 60 failureThreshold: 5
不知道有沒有人有其他檢測的方法呢? 或是都怎麼偵測一個系統是不是活著或是一個活著的殭屍(?
The text was updated successfully, but these errors were encountered:
我們也是差不多,也是靠 k8s 設定
livenessProbe: httpGet: path: /.healthcheck port: http initialDelaySeconds: 10 periodSeconds: 2 failureThreshold: 10 // server... server.get(`/.healthcheck`, (_req, res) => { res.send('OK'); });
印象中是看回傳的 status code 200 <= status < 400,如果不是在這範圍就會砍掉 container 再重啟一個
Sorry, something went wrong.
No branches or pull requests
本章節一開始有帶到分布式系統不是一個理想的世界,時常發生預期外的錯誤,文中滿多著墨在網路的部分,不過我之前就滿常遇過不是單一網路問題造成的節點失效,還滿常 A(這邊假設GKE) -> B(POD/Container) 沒問題,不過某幾台 B -> C(Internal Service) 會偶爾出現問題或是直接掛掉,目前是用下面這種神奇的方式去主動偵測預期外的掛掉:
不知道有沒有人有其他檢測的方法呢? 或是都怎麼偵測一個系統是不是活著或是一個活著的殭屍(?
The text was updated successfully, but these errors were encountered: