-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
submit onboarding task #393
base: main
Are you sure you want to change the base?
Conversation
Hi @YQiu-oo, thank you for submitting the onboarding task. Could you double-check the
One way to ensure the |
Hi, @TZ-zzz! I am working on the issue, but my computer now consistently crash at the "deploy operator" stage(previously it worked successfully using the same stuff) and fails to proceed further. I don't know if Tidbdashboard is important to tidb-operator or actor. I mean will it impact acto to process further or will acto need this resource? |
@YQiu-oo, yes this issue is critical since the operator was not working at all and all the tests acto runs are essentially no-ops. You can check the operator-logs in the |
@TZ-zzz, ok, I got you, but the operator log should be generated after it is deployed? Because my situation is I don't see operator log when stuck at the deploying stage. |
@YQiu-oo, you can find the operator logs of each test case in the |
@TZ-zzz Tidb-operator is weird on my computer, so I switched to MongoDB. Could you see my commit? Here is the link for the testrun.rar:https://drive.google.com/file/d/1ZKEs3y4an0kbzpbNdFdRGr62F6JDtl_h/view?usp=sharing |
@YQiu-oo, I think the first alarm is due to that the previous changes haven't been reconciled. It seems the operator is stuck waiting for the pods to be reconciled, so the bug likely originates in an earlier phase, even though it's manifesting in the step that acto reports. It might be worth investigating why the previous state hasn't converged. Btw, For the second alarm, it's a |
@YQiu-oo, could you investigate the first alarm again, as the root cause is still a bit not clear. |
@TZ-zzz I found the Agent in the Pod repeatedly failed to reach its goal state, preventing the MongoDB ReplicaSet from becoming ready. And then, I checked the previous operator log about configuration. The configuration could not find the required passwordSecretName, which is necessary for SCRAM authentication. I guess this unmatched configuration causes Agent to reach its goal state. (the previous changes haven't been reconciled) |
@YQiu-oo Nice observation! Are you able to pinpoint the root cause in the mongodb operator which causes it to not reconcile after the system got into an error state? |
@tylergu I double checked the mongodb operator repo. If targetConfigVersion is not equal to targetConfigVersion, then it will trigger an agent's issue and output The Agent in the Pod '%s' hasn't reached the goal state yet (goal: %d, agent: %s) (https://github.com/mongodb/mongodb-kubernetes-operator/blob/c83d4d487e36c835f022092d516ce622321172b0/pkg/agent/agent_readiness.go#L110). |
Hi all!
Here is my onboarding task result. You can find testrun.rar in the google drive: https://drive.google.com/file/d/1GDbRX6s0zrnv_1KY-dZEIJqkZFh_Wl71/view?usp=drive_link.
Alarm explanation was explained in the summary.md file.
Thanks,
Yukang Qiu