Skip to content

Commit

Permalink
https://issues.redhat.com/browse/ACM-13050 improving the topic for 2.…
Browse files Browse the repository at this point in the history
…13 (#7320)

* https://issues.redhat.com/browse/ACM-13050 improving the topic for 2.13

* making a change to fix linter
  • Loading branch information
dockerymick authored Dec 9, 2024
1 parent e652b98 commit a99b59b
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 8 deletions.
2 changes: 1 addition & 1 deletion governance/template_functions.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -579,5 +579,5 @@ The `hasNodesWithExactRoles` function returns the `true` value if the cluster co
* See xref:../governance/template_support_intro.adoc#template-processing[Template processing] for more details.
* See xref:../governance/adv_template_process.adoc#adv-template-processing[Advanced template processing in configuration policies] for use-cases.
* For label selector examples, see the link:https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/[Kubernetes labels and selectors] documentation.
* Refer to the link:https://golang.org/pkg/text/template/[Golang documentation - Package templates].
* Refer to the link:https://golang.org/pkg/text/template[Golang documentation - Package templates].
* See the link:https://masterminds.github.io/sprig/[Sprig Function Documentation] for more details.
51 changes: 44 additions & 7 deletions troubleshooting/acm_thanos_compactor.adoc
Original file line number Diff line number Diff line change
@@ -1,22 +1,59 @@
[#troubleshooting-thanos-compactor]
= Troubleshooting a block error for Thanos compactor
= Troubleshooting Thanos compactor halts

You might receive a block error message that indicates that the block for Thanos compactor is corrupted.
You might receive an error message that the compactor is halted. This can occur when there are corrupted blocks or when there is insufficient space on the Thanos compactor persistent volume claim (PVC).

[#symptom-thanos-compactor]
== Symptom: Block error for Thanos compactor
== Symptom: Thanos compactor halts

After you upgrade {acm} and check the logs for the Thanos compactor by using the `oc logs observability-thanos-compact-0` command, the logs display the following error message:
The Thanos compactor halts because there is no space left on your persistent volume claim (PVC). You receive the following message:

[source,terminal]
----
ts=2024-01-24T15:34:51.948653839Z caller=compact.go:491 level=error msg="critical error detected; halting" err="compaction: group 0@15699422364132557315: compact blocks [/var/thanos/compact/compact/0@15699422364132557315/01HKZGQGJCKQWF3XMA8EXAMPLE /var/thanos/compact/compact/0@15699422364132557315/01HKZQK7TD06J2XWGR5EXAMPLE /var/thanos/compact/compact/0@15699422364132557315/01HKZYEZ2DVDQXF1STVEXAMPLE /var/thanos/compact/compact/0@15699422364132557315/01HM05APAHXBQSNC0N5EXAMPLE]: populate block: chunk iter: cannot populate chunk 8 from block 01HKZYEZ2DVDQXF1STVEXAMPLE: segment index 0 out of range"
ts=2024-01-24T15:34:51.948653839Z caller=compact.go:491 level=error msg="critical error detected; halting" err="compaction: group 0@5827190780573537664: compact blocks [ /var/thanos/compact/compact/0@15699422364132557315/01HKZGQGJCKQWF3XMA8EXAMPLE]: 2 errors: populate block: add series: write series data: write /var/thanos/compact/compact/0@15699422364132557315/01HKZGQGJCKQWF3XMA8EXAMPLE.tmp-for-creation/index: no space left on device; write /var/thanos/compact/compact/0@15699422364132557315/01HKZGQGJCKQWF3XMA8EXAMPLE.tmp-for-creation/index: no space left on device"
----

[#resolving-thanos-compactor]
== Resolving the problem: Add the _thanos bucket verify_ command
== Resolving the problem: Thanos compactor halts

To resolve the problem, increase the storage space of the Thanos compactor PVC. Complete the following steps:


. Increase the storage space for the `data-observability-thanos-compact-0` PVC. See link:../observability/customize_observability.adoc#increase-decrease-pv-pvc[Increasing and decreasing persistent volumes and persistent volume claims] for more information.


. Restart the `observability-thanos-compact` pod by deleting the pod. The new pod is automatically created and started.

+
[source,bash]
----
oc delete pod observability-thanos-compact-0 -n open-cluster-management-observability
----

. After you restart the `observability-thanos-compact` pod, check the `acm_thanos_compact_todo_compactions` metric. As the Thanos compactor works through the backlog, the metric value decreases.

. Confirm that the metric changes in a consistent cycle and check the disk usage. Then you can reattempt to decrease the PVC again.

+
*Note:* This might take several weeks.

[#symptom-thanos-compactor-two]
== Symptom: Thanos compactor halts

The Thanos compactor halts because you have corrupted blocks. You might receive the following output where the `01HKZYEZ2DVDQXF1STVEXAMPLE` block is corrupted:

+
[source,terminal]
----
ts=2024-01-24T15:34:51.948653839Z caller=compact.go:491 level=error msg="critical error detected; halting" err="compaction: group 0@15699422364132557315: compact blocks [/var/thanos/compact/compact/0@15699422364132557315/01HKZGQGJCKQWF3XMA8EXAMPLE /var/thanos/compact/compact/0@15699422364132557315/01HKZQK7TD06J2XWGR5EXAMPLE /var/thanos/compact/compact/0@15699422364132557315/01HKZYEZ2DVDQXF1STVEXAMPLE /var/thanos/compact/compact/0@15699422364132557315/01HM05APAHXBQSNC0N5EXAMPLE]: populate block: chunk iter: cannot populate chunk 8 from block 01HKZYEZ2DVDQXF1STVEXAMPLE: segment index 0 out of range"
----

[#resolving-thanos-compactor-two]
== Resolving the problem: Thanos compactor halts

Add the `thanos bucket verify` command to the object storage configuration. Complete the following steps:


. Resolve the block error by adding the `thanos bucket verify` command to the object storage configuration. Set the configuration in the `observability-thanos-compact` pod by using the following commands:

+
Expand All @@ -35,7 +72,7 @@ thanos tools bucket verify -r --objstore.config="$OBJSTORE_CONFIG" --objstore-ba
thanos tools bucket mark --id "01HKZYEZ2DVDQXF1STVEXAMPLE" --objstore.config="$OBJSTORE_CONFIG" --marker=deletion-mark.json --details=DELETE
----

. If you blocked for deletion, clean up the marked blocks by running the following command:
. If you are blocked for deletion, clean up the marked blocks by running the following command:

+
[source,bash]
Expand Down

0 comments on commit a99b59b

Please sign in to comment.