Skip to content
This repository has been archived by the owner on Mar 23, 2020. It is now read-only.

OCS : cluster is in HEALTH_ERR #116

Open
jtaleric opened this issue Sep 10, 2019 · 3 comments
Open

OCS : cluster is in HEALTH_ERR #116

jtaleric opened this issue Sep 10, 2019 · 3 comments

Comments

@jtaleric
Copy link
Contributor

Status:
  Ceph:
    Details:
      MDS_ALL_DOWN:
        Message:   1 filesystem is offline
        Severity:  HEALTH_ERR
      MDS_UP_LESS_THAN_MAX:
        Message:   1 filesystem is online with fewer MDS than max_mds
        Severity:  HEALTH_WARN
      PG_AVAILABILITY:
        Message:   Reduced data availability: 24 pgs inactive
        Severity:  HEALTH_WARN
      TOO_FEW_PGS:
        Message:      too few PGs per OSD (4 < min 30)
        Severity:     HEALTH_WARN
    Health:           HEALTH_ERR
    Last Changed:     2019-09-10T11:21:26Z
    Last Checked:     2019-09-10T16:03:56Z
    Previous Health:  HEALTH_OK
  State:              Created

NAME                                                        READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-c5nqt                                      3/3     Running     0          5h21m
csi-cephfsplugin-kn6rb                                      3/3     Running     0          5h21m
csi-cephfsplugin-lcthx                                      3/3     Running     0          5h21m
csi-cephfsplugin-provisioner-678db5bf66-bnz88               4/4     Running     0          5h21m
csi-cephfsplugin-provisioner-678db5bf66-k7dth               4/4     Running     0          5h21m
csi-rbdplugin-dg5tf                                         3/3     Running     0          5h21m
csi-rbdplugin-mf287                                         3/3     Running     0          5h21m
csi-rbdplugin-provisioner-7bf4556986-hwhsc                  5/5     Running     0          5h21m
csi-rbdplugin-provisioner-7bf4556986-qpnkp                  5/5     Running     0          5h21m
csi-rbdplugin-rjw9s                                         3/3     Running     0          5h21m
local-storage-operator-bcfd5765f-qfpfx                      1/1     Running     0          5h22m
noobaa-core-0                                               0/2     Pending     0          5h21m
noobaa-operator-6f96d69f89-89gst                            1/1     Running     0          5h22m
ocs-operator-7878485678-s29cq                               1/1     Running     0          5h22m
osd-disks-local-diskmaker-qcpgl                             1/1     Running     0          5h21m
osd-disks-local-diskmaker-r555q                             1/1     Running     0          5h21m
osd-disks-local-diskmaker-xg7f7                             1/1     Running     0          5h21m
osd-disks-local-provisioner-7c2hk                           1/1     Running     0          5h21m
osd-disks-local-provisioner-9cj8d                           1/1     Running     0          5h21m
osd-disks-local-provisioner-mc2xb                           1/1     Running     0          5h21m
rook-ceph-mds-mycluster-cephfilesystem-a-799db7669d-8w5gk   1/1     Running     0          5h17m
rook-ceph-mds-mycluster-cephfilesystem-b-6cd95bb858-j2gg2   1/1     Running     0          5h17m
rook-ceph-mgr-a-6c74697d6b-22vxb                            1/1     Running     0          5h18m
rook-ceph-mon-a-54cb68fb47-npbcr                            1/1     Running     0          5h20m
rook-ceph-mon-b-59d978bb88-xzj8n                            1/1     Running     0          5h19m
rook-ceph-mon-c-7cc66d5895-8drkp                            1/1     Running     0          5h19m
rook-ceph-operator-685695b58f-cv8gd                         1/1     Running     0          5h22m
rook-ceph-osd-0-6f784f685d-4cckt                            1/1     Running     0          5h17m
rook-ceph-osd-1-7dd4997744-mts7j                            1/1     Running     0          5h17m
rook-ceph-osd-10-5844f756f-7gxn8                            1/1     Running     0          5h17m
rook-ceph-osd-11-58b47d848b-4mv2k                           1/1     Running     0          5h17m
rook-ceph-osd-2-555c46b877-vfl4z                            1/1     Running     0          5h17m
rook-ceph-osd-3-dcd9f7549-hckcm                             1/1     Running     0          5h17m
rook-ceph-osd-4-55d9b78ff-hhvl2                             1/1     Running     0          5h17m
rook-ceph-osd-5-5c48fdbc5c-2xjml                            1/1     Running     0          5h17m
rook-ceph-osd-6-79646f956c-7sgjz                            1/1     Running     0          5h17m
rook-ceph-osd-7-767bb55984-j729p                            1/1     Running     0          5h17m
rook-ceph-osd-8-69948c56c8-vjxgd                            1/1     Running     0          5h17m
rook-ceph-osd-9-545fbcddb4-n9pd7                            1/1     Running     0          5h17m
rook-ceph-osd-prepare-osd-0-8ppjs-nkcsr                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-1-gwzl6-xlssn                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-10-xfjpd-xcx7b                    0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-11-gpjvx-g55nv                    0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-2-pb9wn-tppv8                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-3-qxx2z-kdrvs                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-4-4vnzs-9lcb2                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-5-krmwx-tz8tq                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-6-ns8gl-vcs4k                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-7-8vvwv-gj42h                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-8-74ttd-65f96                     0/1     Completed   0          5h18m
rook-ceph-osd-prepare-osd-9-cdgxm-79lfz                     0/1     Completed   0          5h18m
rook-ceph-tools-5f5dc75fd5-2rsp9                            1/1     Running     0          5h18m
@jtaleric
Copy link
Contributor Author

sh-4.2# ceph -s
  cluster:
    id:     dcf2163a-2430-4136-91b8-a2c731bf966c
    health: HEALTH_ERR
            1 filesystem is offline
            1 filesystem is online with fewer MDS than max_mds
            Reduced data availability: 24 pgs inactive
            too few PGs per OSD (4 < min 30)
 
  services:
    mon: 3 daemons, quorum a,b,c (age 5h)
    mgr: a(active, since 5h)
    mds: mycluster-cephfilesystem:0
    osd: 12 osds: 12 up (since 5h), 12 in (since 5h)
 
  data:
    pools:   5 pools, 40 pgs
    objects: 0 objects, 0 B
    usage:   12 GiB used, 17 TiB / 17 TiB avail
    pgs:     60.000% pgs unknown
             24 unknown
             16 active+clean
 

@e-minguez
Copy link
Member

In my environment (with autoscale on out of the box because I didn't enabled it...)

$ oc get pods -n openshift-storage
NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-9stz5                                            3/3     Running     0          24h
csi-cephfsplugin-b8zqw                                            3/3     Running     0          24h
csi-cephfsplugin-n9kh8                                            3/3     Running     0          24h
csi-cephfsplugin-provisioner-678db5bf66-h6kq9                     4/4     Running     2          24h
csi-cephfsplugin-provisioner-678db5bf66-lxt6l                     4/4     Running     0          24h
csi-rbdplugin-provisioner-7bf4556986-f4v9r                        5/5     Running     0          24h
csi-rbdplugin-provisioner-7bf4556986-r4sbn                        5/5     Running     0          24h
csi-rbdplugin-tg77k                                               3/3     Running     0          24h
csi-rbdplugin-vklpx                                               3/3     Running     0          24h
csi-rbdplugin-wkmht                                               3/3     Running     0          24h
local-storage-operator-bcfd5765f-gxj5v                            1/1     Running     0          24h
noobaa-core-0                                                     2/2     Running     0          24h
noobaa-operator-6f96d69f89-6rfdv                                  1/1     Running     0          24h
ocs-operator-7878485678-nbhwf                                     1/1     Running     0          23h
osd-disks-local-diskmaker-9wfm6                                   1/1     Running     0          24h
osd-disks-local-diskmaker-hcwpg                                   1/1     Running     0          24h
osd-disks-local-diskmaker-vrplc                                   1/1     Running     0          24h
osd-disks-local-provisioner-4ftqr                                 1/1     Running     0          24h
osd-disks-local-provisioner-b5vwq                                 1/1     Running     0          24h
osd-disks-local-provisioner-w95rb                                 1/1     Running     0          24h
rook-ceph-mds-openshift-storage-cephfilesystem-a-56d6bd4d849j5c   1/1     Running     0          23h
rook-ceph-mds-openshift-storage-cephfilesystem-b-8cfd6f79-nhzkm   1/1     Running     0          23h
rook-ceph-mgr-a-74f685f55c-skhtw                                  1/1     Running     0          23h
rook-ceph-mon-a-c8cd8875c-vc4p8                                   1/1     Running     0          24h
rook-ceph-mon-b-79449bd57d-v6sn7                                  1/1     Running     0          24h
rook-ceph-mon-c-64df8f4c5-dxqg4                                   1/1     Running     0          24h
rook-ceph-operator-685695b58f-9pmhz                               1/1     Running     0          23h
rook-ceph-osd-0-b54548558-szczk                                   1/1     Running     0          23h
rook-ceph-osd-1-779955999d-k55d4                                  1/1     Running     0          23h
rook-ceph-osd-2-7889cd7d8-qk6c4                                   1/1     Running     0          23h
rook-ceph-osd-3-5fcf74b6b7-lgxn4                                  1/1     Running     0          23h
rook-ceph-osd-4-67c56b4555-bjmqf                                  1/1     Running     0          23h
rook-ceph-osd-5-6945d7c764-jrbqq                                  1/1     Running     0          23h
rook-ceph-osd-6-65d468c79f-g6r7d                                  1/1     Running     0          23h
rook-ceph-osd-7-5b496db4d7-bfxxv                                  1/1     Running     0          23h
rook-ceph-osd-8-54995f85f9-wvh2n                                  1/1     Running     0          23h
rook-ceph-osd-prepare-osd-0-9dtlp-v4frm                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-1-n74tf-hghtl                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-2-nttzf-lw8gr                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-3-24tp9-fpwh2                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-4-xjvs7-tfbfl                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-5-fjtb7-zxs8h                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-6-29nc4-njkk4                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-7-f2s5g-n2w5n                           0/1     Completed   0          23h
rook-ceph-osd-prepare-osd-8-fbb97-cnmrp                           0/1     Completed   0          23h
rook-ceph-tools-5f5dc75fd5-fwwq7                                  1/1     Running     0          22h
$ oc exec $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o jsonpath="{.items[*].metadata.name}") -n openshift-storage -- ceph -s
  cluster:
    id:     0b23199a-52a3-4e0c-9e14-9769055e6026
    health: HEALTH_WARN
            too few PGs per OSD (10 < min 30)
 
  services:
    mon: 3 daemons, quorum a,b,c (age 24h)
    mgr: a(active, since 23h)
    mds: openshift-storage-cephfilesystem:1 {0=openshift-storage-cephfilesystem-a=up:active} 1 up:standby-replay
    osd: 9 osds: 9 up (since 23h), 9 in (since 23h)
 
  data:
    pools:   4 pools, 32 pgs
    objects: 10.71k objects, 41 GiB
    usage:   126 GiB used, 20 TiB / 20 TiB avail
    pgs:     32 active+clean
 
  io:
    client:   853 B/s rd, 116 KiB/s wr, 1 op/s rd, 2 op/s wr
$ oc exec $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o jsonpath="{.items[*].metadata.name}") -n openshift-storage -- ceph osd pool autoscale-status
 POOL                                         SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE 
 openshift-storage-cephblockpool            116.9G                3.0        20106G  0.0175                 1.0       8              on        
 openshift-storage-cephfilesystem-metadata   1536k                3.0        20106G  0.0000                 1.0       8              on        
 openshift-storage-cephfilesystem-data0         0                 3.0        20106G  0.0000                 1.0       8              on        
 .rgw.root                                      0                 3.0        20106G  0.0000                 1.0       8              on

@jtaleric
Copy link
Contributor Author

@e-minguez yeah, it looks to be hard coded to be on in the OCS Operator.

Additionally @e-minguez @karmab the Operator seems to default to HostNetwork: False -- just a heads up.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants