英文:
Ceph: fix active+undersized+degraded pgs after removing an osd?
问题
I can't find clear information anywhere. How to make Ceph cluster healthy again after osd removing?
我无法在任何地方找到清晰的信息。如何在删除 OSD 后使 Ceph 集群恢复健康?
I just removed one of the 4 osd. Deleted as in manual.
我刚刚删除了其中一个 4 个 OSD,如手动中所述。
2023-02-23 14:31:50.335428 W | cephcmd: loaded admin secret from env var ROOK_CEPH_SECRET instead of from file
2023-02-23 14:31:50.335546 I | rookcmd: starting Rook v1.10.11 with arguments 'rook ceph osd remove --osd-ids=2 --force-osd-removal=true'
2023-02-23 14:31:50.335558 I | rookcmd: flag values: --force-osd-removal=true, --help=false, --log-level=INFO, --operator-image=, --osd-ids=2, --preserve-pvc=false, --service-account=
2023-02-23 14:31:50.335563 I | op-mon: parsing mon endpoints: b=10.104.202.63:6789
2023-02-23 14:31:50.351772 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2023-02-23 14:31:50.351969 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2023-02-23 14:31:51.371062 I | cephosd: validating status of osd.2
2023-02-23 14:31:51.371103 I | cephosd: osd.2 is marked 'DOWN'
2023-02-23 14:31:52.449943 I | cephosd: marking osd.2 out
2023-02-23 14:31:55.263635 I | cephosd: osd.2 is NOT ok to destroy but force removal is enabled so proceeding with removal
2023-02-23 14:31:55.280318 I | cephosd: removing the OSD deployment "rook-ceph-osd-2"
2023-02-23 14:31:55.280344 I | op-k8sutil: removing deployment rook-ceph-osd-2 if it exists
2023-02-23 14:31:55.293007 I | op-k8sutil: Removed deployment rook-ceph-osd-2
2023-02-23 14:31:55.303553 I | op-k8sutil: "rook-ceph-osd-2" still found. waiting...
2023-02-23 14:31:57.315200 I | op-k8sutil: confirmed rook-ceph-osd-2 does not exist
2023-02-23 14:31:57.315231 I | cephosd: did not find a pvc name to remove for osd "rook-ceph-osd-2"
2023-02-23 14:31:57.315237 I | cephosd: purging osd.2
2023-02-23 14:31:58.845262 I | cephosd: attempting to remove host '\x02' from crush map if not in use
2023-02-23 14:32:03.047937 I | cephosd: no ceph crash to silence
2023-02-23 14:32:03.047963 I | cephosd: completed removal of OSD 2
2023-02-23 14:31:50.335428 W | cephcmd: 从环境变量 ROOK_CEPH_SECRET 载入管理员密钥,而不是从文件中载入
2023-02-23 14:31:50.335546 I | rookcmd: 使用参数 'rook ceph osd remove --osd-ids=2 --force-osd-removal=true' 启动 Rook v1.10.11
2023-02-23 14:31:50.335558 I | rookcmd: 标志值: --force-osd-removal=true, --help=false, --log-level=INFO, --operator-image=, --osd-ids=2, --preserve-pvc=false, --service-account=
2023-02-23 14:31:50.335563 I | op-mon: 解析
英文:
I can't find clear information anywhere. How to make Ceph cluster healthy again after osd removing?
I just removed one of the 4 osd. Deleted as in manual.
kubectl -n rook-ceph scale deployment rook-ceph-osd-2 --replicas=0
kubectl rook-ceph rook purge-osd 2 --force
2023-02-23 14:31:50.335428 W | cephcmd: loaded admin secret from env var ROOK_CEPH_SECRET instead of from file
2023-02-23 14:31:50.335546 I | rookcmd: starting Rook v1.10.11 with arguments 'rook ceph osd remove --osd-ids=2 --force-osd-removal=true'
2023-02-23 14:31:50.335558 I | rookcmd: flag values: --force-osd-removal=true, --help=false, --log-level=INFO, --operator-image=, --osd-ids=2, --preserve-pvc=false, --service-account=
2023-02-23 14:31:50.335563 I | op-mon: parsing mon endpoints: b=10.104.202.63:6789
2023-02-23 14:31:50.351772 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2023-02-23 14:31:50.351969 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2023-02-23 14:31:51.371062 I | cephosd: validating status of osd.2
2023-02-23 14:31:51.371103 I | cephosd: osd.2 is marked 'DOWN'
2023-02-23 14:31:52.449943 I | cephosd: marking osd.2 out
2023-02-23 14:31:55.263635 I | cephosd: osd.2 is NOT ok to destroy but force removal is enabled so proceeding with removal
2023-02-23 14:31:55.280318 I | cephosd: removing the OSD deployment "rook-ceph-osd-2"
2023-02-23 14:31:55.280344 I | op-k8sutil: removing deployment rook-ceph-osd-2 if it exists
2023-02-23 14:31:55.293007 I | op-k8sutil: Removed deployment rook-ceph-osd-2
2023-02-23 14:31:55.303553 I | op-k8sutil: "rook-ceph-osd-2" still found. waiting...
2023-02-23 14:31:57.315200 I | op-k8sutil: confirmed rook-ceph-osd-2 does not exist
2023-02-23 14:31:57.315231 I | cephosd: did not find a pvc name to remove for osd "rook-ceph-osd-2"
2023-02-23 14:31:57.315237 I | cephosd: purging osd.2
2023-02-23 14:31:58.845262 I | cephosd: attempting to remove host '\x02' from crush map if not in use
2023-02-23 14:32:03.047937 I | cephosd: no ceph crash to silence
2023-02-23 14:32:03.047963 I | cephosd: completed removal of OSD 2
Here is the status of the cluster before and after deletion.
[root@rook-ceph-tools-6cd9f76d46-bl4tl /]# ceph status
cluster:
id: 75b45cd3-74ee-4de1-8e46-0f51bfd8a152
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 43h)
mgr: a(active, since 42h), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 4 osds: 4 up (since 43h), 4 in (since 43h)
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 13 pools, 201 pgs
objects: 1.13k objects, 1.5 GiB
usage: 2.0 GiB used, 38 GiB / 40 GiB avail
pgs: 201 active+clean
io:
client: 1.3 KiB/s rd, 7.5 KiB/s wr, 2 op/s rd, 0 op/s wr
[root@rook-ceph-tools-6cd9f76d46-bl4tl /]# ceph status
cluster:
id: 75b45cd3-74ee-4de1-8e46-0f51bfd8a152
health: HEALTH_WARN
Degraded data redundancy: 355/2667 objects degraded (13.311%), 42 pgs degraded, 144 pgs undersized
services:
mon: 3 daemons, quorum a,b,c (age 43h)
mgr: a(active, since 42h), standbys: b
mds: 1/1 daemons up, 1 hot standby
osd: 3 osds: 3 up (since 28m), 3 in (since 17m); 25 remapped pgs
rgw: 1 daemon active (1 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 13 pools, 201 pgs
objects: 1.13k objects, 1.5 GiB
usage: 1.7 GiB used, 28 GiB / 30 GiB avail
pgs: 355/2667 objects degraded (13.311%)
56/2667 objects misplaced (2.100%)
102 active+undersized
42 active+undersized+degraded
33 active+clean
24 active+clean+remapped
io:
client: 1.2 KiB/s rd, 2 op/s rd, 0 op/s wr
If I did it wrong, then how to do it right in the future?
Thanks
Update:
[root@rook-ceph-tools-6cd9f76d46-bl4tl /]# ceph health detail
HEALTH_WARN 1 MDSs report slow metadata IOs; Reduced data availability: 9 pgs inactive, 9 pgs down; Degraded data redundancy: 406/4078 objects degraded (9.956%), 50 pgs degraded, 150 pgs undersized; 1 daemons have recently crashed; 256 slow ops, oldest one blocked for 6555 sec, osd.1 has slow ops
[WRN] MDS_SLOW_METADATA_IO: 1 MDSs report slow metadata IOs
mds.ceph-filesystem-a(mds.0): 1 slow metadata IOs are blocked > 30 secs, oldest blocked for 6490 secs
[WRN] PG_AVAILABILITY: Reduced data availability: 9 pgs inactive, 9 pgs down
pg 13.5 is down, acting [0,1,NONE]
pg 13.7 is down, acting [1,0,NONE]
pg 13.b is down, acting [1,0,NONE]
pg 13.e is down, acting [0,NONE,1]
pg 13.15 is down, acting [0,NONE,1]
pg 13.16 is down, acting [0,1,NONE]
pg 13.18 is down, acting [0,NONE,1]
pg 13.19 is down, acting [1,0,NONE]
pg 13.1e is down, acting [1,0,NONE]
[WRN] PG_DEGRADED: Degraded data redundancy: 406/4078 objects degraded (9.956%), 50 pgs degraded, 150 pgs undersized
pg 2.8 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 2.9 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 2.a is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 2.b is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 2.c is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 2.d is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 2.e is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 5.9 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 5.a is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 5.b is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 5.c is stuck undersized for 108m, current state active+undersized+degraded, last acting [1,0]
pg 5.d is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 5.e is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 5.f is stuck undersized for 108m, current state active+undersized+degraded, last acting [1,0]
pg 6.8 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 6.9 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 6.a is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 6.c is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 6.d is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 6.e is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 6.f is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 8.0 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 8.1 is stuck undersized for 108m, current state active+undersized+degraded, last acting [1,0]
pg 8.2 is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 8.3 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 8.4 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 8.6 is stuck undersized for 108m, current state active+undersized+degraded, last acting [1,0]
pg 8.7 is stuck undersized for 108m, current state active+undersized+degraded, last acting [1,0]
pg 9.0 is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 9.1 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 9.2 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 9.5 is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 9.6 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 9.7 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 11.0 is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 11.2 is stuck undersized for 108m, current state active+undersized, last acting [1,0]
pg 11.3 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 11.4 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 11.5 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 11.7 is stuck undersized for 108m, current state active+undersized, last acting [0,1]
pg 12.0 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 12.2 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 12.3 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 12.4 is stuck undersized for 108m, current state active+undersized+remapped, last acting [1,0]
pg 12.5 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 12.6 is stuck undersized for 108m, current state active+undersized+remapped, last acting [1,0]
pg 12.7 is stuck undersized for 108m, current state active+undersized+degraded, last acting [0,1]
pg 13.1 is stuck undersized for 108m, current state active+undersized, last acting [1,NONE,0]
pg 13.2 is stuck undersized for 108m, current state active+undersized, last acting [0,NONE,1]
pg 13.3 is stuck undersized for 108m, current state active+undersized, last acting [1,0,NONE]
pg 13.4 is stuck undersized for 108m, current state active+undersized+remapped, last acting [0,1,NONE]
[WRN] RECENT_CRASH: 1 daemons have recently crashed
osd.3 crashed on host rook-ceph-osd-3-6f65b8c5b6-hvql8 at 2023-02-23T16:54:29.395306Z
[WRN] SLOW_OPS: 256 slow ops, oldest one blocked for 6555 sec, osd.1 has slow ops
[root@rook-ceph-tools-6cd9f76d46-bl4tl /]# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 18 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'ceph-blockpool' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 35 lfor 0/0/31 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 3 'ceph-objectstore.rgw.control' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 181 lfor 0/181/179 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 4 'ceph-objectstore.rgw.meta' replicated size 3 min_size 2 crush_rule 3 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 54 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 5 'ceph-filesystem-metadata' replicated size 3 min_size 2 crush_rule 4 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 137 lfor 0/0/83 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 6 'ceph-filesystem-data0' replicated size 3 min_size 2 crush_rule 5 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 92 lfor 0/0/83 flags hashpspool stripe_width 0 application cephfs
pool 7 'ceph-objectstore.rgw.log' replicated size 3 min_size 2 crush_rule 6 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 273 lfor 0/273/271 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 8 'ceph-objectstore.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 7 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 98 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 9 'ceph-objectstore.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 8 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 113 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 10 'qa' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 310 lfor 0/0/137 flags hashpspool,selfmanaged_snaps max_bytes 42949672960 stripe_width 0 application rbd
pool 11 'ceph-objectstore.rgw.otp' replicated size 3 min_size 2 crush_rule 9 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 123 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 12 '.rgw.root' replicated size 3 min_size 2 crush_rule 10 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 308 lfor 0/308/306 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 13 'ceph-objectstore.rgw.buckets.data' erasure profile ceph-objectstore.rgw.buckets.data_ecprofile size 3 min_size 2 crush_rule 11 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 200 lfor 0/0/194 flags hashpspool,ec_overwrites stripe_width 8192 application rook-ceph-rgw
[root@rook-ceph-tools-6cd9f76d46-f4vsj /]# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 17 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'ceph-blockpool' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 39 lfor 0/0/35 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 3 'ceph-objectstore.rgw.control' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 194 lfor 0/194/192 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 4 'ceph-objectstore.rgw.meta' replicated size 3 min_size 2 crush_rule 3 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 250 lfor 0/250/248 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 5 'ceph-filesystem-metadata' replicated size 3 min_size 2 crush_rule 4 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 70 lfor 0/0/55 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 6 'ceph-filesystem-data0' replicated size 3 min_size 2 crush_rule 5 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 115 lfor 0/0/103 flags hashpspool stripe_width 0 application cephfs
pool 7 'ceph-objectstore.rgw.log' replicated size 3 min_size 2 crush_rule 6 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 84 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 8 'ceph-objectstore.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 7 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 100 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 9 'ceph-objectstore.rgw.buckets.non-ec' replicated size 3 min_size 2 crush_rule 8 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 122 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 10 'ceph-objectstore.rgw.otp' replicated size 3 min_size 2 crush_rule 9 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 135 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 11 '.rgw.root' replicated size 3 min_size 2 crush_rule 10 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 144 flags hashpspool stripe_width 0 pg_num_min 8 application rook-ceph-rgw
pool 12 'ceph-objectstore.rgw.buckets.data' erasure profile ceph-objectstore.rgw.buckets.data_ecprofile size 3 min_size 2 crush_rule 11 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 167 lfor 0/0/157 flags hashpspool,ec_overwrites stripe_width 8192 application rook-ceph-rgw
pool 13 'qa' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 267 lfor 0/0/262 flags hashpspool,selfmanaged_snaps max_bytes 32212254720 stripe_width 0 application qa,rbd
[root@rook-ceph-tools-6cd9f76d46-f4vsj /]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.02939 root default
-5 0.02939 region nbg1
-4 0.02939 zone nbg1-dc3
-11 0.01959 host k8s-qa-pool1-7b6956fb46-cvdqr
1 ssd 0.00980 osd.1 up 1.00000 1.00000
3 ssd 0.00980 osd.3 up 1.00000 1.00000
-3 0.00980 host k8s-qa-pool1-7b6956fb46-mbnld
0 ssd 0.00980 osd.0 up 1.00000 1.00000
[root@rook-ceph-tools-6cd9f76d46-f4vsj /]# ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_rule",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 1,
"rule_name": "ceph-blockpool",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "ceph-objectstore.rgw.control",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 3,
"rule_name": "ceph-objectstore.rgw.meta",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 4,
"rule_name": "ceph-filesystem-metadata",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 5,
"rule_name": "ceph-filesystem-data0",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 6,
"rule_name": "ceph-objectstore.rgw.log",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 7,
"rule_name": "ceph-objectstore.rgw.buckets.index",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 8,
"rule_name": "ceph-objectstore.rgw.buckets.non-ec",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 9,
"rule_name": "ceph-objectstore.rgw.otp",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 10,
"rule_name": ".rgw.root",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 11,
"rule_name": "ceph-objectstore.rgw.buckets.data",
"type": 3,
"steps": [
{
"op": "set_chooseleaf_tries",
"num": 5
},
{
"op": "set_choose_tries",
"num": 100
},
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_indep",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]
答案1
得分: 1
我不熟悉rook,但显然规则集是为您创建的吗?无论如何,它们都使用"host"作为故障域,并且大小为3,但只有两个主机无法满足您的要求。我假设您之前有第4个OSD是在第三个主机上,这就是为什么您的集群现在降级了。您需要添加至少一个更多的主机,以便您的PG可以成功恢复。至于纠删码池,它也将"host"作为故障域,并且大小=3(我假设EC配置文件类似于k=2,m=1?)您还需要3个主机。为了使复制池恢复,您可以将它们的大小更改为2,但我不建议永久更改,只是出于恢复目的。由于您无法更改EC配置文件,该池将保持降级状态,直到您添加第三个OSD节点。
要回答您的其他问题:
- 故障域:这实际取决于您的设置,可以是机架、机箱、数据中心等等。但在如此小的设置中,将"host"作为故障域是有道理的。
- Ceph是自愈软件,因此如果OSD发生故障,Ceph能够自动恢复,但只有在有足够的备用主机/OSD的情况下才能这样做。因此,对于您的小型设置,您没有足够的容量来抵御至少一个OSD的故障。如果您计划将Ceph用于生产数据,您应该熟悉相关概念并计划适当的设置。
- 每个主机上的OSD越多越好,您将拥有恢复选项。警告是可以接受的,如果Ceph注意到磁盘故障,它会向您发出警告,但如果有足够的OSD和主机,它可以自动恢复。
如果您查看ceph osd tree
的输出,只有2个主机有三个OSD,这就是当前情况不好的原因。
英文:
I'm not familiar with rook, but apparently the rulesets are created for you? Anyway, they all use "host" as failure-domain and have a size of 3, but with only two hosts your requirements can not be fullfilled. I assume the 4th OSD you had was on a third host, that's why your cluster is now degraded. You'll need to add at least one more host so your PGs can recover successfully. As for the erasure-coded pool it also has "host" as failure-domain, and with size = 3 (I assume the EC profile is something like k=2, m=1 ?) you also require 3 hosts. To get the replicated pools recovered you could change their size to 2, but I don't recommend to do that permanently, only for recovery reason. Since you can't change an EC profile that pool will stay degraded until you add a third OSD node.
To answer your other questions:
- Failure domain: It really depends on your setup, it could be rack, chassis, data center and so on. But with such a tiny setup it makes sense to have "host" as the failure domain.
- Ceph is a self-healing software, so in case an OSD fails Ceph is able to recover automatically, but only if you have enough spare hosts/OSDs. So with your tiny setup you don't have enough capacity to be resilient against at least one OSD failure. If you plan to use Ceph for production data you should familiarize yourself with the concepts and plan a proper setup.
- The more OSDs per host the better, you'll have recovery options. Warnings are fine, in case Ceph notices a disk outage it warns you about it, but it is able to recover automatically if there are enough OSDs and hosts.
If you look at the output ofceph osd tree
there are only 2 hosts with three OSDs, that's why it's not fine at the moment.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论