1 介绍
OSD是ceph集群中的一个角色,全称为Object Storage Device,负责响应客户端请求返回具体数据的进程,一个ceph集群中一般有多个OSD。
本文章主要讲的是一些关于OSD的基本操作命令。
2 常用操作
2.1 查看OSD状态
$ ceph osd stat
3 osds: 3 up, 3 in
状态说明:
● 集群内(in)
● 集群外(out)
● 活着且在运行(up)
● 挂了且不再运行(down)
说明:
如果OSD活着,它也可以是in或者out集群。如果它以前是in但最近out了,Ceph会把其归置组迁移到其他OSD。
如果OSD out 了,CRUSH就不会再分配归置组给它。如果它挂了(down)其状态也应该是out。
如果OSD状态为down且in,必定有问题,而且集群处于非健康状态。
2.2 查看OSD映射信息
$ ceph osd dump
epoch 64
fsid 0c042a2d-b040-484f-935b-d4f68428b2d6
created 2021-12-28 18:24:46.492648
modified 2021-12-29 16:43:51.895508
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 19
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client jewel
require_osd_release luminous
max_osd 10
osd.0 up in weight 1 up_from 37 up_thru 0 down_at 36 last_clean_interval [9,35) 192.168.8.101:9601/24028 192.168.8.101:9602/24028 192.168.8.101:9603/24028 192.168.8.101:9604/24028 exists,up 33afe59f-a8f2-44a0-b479-64837183bbd6
osd.1 up in weight 1 up_from 6 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.8.102:9600/20844 192.168.8.102:9601/20844 192.168.8.102:9602/20844 192.168.8.102:9603/20844 exists,up ecf9d3c8-6f2f-42ac-bf8b-6cefcc44cdd0
osd.2 up in weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.8.103:9601/32806 192.168.8.103:9602/32806 192.168.8.103:9603/32806 192.168.8.103:9604/32806 exists,up dde6a9b4-b247-433a-bb29-8355a45a1fb1
2.3 查看OSD目录树
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 1.00000 1.00000
2.4 下线OSD
#让编号为2的osd down掉,此时该osd不接受读写请求,但osd还是存活的
$ systemctl stop ceph-osd@2
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 down 1.00000 1.00000
2.5 上线OSD
#让编号为2的osd up,此时该osd接受读写请求
$ systemctl start ceph-osd@2
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 1.00000 1.00000
2.6 将OSD逐出集群
#将osd.2逐出集群,即下线一个osd,此时可以对该osd进行维护
$ ceph osd out 2
marked out osd.2.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.2 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 0 1.00000
2.7 将OSD加入集群
#将osd.2重新加入集群
$ ceph osd in 2
marked in osd.2.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
2 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
5 hdd 0.01659 osd.2 up 1.00000 1.00000
2.8 新增OSD到集群中
2.8.1 新增OSD类型为bluestore
#将磁盘进行分区,分3个区,分别作为wal、db、data。
$ parted -s /dev/sdb mklabel gpt
$ parted -s /dev/sdb mkpart primary 2048s 4196351s
$ parted -s /dev/sdb mkpart primary 4196352s 69208063s
$ parted -s /dev/sdb mkpart primary 69208064s 100%
#查看磁盘分区
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 50G 0 disk
├─sdb2 8:18 0 31G 0 part
├─sdb3 8:19 0 17G 0 part
└─sdb1 8:17 0 2G 0 part
sr0 11:0 1 1024M 0 rom
sdc 8:32 0 50G 0 disk
sda 8:0 0 50G 0 disk
├─sda2 8:2 0 49G 0 part
│ ├─centos-swap 253:1 0 3.9G 0 lvm [SWAP]
│ └─centos-root 253:0 0 45.1G 0 lvm /
└─sda1 8:1 0 1G 0 part /boot
#使用ceph-vloume命令添加bluestore类型osd至集群中。
$ ceph-volume lvm prepare --block.db /dev/sdc2 --block.wal /dev/sdc1 --data /dev/sdc3
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6a81b934-feb1-4fce-8ad5-ed0a5f203f40
Running command: vgcreate --force --yes ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0 /dev/sdc3
stdout: Physical volume "/dev/sdc3" successfully created.
stdout: Volume group "ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40 ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0
stdout: Logical volume "osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-3
Running command: restorecon /var/lib/ceph/osd/ceph-3
Running command: chown -h ceph:ceph /dev/ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0/osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40
Running command: chown -R ceph:ceph /dev/dm-3
Running command: ln -s /dev/ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0/osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40 /var/lib/ceph/osd/ceph-3/block
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
stderr: got monmap epoch 1
stderr:
Running command: ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQCHDMxhByjSKRAA8pNSa1wpOEccKHaiilCj+g==
stdout: creating /var/lib/ceph/osd/ceph-3/keyring
stdout: added entity osd.3 auth auth(auid = 18446744073709551615 key=AQCHDMxhByjSKRAA8pNSa1wpOEccKHaiilCj+g== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
Running command: chown -R ceph:ceph /dev/sdc1
Running command: chown -R ceph:ceph /dev/sdc2
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --keyfile - --bluestore-block-wal-path /dev/sdc1 --bluestore-block-db-path /dev/sdc2 --osd-data /var/lib/ceph/osd/ceph-3/ --osd-uuid 6a81b934-feb1-4fce-8ad5-ed0a5f203f40 --setuser ceph --setgroup ceph
--> ceph-volume lvm prepare successful for: /dev/sdc3
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 0 osd.3 down 0 1.00000
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 4 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.01GiB used, 48.0GiB / 51.0GiB avail
pgs:
$ ceph osd crush add osd.3 1.00000 host=node103
add item id 3 name 'osd.3' weight 1 at location {host=node103} to crush map
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 0 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ ceph osd in osd.3
marked in osd.3.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 1.00000 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
Active: inactive (dead)
$ systemctl start ceph-osd@3
$ systemctl enable ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2021-12-29 15:30:04 CST; 15s ago
Main PID: 3368700 (ceph-osd)
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@3.service
└─3368700 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
Dec 29 15:30:04 node103 systemd[1]: Starting Ceph object storage daemon osd.3...
Dec 29 15:30:04 node103 systemd[1]: Started Ceph object storage daemon osd.3.
Dec 29 15:30:04 node103 ceph-osd[3368700]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 15:30:05 node103 ceph-osd[3368700]: 2021-12-29 15:30:05.958147 7f6eabf16d80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 15:30:06 node103 ceph-osd[3368700]: 2021-12-29 15:30:06.848849 7f6e922f1700 -1 osd.3 0 waiting for initial osdmap
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 hdd 1.00000 osd.3 up 1.00000 1.00000
2.8.2 新增OSD类型为filestore
$ parted -s /dev/sdc mklabel gpt
$ parted -s /dev/sdc mkpart primary 1M 10240M
$ parted -s /dev/sdc mkpart primary 10241M 100%
$ ceph-volume lvm prepare --filestore --data /dev/sdc2 --journal /dev/sdc1
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ca2b837d-901d-4ce2-940b-fc278364b889
Running command: vgcreate --force --yes ceph-848a4efc-ed30-47f3-b64a-6229cfa35060 /dev/sdc2
stdout: Volume group "ceph-848a4efc-ed30-47f3-b64a-6229cfa35060" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 ceph-848a4efc-ed30-47f3-b64a-6229cfa35060
stdout: Wiping xfs signature on /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889.
stdout: Logical volume "osd-data-ca2b837d-901d-4ce2-940b-fc278364b889" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mkfs -t xfs -f -i size=2048 /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889
stdout: meta-data=/dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 isize=2048 agcount=4, agsize=2651392 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=10605568, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=5178, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Running command: mount -t xfs -o rw,noatime,inode64 /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 /var/lib/ceph/osd/ceph-3
Running command: restorecon /var/lib/ceph/osd/ceph-3
Running command: chown -R ceph:ceph /dev/sdc1
Running command: ln -s /dev/sdc1 /var/lib/ceph/osd/ceph-3/journal
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
stderr: got monmap epoch 1
stderr:
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-3/journal
Running command: chown -R ceph:ceph /dev/sdc1
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore filestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --osd-data /var/lib/ceph/osd/ceph-3/ --osd-journal /var/lib/ceph/osd/ceph-3/journal --osd-uuid ca2b837d-901d-4ce2-940b-fc278364b889 --setuser ceph --setgroup ceph
stderr: 2021-12-29 17:28:26.696005 7fdf5cceed80 -1 journal check: ondisk fsid ff68b4ad-06af-4dd2-8369-aac3da4abcc3 doesn't match expected ca2b837d-901d-4ce2-940b-fc278364b889, invalid (someone else's?) journal
stderr: 2021-12-29 17:28:26.902358 7fdf5cceed80 -1 journal do_read_entry(4096): bad header magic
stderr: 2021-12-29 17:28:26.902401 7fdf5cceed80 -1 journal do_read_entry(4096): bad header magic
stderr: 2021-12-29 17:28:26.904367 7fdf5cceed80 -1 read_settings error reading settings: (2) No such file or directory
stderr: 2021-12-29 17:28:27.171320 7fdf5cceed80 -1 created object store /var/lib/ceph/osd/ceph-3/ for osd.3 fsid 0c042a2d-b040-484f-935b-d4f68428b2d6
Running command: ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQAlKsxhXaKLBRAASmGMnMaltRwMIOZ9s/NN6w==
stdout: creating /var/lib/ceph/osd/ceph-3/keyring
added entity osd.3 auth auth(auid = 18446744073709551615 key=AQAlKsxhXaKLBRAASmGMnMaltRwMIOZ9s/NN6w== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
--> ceph-volume lvm prepare successful for: /dev/sdc2
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 4 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.03GiB used, 48.0GiB / 51.0GiB avail
pgs:
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 0 osd.3 down 0 1.00000
$ ceph osd crush add osd.3 1.00000 host=node103
add item id 3 name 'osd.3' weight 1 at location {host=node103} to crush map
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 0 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ ceph osd in osd.3
marked in osd.3.
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
3 1.00000 osd.3 down 1.00000 1.00000
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Wed 2021-12-29 17:24:23 CST; 6min ago
Process: 3665352 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
Process: 3665311 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 3665352 (code=exited, status=0/SUCCESS)
Dec 29 17:19:47 node103 ceph-osd[3665352]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.875231 7f957470fd80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.875549 7f957470fd80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.976355 7f957470fd80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 17:19:50 node103 ceph-osd[3665352]: 2021-12-29 17:19:50.246598 7f95562e1700 -1 osd.3 0 waiting for initial osdmap
Dec 29 17:24:20 node103 systemd[1]: Stopping Ceph object storage daemon osd.3...
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.051448 7f954c2cd700 -1 received signal: Terminated from PID: 1 task name: /usr/lib/systemd/systemd --switched-root --system --deserialize 22 UID: 0
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.051526 7f954c2cd700 -1 osd.3 68 *** Got signal Terminated ***
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.849249 7f954c2cd700 -1 osd.3 68 shutdown
Dec 29 17:24:23 node103 systemd[1]: Stopped Ceph object storage daemon osd.3.
$ systemctl start ceph-osd@3
$ systemctl enable ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2021-12-29 17:31:07 CST; 9s ago
Main PID: 3696217 (ceph-osd)
CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@3.service
└─3696217 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph
Dec 29 17:31:07 node103 systemd[1]: Starting Ceph object storage daemon osd.3...
Dec 29 17:31:07 node103 systemd[1]: Started Ceph object storage daemon osd.3.
Dec 29 17:31:07 node103 ceph-osd[3696217]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.147135 7f6f8c407d80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.147178 7f6f8c407d80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.185153 7f6f8c407d80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 17:31:09 node103 ceph-osd[3696217]: 2021-12-29 17:31:09.465673 7f6f6dfd9700 -1 osd.3 0 waiting for initial osdmap
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 1.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 1.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
3 hdd 1.00000 osd.3 up 1.00000 1.00000
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 4 osds: 4 up, 4 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.13GiB used, 88.3GiB / 91.4GiB avail
pgs:
2.9 从集群中删除OSD
2.9.1 删除类型为bluestore的OSD
$ systemctl stop ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Wed 2021-12-29 16:35:12 CST; 3s ago
Process: 3541909 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
Process: 3541893 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 3541909 (code=exited, status=0/SUCCESS)
$ ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map
$ ceph osd out 3
marked out osd.3.
$ ceph osd rm osd.3
removed osd.3
$ ceph auth del osd.3
updated
$ ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.04976 root default
-5 0.01659 host node101
0 hdd 0.01659 osd.0 up 1.00000 1.00000
-3 0.01659 host node102
1 hdd 0.01659 osd.1 up 1.00000 1.00000
-7 0.01659 host node103
2 hdd 0.01659 osd.2 up 1.00000 1.00000
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.02GiB used, 48.0GiB / 51.0GiB avail
pgs:
2.9.2 删除类型为filestore的OSD
同2.9.1 删除类型为bluestore的OSD操作方式
2.10 查看最大OSD个数
#查看最大osd的个数,默认最大是4个osd节点
$ ceph osd getmaxosd
max_osd = 4 in epoch 63
2.11 设置最大OSD个数
#设置最大osd的个数,当扩大osd节点的时候必须扣大这个值
$ ceph osd setmaxosd 10
set new max_osd = 10
$ ceph osd getmaxosd
max_osd = 10 in epoch 64
2.12 设置OSD的crush权重
格式:ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]
$ ceph osd crush set 3 3.0 host=node4
#或者
$ ceph osd crush reweight osd.3 1.0
2.13 设置OSD的权重
格式:ceph osd reweight {id} {weight}
$ ceph osd reweight 3 0.5
2.14 暂停OSD
#暂停后整个集群不再接收数据
$ ceph osd pause
2.15 开启OSD
#开启后集群再次接收数据
$ ceph osd unpause
2.16 查看OSD参数配置
#查看某个osd的配置参数
$ ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show | less
2.17 OSD打摆子
#我们建议同时部署公网(前端)和集群网(后端),这样能更好地满足对象复制的容量需求。
#然而,如果集群网(后端)失败、或出现了明显的延时,同时公网(前端)却运行良好,OSD现在不能很好地处理这种情况。
#这时OSD们会向监视器报告邻居down了、同时报告自己是up的,我们把这种情形称为打摆子(flapping)。
#如果有东西导致OSD打摆子(反复地被标记为down,然后又up),你可以强制监视器停止。主要用于osd抖动的时候
$ ceph osd set noup # prevent OSDs from getting marked up
$ ceph osd set nodown # prevent OSDs from getting marked down
#这些标记记录在 osdmap 数据结构里:
ceph osd dump | grep flags
flags no-up,no-down
#下列命令可清除标记:
ceph osd unset noup
ceph osd unset nodown
2.18 动态修改OSD参数
#修改所有osd参数,重启失效,需要写到配置文件中持久化
$ ceph tell osd.* injectargs "--rbd_default_format 2 "
2.19 查看延迟情况
主要解决单块磁盘问题,如果有问题应及时剔除osd。统计的是平均值
fs_commit_latency 表示从接收请求到设置 commit 状态的时间间隔
通过 fs_apply_latency 表示从接受请求到设置为 apply 状态的时间间隔
$ ceph osd perf
osd commit_latency(ms) apply_latency(ms)
2 0 0
0 0 0
1 0 0
2.20 主亲和性
Ceph 客户端读写数据时,某个OSD与其它的相比并不适合做主OSD(比如其硬盘慢、或控制器慢),最大化硬件利用率时为防止性能瓶颈(特别是读操作),
可以调整OSD的主亲和性,这样CRUSH就尽量不把它用作acting set里的主OSD了。
说明:[0, 1, 2]中, osd.0是主的
#ceph osd primary-affinity <osd-id> <weight>
$ ceph osd primary-affinity 2 1.0
#主亲和性默认为1(就是说OSD可作为主 OSD )。此值合法范围为 0-1 ,其中0意为此 OSD 不能用作主的
1意为OSD可用作主的;此权重小于 1 时,CRUSH 选择主OSD时选中它的可能性低
2.21 获取Crush图
#提取最新crush图
#ceph osd getcrushmap -o {compiled-crushmap-filename}
$ ceph osd getcrushmap -o /root/crush
#反编译crush图
# crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}
$ crushtool -d /root/crush -o /root/decompiled_crush
2.22 注入Crush图
#编译crush图
#crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename}
$ crushtool -c /root/decompiled_crush -o /root/crush_new
#注入crush图
# ceph osd setcrushmap -i {compiled-crushmap-filename}
$ ceph osd setcrushmap -i /root/crush_new
2.23 停止自动重均衡
#当周期性地维护集群的子系统或解决某个失败域的问题时,不想停机维护OSD让CRUSH自动均衡,提前设置noout
$ ceph osd set noout
noout is set
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_WARN
noout flag(s) set
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 3 osds: 3 up, 3 in
flags noout
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.03GiB used, 48.0GiB / 51.0GiB avail
pgs:
2.24 取消停止自动均衡
$ ceph osd unset noout
noout is unset
$ ceph -s
cluster:
id: 0c042a2d-b040-484f-935b-d4f68428b2d6
health: HEALTH_OK
services:
mon: 3 daemons, quorum node101,node102,node103
mgr: node101(active), standbys: node102, node103
osd: 3 osds: 3 up, 3 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 3.03GiB used, 48.0GiB / 51.0GiB avail
pgs:
2.25 查看磁盘分区情况
$ ceph-disk list
/dev/dm-0 other, xfs, mounted on /
/dev/dm-1 swap, swap
/dev/dm-2 other, unknown
/dev/dm-3 other, unknown
/dev/sda :
/dev/sda1 other, xfs, mounted on /boot
/dev/sda2 other, LVM2_member
/dev/sdb :
/dev/sdb1 ceph block.wal
/dev/sdb2 ceph block.db
/dev/sdb3 other, LVM2_member