Ceph-OSD基本操作

1 介绍

​ OSD是ceph集群中的一个角色,全称为Object Storage Device,负责响应客户端请求返回具体数据的进程,一个ceph集群中一般有多个OSD。

​ 本文章主要讲的是一些关于OSD的基本操作命令。

2 常用操作

2.1 查看OSD状态

$ ceph osd stat
3 osds: 3 up, 3 in

状态说明:

● 集群内(in)

● 集群外(out)

● 活着且在运行(up)

● 挂了且不再运行(down)

说明:

如果OSD活着,它也可以是in或者out集群。如果它以前是in但最近out了,Ceph会把其归置组迁移到其他OSD。

如果OSD out 了,CRUSH就不会再分配归置组给它。如果它挂了(down)其状态也应该是out。

如果OSD状态为down且in,必定有问题,而且集群处于非健康状态。

2.2 查看OSD映射信息

$ ceph osd dump
epoch 64
fsid 0c042a2d-b040-484f-935b-d4f68428b2d6
created 2021-12-28 18:24:46.492648
modified 2021-12-29 16:43:51.895508
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 19
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client jewel
require_osd_release luminous
max_osd 10
osd.0 up   in  weight 1 up_from 37 up_thru 0 down_at 36 last_clean_interval [9,35) 192.168.8.101:9601/24028 192.168.8.101:9602/24028 192.168.8.101:9603/24028 192.168.8.101:9604/24028 exists,up 33afe59f-a8f2-44a0-b479-64837183bbd6
osd.1 up   in  weight 1 up_from 6 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.8.102:9600/20844 192.168.8.102:9601/20844 192.168.8.102:9602/20844 192.168.8.102:9603/20844 exists,up ecf9d3c8-6f2f-42ac-bf8b-6cefcc44cdd0
osd.2 up   in  weight 1 up_from 13 up_thru 0 down_at 0 last_clean_interval [0,0) 192.168.8.103:9601/32806 192.168.8.103:9602/32806 192.168.8.103:9603/32806 192.168.8.103:9604/32806 exists,up dde6a9b4-b247-433a-bb29-8355a45a1fb1

2.3 查看OSD目录树

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 2   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 5   hdd 0.01659         osd.2        up  1.00000 1.00000

2.4 下线OSD

#让编号为2的osd down掉,此时该osd不接受读写请求,但osd还是存活的

$ systemctl stop ceph-osd@2

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 2   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 5   hdd 0.01659         osd.2      down  1.00000 1.00000

2.5 上线OSD

#让编号为2的osd up,此时该osd接受读写请求

$ systemctl start ceph-osd@2

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 2   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 5   hdd 0.01659         osd.2        up  1.00000 1.00000

2.6 将OSD逐出集群

#将osd.2逐出集群,即下线一个osd,此时可以对该osd进行维护

$ ceph osd out 2
marked out osd.2.

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 5   hdd 0.01659         osd.2        up        0 1.00000

2.7 将OSD加入集群

#将osd.2重新加入集群

$ ceph osd in 2
marked in osd.2.

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 2   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 5   hdd 0.01659         osd.2        up  1.00000 1.00000

2.8 新增OSD到集群中

2.8.1 新增OSD类型为bluestore

#将磁盘进行分区,分3个区,分别作为wal、db、data。

$ parted -s /dev/sdb mklabel gpt

$ parted -s /dev/sdb mkpart primary 2048s 4196351s

$ parted -s /dev/sdb mkpart primary 4196352s 69208063s

$ parted -s /dev/sdb mkpart primary 69208064s 100%

#查看磁盘分区

$ lsblk
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sdb               8:16   0   50G  0 disk
├─sdb2            8:18   0   31G  0 part
├─sdb3            8:19   0   17G  0 part
└─sdb1            8:17   0    2G  0 part
sr0              11:0    1 1024M  0 rom
sdc               8:32   0   50G  0 disk
sda               8:0    0   50G  0 disk
├─sda2            8:2    0   49G  0 part
│ ├─centos-swap 253:1    0  3.9G  0 lvm  [SWAP]
│ └─centos-root 253:0    0 45.1G  0 lvm  /
└─sda1            8:1    0    1G  0 part /boot

#使用ceph-vloume命令添加bluestore类型osd至集群中。

$ ceph-volume lvm prepare --block.db /dev/sdc2 --block.wal /dev/sdc1 --data /dev/sdc3
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 6a81b934-feb1-4fce-8ad5-ed0a5f203f40
Running command: vgcreate --force --yes ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0 /dev/sdc3
 stdout: Physical volume "/dev/sdc3" successfully created.
 stdout: Volume group "ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40 ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0
 stdout: Logical volume "osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-3
Running command: restorecon /var/lib/ceph/osd/ceph-3
Running command: chown -h ceph:ceph /dev/ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0/osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40
Running command: chown -R ceph:ceph /dev/dm-3
Running command: ln -s /dev/ceph-571c57b6-0ef7-44b1-ad46-16f66d92acf0/osd-block-6a81b934-feb1-4fce-8ad5-ed0a5f203f40 /var/lib/ceph/osd/ceph-3/block
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
 stderr: got monmap epoch 1
 stderr:
Running command: ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQCHDMxhByjSKRAA8pNSa1wpOEccKHaiilCj+g==
 stdout: creating /var/lib/ceph/osd/ceph-3/keyring
 stdout: added entity osd.3 auth auth(auid = 18446744073709551615 key=AQCHDMxhByjSKRAA8pNSa1wpOEccKHaiilCj+g== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
Running command: chown -R ceph:ceph /dev/sdc1
Running command: chown -R ceph:ceph /dev/sdc2
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --keyfile - --bluestore-block-wal-path /dev/sdc1 --bluestore-block-db-path /dev/sdc2 --osd-data /var/lib/ceph/osd/ceph-3/ --osd-uuid 6a81b934-feb1-4fce-8ad5-ed0a5f203f40 --setuser ceph --setgroup ceph
--> ceph-volume lvm prepare successful for: /dev/sdc3

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
 3             0 osd.3              down        0 1.00000


$ ceph -s
  cluster:
    id:     0c042a2d-b040-484f-935b-d4f68428b2d6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node101,node102,node103
    mgr: node101(active), standbys: node102, node103
    osd: 4 osds: 3 up, 3 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   3.01GiB used, 48.0GiB / 51.0GiB avail
    pgs:



$ ceph osd crush add osd.3 1.00000 host=node103
add item id 3 name 'osd.3' weight 1 at location {host=node103} to crush map

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       1.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       1.01659     host node103
 3       1.00000         osd.3      down        0 1.00000
 2   hdd 0.01659         osd.2        up  1.00000 1.00000

$ ceph osd in osd.3
marked in osd.3.

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       1.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       1.01659     host node103
 3       1.00000         osd.3      down  1.00000 1.00000
 2   hdd 0.01659         osd.2        up  1.00000 1.00000

$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
   Active: inactive (dead)
   
$ systemctl start ceph-osd@3

$ systemctl enable ceph-osd@3

$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2021-12-29 15:30:04 CST; 15s ago
 Main PID: 3368700 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@3.service
           └─3368700 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph

Dec 29 15:30:04 node103 systemd[1]: Starting Ceph object storage daemon osd.3...
Dec 29 15:30:04 node103 systemd[1]: Started Ceph object storage daemon osd.3.
Dec 29 15:30:04 node103 ceph-osd[3368700]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 15:30:05 node103 ceph-osd[3368700]: 2021-12-29 15:30:05.958147 7f6eabf16d80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 15:30:06 node103 ceph-osd[3368700]: 2021-12-29 15:30:06.848849 7f6e922f1700 -1 osd.3 0 waiting for initial osdmap

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       1.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       1.01659     host node103
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
 3   hdd 1.00000         osd.3        up  1.00000 1.00000


2.8.2 新增OSD类型为filestore

$ parted -s /dev/sdc mklabel gpt
$ parted -s /dev/sdc mkpart primary 1M 10240M
$ parted -s /dev/sdc mkpart primary 10241M 100%
$ ceph-volume lvm prepare --filestore --data /dev/sdc2 --journal /dev/sdc1

Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new ca2b837d-901d-4ce2-940b-fc278364b889
Running command: vgcreate --force --yes ceph-848a4efc-ed30-47f3-b64a-6229cfa35060 /dev/sdc2
 stdout: Volume group "ceph-848a4efc-ed30-47f3-b64a-6229cfa35060" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 ceph-848a4efc-ed30-47f3-b64a-6229cfa35060
 stdout: Wiping xfs signature on /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889.
 stdout: Logical volume "osd-data-ca2b837d-901d-4ce2-940b-fc278364b889" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: mkfs -t xfs -f -i size=2048 /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889
 stdout: meta-data=/dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 isize=2048   agcount=4, agsize=2651392 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=10605568, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=5178, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Running command: mount -t xfs -o rw,noatime,inode64 /dev/ceph-848a4efc-ed30-47f3-b64a-6229cfa35060/osd-data-ca2b837d-901d-4ce2-940b-fc278364b889 /var/lib/ceph/osd/ceph-3
Running command: restorecon /var/lib/ceph/osd/ceph-3
Running command: chown -R ceph:ceph /dev/sdc1
Running command: ln -s /dev/sdc1 /var/lib/ceph/osd/ceph-3/journal
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
 stderr: got monmap epoch 1
 stderr:
Running command: chown -h ceph:ceph /var/lib/ceph/osd/ceph-3/journal
Running command: chown -R ceph:ceph /dev/sdc1
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore filestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --osd-data /var/lib/ceph/osd/ceph-3/ --osd-journal /var/lib/ceph/osd/ceph-3/journal --osd-uuid ca2b837d-901d-4ce2-940b-fc278364b889 --setuser ceph --setgroup ceph
 stderr: 2021-12-29 17:28:26.696005 7fdf5cceed80 -1 journal check: ondisk fsid ff68b4ad-06af-4dd2-8369-aac3da4abcc3 doesn't match expected ca2b837d-901d-4ce2-940b-fc278364b889, invalid (someone else's?) journal
 stderr: 2021-12-29 17:28:26.902358 7fdf5cceed80 -1 journal do_read_entry(4096): bad header magic
 stderr: 2021-12-29 17:28:26.902401 7fdf5cceed80 -1 journal do_read_entry(4096): bad header magic
 stderr: 2021-12-29 17:28:26.904367 7fdf5cceed80 -1 read_settings error reading settings: (2) No such file or directory
 stderr: 2021-12-29 17:28:27.171320 7fdf5cceed80 -1 created object store /var/lib/ceph/osd/ceph-3/ for osd.3 fsid 0c042a2d-b040-484f-935b-d4f68428b2d6
Running command: ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQAlKsxhXaKLBRAASmGMnMaltRwMIOZ9s/NN6w==
 stdout: creating /var/lib/ceph/osd/ceph-3/keyring
added entity osd.3 auth auth(auid = 18446744073709551615 key=AQAlKsxhXaKLBRAASmGMnMaltRwMIOZ9s/NN6w== with 0 caps)
Running command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
--> ceph-volume lvm prepare successful for: /dev/sdc2

$ ceph -s
  cluster:
    id:     0c042a2d-b040-484f-935b-d4f68428b2d6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node101,node102,node103
    mgr: node101(active), standbys: node102, node103
    osd: 4 osds: 3 up, 3 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   3.03GiB used, 48.0GiB / 51.0GiB avail
    pgs:

$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
 3             0 osd.3              down        0 1.00000

$ ceph osd crush add osd.3 1.00000 host=node103
add item id 3 name 'osd.3' weight 1 at location {host=node103} to crush map
$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       1.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       1.01659     host node103
 3       1.00000         osd.3      down        0 1.00000
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
$ ceph osd in osd.3
marked in osd.3.
$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       1.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       1.01659     host node103
 3       1.00000         osd.3      down  1.00000 1.00000
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2021-12-29 17:24:23 CST; 6min ago
  Process: 3665352 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
  Process: 3665311 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 3665352 (code=exited, status=0/SUCCESS)

Dec 29 17:19:47 node103 ceph-osd[3665352]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.875231 7f957470fd80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.875549 7f957470fd80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:19:47 node103 ceph-osd[3665352]: 2021-12-29 17:19:47.976355 7f957470fd80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 17:19:50 node103 ceph-osd[3665352]: 2021-12-29 17:19:50.246598 7f95562e1700 -1 osd.3 0 waiting for initial osdmap
Dec 29 17:24:20 node103 systemd[1]: Stopping Ceph object storage daemon osd.3...
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.051448 7f954c2cd700 -1 received  signal: Terminated from  PID: 1 task name: /usr/lib/systemd/systemd --switched-root --system --deserialize 22  UID: 0
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.051526 7f954c2cd700 -1 osd.3 68 *** Got signal Terminated ***
Dec 29 17:24:20 node103 ceph-osd[3665352]: 2021-12-29 17:24:20.849249 7f954c2cd700 -1 osd.3 68 shutdown
Dec 29 17:24:23 node103 systemd[1]: Stopped Ceph object storage daemon osd.3.
$ systemctl start ceph-osd@3
$ systemctl enable ceph-osd@3
$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2021-12-29 17:31:07 CST; 9s ago
 Main PID: 3696217 (ceph-osd)
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@3.service
           └─3696217 /usr/bin/ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph

Dec 29 17:31:07 node103 systemd[1]: Starting Ceph object storage daemon osd.3...
Dec 29 17:31:07 node103 systemd[1]: Started Ceph object storage daemon osd.3.
Dec 29 17:31:07 node103 ceph-osd[3696217]: starting osd.3 at - osd_data /var/lib/ceph/osd/ceph-3 /var/lib/ceph/osd/ceph-3/journal
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.147135 7f6f8c407d80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.147178 7f6f8c407d80 -1 journal do_read_entry(8192): bad header magic
Dec 29 17:31:08 node103 ceph-osd[3696217]: 2021-12-29 17:31:08.185153 7f6f8c407d80 -1 osd.3 0 log_to_monitors {default=true}
Dec 29 17:31:09 node103 ceph-osd[3696217]: 2021-12-29 17:31:09.465673 7f6f6dfd9700 -1 osd.3 0 waiting for initial osdmap
$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       1.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       1.01659     host node103
 2   hdd 0.01659         osd.2        up  1.00000 1.00000
 3   hdd 1.00000         osd.3        up  1.00000 1.00000

$ ceph -s
  cluster:
    id:     0c042a2d-b040-484f-935b-d4f68428b2d6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node101,node102,node103
    mgr: node101(active), standbys: node102, node103
    osd: 4 osds: 4 up, 4 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   3.13GiB used, 88.3GiB / 91.4GiB avail
    pgs:

2.9 从集群中删除OSD

2.9.1 删除类型为bluestore的OSD

$ systemctl stop ceph-osd@3

$ systemctl status ceph-osd@3
● ceph-osd@3.service - Ceph object storage daemon osd.3
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Wed 2021-12-29 16:35:12 CST; 3s ago
  Process: 3541909 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
  Process: 3541893 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
 Main PID: 3541909 (code=exited, status=0/SUCCESS)
 
$ ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map

$ ceph osd out 3
marked out osd.3.

$ ceph osd rm osd.3
removed osd.3

$ ceph auth del osd.3
updated
$ ceph osd tree
ID CLASS WEIGHT  TYPE NAME        STATUS REWEIGHT PRI-AFF
-1       0.04976 root default
-5       0.01659     host node101
 0   hdd 0.01659         osd.0        up  1.00000 1.00000
-3       0.01659     host node102
 1   hdd 0.01659         osd.1        up  1.00000 1.00000
-7       0.01659     host node103
 2   hdd 0.01659         osd.2        up  1.00000 1.00000

$ ceph -s
  cluster:
    id:     0c042a2d-b040-484f-935b-d4f68428b2d6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node101,node102,node103
    mgr: node101(active), standbys: node102, node103
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   3.02GiB used, 48.0GiB / 51.0GiB avail
    pgs:

2.9.2 删除类型为filestore的OSD

同2.9.1 删除类型为bluestore的OSD操作方式

2.10 查看最大OSD个数

#查看最大osd的个数,默认最大是4个osd节点

$ ceph osd getmaxosd
max_osd = 4 in epoch 63

2.11 设置最大OSD个数

#设置最大osd的个数,当扩大osd节点的时候必须扣大这个值

$ ceph osd setmaxosd 10
set new max_osd = 10

$ ceph osd getmaxosd
max_osd = 10 in epoch 64

2.12 设置OSD的crush权重

格式:ceph osd crush set {id} {weight} [{loc1} [{loc2} ...]]

$ ceph osd crush set 3 3.0 host=node4
#或者
$ ceph osd crush reweight osd.3 1.0

2.13 设置OSD的权重

格式:ceph osd reweight {id} {weight}

$ ceph osd reweight 3 0.5

2.14 暂停OSD

#暂停后整个集群不再接收数据

$ ceph osd pause

2.15 开启OSD

#开启后集群再次接收数据

$ ceph osd unpause

2.16 查看OSD参数配置

#查看某个osd的配置参数

$ ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config show | less

2.17 OSD打摆子

#我们建议同时部署公网(前端)和集群网(后端),这样能更好地满足对象复制的容量需求。
#然而,如果集群网(后端)失败、或出现了明显的延时,同时公网(前端)却运行良好,OSD现在不能很好地处理这种情况。
#这时OSD们会向监视器报告邻居down了、同时报告自己是up的,我们把这种情形称为打摆子(flapping)。
#如果有东西导致OSD打摆子(反复地被标记为down,然后又up),你可以强制监视器停止。主要用于osd抖动的时候
 
$ ceph osd set noup      # prevent OSDs from getting marked up
$ ceph osd set nodown    # prevent OSDs from getting marked down
 
#这些标记记录在 osdmap 数据结构里:
ceph osd dump | grep flags
flags no-up,no-down
 
#下列命令可清除标记:
ceph osd unset noup
ceph osd unset nodown

2.18 动态修改OSD参数

#修改所有osd参数,重启失效,需要写到配置文件中持久化

$ ceph tell osd.* injectargs "--rbd_default_format 2 "   

2.19 查看延迟情况

主要解决单块磁盘问题,如果有问题应及时剔除osd。统计的是平均值
fs_commit_latency 表示从接收请求到设置 commit 状态的时间间隔
通过 fs_apply_latency 表示从接受请求到设置为 apply 状态的时间间隔

 
$ ceph osd perf
osd commit_latency(ms) apply_latency(ms)
  2                  0                 0
  0                  0                 0
  1                  0                 0

2.20 主亲和性

​ Ceph 客户端读写数据时,某个OSD与其它的相比并不适合做主OSD(比如其硬盘慢、或控制器慢),最大化硬件利用率时为防止性能瓶颈(特别是读操作),
​ 可以调整OSD的主亲和性,这样CRUSH就尽量不把它用作acting set里的主OSD了。

​ 说明:[0, 1, 2]中, osd.0是主的

#ceph osd primary-affinity <osd-id> <weight>   
 
$ ceph osd primary-affinity 2 1.0

#主亲和性默认为1(就是说OSD可作为主 OSD )。此值合法范围为 0-1 ,其中0意为此 OSD 不能用作主的
1意为OSD可用作主的;此权重小于 1 时,CRUSH 选择主OSD时选中它的可能性低

2.21 获取Crush图

#提取最新crush图
#ceph osd getcrushmap -o {compiled-crushmap-filename}
 
$ ceph osd getcrushmap -o /root/crush
 
#反编译crush图
# crushtool -d {compiled-crushmap-filename} -o {decompiled-crushmap-filename}

$ crushtool -d /root/crush -o /root/decompiled_crush

2.22 注入Crush图

#编译crush图
#crushtool -c {decompiled-crush-map-filename} -o {compiled-crush-map-filename}
 
$ crushtool -c /root/decompiled_crush -o /root/crush_new

#注入crush图
# ceph osd setcrushmap -i {compiled-crushmap-filename}

$ ceph osd setcrushmap -i /root/crush_new

2.23 停止自动重均衡

#当周期性地维护集群的子系统或解决某个失败域的问题时,不想停机维护OSD让CRUSH自动均衡,提前设置noout

$ ceph osd set noout
noout is set

$ ceph -s
  cluster:
    id:     0c042a2d-b040-484f-935b-d4f68428b2d6
    health: HEALTH_WARN
            noout flag(s) set

  services:
    mon: 3 daemons, quorum node101,node102,node103
    mgr: node101(active), standbys: node102, node103
    osd: 3 osds: 3 up, 3 in
         flags noout

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   3.03GiB used, 48.0GiB / 51.0GiB avail
    pgs:

2.24 取消停止自动均衡

$ ceph osd unset noout
noout is unset

$ ceph -s
  cluster:
    id:     0c042a2d-b040-484f-935b-d4f68428b2d6
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node101,node102,node103
    mgr: node101(active), standbys: node102, node103
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   3.03GiB used, 48.0GiB / 51.0GiB avail
    pgs:

2.25 查看磁盘分区情况

$ ceph-disk list

/dev/dm-0 other, xfs, mounted on /
/dev/dm-1 swap, swap
/dev/dm-2 other, unknown
/dev/dm-3 other, unknown
/dev/sda :
 /dev/sda1 other, xfs, mounted on /boot
 /dev/sda2 other, LVM2_member
/dev/sdb :
 /dev/sdb1 ceph block.wal
 /dev/sdb2 ceph block.db
 /dev/sdb3 other, LVM2_member
上一篇:LeetCode——781. 森林中的兔子(Rabbits in Forest)[中等]——分析及代码(Java)


下一篇:【OSD】PG相关命令