crush class实验

标签(空格分隔): ceph,ceph实验,crushmap


luminous版本的ceph新增了一个功能crush class,这个功能又可以称为磁盘智能分组。因为这个功能就是根据磁盘类型自动的进行属性的关联,然后进行分类。无需手动修改crushmap,极大的减少了人为的操作。以前的操作有多麻烦可以看看:ceph crushmap

ceph中的每个osd设备都可以选择一个class类型与之关联,默认情况下,在创建osd的时候会自动识别设备类型,然后设置该设备为相应的类。通常有三种class类型:hdd,ssd,nvme。

由于当前实验环境下没有ssd和nvme设备,只好修改class标签,假装为有ssd设备,然后进行实验。

一,实验环境

[root@node3 ~]# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
[root@node3 ~]# ceph -v
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)

二,修改crush class:

1,查看当前集群布局:

[root@node3 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.05878 root default
-3 0.01959 host node1
0 hdd 0.00980 osd.0 up 1.00000 1.00000
3 hdd 0.00980 osd.3 up 1.00000 1.00000
-5 0.01959 host node2
1 hdd 0.00980 osd.1 up 1.00000 1.00000
4 hdd 0.00980 osd.4 up 1.00000 1.00000
-7 0.01959 host node3
2 hdd 0.00980 osd.2 up 1.00000 1.00000
5 hdd 0.00980 osd.5 up 1.00000 1.00000

可以看到只有第二列为CLASS,只有hdd类型。

通过查看crush class,确实只有hdd类型

[root@node3 ~]# ceph osd crush class ls
[
"hdd"
]

2,删除osd.0,osd.1,osd.2的class:

[root@node3 ~]# for i in 0 1 2;do ceph osd crush rm-device-class osd.$i;done
done removing class of osd(s): 0
done removing class of osd(s): 1
done removing class of osd(s): 2

再次通过命令ceph osd tree查看osd.0,osd.1,osd.2的class

[root@node3 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.05878 root default
-3 0.01959 host node1
0 0.00980 osd.0 up 1.00000 1.00000
3 hdd 0.00980 osd.3 up 1.00000 1.00000
-5 0.01959 host node2
1 0.00980 osd.1 up 1.00000 1.00000
4 hdd 0.00980 osd.4 up 1.00000 1.00000
-7 0.01959 host node3
2 0.00980 osd.2 up 1.00000 1.00000
5 hdd 0.00980 osd.5 up 1.00000 1.00000

可以发现osd.0,osd.1,osd.2的class为空

3,设置osd.0,osd.1,osd.2的class为ssd:

[root@node3 ~]# for i in 0 1 2;do ceph osd crush set-device-class ssd osd.$i;done
set osd(s) 0 to class 'ssd'
set osd(s) 1 to class 'ssd'
set osd(s) 2 to class 'ssd'

再次通过命令ceph osd tree查看osd.0,osd.1,osd.2的class

[root@node3 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.05878 root default
-3 0.01959 host node1
3 hdd 0.00980 osd.3 up 1.00000 1.00000
0 ssd 0.00980 osd.0 up 1.00000 1.00000
-5 0.01959 host node2
4 hdd 0.00980 osd.4 up 1.00000 1.00000
1 ssd 0.00980 osd.1 up 1.00000 1.00000
-7 0.01959 host node3
5 hdd 0.00980 osd.5 up 1.00000 1.00000
2 ssd 0.00980 osd.2 up 1.00000 1.00000

可以看到osd.0,osd.1,osd.2的class变为ssd

再查看一下crush class:

[root@node3 ~]# ceph osd crush class ls
[
"hdd",
"ssd"
]

可以看到class中多出了一个名为ssd的class

4,创建一个优先使用ssd设备的crush rule:

创建了一个rule的名字为:rule-ssd,在root名为default下的rule

[root@node3 ~]# ceph osd crush rule create-replicated rule-ssd default  host ssd

查看集群的rule:

[root@node3 ~]# ceph osd crush rule ls
replicated_rule
rule-ssd

可以看到多出了一个名为rule-ssd的rule

通过下面的命令下载集群crushmap查看有哪些变化:

[root@node3 ~]# ceph osd getcrushmap -o crushmap
20
[root@node3 ~]# crushtool -d crushmap -o crushmap
[root@node3 ~]# cat crushmap
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54 # devices
device 0 osd.0 class ssd
device 1 osd.1 class ssd
device 2 osd.2 class ssd
device 3 osd.3 class hdd
device 4 osd.4 class hdd
device 5 osd.5 class hdd # types
type 0 osd
type 1 host
type 2 chassis
type 3 rack
type 4 row
type 5 pdu
type 6 pod
type 7 room
type 8 datacenter
type 9 region
type 10 root # buckets
host node1 {
id -3 # do not change unnecessarily
id -4 class hdd # do not change unnecessarily
id -9 class ssd # do not change unnecessarily
# weight 0.020
alg straw2
hash 0 # rjenkins1
item osd.0 weight 0.010
item osd.3 weight 0.010
}
host node2 {
id -5 # do not change unnecessarily
id -6 class hdd # do not change unnecessarily
id -10 class ssd # do not change unnecessarily
# weight 0.020
alg straw2
hash 0 # rjenkins1
item osd.1 weight 0.010
item osd.4 weight 0.010
}
host node3 {
id -7 # do not change unnecessarily
id -8 class hdd # do not change unnecessarily
id -11 class ssd # do not change unnecessarily
# weight 0.020
alg straw2
hash 0 # rjenkins1
item osd.2 weight 0.010
item osd.5 weight 0.010
}
root default {
id -1 # do not change unnecessarily
id -2 class hdd # do not change unnecessarily
id -12 class ssd # do not change unnecessarily
# weight 0.059
alg straw2
hash 0 # rjenkins1
item node1 weight 0.020
item node2 weight 0.020
item node3 weight 0.020
} # rules
rule replicated_rule {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
rule rule-ssd {
id 1
type replicated
min_size 1
max_size 10
step take default class ssd
step chooseleaf firstn 0 type host
step emit
} # end crush map

可以看到在root default下多了一行: id -12 class ssd。在rules下,多了一个rule rule-ssd其id为1

5,创建一个使用该rule-ssd规则的存储池:

[root@node3 ~]# ceph osd pool create ssdpool 64 64 rule-ssd
pool 'ssdpool' created

查看ssdpool的信息可以看到使用的crush_rule 为1,也就是rule-ssd

[root@node3 ~]# ceph osd pool ls detail
pool 1 'ssdpool' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 39 flags hashpspool stripe_width 0

6,创建对象测试ssdpool:

创建一个对象test并放到ssdpool中:

[root@node3 ~]# rados -p ssdpool ls
[root@node3 ~]# echo "hahah" >test.txt
[root@node3 ~]# rados -p ssdpool put test test.txt
[root@node3 ~]# rados -p ssdpool ls
test

查看该对象的osd组:

[root@node3 ~]# ceph osd map ssdpool test
osdmap e46 pool 'ssdpool' (1) object 'test' -> pg 1.40e8aab5 (1.35) -> up ([1,2,0], p1) acting ([1,2,0], p1)

可以看到该对象的osd组使用的都是ssd磁盘,至此验证成功。可以看出crush class相当于一个辨别磁盘类型的标签。

三,参考文献:

  1. ceph luminous 新功能之磁盘智能分组
  2. CRUSH MAPS
  3. ceph crushmap
上一篇:Sybase:存储过程中采用临时表存储统计数据


下一篇:Android AIDL使用详解