一、Ceph组件:
1.OSD(Object Storage Daemon)
功能:Ceph OSDs(对象存储守护程序ceph-osd):提供数据存储,操作系统上的一个磁盘就是一个OSD守护程序,用于处理ceph集群数据复制、回复、重新平衡,并通过检查其他Ceph OSD守护程序的心跳来向Ceph监视器和管理器提供一些监视信息,实现冗余和高可用性至少需要3个Ceph OSD。
2.Mon (monitor):ceph的监视器
功能:一个主机上运行的一个守护进程,用于维护集群状态映射(maintains maps of the cluster state),如ceph集群中有多少存储池、每个存储池有多少个PG以及存储池的PG的映射关系等,一个ceph集群至少有一个Mon(1,3,5,7...),Ceph守护程序相互协调所需的关键集群状态有:monitor map,manager map,the OSD map,the MDS map和the CRUSH map。
3.Mgr(Manager)管理器
功能:一个主机上运行的一个守护进程,Ceph Manager守护程序负责跟踪运行时,指标和Ceph集群的当前状态,包括存储利用率,当前性能指标和系统负载。还托管基于Python的模块来管理和公开Ceph集群信息,包括基于Web的Ceph仪表板和REST API。高可用至少需要两个管理器。
二、Ceph的数据读写流程:
- 计算文件到对象的映射,得到oid(object id)= ino+non:
- ino:iNode number (INO),File的元数据序列号,File的唯一id
- ono:object number (ONO),File切分产生的某个object的序号,默认以4M切分一个块大小
- 通过hash算法计算出文件对应的pool中的PG:
通过一致性HASH计算object到PG,Object --> PG映射的hash(oid)&mask --> pgid
- 通过CRUSH把对象映射到PG中的OSD
通过CRUSH算法计算PG到OSD,PG --> OSD映射:[CRUSH(pgid)->(osd1,osd2,osd3)]
- PG中的主OSD将对象写入到硬盘
- 主OSD将数据同步到备份OSD,并等待备份OSD返回确认
- 主OSD将写入完成返回给客户端。
说明:
Pool:存储池、分区,存储池的大小取决于底层的存储空间。
PG(placement group):一个pool内部可以有多个PG存在,Pool和PG都是抽象的逻辑概念,一个pool中有多少个PG可以通过公式计算。
OSD(Object storage Daemon,对象存储设备):每一块磁盘都是一个osd,一个主机由一个或多个osd组成。
ceph集群部署好之后,要先创建存储池才能向ceph写入数据,文件在向ceph保存之前要先进行一致性hash计算,计算后会把文件保存在某个对应的PG中,此文件一定属于某个pool的一个PG,在通过PG保存在OSD上。数据对象在写到主OSD之后再同步到从OSD以实现数据的高可用。
三、部署ceph集群
服务器角色 | 系统版本 | IP地址 | 基本配置及分区大小 |
ceph-deploy | Ubuntu 1804 | 10.0.0.100/192.168.0.100 | 2C2G/120G |
ceph-mon1 | Ubuntu 1804 | 10.0.0.101/192.168.0.101 | 2C2G/120G |
ceph-mon2 | Ubuntu 1804 | 10.0.0.102/192.168.0.102 | 2C2G/120G |
ceph-mon3 | Ubuntu 1804 | 10.0.0.103/192.168.0.103 | 2C2G/120G |
ceph-mgr1 | Ubuntu 1804 | 10.0.0.104/192.168.0.104 | 2C2G/120G |
ceph-mgr2 | Ubuntu 1804 | 10.0.0.105/192.168.0.105 | 2C2G/120G |
ceph-node1 | Ubuntu 1804 | 10.0.0.106/192.168.0.106 | 2C2G/120G+100*5 |
ceph-node2 | Ubuntu 1804 | 10.0.0.107/192.168.0.107 | 2C2G/120G+100*5 |
ceph-node3 | Ubuntu 1804 | 10.0.0.108/192.168.0.108 | 2C2G/120G+100*5 |
ceph-node4 | Ubuntu 1804 | 10.0.0.109/192.168.0.109 | 2C2G/120G+100*5 |
环境简介:
1、一个服务器用语部署ceph集群即安装ceph-deploy,也可以和cepy-mgr等复用。
10.0.0.100/192.168.0.100
2、三台服务器作为ceph集群Mon监控服务器,每台服务器可以和ceph集群的cluster网络通信
10.0.0.101/192.168.0.101
10.0.0.102/192.168.0.102
10.0.0.103/192.168.0.103
3、两个ceph-mgr管理器,可以和ceph集群的cluster网络通信
10.0.0.104/192.168.0.104
10.0.0.105/192.168.0.105
4、四台服务器作为ceph集群OSD存储服务器,每台服务器支持两个网络,public网络针对客户端使用,cluster网络用于集群管理及数据同步,每台3块以上硬盘
10.0.0.106/192.168.0.106
10.0.0.107/192.168.0.107
10.0.0.108/192.168.0.108
10.0.0.109/192.168.0.109
#各存储服务器磁盘划分:
/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf #100G
5、创建一个普通用户,能够通过sudo执行特权命令,配置主机名解析,ceph集群部署过程中需要对各主机配置不同的主机名,另外如果是centos系统则需要关闭各服务器的防火墙和selinux。
-
Ubuntu server系统基础配置
1、更改主机名
# cat /etc/hostname Ubuntu1804 # hostnamectl set-hostname ceph-deploy.example.lcoal #hostnamectl set-hostname 更改后的主机名 # cat /etc/hostname ceph-deploy.example.lcoal
2、更改网卡名称为eth*
方法一:安装Ubuntu系统界面时传递内核参数:net.ifnames=0 biosdevname=0
方法二:如果没有在安装系统之前传递内核参数将网卡名称更改为eth*,可以通过如下方式更改(需重启Ubuntu系统):
3、配置root远程登录
默认情况下,Ubuntu不允许root用户远程ssh,需添加root密码并编辑/etc/ssh/sshd_config文件:
$ sudo vim /etc/ssh/sshd_config 32 #PermitRootLogin prohibit-password 33 PermitRootLogin yes #允许root登录 101 #UseDNS no 102 UseDNS no #关闭DNS解析
$ sudo su - root #切换至root用户 # passwd #设置root密码 Enter new UNIX password: Retype new UNIX password: passwd: password updated successfully # systemctl restart sshd #重启ssh服务
4、各节点配置服务器网卡,例如ceph-deploy:
root@ceph-deploy:~# cat /etc/netplan/01-netcfg.yaml # This file describes the network interfaces available on your system # For more information, see netplan(5). network: version: 2 renderer: networkd ethernets: eth0: dhcp4: no dhcp6: no addresses: [10.0.0.100/24] gateway4: 10.0.0.2 nameservers: addresses: [10.0.0.2, 114.114.114.114, 8.8.8.8] eth1: dhcp4: no dhcp6: no addresses: [192.168.0.100/24] root@ceph-deploy:~# ifconfig eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.0.100 netmask 255.255.255.0 broadcast 10.0.0.255 inet6 fe80::20c:29ff:fe65:a300 prefixlen 64 scopeid 0x20<link> ether 00:0c:29:65:a3:00 txqueuelen 1000 (Ethernet) RX packets 2057 bytes 172838 (172.8 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1575 bytes 221983 (221.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.0.100 netmask 255.255.255.0 broadcast 192.168.0.255 inet6 fe80::20c:29ff:fe65:a30a prefixlen 64 scopeid 0x20<link> ether 00:0c:29:65:a3:0a txqueuelen 1000 (Ethernet) RX packets 2 bytes 486 (486.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 14 bytes 1076 (1.0 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1000 (Local Loopback) RX packets 182 bytes 14992 (14.9 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 182 bytes 14992 (14.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 root@ceph-deploy:~# ping -c 1 -i 1 www.baidu.com
64 bytes from 220.181.38.149 (220.181.38.149): icmp_seq=1 ttl=128 time=6.67 ms
5、配置apt仓库
https://mirrors.aliyun.com/ceph/ #阿里云镜像仓库
http://mirrors.163.com/ceph/ #网易镜像仓库
https://mirrors.tuna.tsinghua.edu.cn/ceph/ #清华大学镜像源仓库
$ wget -q -O- ‘https://mirrors.tuna.tsinghua.edu.cn/ceph/keys/release.asc‘ | sudo apt-key add - #导入key文件 OK $ sudo echo "deb https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-pacific bionic main" >> /etc/apt/sources.list $ cat /etc/apt/sources.list # 默认注释了源码镜像以提高 apt update 速度,如有需要可自行取消注释 deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-updates main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-backports main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse # deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ bionic-security main restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic main #echo追加该条内容
$ sudo apt update
-
部署RADOS集群
1、创建cephadmin用户:
推荐使用指定的普通用户部署、运行ceph集群,普通用户只要能以交互式方式使用sudo命令执行一些特权命令即可,新版的ceph-deploy可以指定包含root在内的只要可以执行sudo命令的用户,不过仍推荐使用普通用户,如:cephuser、cephadmin这样的用户去管理ceph集群。
在包含ceph-deploy节点的存储节点、Mon节点和mgr节点创建cephadmin用户。
groupadd -r -g 2022 cephadmin && useradd -r -m -s /bin/bash -u 2022 -g 2022 cephadmin && echo cephadmin:123.com | chpasswd
各服务允许cephadmin用户使用sudo执行特权命令:
echo "cephadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
2、配置免秘钥登录:
在ceph-deploy节点配置允许以非交互的方式登录到各ceph node/mon/mgr节点,及在ceph-deploy节点生成密钥对,然后分发公钥到各被管理节点:
cephadmin@ceph-deploy:~$ ssh-keygen #生成ssh秘钥对 Generating public/private rsa key pair. Enter file in which to save the key (/home/cephadmin/.ssh/id_rsa): Created directory ‘/home/cephadmin/.ssh‘. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/cephadmin/.ssh/id_rsa. Your public key has been saved in /home/cephadmin/.ssh/id_rsa.pub. The key fingerprint is: SHA256:0+vL5tnFkcEzFiGCmKTzR7G58KHrbUB9qBiaqtYsSi4 cephadmin@ceph-deploy The key‘s randomart image is: +---[RSA 2048]----+ | ..o... . o. | | .o .+ . o . | | o ..=. * | | .o.=o+. . = | | o +o.S.. o | | o . oo . . . . | | oo .. . o | |Eo o . ..o.o . | |B.. ...o*.. | +----[SHA256]-----+ cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.100 #分发公钥至各管理节点(包括自身) cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.101 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.102 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.103 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.104 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.105 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.106 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.107 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.108 cephadmin@ceph-deploy:~$ ssh-copy-id cephadmin@10.0.0.109
3、各节点配置域名解析:
# cat >> /etc/hosts << EOF 10.0.0.100 ceph-deploy 10.0.0.101 ceph-mon1 10.0.0.102 ceph-mon2 10.0.0.103 ceph-mon3 10.0.0.104 ceph-mgr1 10.0.0.105 ceph-mgr2 10.0.0.106 ceph-node1 10.0.0.107 ceph-node2 10.0.0.108 ceph-node3 10.0.0.109 ceph-node4 EOF
4、在各节点安装Python2:
# apt -y install python2.7 #安装Python2.7 # ln -sv /usr/bin/python2.7 /usr/bin/python2 #创建软链接
5、安装ceph部署工具
在ceph部署服务器安装部署工具ceph-deploy
cephadmin@ceph-deploy:~$ apt-cache madison ceph-deploy ceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main amd64 Packages ceph-deploy | 2.0.1 | https://mirrors.tuna.tsinghua.edu.cn/ceph/debian-pacific bionic/main i386 Packages ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe amd64 Packages ceph-deploy | 1.5.38-0ubuntu1 | https://mirrors.tuna.tsinghua.edu.cn/ubuntu bionic/universe i386 Packages cephadmin@ceph-deploy:~$ sudo apt -y install ceph-deploy
6、初始化Mon节点
在ceph-deploy管理节点初始化Mon节点,Mon节点也需要有cluster network,否则初始化会报错;
cephadmin@ceph-deploy:~$ mkdir ceph-cluster cephadmin@ceph-deploy:~$ cd ceph-cluster/ cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy new --cluster-network 192.168.0.0/24 --public-network 10.0.0.0/24 ceph-mon1
验证初始化:
cephadmin@ceph-deploy:~/ceph-cluster$ ll total 20 drwxrwxr-x 2 cephadmin cephadmin 4096 Aug 18 15:26 ./ drwxr-xr-x 6 cephadmin cephadmin 4096 Aug 18 15:20 ../ -rw-rw-r-- 1 cephadmin cephadmin 259 Aug 18 15:26 ceph.conf #自动生成的配置文件 -rw-rw-r-- 1 cephadmin cephadmin 3892 Aug 18 15:26 ceph-deploy-ceph.log #初始化日志 -rw------- 1 cephadmin cephadmin 73 Aug 18 15:26 ceph.mon.keyring #用于ceph Mon节点内部通讯认证的秘钥环文件 cephadmin@ceph-deploy:~/ceph-cluster$ cat ceph.conf [global] fsid = 0d11d338-a480-40da-8520-830423b22c3e #ceph的集群ID public_network = 10.0.0.0/24 cluster_network = 192.168.0.0/24 mon_initial_members = ceph-mon1 #可以用逗号做分割添加多个mon节点 mon_host = 10.0.0.101 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx
配置Mon节点并生成同步秘钥
在各Mon节点安装组件ceph-mon,并初始化Mon节点,Mon节点可以后期横向扩容
root@ceph-mon1:~# apt -y install ceph-mon cephadmin@ceph-deploy:~/ceph-cluster$ pwd /home/cephadmin/ceph-cluster cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon create-initial [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mon create-initial [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : create-initial [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f0903be4fa0> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] func : <function mon at 0x7f0903bc8ad0> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] keyrings : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon1 [ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon1 ... [ceph-mon1][DEBUG ] connection detected need for sudo [ceph-mon1][DEBUG ] connected to host: ceph-mon1 [ceph-mon1][DEBUG ] detect platform information from remote host [ceph-mon1][DEBUG ] detect machine type [ceph-mon1][DEBUG ] find the location of an executable [ceph_deploy.mon][INFO ] distro info: Ubuntu 18.04 bionic [ceph-mon1][DEBUG ] determining if provided host has same hostname in remote [ceph-mon1][DEBUG ] get remote short hostname [ceph-mon1][DEBUG ] deploying mon to ceph-mon1 [ceph-mon1][DEBUG ] get remote short hostname [ceph-mon1][DEBUG ] remote hostname: ceph-mon1 [ceph-mon1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-mon1][DEBUG ] create the mon path if it does not exist [ceph-mon1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon1/done [ceph-mon1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon1/done [ceph-mon1][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring [ceph-mon1][DEBUG ] create the monitor keyring file [ceph-mon1][INFO ] Running command: sudo ceph-mon --cluster ceph --mkfs -i ceph-mon1 --keyring /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring --setuser 64045 --setgroup 64045 [ceph-mon1][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon1.mon.keyring [ceph-mon1][DEBUG ] create a done file to avoid re-doing the mon deployment [ceph-mon1][DEBUG ] create the init path if it does not exist [ceph-mon1][INFO ] Running command: sudo systemctl enable ceph.target [ceph-mon1][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon1 [ceph-mon1][WARNIN] Created symlink /etc/systemd/system/ceph-mon.target.wants/ceph-mon@ceph-mon1.service → /lib/systemd/system/ceph-mon@.service. [ceph-mon1][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon1 [ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph-mon1][DEBUG ] ******************************************************************************** [ceph-mon1][DEBUG ] status for monitor: mon.ceph-mon1 [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "election_epoch": 3, [ceph-mon1][DEBUG ] "extra_probe_peers": [], [ceph-mon1][DEBUG ] "feature_map": { [ceph-mon1][DEBUG ] "mon": [ [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "features": "0x3f01cfb9fffdffff", [ceph-mon1][DEBUG ] "num": 1, [ceph-mon1][DEBUG ] "release": "luminous" [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "features": { [ceph-mon1][DEBUG ] "quorum_con": "4540138297136906239", [ceph-mon1][DEBUG ] "quorum_mon": [ [ceph-mon1][DEBUG ] "kraken", [ceph-mon1][DEBUG ] "luminous", [ceph-mon1][DEBUG ] "mimic", [ceph-mon1][DEBUG ] "osdmap-prune", [ceph-mon1][DEBUG ] "nautilus", [ceph-mon1][DEBUG ] "octopus", [ceph-mon1][DEBUG ] "pacific", [ceph-mon1][DEBUG ] "elector-pinging" [ceph-mon1][DEBUG ] ], [ceph-mon1][DEBUG ] "required_con": "2449958747317026820", [ceph-mon1][DEBUG ] "required_mon": [ [ceph-mon1][DEBUG ] "kraken", [ceph-mon1][DEBUG ] "luminous", [ceph-mon1][DEBUG ] "mimic", [ceph-mon1][DEBUG ] "osdmap-prune", [ceph-mon1][DEBUG ] "nautilus", [ceph-mon1][DEBUG ] "octopus", [ceph-mon1][DEBUG ] "pacific", [ceph-mon1][DEBUG ] "elector-pinging" [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "monmap": { [ceph-mon1][DEBUG ] "created": "2021-08-18T07:55:40.349602Z", [ceph-mon1][DEBUG ] "disallowed_leaders: ": "", [ceph-mon1][DEBUG ] "election_strategy": 1, [ceph-mon1][DEBUG ] "epoch": 1, [ceph-mon1][DEBUG ] "features": { [ceph-mon1][DEBUG ] "optional": [], [ceph-mon1][DEBUG ] "persistent": [ [ceph-mon1][DEBUG ] "kraken", [ceph-mon1][DEBUG ] "luminous", [ceph-mon1][DEBUG ] "mimic", [ceph-mon1][DEBUG ] "osdmap-prune", [ceph-mon1][DEBUG ] "nautilus", [ceph-mon1][DEBUG ] "octopus", [ceph-mon1][DEBUG ] "pacific", [ceph-mon1][DEBUG ] "elector-pinging" [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "fsid": "0d11d338-a480-40da-8520-830423b22c3e", [ceph-mon1][DEBUG ] "min_mon_release": 16, [ceph-mon1][DEBUG ] "min_mon_release_name": "pacific", [ceph-mon1][DEBUG ] "modified": "2021-08-18T07:55:40.349602Z", [ceph-mon1][DEBUG ] "mons": [ [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "addr": "10.0.0.101:6789/0", [ceph-mon1][DEBUG ] "crush_location": "{}", [ceph-mon1][DEBUG ] "name": "ceph-mon1", [ceph-mon1][DEBUG ] "priority": 0, [ceph-mon1][DEBUG ] "public_addr": "10.0.0.101:6789/0", [ceph-mon1][DEBUG ] "public_addrs": { [ceph-mon1][DEBUG ] "addrvec": [ [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "addr": "10.0.0.101:3300", [ceph-mon1][DEBUG ] "nonce": 0, [ceph-mon1][DEBUG ] "type": "v2" [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] { [ceph-mon1][DEBUG ] "addr": "10.0.0.101:6789", [ceph-mon1][DEBUG ] "nonce": 0, [ceph-mon1][DEBUG ] "type": "v1" [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ] [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "rank": 0, [ceph-mon1][DEBUG ] "weight": 0 [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ], [ceph-mon1][DEBUG ] "stretch_mode": false [ceph-mon1][DEBUG ] }, [ceph-mon1][DEBUG ] "name": "ceph-mon1", [ceph-mon1][DEBUG ] "outside_quorum": [], [ceph-mon1][DEBUG ] "quorum": [ [ceph-mon1][DEBUG ] 0 [ceph-mon1][DEBUG ] ], [ceph-mon1][DEBUG ] "quorum_age": 1, [ceph-mon1][DEBUG ] "rank": 0, [ceph-mon1][DEBUG ] "state": "leader", [ceph-mon1][DEBUG ] "stretch_mode": false, [ceph-mon1][DEBUG ] "sync_provider": [] [ceph-mon1][DEBUG ] } [ceph-mon1][DEBUG ] ******************************************************************************** [ceph-mon1][INFO ] monitor: mon.ceph-mon1 is running [ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph_deploy.mon][INFO ] processing monitor mon.ceph-mon1 [ceph-mon1][DEBUG ] connection detected need for sudo [ceph-mon1][DEBUG ] connected to host: ceph-mon1 [ceph-mon1][DEBUG ] detect platform information from remote host [ceph-mon1][DEBUG ] detect machine type [ceph-mon1][DEBUG ] find the location of an executable [ceph-mon1][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph_deploy.mon][INFO ] mon.ceph-mon1 monitor has reached quorum! [ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum [ceph_deploy.mon][INFO ] Running gatherkeys... [ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpqCeuN6 [ceph-mon1][DEBUG ] connection detected need for sudo [ceph-mon1][DEBUG ] connected to host: ceph-mon1 [ceph-mon1][DEBUG ] detect platform information from remote host [ceph-mon1][DEBUG ] detect machine type [ceph-mon1][DEBUG ] get remote short hostname [ceph-mon1][DEBUG ] fetch remote file [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-mon1.asok mon_status [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.admin [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mds [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-mgr [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-osd [ceph-mon1][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon1/keyring auth get client.bootstrap-rgw [ceph_deploy.gatherkeys][INFO ] Storing ceph.client.admin.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mds.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-mgr.keyring [ceph_deploy.gatherkeys][INFO ] keyring ‘ceph.mon.keyring‘ already exists [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-osd.keyring [ceph_deploy.gatherkeys][INFO ] Storing ceph.bootstrap-rgw.keyring [ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpqCeuN6
验证Mon节点
验证在mon节点已经安装并启动了ceph-mon服务,并且后期在ceph-deploy节点初始化目录会生成一些bootstrap ceph mds/mgr/osd/rgw等服务的keyring认证文件,这些初始化文件拥有对ceph集群的最高特权,所以一定要保存好。
root@ceph-mon1:~# ps -ef | grep ceph-mon ceph 6688 1 0 15:55 ? 00:00:00 /usr/bin/ceph-mon -f --cluster ceph --id ceph-mon1 --setuser ceph --setgroup ceph root 7252 2514 0 16:00 pts/0 00:00:00 grep --color=auto ceph-mon
7、配置manager节点
部署ceph-mgr节点:
mgr节点需要读取ceph的配置文件,及/etc/ceph目录中的配置文件
#初始化ceph-mgr节点: root@ceph-mgr1:~# apt -y install ceph-mgr cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr1 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy mgr create ceph-mgr1 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] mgr : [(‘ceph-mgr1‘, ‘ceph-mgr1‘)] [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] subcommand : create [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f8d17024c30> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] func : <function mgr at 0x7f8d17484150> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph-mgr1:ceph-mgr1 The authenticity of host ‘ceph-mgr1 (10.0.0.104)‘ can‘t be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘ceph-mgr1‘ (ECDSA) to the list of known hosts. [ceph-mgr1][DEBUG ] connection detected need for sudo [ceph-mgr1][DEBUG ] connected to host: ceph-mgr1 [ceph-mgr1][DEBUG ] detect platform information from remote host [ceph-mgr1][DEBUG ] detect machine type [ceph_deploy.mgr][INFO ] Distro info: Ubuntu 18.04 bionic [ceph_deploy.mgr][DEBUG ] remote host will use systemd [ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph-mgr1 [ceph-mgr1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph-mgr1][WARNIN] mgr keyring does not exist yet, creating one [ceph-mgr1][DEBUG ] create a keyring file [ceph-mgr1][DEBUG ] create path recursively if it doesn‘t exist [ceph-mgr1][INFO ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph-mgr1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph-mgr1/keyring [ceph-mgr1][INFO ] Running command: sudo systemctl enable ceph-mgr@ceph-mgr1 [ceph-mgr1][WARNIN] Created symlink /etc/systemd/system/ceph-mgr.target.wants/ceph-mgr@ceph-mgr1.service → /lib/systemd/system/ceph-mgr@.service. [ceph-mgr1][INFO ] Running command: sudo systemctl start ceph-mgr@ceph-mgr1 [ceph-mgr1][INFO ] Running command: sudo systemctl enable ceph.target
验证ceph-mgr节点:
root@ceph-mgr1:~# ps -ef | grep ceph-mgr ceph 8128 1 8 17:09 ? 00:00:03 /usr/bin/ceph-mgr -f --cluster ceph --id ceph-mgr1 --setuser ceph --setgroup ceph root 8326 2396 0 17:10 pts/0 00:00:00 grep --color=auto ceph-mgr
8、分发admin秘钥:
在ceph-deploy节点把配置文件和admin秘钥拷贝至ceph集群需要执行ceph管理命令的节点,从而不需要后期通过ceph命令对ceph集群进行管理配置的时候每次都需要指定ceph-mon节点地址和ceph.client.admin.keyring文件,另外各ceph-mon节点也需要同步ceph的集群配置文件及认证文件。
在ceph-deploy节点管理集群:
root@ceph-deploy:~# apt -y install ceph-common #安装ceph公共组件,安装ceph-common需要使用root root@cepn-node1:~# apt -y install ceph-common root@cepn-node2:~# apt -y install ceph-common root@cepn-node3:~# apt -y install ceph-common root@cepn-node4:~# apt -y install ceph-common cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy admin ceph-deploy ceph-node1 ceph-node2 ceph-node3 ceph-node4 #分发admin秘钥 [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-deploy [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f4ba41a4190> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : [‘ceph-deploy‘] [ceph_deploy.cli][INFO ] func : <function admin at 0x7f4ba4aa5a50> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-deploy [ceph-deploy][DEBUG ] connection detected need for sudo [ceph-deploy][DEBUG ] connected to host: ceph-deploy [ceph-deploy][DEBUG ] detect platform information from remote host [ceph-deploy][DEBUG ] detect machine type [ceph-deploy][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf [ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy admin ceph-node1 ceph-node2 ceph-node3 ceph-node4 [ceph_deploy.cli][INFO ] ceph-deploy options: [ceph_deploy.cli][INFO ] username : None [ceph_deploy.cli][INFO ] verbose : False [ceph_deploy.cli][INFO ] overwrite_conf : False [ceph_deploy.cli][INFO ] quiet : False [ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fc78eac3190> [ceph_deploy.cli][INFO ] cluster : ceph [ceph_deploy.cli][INFO ] client : [‘ceph-node1‘, ‘ceph-node2‘, ‘ceph-node3‘, ‘ceph-node4‘] [ceph_deploy.cli][INFO ] func : <function admin at 0x7fc78f3c4a50> [ceph_deploy.cli][INFO ] ceph_conf : None [ceph_deploy.cli][INFO ] default_release : False [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node1 The authenticity of host ‘ceph-node1 (10.0.0.106)‘ can‘t be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘ceph-node1‘ (ECDSA) to the list of known hosts. [ceph-node1][DEBUG ] connection detected need for sudo [ceph-node1][DEBUG ] connected to host: ceph-node1 [ceph-node1][DEBUG ] detect platform information from remote host [ceph-node1][DEBUG ] detect machine type [ceph-node1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node2 The authenticity of host ‘ceph-node2 (10.0.0.107)‘ can‘t be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘ceph-node2‘ (ECDSA) to the list of known hosts. [ceph-node2][DEBUG ] connection detected need for sudo [ceph-node2][DEBUG ] connected to host: ceph-node2 [ceph-node2][DEBUG ] detect platform information from remote host [ceph-node2][DEBUG ] detect machine type [ceph-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node3 The authenticity of host ‘ceph-node3 (10.0.0.108)‘ can‘t be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘ceph-node3‘ (ECDSA) to the list of known hosts. [ceph-node3][DEBUG ] connection detected need for sudo [ceph-node3][DEBUG ] connected to host: ceph-node3 [ceph-node3][DEBUG ] detect platform information from remote host [ceph-node3][DEBUG ] detect machine type [ceph-node3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph-node4 The authenticity of host ‘ceph-node4 (10.0.0.109)‘ can‘t be established. ECDSA key fingerprint is SHA256:Y7Y9tQOTjbM8RnmDHvT8eJBzIu8ZPdaBkG9jBg8bifA. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘ceph-node4‘ (ECDSA) to the list of known hosts. [ceph-node4][DEBUG ] connection detected need for sudo [ceph-node4][DEBUG ] connected to host: ceph-node4 [ceph-node4][DEBUG ] detect platform information from remote host [ceph-node4][DEBUG ] detect machine type [ceph-node4][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
ceph节点验证秘钥:
到ceph-node节点验证key文件
root@cepn-node1:~# ll /etc/ceph/ total 20 drwxr-xr-x 2 root root 4096 Aug 18 16:38 ./ drwxr-xr-x 91 root root 4096 Aug 18 16:27 ../ -rw------- 1 root root 151 Aug 18 16:38 ceph.client.admin.keyring -rw-r--r-- 1 root root 259 Aug 18 16:38 ceph.conf -rw-r--r-- 1 root root 92 Jul 8 22:17 rbdmap -rw------- 1 root root 0 Aug 18 16:38 tmp4MDGPp
认证文件的属主和属组为了安全考虑,默认设置为root用户和root组,如果需要cephadmin用户也能执行ceph命令,需要对cephadmin用户进行授权
cephadmin@ceph-deploy:~/ceph-cluster$ sudo setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node1:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node2:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node3:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring root@cepn-node4:~# setfacl -m u:cephadmin:rw /etc/ceph/ceph.client.admin.keyring
测试ceph命令:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_WARN mon is allowing insecure global_id reclaim #需要禁用非安全模式通信 OSD count 0 < osd_pool_default_size 3 #集群的OSD数量小于3 services: mon: 1 daemons, quorum ceph-mon1 (age 98m) mgr: ceph-mgr1(active, since 24m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: cephadmin@ceph-deploy:~/ceph-cluster$ ceph config set mon auth_allow_insecure_global_id_reclaim false #禁用非安全模式通信 cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_WARN OSD count 0 < osd_pool_default_size 3 services: mon: 1 daemons, quorum ceph-mon1 (age 100m) mgr: ceph-mgr1(active, since 26m) osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 0 B used, 0 B / 0 B avail pgs: cephadmin@ceph-deploy:~/ceph-cluster$
9、初始化node节点过程
添加OSD之前,需要对node节点安装基本环境:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy install --no-adjust-repos --nogpgcheck ceph-node1 ceph-node2 ceph-node3 ceph-node4
--no-adjust-repos install packages without modifying source repos
--nogpgcheck install packages without gpgcheck
擦除硬盘
使用ceph-deploy disk zap 擦除各caph node数据磁盘
cephadmin@ceph-deploy:~/ceph-cluster$ cat EraseDisk.sh #!/bin/bash # for i in {1..4}; do for d in {b..f}; do ceph-deploy disk zap ceph-node$i /dev/sd$d done done cephadmin@ceph-deploy:~/ceph-cluster$ bash -n EraseDisk.sh cephadmin@ceph-deploy:~/ceph-cluster$ bash EraseDisk.sh
添加主机的磁盘OSD:
数据分类保存方式:
- Data:即ceph保存的对象数据
- Block:rocks DB数据,即元数据
- block-wal:数据库的wal日志
添加OSD(OSD的ID从0开始顺序使用):
cephadmin@ceph-deploy:~/ceph-cluster$ cat CreateDisk.sh #!/bin/bash # for i in {1..4}; do for d in {b..f}; do ceph-deploy osd create ceph-node$i --data /dev/sd$d done done cephadmin@ceph-deploy:~/ceph-cluster$ bash -n CreateDisk.sh cephadmin@ceph-deploy:~/ceph-cluster$ bash CreateDisk.sh
10、验证ceph集群:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_OK services: mon: 1 daemons, quorum ceph-mon1 (age 2h) mgr: ceph-mgr1(active, since 88m) osd: 20 osds: 20 up (since 55s), 20 in (since 63s) data: pools: 1 pools, 1 pgs objects: 0 objects, 0 B usage: 150 MiB used, 2.0 TiB / 2.0 TiB avail pgs: 1 active+clean
11、测试上传与下载数据:
存取数据时,客户端必须首先连接至RADOS集群上某存储池,然后根据对象名称由相关的CRUSH规格完成数据对象寻址,于是,为了测试集群的数据存取功能,这里首先先创建一个用于测试的存储池mypool,并设定其PG数量为32个。
$ ceph -h #一个更底层的命令 $ rados -h #
创建pool
[ceph@ceph-deploy ceph-cluster]$ ceph osd pool create mypool 32 32 #32个PG和32种PGD组合 pool ‘mypool‘ created cephadmin@ceph-deploy:~/ceph-cluster$ ceph pg ls-by-pool mypool | awk ‘{print $1,$2,$15}‘ #验证PG与PGP组合 PG OBJECTS ACTING 2.0 0 [8,10,3]p8 2.1 0 [15,0,13]p15 2.2 0 [5,1,15]p5 2.3 0 [17,5,14]p17 2.4 0 [1,12,18]p1 2.5 0 [12,4,8]p12 2.6 0 [1,13,19]p1 2.7 0 [6,17,2]p6 2.8 0 [16,13,0]p16 2.9 0 [4,9,19]p4 2.a 0 [11,4,18]p11 2.b 0 [13,7,17]p13 2.c 0 [12,0,5]p12 2.d 0 [12,19,3]p12 2.e 0 [2,13,19]p2 2.f 0 [11,17,8]p11 2.10 0 [15,13,0]p15 2.11 0 [16,6,1]p16 2.12 0 [10,3,9]p10 2.13 0 [17,6,3]p17 2.14 0 [8,13,17]p8 2.15 0 [19,1,11]p19 2.16 0 [8,12,17]p8 2.17 0 [6,14,2]p6 2.18 0 [18,9,12]p18 2.19 0 [3,6,13]p3 2.1a 0 [6,14,2]p6 2.1b 0 [11,7,17]p11 2.1c 0 [10,7,1]p10 2.1d 0 [15,10,7]p15 2.1e 0 [3,13,15]p3 2.1f 0 [4,7,14]p4 * NOTE: afterwards cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd tree #查看osd与存储服务器的对应关系 ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 1.95374 root default -5 0.48843 host ceph-node2 5 hdd 0.09769 osd.5 up 1.00000 1.00000 6 hdd 0.09769 osd.6 up 1.00000 1.00000 7 hdd 0.09769 osd.7 up 1.00000 1.00000 8 hdd 0.09769 osd.8 up 1.00000 1.00000 9 hdd 0.09769 osd.9 up 1.00000 1.00000 -7 0.48843 host ceph-node3 10 hdd 0.09769 osd.10 up 1.00000 1.00000 11 hdd 0.09769 osd.11 up 1.00000 1.00000 12 hdd 0.09769 osd.12 up 1.00000 1.00000 13 hdd 0.09769 osd.13 up 1.00000 1.00000 14 hdd 0.09769 osd.14 up 1.00000 1.00000 -9 0.48843 host ceph-node4 15 hdd 0.09769 osd.15 up 1.00000 1.00000 16 hdd 0.09769 osd.16 up 1.00000 1.00000 17 hdd 0.09769 osd.17 up 1.00000 1.00000 18 hdd 0.09769 osd.18 up 1.00000 1.00000 19 hdd 0.09769 osd.19 up 1.00000 1.00000 -3 0.48843 host cepn-node1 0 hdd 0.09769 osd.0 up 1.00000 1.00000 1 hdd 0.09769 osd.1 up 1.00000 1.00000 2 hdd 0.09769 osd.2 up 1.00000 1.00000 3 hdd 0.09769 osd.3 up 1.00000 1.00000 4 hdd 0.09769 osd.4 up 1.00000 1.00000 cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool ls #查看当前存储池 device_health_metrics mypool cephadmin@ceph-deploy:~/ceph-cluster$ rados lspools #查看当前存储池 device_health_metrics mypool
当前的ceph环境还没有部署使用块存储和文件系统使用ceph,也没有使用对象存储的客户端,但是ceph的rados命令可以实现访问ceph对象存储的功能:
上传文件:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados put msg1 /var/log/syslog --pool=mypool #把文件上传至mypool并指定对象ID为msg1
列出文件:
cephadmin@ceph-deploy:~/ceph-cluster$ rados ls --pool=mypool
msg1
文件信息:
ceph osd map 命令:获取到存储池中数据对象的具体位置信息
cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd map mypool msg1
osdmap e131 pool ‘mypool‘ (2) object ‘msg1‘ -> pg 2.c833d430 (2.10) -> up ([15,13,0], p15) acting ([15,13,0], p15)
2.c844d430:表示文件放在了存储池ID为2的c844d430的PG上
2.10:表示数据存储在ID为2的存储池、ID为10的PG中
[15,13,0],p15:OSD编号,主OSD为15,活动的OSD 15,13,0,三个OSD表示数据存放一共3个副本,PG中的OSD是ceph的crush算法计算出三份数据保存在哪些OSD中
下载文件:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my.txt cephadmin@ceph-deploy:~/ceph-cluster$ ll /opt/ total 1840 drwxr-xr-x 2 root root 4096 Aug 20 15:23 ./ drwxr-xr-x 23 root root 4096 Aug 18 18:32 ../ -rw-r--r-- 1 root root 1873597 Aug 20 15:23 my.txt
#验证下载文件: cephadmin@ceph-deploy:~/ceph-cluster$ head /opt/my.txt Aug 18 18:33:40 ceph-deploy systemd-modules-load[484]: Inserted module ‘iscsi_tcp‘ Aug 18 18:33:40 ceph-deploy systemd-modules-load[484]: Inserted module ‘ib_iser‘ Aug 18 18:33:40 ceph-deploy systemd[1]: Starting Flush Journal to Persistent Storage... Aug 18 18:33:40 ceph-deploy systemd[1]: Started Load/Save Random Seed. Aug 18 18:33:40 ceph-deploy systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling. Aug 18 18:33:40 ceph-deploy systemd[1]: Started udev Kernel Device Manager. Aug 18 18:33:40 ceph-deploy systemd[1]: Started Set the console keyboard layout. Aug 18 18:33:40 ceph-deploy systemd[1]: Reached target Local File Systems (Pre). Aug 18 18:33:40 ceph-deploy systemd[1]: Reached target Local File Systems. Aug 18 18:33:40 ceph-deploy systemd[1]: Starting Tell Plymouth To Write Out Runtime Data...
修改文件:
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados put msg1 /etc/passwd --pool=mypool cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados get msg1 --pool=mypool /opt/my1.txt
#验证修改后的文件 cephadmin@ceph-deploy:~/ceph-cluster$ tail /opt/my1.txt _apt:x:104:65534::/nonexistent:/usr/sbin/nologin lxd:x:105:65534::/var/lib/lxd/:/bin/false uuidd:x:106:110::/run/uuidd:/usr/sbin/nologin dnsmasq:x:107:65534:dnsmasq,,,:/var/lib/misc:/usr/sbin/nologin landscape:x:108:112::/var/lib/landscape:/usr/sbin/nologin sshd:x:109:65534::/run/sshd:/usr/sbin/nologin pollinate:x:110:1::/var/cache/pollinate:/bin/false wang:x:1000:1000:wang,,,:/home/wang:/bin/bash cephadmin:x:2022:2022::/home/cephadmin:/bin/bash ceph:x:64045:64045:Ceph storage service:/var/lib/ceph:/usr/sbin/nologin
删除文件
cephadmin@ceph-deploy:~/ceph-cluster$ sudo rados rm msg1 --pool=mypool cephadmin@ceph-deploy:~/ceph-cluster$ rados ls --pool=mypool cephadmin@ceph-deploy:~/ceph-cluster$
12、扩展ceph集群实现高可用:
主要是扩展ceph集群的mon节点以及mgr节点以实现集群高可用。
扩展ceph-mon节点:
Ceph-mon是原生具备自选举以实现高可用机制的ceph服务,节点数量通常是奇数。
root@ceph-mon2:~# apt -y install ceph-mon root@ceph-mon3:~# apt -y install ceph-mon cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon add ceph-mon2 cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mon add ceph-mon3
验证ceph-mon状态:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph quorum_status --format json-pretty { "election_epoch": 14, "quorum": [ 0, 1, 2 ], "quorum_names": [ "ceph-mon1", "ceph-mon2", "ceph-mon3" ], "quorum_leader_name": "ceph-mon1", #当前的leader "quorum_age": 304, "features": { "quorum_con": "4540138297136906239", "quorum_mon": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ] }, "monmap": { "epoch": 3, "fsid": "0d11d338-a480-40da-8520-830423b22c3e", "modified": "2021-08-20T07:39:56.803507Z", "created": "2021-08-18T07:55:40.349602Z", "min_mon_release": 16, "min_mon_release_name": "pacific", "election_strategy": 1, "disallowed_leaders: ": "", "stretch_mode": false, "features": { "persistent": [ "kraken", "luminous", "mimic", "osdmap-prune", "nautilus", "octopus", "pacific", "elector-pinging" ], "optional": [] }, "mons": [ { "rank": 0, #当前节点等级 "name": "ceph-mon1", #当前节点名称 "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.0.0.101:3300", "nonce": 0 }, { "type": "v1", "addr": "10.0.0.101:6789", "nonce": 0 } ] }, "addr": "10.0.0.101:6789/0", #监听地址 "public_addr": "10.0.0.101:6789/0", #监听地址 "priority": 0, "weight": 0, "crush_location": "{}" }, { "rank": 1, "name": "ceph-mon2", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.0.0.102:3300", "nonce": 0 }, { "type": "v1", "addr": "10.0.0.102:6789", "nonce": 0 } ] }, "addr": "10.0.0.102:6789/0", "public_addr": "10.0.0.102:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" }, { "rank": 2, "name": "ceph-mon3", "public_addrs": { "addrvec": [ { "type": "v2", "addr": "10.0.0.103:3300", "nonce": 0 }, { "type": "v1", "addr": "10.0.0.103:6789", "nonce": 0 } ] }, "addr": "10.0.0.103:6789/0", "public_addr": "10.0.0.103:6789/0", "priority": 0, "weight": 0, "crush_location": "{}" } ] } } cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 28m) mgr: ceph-mgr1(active, since 6h) osd: 20 osds: 20 up (since 6h), 20 in (since 45h) data: pools: 2 pools, 33 pgs objects: 0 objects, 0 B usage: 165 MiB used, 2.0 TiB / 2.0 TiB avail pgs: 33 active+clean
扩展mgr节点
root@ceph-mgr2:~# apt -y install ceph-mgr cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mgr create ceph-mgr2
验证mgr节点状态
cephadmin@ceph-deploy:~/ceph-cluster$ ceph -s cluster: id: 0d11d338-a480-40da-8520-830423b22c3e health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 33m) mgr: ceph-mgr1(active, since 6h), standbys: ceph-mgr2 osd: 20 osds: 20 up (since 6h), 20 in (since 45h) data: pools: 2 pools, 33 pgs objects: 0 objects, 0 B usage: 165 MiB used, 2.0 TiB / 2.0 TiB avail pgs: 33 active+clean
四、块设备RBD
RBD(RADOS Block Devices)即为块存储的一种,RBD通过librbd库与OSD进行交互,RBD为KVM等虚拟化技术和云服务(如OpenStack和CloudStack)提供高性能和无线可拓展性的存储后端,这些系统依赖libvirt和QWMU使用程序与RBD进行集成,客户端基于librbd库即可将RADOS存储集群用作块设备,用于rbd的存储池需要先启用rbd功能并进行初始化。
-
创建RBD
$ ceph osd pool create <pool> [<pg_num:int>] [<pgp_num:int>] [replicated|erasure] #创建存储池命令格式
cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool create myrbd1 64 64 #创建存储池,指定pg和pgd的数量,pgp是对存在于pg的数据进行组合存储,pgp通常等于pg的值 pool ‘myrbd1‘ created cephadmin@ceph-deploy:~/ceph-cluster$ ceph osd pool application enable myrbd1 rbd #对存储池启用RBD功能 enabled application ‘rbd‘ on pool ‘myrbd1‘ cephadmin@ceph-deploy:~/ceph-cluster$ rbd pool init -p myrbd1 #通过RBD命令对存储池初始化
-
创建并验证img:
rbd存储池并不能直接用于块设备,而是需要事先在其中按需创建映像(image),并把映像文件作为块设备使用,rbd命令可用于创建、查看及删除块设备所在的映像(image),以及克隆映像、创建快照、将映像回滚到快照和查看快照等管理操作。
cephadmin@ceph-deploy:~/ceph-cluster$ rbd create myimg1 --size 5G --pool myrbd1 cephadmin@ceph-deploy:~/ceph-cluster$ rbd create myimg2 --size 3G --pool myrbd1 --image-format 2 --image-feature layering #centos系统内核较低无法挂载使用,因此只开启部分特性。 #除了layering 其他特性需要高版本内核支持 cephadmin@ceph-deploy:~/ceph-cluster$ rbd ls --pool myrbd1 #列出指定的pool中所有的img myimg1 myimg2 cephadmin@ceph-deploy:~/ceph-cluster$ rbd --image myimg1 --pool myrbd1 info #查看指定rbd的信息 rbd image ‘myimg1‘: size 5 GiB in 1280 objects order 22 (4 MiB objects) snapshot_count: 0 id: 38ee810c4674 block_name_prefix: rbd_data.38ee810c4674 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten op_features: flags: create_timestamp: Fri Aug 20 18:08:52 2021 access_timestamp: Fri Aug 20 18:08:52 2021 modify_timestamp: Fri Aug 20 18:08:52 2021 cephadmin@ceph-deploy:~/ceph-cluster$ rbd --image myimg2 --pool myrbd1 info rbd image ‘myimg2‘: size 3 GiB in 768 objects order 22 (4 MiB objects) snapshot_count: 0 id: 38f7fea54b29 block_name_prefix: rbd_data.38f7fea54b29 format: 2 features: layering op_features: flags: create_timestamp: Fri Aug 20 18:09:51 2021 access_timestamp: Fri Aug 20 18:09:51 2021 modify_timestamp: Fri Aug 20 18:09:51 2021
-
客户端使用块存储:
1、查看当前ceph状态:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 2.0 TiB 2.0 TiB 169 MiB 169 MiB 0 TOTAL 2.0 TiB 2.0 TiB 169 MiB 169 MiB 0 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 0 B 0 0 B 0 633 GiB mypool 2 32 0 B 0 0 B 0 633 GiB myrbd1 3 64 405 B 7 48 KiB 0 633 GiB
2、在客户端安装ceph-common:
[root@ceph-client1 ~]# yum install epel-release #配置yum源 [root@ceph-client1 ~]# yum -y install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm [root@ceph-client1 ~]# yum -y install ceph-common
#从部署服务器同步认证文件
cephadmin@ceph-deploy:~/ceph-cluster$ scp ceph.conf ceph.client.admin.keyring 10.0.0.71:/etc/ceph
3、客户端映射img:
[root@ceph-client1 ~]# rbd -p myrbd1 map myimg2 /dev/rbd0 [root@ceph-client1 ~]# rbd -p myrbd1 map myimg1 rbd: sysfs write failed RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable myrbd1/myimg1 object-map fast-diff deep-flatten". In some cases useful info is found in syslog - try "dmesg | tail". rbd: map failed: (6) No such device or address [root@ceph-client1 ~]# rbd feature disable myrbd1/myimg1 object-map fast-diff deep-flatten [root@ceph-client1 ~]# rbd -p myrbd1 map myimg1 /dev/rbd1
4、客户端验证RBD:
[root@ceph-client1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 200G 0 disk ├─sda1 8:1 0 1G 0 part /boot ├─sda2 8:2 0 100G 0 part / ├─sda3 8:3 0 50G 0 part /data ├─sda4 8:4 0 1K 0 part └─sda5 8:5 0 4G 0 part [SWAP] sr0 11:0 1 1024M 0 rom rbd0 253:0 0 3G 0 disk rbd1 253:16 0 5G 0 disk
5、客户端格式化磁盘并挂载使用:
[root@ceph-client1 ~]# mkfs.ext4 /dev/rbd0 mke2fs 1.42.9 (28-Dec-2013) Discarding device blocks: done Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=1024 blocks, Stripe width=1024 blocks 196608 inodes, 786432 blocks 39321 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=805306368 24 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done [root@ceph-client1 ~]# mkfs.xfs /dev/rbd1 Discarding blocks...Done. meta-data=/dev/rbd1 isize=512 agcount=8, agsize=163840 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=0, sparse=0 data = bsize=4096 blocks=1310720, imaxpct=25 = sunit=1024 swidth=1024 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=8 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [root@ceph-client1 ~]# mount /dev/rbd0 /mnt/ [root@ceph-client1 ~]# df -TH Filesystem Type Size Used Avail Use% Mounted on devtmpfs devtmpfs 943M 0 943M 0% /dev tmpfs tmpfs 954M 0 954M 0% /dev/shm tmpfs tmpfs 954M 11M 944M 2% /run tmpfs tmpfs 954M 0 954M 0% /sys/fs/cgroup /dev/sda2 xfs 108G 5.2G 103G 5% / /dev/sda3 xfs 54G 175M 54G 1% /data /dev/sda1 xfs 1.1G 150M 915M 15% /boot tmpfs tmpfs 191M 0 191M 0% /run/user/0 /dev/rbd0 ext4 3.2G 9.5M 3.0G 1% /mnt [root@ceph-client1 ~]# mkdir /data [root@ceph-client1 ~]# mount /dev/rbd1 /data/ [root@ceph-client1 ~]# df -TH Filesystem Type Size Used Avail Use% Mounted on devtmpfs devtmpfs 943M 0 943M 0% /dev tmpfs tmpfs 954M 0 954M 0% /dev/shm tmpfs tmpfs 954M 10M 944M 2% /run tmpfs tmpfs 954M 0 954M 0% /sys/fs/cgroup /dev/sda2 xfs 108G 5.2G 103G 5% / /dev/rbd1 xfs 5.4G 35M 5.4G 1% /data /dev/sda1 xfs 1.1G 150M 915M 15% /boot tmpfs tmpfs 191M 0 191M 0% /run/user/0 /dev/rbd0 ext4 3.2G 9.5M 3.0G 1% /mnt
6、客户端验证:
[root@ceph-client1 data]# dd if=/dev/zero of=/data/ceph-test-file bs=1MB count=300 300+0 records in 300+0 records out 300000000 bytes (300 MB) copied, 1.61684 s, 186 MB/s [root@ceph-client1 data]# file /data/ceph-test-file /data/ceph-test-file: data [root@ceph-client1 data]# ll -h /data/ceph-test-file -rw-r--r--. 1 root root 287M Aug 23 12:27 /data/ceph-test-file
7、ceph验证数据:
cephadmin@cepn-node1:~$ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 2.0 TiB 2.0 TiB 2.3 GiB 2.3 GiB 0.11 TOTAL 2.0 TiB 2.0 TiB 2.3 GiB 2.3 GiB 0.11 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 0 B 0 0 B 0 632 GiB myrbd1 2 64 363 MiB 115 1.1 GiB 0.06 632 GiB