由于官网没有明确的说明,所以在做ceph会出现如下报错集。
执行ceph-deploy mon create-initial
报错部分内容如下:
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph2][WARNIN] monitor: mon.ceph2, might not be running yet
[ceph2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph2][WARNIN] monitor ceph2 does not exist in monmap
[ceph2][WARNIN] neither public_addr
nor public_network
keys are defined for monitors
[ceph2][WARNIN] monitors may not be able to form quorum
注意报错中public_network,这是由于没有在ceph.conf中配置
解决办法:
修改ceph.conf配置文件(此IP段根据个人情况设定),添加public_network = 192.168.1.0/24
修改后继续执行ceph-deploy mon create-initial后,发现依旧报错,报错部分内容如下
[ceph3][WARNIN] provided hostname must match remote hostname
[ceph3][WARNIN] provided hostname: ceph3
[ceph3][WARNIN] remote hostname: localhost
[ceph3][WARNIN] monitors may not reach quorum and create-keys will not complete
[ceph3][WARNIN] ********************************************************************************
[ceph3][DEBUG ] deploying mon to ceph3
[ceph3][DEBUG ] get remote short hostname
[ceph3][DEBUG ] remote hostname: localhost
[ceph3][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][ERROR ] RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
[ceph_deploy][ERROR ] GenericError: Failed to create 3 monitors
这里看到错误提示/etc/ceph/ceph.conf内容不同,使用--overwrite-conf来覆盖
命令如下:
ceph-deploy --overwrite-conf config push ceph1 ceph2 ceph3
修改后继续执行ceph-deploy mon create-initial,发现报错还是存在,报错部分内容如下
[ceph3][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph3.asok mon_status
[ceph3][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph_deploy.mon][WARNIN] mon.ceph3 monitor is not yet in quorum, tries left: 1
[ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] ceph1
[ceph_deploy.mon][ERROR ] ceph3
[ceph_deploy.mon][ERROR ] ceph2
经过排查发现节点的hostname与/etc/hosts不符
解决办法:修改节点hostname名称,使其与/etc/hosts相符
节点一执行:hostnamectl set-hostname ceph1
节点二执行:hostnamectl set-hostname ceph2
节点三执行:hostnamectl set-hostname ceph3
修改后继续执行ceph-deploy mon create-initial,mmp发现还是报错,报错内容又不一样了,中间部分报错内容如下
[ceph2][ERROR ] no valid command found; 10 closest matches:
[ceph2][ERROR ] perf dump {<logger>} {<counter>}
[ceph2][ERROR ] log reopen
[ceph2][ERROR ] help
[ceph2][ERROR ] git_version
[ceph2][ERROR ] log flush
[ceph2][ERROR ] log dump
[ceph2][ERROR ] config unset <var>
[ceph2][ERROR ] config show
[ceph2][ERROR ] get_command_descriptions
[ceph2][ERROR ] dump_mempools
[ceph2][ERROR ] admin_socket: invalid command
[ceph_deploy.mon][WARNIN] mon.ceph2 monitor is not yet in quorum, tries left: 5
[ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying
[ceph2][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph2.asok mon_status
[ceph2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
解决办法:在各个节点上执行sudo pkill ceph,然后再在deploy节点执行ceph-deploy mon create-initial
然后发现ERROR报错消失了,配置初始monitor(s)、并收集到了所有密钥,当前目录下可以看到下面这些密钥环
ceph.bootstrap-mds.keyring
ceph.bootstrap-mgr.keyring
ceph.bootstrap-osd.keyring
ceph.bootstrap-rgw.keyring
ceph.client.admin.keyring