ceph常见故障之(系统重启后,因lvm信息丢失 OSD无法启动)

其他人也遇到过该问题 请点击该处查看

手动调试,启动osd(失败)

root@node16072:~# /usr/bin/ceph-osd -f --cluster ceph --id 0 --setuser ceph --setgroup ceph
2023-05-04T21:39:39.250+0800 7f8d9aae2240 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2023-05-04T21:39:39.250+0800 7f8d9aae2240 -1 AuthRegistry(0x56400f1ac140) no keyring found at /var/lib/ceph/osd/ceph-0/keyring, disabling cephx
2023-05-04T21:39:39.250+0800 7f8d9aae2240 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2023-05-04T21:39:39.250+0800 7f8d9aae2240 -1 AuthRegistry(0x7ffe9af34500) no keyring found at /var/lib/ceph/osd/ceph-0/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)

查看系统日志

root@node16072:~# dmesg -T |grep ceph 
[Thu May  4 21:32:58 2023] systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
[Thu May  4 21:32:58 2023] systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
[Thu May  4 21:32:58 2023] systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
[Thu May  4 21:32:58 2023] systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
[Thu May  4 21:32:58 2023] systemd[1]: /lib/systemd/system/ceph-volume@.service:8: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
[Thu May  4 21:32:58 2023] systemd[1]: remote-fs-pre.target: Found dependency on ceph.target/start
[Thu May  4 21:32:58 2023] systemd[1]: remote-fs-pre.target: Found dependency on ceph-mds.target/start
[Thu May  4 21:32:58 2023] systemd[1]: remote-fs-pre.target: Found dependency on ceph-mon.target/start
[Thu May  4 21:32:58 2023] systemd[1]: remote-fs-pre.target: Found dependency on ceph-mon@node16072.service/start
[Thu May  4 21:32:58 2023] systemd[1]: remote-fs-pre.target: Found ordering cycle on ceph-mon@node16072.service/start
[Thu May  4 21:32:58 2023] systemd[1]: remote-fs-pre.target: Job ceph-mon@node16072.service/start deleted to break ordering cycle starting with remote-fs-pre.target/start
[Thu May  4 21:32:58 2023] systemd[1]: ceph-mgr@node16072.service: Found ordering cycle on pve-cluster.service/start
[Thu May  4 21:32:58 2023] systemd[1]: ceph-mgr@node16072.service: Found dependency on rrdcached.service/start
[Thu May  4 21:32:58 2023] systemd[1]: ceph-mgr@node16072.service: Found dependency on remote-fs.target/start
[Thu May  4 21:32:58 2023] systemd[1]: ceph-mgr@node16072.service: Found dependency on remote-fs-pre.target/start
[Thu May  4 21:32:58 2023] systemd[1]: ceph-mgr@node16072.service: Found dependency on ceph-mgr@node16072.service/start
[Thu May  4 21:32:58 2023] systemd[1]: ceph-mgr@node16072.service: Job pve-cluster.service/start deleted to break ordering cycle starting with ceph-mgr@node16072.service/start
[Thu May  4 21:32:58 2023] systemd[1]: Created slice system-ceph\x2dmgr.slice.
[Thu May  4 21:32:58 2023] systemd[1]: Created slice system-ceph\x2dmon.slice.
[Thu May  4 21:32:58 2023] systemd[1]: Created slice system-ceph\x2dvolume.slice.
[Thu May  4 21:32:58 2023] systemd[1]: Reached target ceph target allowing to start/stop all ceph-fuse@.service instances at once.
[Thu May  4 21:32:58 2023] systemd[1]: Reached target ceph target allowing to start/stop all ceph-mon@.service instances at once.
[Thu May  4 21:32:58 2023] systemd[1]: Reached target ceph target allowing to start/stop all ceph-mds@.service instances at once.
[Thu May  4 21:32:58 2023] systemd[1]: Reached target ceph target allowing to start/stop all ceph-osd@.service instances at once.

可能的原因

# 1、osd在创建的时候,lvm2-lvmetad.service 和 lvm2-lvmetad.socket  一定要是运行的,不然机器重启后,可能会导致磁盘上lvm信息丢失

# 2、最好的办法就是 systemctl enable lvm2-lvmetad.service 然后重启机器
    systemctl start lvm2-lvmetad.service
    systemctl enable lvm2-lvmetad.service

# 3、若是 在创建osd时 lvm2-lvmetad 未运行,那么osd的日志中 会报 WARNING。

解决办法

# 解决办法
    systemctl start ceph-volume@lvm-1-fb045fd1-ce5b-4503-a37e-1c63061058ab.service

    # 不行的话试试
    /usr/sbin/ceph-volume lvm trigger 1-fb045fd1-ce5b-4503-a37e-1c63061058ab
声明:本文为原创,作者为 辣条①号,转载时请保留本声明及附带文章链接:https://boke.wsfnk.com/archives/1157.html
谢谢你请我吃辣条谢谢你请我吃辣条

如果文章对你有帮助,欢迎点击上方按钮打赏作者

最后编辑于:2023/7/11作者: 辣条①号

现在在做什么? 接下来打算做什么? 你的目标什么? 期限还有多少? 进度如何? 不负遇见,不谈亏欠!

暂无评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注

arrow grin ! ? cool roll eek evil razz mrgreen smile oops lol mad twisted wink idea cry shock neutral sad ???

文章目录