ceph常见WARN告警的处理方式

附:IBM关于ceph的故障类文章

https://www.ibm.com/docs/en/storage-ceph/6?topic=storage-setting-crush-location-daemons

1、类别:Too many repaired reads

file
解决办法:重启对应的OSD

2、类别:recently crashed

file

#查看具体的信息
[root@node14 ~]# ceph crash ls-new 
ID                                                                ENTITY  NEW  
2023-01-12T09:23:41.905887Z_45bae4d7-2197-4050-bd28-b3b371650af8  osd.62   *

#处理方式:归档
    #ceph crash achive <crash-id>        #可以一条一条的进行归档
    ceph crash archive-all            #建议归档全部告警,就不会再显示了

3、类别:allowing insecure global_id reclaim

file

#处理方式:
root@ceph01:~# ceph health mute AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED

4、类别:client is using insecure global_id reclaim

问题:若是未启用 msgrv2 ,监视器被允许客户端使用不安全的身份认证 (mons are allowing insecure global_id reclaim)

#解决办法:在任一集群节点执行
ceph mon enable-msgr2
ceph config set mon auth_allow_insecure_global_id_reclaim false
ceph config set mon auth_expose_insecure_global_id_reclaim false        #说是官方文档没有这个命令,没有就不执行,就在ceph的配置数据库中看不到

5、类别:pgs not deep-scrubbed in time

告警的现象和详细信息
[root@node14 ~]# ceph -s
  cluster:
    id:     3a9af753-3f48-43d8-b0e3-b6a3189f41e7
    health: HEALTH_WARN
            3 pgs not deep-scrubbed in time

[root@node14 ~]# ceph health detail 
HEALTH_WARN 3 pgs not deep-scrubbed in time
[WRN] PG_NOT_DEEP_SCRUBBED: 3 pgs not deep-scrubbed in time
    pg 5.316 not deep-scrubbed since 2023-03-28T10:31:44.693734-0700
    pg 5.d not deep-scrubbed since 2023-03-28T12:26:42.875821-0700
    pg 5.125 not deep-scrubbed since 2023-03-28T12:35:33.828058-0700

告警形成的原因
    1、由于部分PG没有deep-scrubbed造成,手动对异常PG进行deep-scrubb清理及可

# 优化思路,修改[osd]的配置(单位是秒),增大scrub 最小周期(1天变2天) 和 最大周期(7天变14天),增大deep-scrub 周期(7天变21天)
    osd_scrub_min_interval = 172800
    osd_scrub_max_interval = 1209600
    osd_deep_scrub_interval = 1814400

思路1、修复方式
    ceph pg deep-scrub 5.316
    ceph pg deep-scrub 5.d 
    ceph pg deep-scrub 5.125

思路2、附若是有非常的pg要操作,可以这样
    # 手动执行深度
    ceph health  detail |grep deep-scrubbed | awk '$1=="pg" {print $2}' | while read -r pg; do ceph pg deep-scrub "$pg"; sleep 10; done

    # 手动执行浅度
    ceph health  detail |grep "not scrubbed" | awk '$1=="pg" {print $2}' | while read -r pg; do ceph pg scrub "$pg"; sleep 10; done

思路3、若是手动都无法执行,可以考虑先将告警pg从主osd移动到同host下的其他osd后,再执行scrub
    # 该脚本适用与pve7 部署的ceph 15 和 ceph 16 版本
    ceph health  detail |grep deep-scrubbed | awk '$1=="pg" {print $2}' | while read -r pg
    do
    # 注意该脚本 并没有判断OldOsd 的 磁盘类型与 NewOsd 的磁盘类型
    # pg=5.701
        OldOsd=`ceph pg map "$pg" | awk -F' ' '{print $8}' | sed 's/[][]//g' | awk -F',' '{print $1}'`
        Host=`ceph osd find $OldOsd | grep host | sed 's/[ ",]//g' | uniq | awk -F':' '{print $2}'`
        NewOsd=`ceph osd df tree "$Host" | grep  osd. | awk -F' ' '$17 != "0"' | grep -v "osd.$OldOsd" | sort -nk17 | head -n 1 | awk '{print $1}'`
        echo "ceph osd pg-upmap-items $pg osd.$OldOsd osd.$NewOsd"
    done

思路4、将该pool的pg和pgp数量调大,以减小单个pg执行scrub和deep-scrub 的时间(亲测有效)

6、类别:mons down

file

修复办法:(保证有半数以上的mon存活,集群才可用)
    方式一:启动down掉的mon

    方式二:老的mon已经故障了,无法上线了,剔除老的mon,再添加新的mon
root@cunchu4:~# ceph mon stat
e4: 4 mons at {cunchu1=[v2:192.168.19.8:3300/0,v1:192.168.19.8:6789/0],cunchu2=[v2:192.168.19.9:3300/0,v1:192.168.19.9:6789/0],cunchu3=[v2:192.168.19.10:3300/0,v1:192.168.19.10:6789/0],cunchu4=[v2:192.168.19.11:3300/0,v1:192.168.19.11:6789/0]}, election epoch 84, leader 1 cunchu2, quorum 1,2,3 cunchu2,cunchu3,cunchu4

root@cunchu4:~# ceph mon remove cunchu1
removing mon.cunchu1 at [v2:192.168.19.8:3300/0,v1:192.168.19.8:6789/0], there will be 3 monitors

7、类别:nearfull osd 或者backfillfull osd 或者 pool nearfull

#告警现象:
HEALTH_WARN 1 nearfull osd(s); Degraded data redundancy: 1 pg undersized; 1 pool(s) nearfull
[WRN] OSD_NEARFULL: 1 nearfull osd(s)
    osd.2 is near full
[WRN] PG_DEGRADED: Degraded data redundancy: 1 pg undersized
    pg 12.51 is stuck undersized for 3m, current state active+recovering+undersized+remapped, last acting [11,0]
[WRN] POOL_NEARFULL: 1 pool(s) nearfull
    pool 'volumes' is nearfull

#告警说明:
    当出现pool nearfull时,多半也出现了osd nearfull。因为默认情况下,只有osd 出现nearfull 后才会导致pool nearfull

#临时解决措施
    方式一:(ceph16及以前版本适用)
        #单独提升某个osd的ratio(默认这三个值是0.95、0.9、0.85)
        #若是osd个数较多,可以先设置osd的维护标志,再调整ratio,最后在unset 标志
            #ceph osd set noout
            #ceph osd set noscrub
            #ceph osd set nodeep-scrub
        ceph tell osd.11 injectargs '--mon_osd_full_ratio 0.97 --mon_osd_backfillfull_ratio 0.95 --mon_osd_nearfull_ratio 0.9'        #即时生效

        #或者直接修改所在node下的所有osd
        for osd in $(ceph osd ls-tree $HOSTNAME); do ceph tell osd.$osd injectargs '--mon_osd_full_ratio 0.97 --mon_osd_backfillfull_ratio 0.95 --mon_osd_nearfull_ratio 0.9'; done 

    方拾二:(ceph16及以前版本适用)
        #和方式一一样的,只是这个要重启osd才能生效
        ceph daemon osd.11 config set mon_osd_full_ratio 0.97
        ceph daemon osd.11 config set mon_osd_backfillfull_ratio 0.95
        ceph daemon osd.11 config set mon_osd_nearfull_ratio 0.9
        systemctl restart ceph-osd@11

    方式三:(适用于所有ceph版本)推荐该方法,因为出现不一致往往因为盘的容量不一致
        #降低告警对应osd的reweight【或者增加其他osd的reweight】,让其上的数据迁移到别的osd
        ceph osd reweight osd.11 0.9        #默认所有osd的reweight都是1

#根本解决办法:扩容

8、类别:OSD(s) experiencing BlueFS spillover

(DB分区溢出-----一般出现在 ssd作为db,hdd作为数据盘时)

故障告警现象

告警现象
[root@node14 ~]# ceph -s
  cluster:
    id:     3a9af753-3f48-43d8-b0e3-b6a3189f41e7
    health: HEALTH_WARN
            6 OSD(s) experiencing BlueFS spillover

[root@node14 ~]# ceph health detail 
HEALTH_WARN 6 OSD(s) experiencing BlueFS spillover
[WRN] BLUEFS_SPILLOVER: 6 OSD(s) experiencing BlueFS spillover
     osd.7 spilled over 1.0 GiB metadata from 'db' device (8.0 GiB used of 30 GiB) to slow device
     osd.17 spilled over 1.6 GiB metadata from 'db' device (7.0 GiB used of 30 GiB) to slow device
     osd.59 spilled over 1.3 GiB metadata from 'db' device (7.9 GiB used of 30 GiB) to slow device
     osd.65 spilled over 1.4 GiB metadata from 'db' device (7.8 GiB used of 30 GiB) to slow device
     osd.71 spilled over 972 MiB metadata from 'db' device (7.9 GiB used of 30 GiB) to slow device
     osd.77 spilled over 1.2 GiB metadata from 'db' device (8.9 GiB used of 30 GiB) to slow device

在对应的osd主机上查看(查看wal 和 db的实际占用情况)
root@lahost001:/var/lib/ceph/osd# ceph daemon osd.7 perf dump |grep bluefs -A 10
    "bluefs": {
        "gift_bytes": 0,
        "reclaim_bytes": 0,
        "db_total_bytes": 32212246528,
        "db_used_bytes": 8571052032,
        "wal_total_bytes": 0,
        "wal_used_bytes": 0,
        "slow_total_bytes": 160031375360,
        "slow_used_bytes": 1121583104,        #注意这里
        "num_files": 195,
        "log_bytes": 6475776,
--
        "bluefs_bytes": 78768,
        "bluefs_items": 2266,
        "bluefs_file_reader_bytes": 25757952,
        "bluefs_file_reader_items": 374,
        "bluefs_file_writer_bytes": 896,
        "bluefs_file_writer_items": 4,
        "buffer_anon_bytes": 3670925,
        "buffer_anon_items": 5855,
        "buffer_meta_bytes": 978824,
        "buffer_meta_items": 11123,
        "osd_bytes": 1629936,
        "osd_items": 126,
        "osd_mapbl_bytes": 0,
        "osd_mapbl_items": 0,
        "osd_pglog_bytes": 1990084680,
        "osd_pglog_items": 3989621,

故障告警处理方式(压缩)

压缩命令:
ceph daemon osd.{id} compact        #尝试压缩临时解决办法(这条命令执行时间有点长)

查看执行压缩后的结果
root@lahost001:/var/lib/ceph/osd# ceph daemon osd.7 compact
{
    "elapsed_time": 456.12800944000003
}
root@lahost001:/var/lib/ceph/osd# ceph daemon osd.7 perf dump |grep bluefs -A 10
    "bluefs": {
        "gift_bytes": 0,
        "reclaim_bytes": 0,
        "db_total_bytes": 32212246528,
        "db_used_bytes": 4235190272,
        "wal_total_bytes": 0,
        "wal_used_bytes": 0,
        "slow_total_bytes": 160031375360,
        "slow_used_bytes": 0,
        "num_files": 80,
        "log_bytes": 10592256,
--
        "bluefs_bytes": 36680,
        "bluefs_items": 1327,
        "bluefs_file_reader_bytes": 7851648,
        "bluefs_file_reader_items": 154,
        "bluefs_file_writer_bytes": 896,
        "bluefs_file_writer_items": 4,
        "buffer_anon_bytes": 692095,
        "buffer_anon_items": 6970,
        "buffer_meta_bytes": 1295712,
        "buffer_meta_items": 14724,
        "osd_bytes": 1629936,
        "osd_items": 126,
        "osd_mapbl_bytes": 0,
        "osd_mapbl_items": 0,
        "osd_pglog_bytes": 1989875000,
        "osd_pglog_items": 3989208,

备注:这可以压缩db分区内的数据,减小大小,错误可能会消失,但如果溢出过多、可用容量太小则不一定有效。如果你不希望继续碰到该问题,则有以下2个根本性解决方案

参考文章:https://www.cnweed.com/archives/4328/

1:设置db占用容量(请参考下文)
    https://yourcmc.ru/wiki/Ceph_performance#About_block.db_sizing

2:迁移db至更大的分区

9、类别:mon store is getting too big

原因:ceph的leveldb数据量过大   (类似db溢出)

解决办法:
    ceph tell mon.pve-ceph01 compact
    #ceph daemon mon.pve-ceph01 compact

9、类别:OSD SLOW_OPS

1、原因:可能是 hdd盘的性能问题

2、告警详情:
root@hkhdd001:~# ceph health detail 
HEALTH_WARN 1 slow ops, oldest one blocked for 43 sec, daemons [osd.135,osd.145,osd.150,osd.159] have slow ops.
[WRN] SLOW_OPS: 1 slow ops, oldest one blocked for 43 sec, daemons [osd.135,osd.145,osd.150,osd.159] have slow ops.

3、尝试处理方式:换好的硬盘,或者 调高 告警粒度
root@fuse01:~# cat /etc/ceph/ceph.conf
[global]
     osd_mon_heartbeat_slop = 60ms
     osd_op_complaint_time = 60ms

4、重启mon角色

9、类别:MON_DISK_LOW

MON_DISK_LOW: mon pve-ceph01 is low on available space
极有可能是,mon所在节点的磁盘空间不够了,清理下,就能恢复

10、类别:no active mgr

当存储中时间不一致时,可能出现 类别:no active mgr 告警

11、类别:osd full(最好的解决办法是扩容)


再针对性调整ratio
ceph tell osd.4 injectargs '--mon_osd_full_ratio 0.95 --mon_osd_nearfull_ratio 0.95 --mon_osd_backfillfull_ratio 0.95'

12、类别:get_monmap_and_config failed to get config

#故障现象
root@node59 ~]# ceph -s
2023-07-25T21:48:52.756+0800 7fe35a764700 -1 monclient: get_monmap_and_config failed to get config
2023-07-25T21:48:55.762+0800 7fe35a764700 -1 monclient: get_monmap_and_config failed to get config

#mon日志
root@hkhost002:~# cat /var/log/ceph/ceph-mon.hkhost002.log |grep error
2023-07-26T03:23:10.137+0800 7f2a74c9b700  4 rocksdb: [db_impl/db_impl_files.cc:275] [JOB 315799] Tried to delete a non-existing file /var/lib/ceph/mon/ceph-hkhost002/store.db/5420015.log type=0 #5420015 -- IO error: No such file or directorywhile unlink() file: /var/lib/ceph/mon/ceph-hkhost002/store.db/5420015.log: No such file or directory
2023-07-26T05:17:42.782+0800 7f1d6c072700  4 rocksdb:                         Options.error_if_exists: 0

2023-07-26T03:22:38.963+0800 7f2a7549c700  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1690312958966273, "job": 315798, "event": "table_file_deletion", "file_number": 5420008}
mon.hkhost002@1(peon).paxos(paxos updating c 80487920..80488435) lease_expire from mon.0 v2:10.1.3.11:3300/0 is 27049.798828125s seconds in the past; mons are probably laggy (or possibly clocks are too skewed)

#分析:极有可能是mon之前的时间同步有问题,因mon节点时间不一致,引发的故障

#解决办法:将mon节点时间同步到一致,然后重启mon服务

13、类别:pg stuck undersized , current state active+undersized+degraded

root@QS0203:~# ceph health  detail 
HEALTH_WARN Degraded data redundancy: 6372/86315340 objects degraded (0.007%), 4 pgs degraded, 5 pgs undersized
[WRN] PG_DEGRADED: Degraded data redundancy: 6372/86315340 objects degraded (0.007%), 4 pgs degraded, 5 pgs undersized
    找到主osd,重启osd,等重启完成后看看会恢复不,不能就,out 他
    pg 2.3bf is stuck undersized for 16h, current state active+undersized+degraded, last acting [122,200]       
    pg 2.3d7 is stuck undersized for 20h, current state active+undersized+degraded, last acting [121,283]
    pg 2.506 is stuck undersized for 16h, current state active+undersized+degraded, last acting [115,267]
    pg 2.68d is stuck undersized for 16h, current state active+undersized+remapped, last acting [236,229]
    pg 2.783 is stuck undersized for 17h, current state active+undersized+degraded, last acting [234,206]

尝试解决办法一(可按顺序执行):
    1、找到pg所在主osd,重启后观察    systemctl restart ceph-osd@122
    2、将主osd的权重降低,触发迁移,过会查看(osd多的情况下 ,不容易看见效果)    ceph osd reweight osd.122 0.8
    3、将主osd的 out掉,触发迁移,过会查看,若是恢复,再in 入(请测有效),  ceph osd out 122

    4、如果加入了新osd,则可以直接恢复正常(未验证)
    5、如果暂时不能加入新osd,考虑重建一个osd,触发pg的重新计算(不建议,花费时间太长)

14、类别:PG_BACKFILL_FULL ,Low space hindering backfill

[WRN] PG_BACKFILL_FULL: Low space hindering backfill (add storage if this doesn't resolve itself): 10 pgs backfill_toofull
    pg 2.5 is active+remapped+backfill_wait+backfill_toofull, acting [123,243,233]
    pg 2.c0 is active+remapped+backfill_wait+backfill_toofull, acting [99,226,204]
    pg 2.1ef is active+remapped+backfill_wait+backfill_toofull, acting [122,228,200]
    pg 2.424 is active+remapped+backfill_wait+backfill_toofull, acting [234,228,203]
    pg 2.528 is active+remapped+backfill_wait+backfill_toofull, acting [103,1,196]
    pg 2.540 is active+remapped+backfill_wait+backfill_toofull, acting [105,130,197]
    pg 2.649 is active+remapped+backfill_wait+backfill_toofull, acting [123,201,114]

尝试解决办法:降低主osd的权重
    ceph osd reweight osd.123 0.8
声明:本文为原创,作者为 辣条①号,转载时请保留本声明及附带文章链接:https://boke.wsfnk.com/archives/1076.html
谢谢你请我吃辣条谢谢你请我吃辣条

如果文章对你有帮助,欢迎点击上方按钮打赏作者

最后编辑于:2023/12/14作者: 辣条①号

目标:网络规划设计师、系统工程师、ceph存储工程师、云计算工程师。 不负遇见,不谈亏欠!

暂无评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注

arrow grin ! ? cool roll eek evil razz mrgreen smile oops lol mad twisted wink idea cry shock neutral sad ???

文章目录