文章目录
前言
- 实验环境部署ceph集群(结合openstack多节点),重启之后ceph集群的osd服务出现问题,解决如下
- ceph+openstack多节点部署,有兴趣的可以查看我另一篇博客:https://blog.csdn.net/CN_TangZheng/article/details/104745364
一:报错
-
[root@ct ~]# ceph -s '//查看ceph集群状态' cluster: id: 8c9d2d27-492b-48a4-beb6-7de453cf45d6 health: HEALTH_WARN '//健康检查为warn' 1 osds down 1 host (1 osds) down Reduced data availability: 192 pgs inactive Degraded data redundancy: 812/1218 objects degraded (66.667%), 116 pgs degraded, 192 pgs undersized clock skew detected on mon.c1, mon.c2 services: mon: 3 daemons, quorum ct,c1,c2 mgr: ct(active), standbys: c1, c2 osd: 3 osds: 1 up, 2 in '//二个OSD服务宕了' data: pools: 3 pools, 192 pgs objects: 406 objects, 1.8 GiB usage: 2.8 GiB used, 1021 GiB / 1024 GiB avail pgs: 100.000% pgs not active 812/1218 objects degraded (66.667%) 116 undersized+degraded+peered 76 undersized+peered [root@ct ~]# ceph osd status '//查看osd服务状态,发现两个计算节点的osd服务状态不正常' +----+------+-------+-------+--------+---------+--------+---------+----------------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+------+-------+-------+--------+---------+--------+---------+----------------+ | 0 | ct | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up | | 1 | | 0 | 0 | 0 | 0 | 0 | 0 | exists | | 2 | | 0 | 0 | 0 | 0 | 0 | 0 | autoout,exists | +----+------+-------+-------+--------+---------+--------+---------+----------------+
1.1:解决
-
发现neutron的Open vSwitch服务挂了
[root@ct ~]# source keystonerc_admin [root@ct ~(keystone_admin)]# openstack network agent list '//经过排查,发现Open vSwitch和L3服务挂掉了' +--------------------------------------+----------------------+------+-------------------+-------+-------+---------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+----------------------+------+-------------------+-------+-------+---------------------------+ | 12dd5b51-1344-4c29-8974-e5d8e0e65d2e | Open vSwitch agent | c1 | None | XXX | UP | neutron-openvswitch-agent | | 20829a10-4a26-4317-8175-4534ac0b01e1 | Open vSwitch agent | c2 | None | XXX | UP | neutron-openvswitch-agent | | 25c121ec-b761-4e7b-bfbf-9601993ebb54 | Metadata agent | ct | None | :-) | UP | neutron-metadata-agent | | 47c878ee-93f0-4960-baa1-1cc92476ed2a | DHCP agent | ct | nova | :-) | UP | neutron-dhcp-agent | | 57647383-7106-46b6-971f-2398457e5179 | Loadbalancerv2 agent | ct | None | :-) | UP | neutron-lbaasv2-agent | | 92d49052-0b4f-467c-a92c-1743d891043f | Metering agent | ct | None | :-) | UP | neutron-metering-agent | | c2f7791c-96ed-472b-abda-509a3ff125b5 | L3 agent | ct | nova | XXX | UP | neutron-l3-agent | | e48269d8-e4f1-424b-bc3e-4c0d13757e8a | Open vSwitch agent | ct | None | :-) | UP | neutron-openvswitch-agent | +--------------------------------------+----------------------+------+-------------------+-------+-------+---------------------------+
-
控制节点重启l3服务
[root@ct ~(keystone_admin)]# systemctl start neutron-l3-agent
-
计算节点重启Open vSwitch agent
[root@c1 ceph]# systemctl restart neutron-openvswitch-agent [root@c2 ceph]# systemctl restart neutron-openvswitch-agent
-
重启完成后再次查看
openstack network agent list
服务是否都正常开启 -
我们进入计算节点重启osd,
[root@c1 ceph]# systemctl restart ceph-osd.target [root@c2 ceph]# systemctl restart ceph-osd.target [root@c1 ceph]# systemctl restart ceph-mgr.target [root@c2 ceph]# systemctl restart ceph-mgr.target '//重启OSD服务后使用ceph -s命令查看ceph集群状态,若计算节点的mgr服务没有开启也需要重启一下'
1.2:再次检查,发现问题已经解决
-
[root@ct ~(keystone_admin)]# ceph -s cluster: id: 8c9d2d27-492b-48a4-beb6-7de453cf45d6 health: HEALTH_OK '//健康检查OK' services: '//下面的服务也都正常了' mon: 3 daemons, quorum ct,c1,c2 mgr: ct(active), standbys: c2, c1 osd: 3 osds: 3 up, 3 in data: pools: 3 pools, 192 pgs objects: 406 objects, 1.8 GiB usage: 8.3 GiB used, 3.0 TiB / 3.0 TiB avail pgs: 192 active+clean io: client: 1.5 KiB/s rd, 1 op/s rd, 0 op/s wr [root@ct ~(keystone_admin)]# ceph osd status '//OSD状态也都没问题' +----+------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | ct | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up | | 1 | c1 | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up | | 2 | c2 | 2837M | 1021G | 0 | 0 | 0 | 0 | exists,up | +----+------+-------+-------+--------+---------+--------+---------+-----------+