参考文档:
《openstack hacluster apache2 service not running, wrong ssl cert name》
在第10节部署完毕高可用后,发现无法登陆到openstack-dashboard,出现了“An error occurred authenticating. Please try again later.”的提示。
查看keystone状态:
juju status keystone
Model Controller Cloud/Region Version SLA Timestamp
openstack maas-controller mymaas/default 2.8.7 unsupported 15:54:28+08:00
App Version Status Scale Charm Store Rev OS Notes
keystone 18.0.0 active 3 keystone local 0 ubuntu
keystone-hacluster active 3 hacluster jujucharms 74 ubuntu
keystone-mysql-router 8.0.23 active 3 mysql-router local 0 ubuntu
Unit Workload Agent Machine Public address Ports Message
keystone/0* active idle 0/lxd/2 10.0.2.101 5000/tcp Unit is ready
keystone-hacluster/0* active idle 10.0.2.101 Unit is ready and clustered
keystone-mysql-router/0* active idle 10.0.2.101 Unit is ready
keystone/1 active idle 1/lxd/7 10.0.2.117 5000/tcp Unit is ready
keystone-hacluster/1 active idle 10.0.2.117 Unit is ready and clustered
keystone-mysql-router/1 active idle 10.0.2.117 Unit is ready
keystone/2 active idle 2/lxd/7 10.0.2.118 5000/tcp Unit is ready
keystone-hacluster/2 active idle 10.0.2.118 Unit is ready and clustered
keystone-mysql-router/2 active idle 10.0.2.118 Unit is ready
Machine State DNS Inst id Series AZ Message
0 started 10.0.0.159 node4 focal default Deployed
0/lxd/2 started 10.0.2.101 juju-1584e6-0-lxd-2 focal default Container started
1 started 10.0.0.156 node2 focal default Deployed
1/lxd/7 started 10.0.2.117 juju-1584e6-1-lxd-7 focal default Container started
2 started 10.0.0.157 node1 focal default Deployed
2/lxd/7 started 10.0.2.118 juju-1584e6-2-lxd-7 focal default Container started
查看image:
openstack image list
Certificate did not match expected hostname: 10.0.7.12. Certificate: {'subject': ((('commonName', 'juju-1584e6-0-lxd-2.maas'),),), 'subjectAltName': [('DNS', 'juju-1584e6-0-lxd-2.maas'), ('IP Address', '10.0.2.101')]}
Failed to discover available identity versions when contacting https://10.0.7.12:5000/v3. Attempting to parse version from URL.
+--------------------------------------+-------+--------+
| ID | Name | Status |
+--------------------------------------+-------+--------+
| dfaeaebc-64d2-4996-96be-6475b6d06e17 | focal | active |
+--------------------------------------+-------+--------+
开始以为是vip的问题,因为觉得可能要使用VIP地址通信。
但是在重现故障时,发现出现了以前出现的apache不可用提示,
怀疑和以前出现的apache2不可用问题是一个bug。
根据社区debug信息,产生这个问题的原因是由于ssl链的问题,在openstack dashboard中使用VIP扩展HA高可用后,容器需要使用ca-cart,但是在VIP目录下,证书内容为空。
需要使用juju run-action --wait vault/0 reissue-certificates或run-action --wait vault/leader reissue-certificates 重新传递证书。
但是重新传递证书后,大约能解决90%的故障,正巧这次在那不幸的10%。
剩下的第二种办法是手工传递信用证书。大致的命令如下:
sudo ln -s /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas cert_10.10.20.201
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ ls -la
total 28
dr-xr-xr-x 2 root root 4096 Jul 21 19:23 .
drwxr-xr-x 3 root root 4096 Jul 21 18:18 ..
lrwxrwxrwx 1 root root 60 Jul 21 19:23 cert_10.10.20.201 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 60 Jul 21 18:20 cert_10.10.40.126 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 60 Jul 21 18:20 cert_172.16.1.247 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
-rw-r----- 1 root root 3175 Jul 21 18:20 cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 59 Jul 21 18:20 key_10.10.40.126 -> /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 59 Jul 21 18:20 key_172.16.1.247 -> /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas
-rw-r----- 1 root root 1678 Jul 21 18:20 key_eth2.juju-70b05d-3-lxd-10.maas
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ sudo ln -s /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas cert_10.10.40.201
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ sudo ln -s /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas key_10.10.40.201
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ sudo ln -s /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas key_10.10.20.201
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ ls -la
total 32
dr-xr-xr-x 2 root root 4096 Jul 21 19:24 .
drwxr-xr-x 3 root root 4096 Jul 21 18:18 ..
lrwxrwxrwx 1 root root 60 Jul 21 19:23 cert_10.10.20.201 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 60 Jul 21 18:20 cert_10.10.40.126 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 60 Jul 21 19:23 cert_10.10.40.201 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 60 Jul 21 18:20 cert_172.16.1.247 -> /etc/apache2/ssl/horizon/cert_eth2.juju-70b05d-3-lxd-10.maas
-rw-r----- 1 root root 3175 Jul 21 18:20 cert_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 59 Jul 21 19:24 key_10.10.20.201 -> /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 59 Jul 21 18:20 key_10.10.40.126 -> /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 59 Jul 21 19:24 key_10.10.40.201 -> /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas
lrwxrwxrwx 1 root root 59 Jul 21 18:20 key_172.16.1.247 -> /etc/apache2/ssl/horizon/key_eth2.juju-70b05d-3-lxd-10.maas
-rw-r----- 1 root root 1678 Jul 21 18:20 key_eth2.juju-70b05d-3-lxd-10.maas
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ sudo systemctl start apache2
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
ubuntu@juju-70b05d-3-lxd-10:/etc/apache2/ssl/horizon$ sudo systemctl status apache2
sudo: setrlimit(RLIMIT_CORE): Operation not permitted
● apache2.service - The Apache HTTP Server
Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2020-07-21 19:24:27 UTC; 4s ago
Docs: https://httpd.apache.org/docs/2.4/
Process: 139830 ExecStart=/usr/sbin/apachectl start (code=exited, status=0/SUCCESS)
Main PID: 139834 (apache2)
Tasks: 107 (limit: 38355)
Memory: 32.8M
CGroup: /system.slice/apache2.service
├─139834 /usr/sbin/apache2 -k start
├─139835 (wsgi:horizon) -k start
├─139836 (wsgi:horizon) -k start
├─139837 (wsgi:horizon) -k start
├─139838 (wsgi:horizon) -k start
├─139839 /usr/sbin/apache2 -k start
└─139840 /usr/sbin/apache2 -k start
Jul 21 19:24:27 juju-70b05d-3-lxd-10 systemd[1]: Starting The Apache HTTP Server...
Jul 21 19:24:27 juju-70b05d-3-lxd-10 apachectl[139833]: AH00548: NameVirtualHost has no effect and will be removed in the next release /etc/apache2/sites-enabled/default-ssl.conf:3
Jul 21 19:24:27 juju-70b05d-3-lxd-10 systemd[1]: Started The Apache HTTP Server.
第三种办法就是笔者偷懒采用的删除相关单元,再重新部署相关单元。
将主用keystone单元删除,再重新添加了一个单元,发现新添加的keystone单元还未添加上时,openstack dashboard就已经恢复正常了。
过了一会儿,openstack dashboard 又登不上去了,看了下后台,keystone正在更新端点,再等了会儿,openstack dashboard可以登上去了。
20210326日更新:
现在严重怀疑ssl的传递过程有bug且未修复。
应该先布置完基础架构高可用后再解封vault,下次试验下,再出结论。