我们一般使用BOSH来部署cloudfoundry。使用bosh vms命令来查看各个节点的运行情况,如下所示:
通过这种形式,我们可以一目了然的查看到各节点的运行情况(running,failing等等),而这些信息都是通过Monit来获取的。
什么是Monit?
Monit是一个跨平台的用来监控Unix/linux系统(比如Linux、BSD、OSX、Solaris)的工具。Monit特别易于安装,而且非常轻量级(只有500KB大小),并且不依赖任何第三方程序、插件或者库。然而,Monit可以胜任全面监控、进程状态监控、文件系统变动监控、邮件通知和对核心服务的自定义动作等场景。易于安装、轻量级的实现以及强大的功能,让Monit成为一个理想的后备监控工具。Monit 包含一个内嵌的 HTTP(S) Web 界面,可以使用浏览器方便地查看 Monit 所监视的服务。
BOSH中的Monit
我们可以登录到cloudfoundry的节点上,在每一个节点上我们都可以发现/var/vcap/bosh/bin/monit这个可执行文件,执行
monit -h查看一下monit可以做哪些事:
root@ubuntu:/var/vcap/bosh/bin# ./monit -h
Usage: monit [options] {arguments}
Options are as follows:
-c file Use this control file
-d n Run as a daemon once per n seconds
-g name Set group name for start, stop, restart, monitor and unmonitor
-l logfile Print log information to this file
-p pidfile Use this lock file in daemon mode
-s statefile Set the file monit should write state information to
-I Do not run in background (needed for run from init)
-t Run syntax check for the control file
-v Verbose mode, work noisy (diagnostic output)
-H [filename] Print SHA1 and MD5 hashes of the file or of stdin if the
filename is omited; monit will exit afterwards
-V Print version number and patchlevel
-h Print this text
Optional action arguments for non-daemon mode are as follows:
start all - Start all services
start name - Only start the named service
stop all - Stop all services
stop name - Only stop the named service
restart all - Stop and start all services
restart name - Only restart the named service
monitor all - Enable monitoring of all services
monitor name - Only enable monitoring of the named service
unmonitor all - Disable monitoring of all services
unmonitor name - Only disable monitoring of the named service
reload - Reinitialize monit
status - Print full status information for each service
summary - Print short status information for each service
quit - Kill monit daemon process
validate - Check all services and start if not running
procmatch <pattern> - Test process matching pattern
(Action arguments operate on services defined in the control file)
monit不仅仅可以监控服务,还可以启动,停止,重启服务(start ,stop, restart...),功能不可谓不强大。
首先看监控,执行命令:
monit summary :
root@ubuntu:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 7m
Process 'nats' running
Process 'nats_stream_forwarder' running
Process 'etcd' running
Process 'hm9000_listener' running
Process 'hm9000_fetcher' running
Process 'hm9000_analyzer' running
Process 'hm9000_sender' running
Process 'hm9000_metrics_server' running
Process 'hm9000_api_server' running
Process 'hm9000_evacuator' running
Process 'hm9000_shredder' running
Process 'cloud_controller_ng' running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc' running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_clock' running
Process 'uaa' running
Process 'consul_template' running
File 'haproxy_config' accessible
Process 'haproxy' running
Process 'gorouter' running
Process 'warden' running
Process 'dea_next' running
Process 'dir_server' running
Process 'loggregator_trafficcontroller' running
Process 'doppler' running
Process 'metron_agent' running
Process 'dea_logging_agent' running
Process 'etcd_metrics_server' running
Process 'consul_agent' running
Process 'route_registrar' running
Process 'postgres' running
System 'system_ubuntu' running
以上列出了所有被监控的的cloudfoundry组件的运行情况。
重启一个服务,执行命令:
monit restart
root@ubuntu:/var/vcap/bosh/bin# ./monit restart nats
root@ubuntu:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 11m
Process 'nats' not monitored - restart pending
Process 'nats_stream_forwarder' not monitored
Process 'etcd' running
Process 'hm9000_listener' running
Process 'hm9000_fetcher' running
Process 'hm9000_analyzer' running
Process 'hm9000_sender' running
Process 'hm9000_metrics_server' running
Process 'hm9000_api_server' running
Process 'hm9000_evacuator' running
Process 'hm9000_shredder' running
Process 'cloud_controller_ng' running
Process 'cloud_controller_worker_local_1' running
Process 'cloud_controller_worker_local_2' running
Process 'nginx_cc' running
Process 'cloud_controller_migration' running
Process 'cloud_controller_worker_1' running
Process 'cloud_controller_clock' running
Process 'uaa' running
Process 'consul_template' running
File 'haproxy_config' accessible
Process 'haproxy' running
Process 'gorouter' running
Process 'warden' running
Process 'dea_next' running
Process 'dir_server' running
Process 'loggregator_trafficcontroller' running
Process 'doppler' running
Process 'metron_agent' running
Process 'dea_logging_agent' running
Process 'etcd_metrics_server' running
Process 'consul_agent' running
Process 'route_registrar' running
Process 'postgres' running
System 'system_ubuntu' running
root@ubuntu:/var/vcap/bosh/bin#
过一段时间,nats服务就会重新启动起来。
root@ubuntu:/var/vcap/bosh/bin# ./monit summary
The Monit daemon 5.2.4 uptime: 1h 11m
Process 'nats' running
Process 'nats_stream_forwarder' running
总之,monit提供了很多的命令,在此不一一列举了。
自定义Monit
bosh中的monit配置文件monitrc存放在
/var/vcap/bosh/etc目录下面,文件的内容形如以下:
root@ubuntu:/var/vcap/bosh/etc# cat monitrc
set daemon 10
set logfile /var/vcap/monit/monit.log
set httpd port 2822 and use address 10.0.0.112
allow cleartext /var/vcap/monit/monit.user
include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc
set daemon 10 规定了检查间隔为10秒
allow cleartext /var/vcap/monit/monit.user 规定了登录的用户名密码存放的文件
set httpd port 2822 and use address 10.0.0.112 设置了web服务器的地址和端口(后面会讲到如何打开web页面,以便更直观的看到监控信息)
include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc 这两条命令是为了引入其他的配置文件,可以使用通配符
include /var/vcap/monit/*.monitrc
include /var/vcap/monit/job/*.monitrc 这两条命令是为了引入其他的配置文件,可以使用通配符
打开web页面
1. 在上述monitrc文件中设置:
set httpd port 2822 and use address 10.0.0.112
其中2822为自定义的端口号,10.0.0.112为本机的ip
2. 修改防火墙
iptables -A INPUT -p tcp --dport 2822 -j ACCEPT
iptables -A OUTPUT -p tcp --sport 2822 -j ACCEPT
永久保存防火墙设置:
编辑/etc/network/interfaces文件,添加以下内容
pre-up iptables-restore < /etc/iptables.rules
post-down iptables-restore < /etc/iptables.downrules
修改完的文件类似于以下
auto eth0
iface eth0 inet dhcp
pre-up iptables-restore < /etc/iptables.rules
post-down iptables-restore < /etc/iptables.downrules
sudo sh -c "iptables-save -c > /etc/iptables.rules"
3. 重启monit服务
monit reload
通过浏览器访问:http://10.0.0.112:2822,使用/var/vcap/monit/monit.user文件中(见monitrc文件中定义的路径)的用户名密码登录系统,可以看到如下的效果: