1. 概述
github项目地址:https://github.com/superwujc
尊重原创,欢迎转载,注明出处:https://my.oschina.net/superwjc/blog/1811314
守护进程通常具有2个特征:
1. 随系统启动而运行,直到系统关闭,因而长期运行
2. 后端运行且无控制终端相关联,使得内核不会自动发送任何与任务控制以及终端设备相关的信号
普通进程通过依次执行以下步骤而成为守护进程:
1. 调用fork(2)后,父进程退出,子进程继续运行
2. 子进程调用setsid(2),创建新的会话与进程组,且无控制终端相关联,但文件描述符0/1/2仍各自引用stdin/stdout/stderr;子进程具有自身pid且与继承自父进程的pgid不同,因而不是进程组长,不违背setsid(2)的调用规则
3. 为防止守护进程后续打开的终端设备成为控制终端,可以通过以下2种方式:
- 调用open(2)打开终端设备时指定O_NOCTTY标志,子进程作为守护进程在后端运行;或
- 再次调用fork(2),子进程正常退出而使孙进程作为守护进程在后端运行;由于控制终端必须由会话首进程建立关联,而继承子进程sid的孙进程不是会话首进程,因而后续打开的终端设备不会成为控制终端
4. 清空进程掩码,以确保守护进程创建文件与目录时具有相应的权限
5. 更改守护进程的当前工作目录,通常为系统根目录,以确保守护进程不会因文件系统卸载而无法正常读写文件;也可以更改为守护进程执行任务的目录,或配置文件中定义的路径
6. 关闭所有继承自父进程的无用的文件描述符,以释放非必须的系统资源;若守护进程需要某些继承的文件描述符保持打开,则该步骤可选;nginx(V1.14.0)及其变体tengine(V2.2.2)均未执行该步骤
7. 打开/dev/null设备,并通过dup2(2)等调用使得文件描述符0,1,2引用该设备文件以确保守护进程对这些文件描述符执行IO库函数时正确运行,且IO库函数不会将这些文件描述符视为stdin/stdout/stderr而与终端交互;nginx(1.14.0)及其变体tengine(2.2.2)并未执行dup2(fd, STDERR_FILENO),因而程序启动与运行过程中产生的一些错误和诊断信息可以显示在stderr上
由于父进程已退出,因此启动后的守护进程将作为init进程的子进程后端运行。
由于守护进程运行于后端,且通常没有终端与之关联,因此需要某种机制来记录错误与诊断信息;对于守护进程对文件描述符0/1/2调用dup2()以引用/dev/null的情况,可以通过应用程序自身实现的日志系统,或Linux系统自带的syslog机制实现该功能,详见 Linux标准系统日志接口之syslog(3)
控制进程失去与控制终端的连接时,内核将产生SIGHUP信号;守护进程没有控制终端,因此该信号可以用于守护进程的重新初始化,通过sigaction(2)为SIGHUP创建信号处理函数,可以实现重新读取配置文件,重新打开日志文件以及重新平滑重启守护进程等功能
守护进程通常仅允许单个实例,通常的实现方式为文件锁,守护进程运行期间将独占锁定某个预定义的文件,并在正常结束运行前手动或自动解锁;守护进程可以将自身的进程PID写入到锁文件中,使得其他进程可以通过kill(pid, 0)的方式检测是否有pid指定的守护进程正在运行,详见 Linux文件锁实现之flock(2)与fcntl(2)
glibc提供了非标准的daemon(3)库函数
#include <unistd.h>
int daemon(int nochdir, int noclose);
该函数允许通过将nochdir与noclose设置为0与非0值,以控制是否将守护进程的工作目录设置为根目录,以及是否将stdin/stdout/stderr重定向为/dev/null
2. 示例
操作系统与内核版本
# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.4.1708 (Core)
Release: 7.4.1708
Codename: Core
#
# uname -r
3.10.0-693.21.1.el7.x86_64
glibc与gcc版本
# gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
# ldd /bin/ls
linux-vdso.so.1 => (0x00007fffe93b2000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007faebd618000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007faebd413000)
libacl.so.1 => /lib64/libacl.so.1 (0x00007faebd209000)
libc.so.6 => /lib64/libc.so.6 (0x00007faebce46000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007faebcbe4000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007faebc9df000)
/lib64/ld-linux-x86-64.so.2 (0x000056058dd43000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007faebc7da000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007faebc5be000)
#
# /lib64/libc.so.6
GNU C Library (GNU libc) stable release version 2.17, by Roland McGrath et al.
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.5 20150623 (Red Hat 4.8.5-16).
Compiled on a Linux 3.10.0 system on 2017-11-30.
Available extensions:
The C stubs add-on version 2.1.2.
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
RT using linux kernel aio
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
示例程序 - daemonize.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <syslog.h>
#include <errno.h>
#include <string.h>
#define BUF_SIZE 16
#define MAX_CLOSE 4096
#define LOCKFILE "/var/run/daemonize.pid"
#define ERR_LOG_RET(msg, ...) do { syslog(LOG_ERR, msg, ##__VA_ARGS__); return -1; } while (0)
static int addlock(void);
static int daemonize(void);
int main(int argc, char *argv[])
{
setbuf(stdout, NULL);
openlog(NULL, LOG_CONS | LOG_PERROR | LOG_PID, LOG_USER);
if (daemonize() == -1)
ERR_LOG_RET("daemon demo started failed");
syslog(LOG_INFO, "daemon demo started successfully");
for ( ; ; )
sleep(1);
}
static int daemonize(void)
{
int fd, maxfd, lockfd;
switch (fork()) {
case -1: ERR_LOG_RET("the 1st fork() failded: %m");
case 0: break;
default: _exit(EXIT_SUCCESS);
}
if (setsid() == -1)
ERR_LOG_RET("setsid() failed: %m");
switch (fork()) {
case -1: ERR_LOG_RET("the 2nd fork() failded: %m");
case 0: break;
default: _exit(EXIT_SUCCESS);
}
lockfd = addlock();
if (lockfd == -1)
ERR_LOG_RET("set lock to %s failed", LOCKFILE);
umask(0);
if (chdir("/var/log") == -1)
ERR_LOG_RET("chdir() failed: %m");
maxfd = sysconf(_SC_OPEN_MAX);
if (maxfd == -1)
maxfd = MAX_CLOSE;
for (fd = 0; fd < maxfd; fd++) {
if (fd == STDERR_FILENO || fd == lockfd)
continue;
if (close(fd) == -1 && errno != EBADF)
ERR_LOG_RET("close(%d) failed: %m", fd);
}
fd = open("/dev/null", O_RDWR);
if (fd == -1)
ERR_LOG_RET("open(\"/dev/null\": %m");
if (dup2(fd, STDIN_FILENO) == -1)
ERR_LOG_RET("dup2(STDIN_FILENO) failed: %m");
if (dup2(fd, STDOUT_FILENO) == -1)
ERR_LOG_RET("dup2(STDOUT_FILENO) failed: %m");
#if 0
if (dup2(fd, STDERR_FILENO) == -1)
ERR_LOG_RET("dup2(STDERR_FILENO) failed: %m");
#endif
return 0;
}
static int addlock(void)
{
int lockfd, flags;
struct flock fl_w;
char buf[BUF_SIZE];
lockfd = open(LOCKFILE, O_RDWR | O_CREAT, S_IRUSR | S_IWUSR);
if (lockfd == -1)
ERR_LOG_RET("open() lockfd failed: %m");
flags = fcntl(lockfd, F_GETFD);
if (flags == -1)
ERR_LOG_RET("fcntl() to get flags failed: %m");
flags |= FD_CLOEXEC;
if (fcntl(lockfd, F_SETFD, flags) == -1)
ERR_LOG_RET("fcntl() to set flags failed: %m");
fl_w.l_type = F_WRLCK;
fl_w.l_whence = SEEK_SET;
fl_w.l_start = 0;
fl_w.l_len = 0;
if (fcntl(lockfd, F_SETLK, &fl_w) == -1)
ERR_LOG_RET("fcntl() to set lock failed: %m");
if (ftruncate(lockfd, 0) == -1)
ERR_LOG_RET("ftruncate() LOCKFILE failed: %m");
snprintf(buf, BUF_SIZE, "%ld\n", (long)getpid());
if (write(lockfd, buf, strlen(buf)) != strlen(buf))
ERR_LOG_RET("write() fd to LOCKFILE failed: %m");
return lockfd;
}
该程序依次执行以下步骤:
1. 调用openlog(3)创建日志连接以通过syslog机制记录程序日志信息
2. 两次执行fork(2)以创建作为守护进程的孙进程
3. 以/var/run/daemonize.pid文件作为锁文件,设置非阻塞独占锁,以确保守护进程的单实例运行,然后将该进程的PID写入到加锁文件中
编译程序
# gcc daemonize.c -o daemonize
syslog(3)默认以程序名称字符串标识特定程序产生的日志,因此可以添加一条规则,使daemonize程序的日志输出到一个单独的自定义文件,本例中为/var/log/daemonize.log文件。
新建配置文件并添加以下规则,重启rsyslogd守护进程
# vi /etc/rsyslog.d/daemonize.conf
if $programname == "daemonize" then /var/log/daemonize.log
&stop
#
# systemctl restart rsyslog
运行程序
# ./daemonize
# daemonize[1939]: daemon demo started successfully
# ./daemonize
# daemonize[1942]: fcntl() to set lock failed: Resource temporarily unavailable
daemonize[1942]: set lock to /var/run/daemonize.pid failed
daemonize[1942]: daemon demo started failed
#
# cat /var/log/daemonize.log
2018-05-12 16:08:20 localhost daemonize[1939]: daemon demo started successfully
2018-05-12 16:08:27 localhost daemonize[1942]: fcntl() to set lock failed: Resource temporarily unavailable
2018-05-12 16:08:27 localhost daemonize[1942]: set lock to /var/run/daemonize.pid failed
2018-05-12 16:08:27 localhost daemonize[1942]: daemon demo started failed
#
# echo $$
958
#
# cat /var/run/daemonize.pid
1939
#
# ps -p 1939 -o pid,ppid,pgid,tpgid,sid,args
PID PPID PGID TPGID SID COMMAND
1939 1 1938 -1 1938 ./daemonize
锁文件/var/run/daemonize.pid中已包含守护进程PID,尝试再次运行daemonize失败,程序的INFO与ERR信息被记录到自定义日志文件/var/log/daemonize.log
由于守护进程并未对文件描述符2(STDERR_FILENO)执行关闭与重定向操作,且openlog(3)的option参数包含LOG_PERROR标志,因此日志信息同时输出至stderr
查看锁
# lslocks
COMMAND PID TYPE SIZE MODE M START END PATH
atd 550 POSIX 4B WRITE 0 0 0 /run/atd.pid
master 951 FLOCK 33B WRITE 0 0 0 /var/spool/postfix/pid/master.pid
master 951 FLOCK 33B WRITE 0 0 0 /var/lib/postfix/master.lock
daemonize 1939 POSIX 5B WRITE 0 0 0 /run/daemonize.pid
crond 548 FLOCK 4B WRITE 0 0 0 /run/crond.pid
anacron 1188 POSIX 9B WRITE 0 0 0 /var/spool/anacron/cron.weekly
#
# cat /proc/locks
1: POSIX ADVISORY WRITE 1188 08:03:67286703 0 EOF
2: FLOCK ADVISORY WRITE 548 00:13:16035 0 EOF
3: POSIX ADVISORY WRITE 1939 00:13:19432 0 EOF
4: FLOCK ADVISORY WRITE 951 08:03:67598583 0 EOF
5: FLOCK ADVISORY WRITE 951 08:03:375978 0 EOF
6: POSIX ADVISORY WRITE 550 00:13:15339 0 EOF
/var/run与/run指向同一目录,inode相同
# ll /var/
total 8
drwxr-xr-x. 2 root root 6 Nov 5 2016 adm
drwxr-xr-x. 6 root root 62 May 9 23:27 cache
drwxr-xr-x. 2 root root 6 Oct 20 2017 crash
drwxr-xr-x. 3 root root 34 Mar 20 22:51 db
drwxr-xr-x. 3 root root 18 Mar 20 22:20 empty
drwxr-xr-x. 2 root root 6 Nov 5 2016 games
drwxr-xr-x. 2 root root 6 Nov 5 2016 gopher
drwxr-xr-x. 3 root root 18 Mar 20 22:19 kerberos
drwxr-xr-x. 24 root root 4096 Mar 20 22:20 lib
drwxr-xr-x. 2 root root 6 Nov 5 2016 local
lrwxrwxrwx. 1 root root 11 Mar 20 22:19 lock -> ../run/lock
drwxr-xr-x. 7 root root 4096 May 12 16:08 log
lrwxrwxrwx. 1 root root 10 Mar 20 22:19 mail -> spool/mail
drwxr-xr-x. 2 root root 6 Nov 5 2016 nis
drwxr-xr-x. 2 root root 6 Nov 5 2016 opt
drwxr-xr-x. 2 root root 6 Nov 5 2016 preserve
lrwxrwxrwx. 1 root root 6 Mar 20 22:19 run -> ../run
drwxr-xr-x. 9 root root 97 May 9 23:27 spool
drwxrwxrwt. 3 root root 85 May 12 15:41 tmp
drwxr-xr-x. 2 root root 6 Nov 5 2016 yp
#
# ls -lid /var/run/
8844 drwxr-xr-x 20 root root 620 May 12 16:08 /var/run/
#
# ls -lid /run/
8844 drwxr-xr-x 20 root root 620 May 12 16:08 /run/
3. 参考
《The Linux Programming Interface》 Chapter 37
《UNIX环境高级编程》 第13章
nginx-1.14.0/src/core/ngx_config.h
nginx-1.14.0/src/os/unix/ngx_daemon.c
nginx-1.14.0/src/os/unix/ngx_process.c