Chapter 20. Program ExZecution
20.1. Executable Files
An executable file is a regular file that describes how to initialize a new execution context
20.1.1. Process Credentials and Capabilities
20.1.1.0. Process Credentials
Table 20-1. Traditional process credentials
Name Description
uid, gid User and group real identifiers
euid, egid User and group effective identifiers
fsuid, fsgid User and group effective identifiers for file access
groups Supplementary group identifiers
suid, sgid User and group saved identifiers
20.1.1.1. Process capabilities
20.1.1.2. The Linux Security Modules framework
20.1.2. Command-Line Arguments and Shell Environment
int main(int argc, char *argv[])
int main(int argc, char *argv[], char *envp[])
Figure 20-1. The bottom locations of the User Mode stack
20.1.3. Libraries
static libraries :
This means that the executable file produced by the linker includes not only the code of the original program but also the
code of the library functions that the program refers to.
shared libraries:
The executable file does not contain the library object
code, but only a reference to the library name. When the program is loaded in memory for
execution, a suitable program called dynamic linker (also named ld.so ) takes care of analyzing the
library names in the executable file, locating the library in the system's directory tree and making
the requested code available to the executing process.
A process can also load additional shared
libraries at runtime by using the dlopen( ) library function.
20.1.4. Program Segments and Process Memory Regions
Text segment
Initialized data segment
Uninitialized data segment (bss)
Stack segment
Each mm_struct memory descriptor (see the section "The Memory Descriptor" in Chapter 9) includes
some fields that identify the role of a few crucial memory regions of the corresponding process:
start_code, end_code
start_data, end_data
start_brk, brk
arg_start, arg_end
env_start, env_end
20.1.4.1. Flexible memory region layout
20.1.5. Execution Tracing
We focus on how the kernel supports execution tracing rather than discussing how debuggers work.
In Linux, execution tracing is performed through the ptrace( ) system call, which can handle the
commands listed in Table 20-5.
20.2. Executable Formats
An executable format is described by an object of type linux_binfmt, which essentially provides three methods:
load_binary
load_shlib
core_dump
register_binfmt( ) and unregister_binfmt( ) functions
To register a new format, the user writes into the register file of
the binfmt_misc special filesystem (usually mounted on /proc/sys/fs/binfmt_misc) a string with the
following format:
:name:type:offset:string:mask:interpreter:flags
20.3. Execution Domains
A process specifies its execution domain by setting the personality field of its descriptor and storing
the address of the corresponding exec_domain data structure in the exec_domain field of the
tHRead_info structure. A process can change its personality by issuing a suitable system call named
personality( ) ; typical values assumed by the system call's parameter are listed in Table 20-6.
Programmers are not expected to directly change the personality of their programs; instead, the
personality( ) system call should be issued by the glue code that sets up the execution context of
the process (see the next section).
20.4. The exec Functions
Table 20-7. The exec functions
Function name PATH search Command-line arguments Environment array
execl( ) No List No
execlp( ) Yes List No
execle( ) No List Yes
execv( ) No Array No
execvp( ) Yes Array No
execve( ) No Array Yes
All exec functions, with the exception of execve( ), are wrapper routines defined in the C library and
use execve( ), which is the only system call offered by Linux to deal with program execution.
The sys_execve( ) service routine receives the following parameters:
(1) The address of the executable file pathname (in the User Mode address space).
(2) The address of a NULL-terminated array (in the User Mode address space) of pointers to strings
(again in the User Mode address space); each string represents a command-line argument
(3) The address of a NULL-terminated array (in the User Mode address space) of pointers to strings
(again in the User Mode address space); each string represents an environment variable in the
NAME=value format.
The function copies the executable file pathname into a newly allocated page frame. It then invokes
the do_execve( ) function, passing to it the pointers to the page frame, to the pointer's arrays, and
to the location of the Kernel Mode stack where the User Mode register contents are saved. In turn,
do_execve( ) performs the following operations:
1:
2:
......................................
Appendix A. System Startup
A.1. Prehistoric Age: the BIOS
The BIOS bootstrap procedure essentially performs the following four operations:
1. Executes a series of tests on the computer hardware to establish which devices are present and
whether they are working properly. This phase is often called Power-On Self-Test (POST).
During this phase, several messages, such as the BIOS version banner, are displayed.
2. Initializes the hardware devices. This phase is crucial in modern PCI-based architectures,
because it guarantees that all hardware devices operate without conflicts on the IRQ lines and
I/O ports. At the end of this phase, a table of installed PCI devices is displayed
3. Searches for an operating system to boot. Actually, depending on the BIOS setting, the
procedure may try to access (in a predefined, customizable order) the first sector (boot sector)
of every floppy disk, hard disk, and CD-ROM in the system.
4. As soon as a valid device is found, it copies the contents of its first sector into RAM, starting
from physical address 0x00007c00, and then jumps into that address and executes the code just
loaded.
A.2. Ancient Age: the Boot Loader
The boot loader is the program invoked by the BIOS to load the image of an operating system kernel
into RAM.
A.2.1. Booting Linux from a Disk
1: Invokes a BIOS procedure to display a "Loading" message
2: Invokes a BIOS procedure to load an initial portion of the kernel image from disk: the first 512
bytes of the kernel image are put in RAM at address 0x00090000, while the code of the setup( )
function (see below) is put in RAM starting from address 0x00090200
3: Invokes a BIOS procedure to load the rest of the kernel image from disk and puts the image in
RAM starting from either low address 0x00010000 (for small kernel images compiled with make
zImage) or high address 0x00100000 (for big kernel images compiled with make bzImage). In the
following discussion, we say that the kernel image is "loaded low" or "loaded high" in RAM,
respectively. Support for big kernel images uses essentially the same booting scheme as the
other one, but it places data in different physical memory addresses to avoid problems with the
ISA hole mentioned in the section "Physical Memory Layout" in Chapter 2.
4: Jumps to the setup( ) code.
A.3. Middle Ages: the setup( ) Function
A.4. Renaissance: the startup_32( ) Functions
A.5. Modern Age: the start_kernel( ) Function
Appendix B. Modules
B.1. To Be (a Module) or Not to Be?
The kernel has two key tasks to perform in managing modules.
The first task is making sure the rest of the kernel can reach the module's global symbols, such as the entry point to its main function. A
module must also know the addresses of symbols in the kernel and in other modules. Thus, references are resolved once and for all when a module is linked.
The second task consists of keeping track of the use of modules, so that no module is unloaded while another module or another
part of the kernel is using it. A simple reference count keeps track of each module's usage.
B.1.1. Module Licenses
B.2. Module Implementation
The module object describes a module
Table B-1. The module object
Type Name Description
enum module_state state The internal state of the module
struct list_head list Pointers for the list of modules
char [60] name The module name
struct module_kobject kobj Includes a kobject data structure and a pointer to this module object
struct module_param_attrs * param_attrs Pointer to an array of module parameter descriptors
const struct kernel_symbol * syms Pointer to an array of exported symbols
unsigned int num_syms Number of exported symbols
const unsigned long * crcs Pointer to an array of CRC values for the exported symbols
const struct kernel_symbol * gpl_syms Pointer to an array of GPL-exported symbols
unsigned int num_gpl_syms Number of GPL-exported symbols
const unsigned long * gpl_crcs Pointer to an array of CRC values for the GPLexported symbols
unsigned int num_exenTRies Number of entries in the module's exception table
const struct exception_table_entry *extable Pointer to the module's exception table
int (*)(void) init The initialization method of the module
void * module_init Pointer to the dynamic memory area allocated for module's initialization
void * module_core Pointer to the dynamic memory area allocated for module's core functions and data structures
unsigned long init_size Size of the dynamic memory area required for module's initialization
unsigned long core_size Size of the dynamic memory area required for module's core functions and data structures
unsigned long init_text_size Size of the executable code used for module's initialization; used only when linking the module
unsigned long core_text_size Size of the core executable code of the module; used only when linking the module
struct mod_arch_specific arch Architecture-dependent fields (none in the 80 x 86 architecture)
int unsafe Flag set if the module cannot be safely unloaded
int license_gplok Flag set if the module license is GPL-compatible
struct module_ref [NR_CPUS] ref Per-CPU usage counters
struct list_head modules_which_use_me List of modules that rely on this module
struct task_struct * waiter The process that is trying to unload the module
void (*)(void) exit Exit method of the module
Elf_Sym * symtab Pointer to an array of module's ELF symbols for the /proc/kallsyms file
unsigned long num_symtab Number of module's ELF symbols shown in /proc/kallsyms
char * strtab The string table for the module's ELF symbols shown in /proc/kallsyms
struct module_sect_attrs * sect_attrs Pointer to an array of module's section attribute descriptors (displayed in the sysfs filesystem)
void * percpu Pointer to CPU-specific memory areas
char * args Command line arguments used when linking the module
B.2.1. Module Usage Counters
B.2.2. Exporting Symbols
This operation is delegated to the insmod external program.
the _ _kstrtab section includes the names of the symbols
the _ _ksymtab section includes the addresses of the symbols that can be used by all kind of modules
the _ksymtab_gpl section includes the addresses of the symbols that can be used by the modules released under a GPL-compatible license.
B.2.3. Module Dependency
B.3. Linking and Unlinking Modules
A user can link a module into the running kernel by executing the insmod external program
The sys_init_module( ) service routine does all the real work; it performs the following main operations:
To unlink a module, a user invokes the rmmod external program
In turn, the sys_delete_module( ) service routine performs the following main operations:
B.4. Linking Modules on Demand
B.4.1. The modprobe Program
To automatically link a module, the kernel creates a kernel thread to execute the modprobe external
program,[*] which takes care of possible complications due to module dependencies
How does modprobe know about module dependencies? Another external program named depmod is
executed at system startup. It looks at all the modules compiled for the running kernel, which are
usually stored inside the /lib/modules directory. Then it writes all module dependencies to a file
named modules.dep. The modprobe program can thus simply compare the information stored in the
file with the list of linked modules yielded by the /proc /modules file
B.4.2. The request_module( ) Function
818