processes

Table of Contents

Processes

Prev: file-io-further-details Next: memory-allocation

Processes and Programs

A process is an instance of an executing program. A program is a file containing information that describes how to construct a process at run time. Programs have information that describes how to construct a process at run time like:

A program can be used to construct many processes.

Process ID and Parent Process ID

Each process has a process ID, a PID, which uniquely identifies the process on the system. getpid returns the process ID of the calling process.

#include <unistd.h>

pid_t getpid(void); // Always successfully returns process ID of called

The Linux kernel limits process IDs to being less than or equal to 32,767.

To get the parent process id, getppid can be used.

#include <unistd.h>
pid_t getppid(void); // Always successfully returns process ID of parent of caller

Memory Layout of a Process

The memory layout of a process has many segments, or parts.

Virtual Memory Management

Linux uses virtual memory to layout processes, which looks like this:

Process Memory Layout

A virtual memory scheme splits the memory used by each program into smaller, fixed-size units called pages. RAM is divided into series of page frames of the same size.

The kernel maintains a page table for each process, which contains the location of each page in the processes’ virtual address space.

Virtual Memory Page Table

If a process tries to access memory without a corresponding page-table entry, it receives a SIGSEGV signal.

The Stack and Stack Frames

The stack grows and shrinks as functions are called and return. Each process has its own stack with the following:

Command-Line Arguments (argc, argv)

Every C program must have a function called main. The first argument is int argc, which indicates how many arguments there are. char* argv[] is an array of pointers to the command line arguments provided. The first, argv[0] is normally the name of the program itself. This allows for a nice trick, used by some utilities, like busybox, to run different programs based on the name the program is called.

Environment List

Each process maintains an array of strings called the environment list, which holds strings of form name=value. Each name is referred to as an environment variable. When a new process is created, it inherits a copy of its parent’s environment.

To add a value to the environment, the export command can be used:

export SHELL=/bin/bash

Or, to affect a single program:

NAME=value ./program

As well, setenv in the shell can be used.

To print the current environment list, printenv can be used.

To access the current environment in a program, you can iterate through environ.

extern char **environ;
int main(int argc, char *argv[]) {
    char **ep;
    for (ep = environ; *ep != NULL; ep++)
    puts(*ep);
    exit(EXIT_SUCCESS);
}

To get a specific environment variable, use getenv.

#include <stdlib.h>
char *getenv(const char *name); // Returns pointer to (value) string, or NULL if no such variable

To set a specific environment variable, use setenv.

#include <stdlib.h>
int setenv(const char *name, const char *value, int overwrite); // Returns 0 on success, or –1 on err

To unset a variable, use unsetenv.

#include <stdlib.h>
int unsetenv(const char *name); // Returns 0 on success, or –1 on error

To clear the entire environment, use clearenv.

#define _BSD_SOURCE /* Or: #define _SVID_SOURCE */
#include <stdlib.h>
int clearenv(void)
Returns 0 on success, or a nonzero on error

Performing a Nonlocal Goto: setjmp and longjmp

setjmp and longjmp are libc functions that perform a nonlocal goto. C’s goto does not allow jumping from the current function to another function. Imagine there is a function that errors out, and the code to handle the error is in another function that is up the stack. Without non-local goto, you’d have to jump through all those functions and go to the error handling function, and hope that the call stack doesn’t terminate before then.

#include <setjmp.h>
int setjmp(jmp_buf env); // Returns 0 on initial call, nonzero on return via longjmp()
void longjmp(jmp_buf env, int val);

setjmp establishes a target for a later jump performed by longjmp.

longjmp and setjmp make code worse to read but also are optimized away by compilers, which can lead to bugs. Avoid their use.

Prev: file-io-further-details Next: memory-allocation