Linux Operating System

Slides:



Advertisements
Similar presentations
Recitation 8: 10/28/02 Outline Processes Signals –Racing Hazard –Reaping Children Annie Luo Office Hours: Thursday 6:00.
Advertisements

CSCC69: Operating Systems
Using tcpdump. tcpdump is a powerful tool that allows us to sniff network packets and make some statistical analysis out of those dumps. tcpdump operates.
Process groups, sessions, controlling terminal, and job control Process relationship: –Parent/child –Same group –Same session.
Process Relationships Terminal and Network Logins Process Groups and Sessions Job Control Relationships.
Lesson 10-Controlling User Processes. Overview Managing and processing processes. Managing jobs. Exiting/quitting when jobs have been stopped.
1 Created by Another Process Reason: modeling concurrent sub-tasks Fetch large amount data from network and process them Two sub-tasks: fetching  processing.
Processes CSCI 444/544 Operating Systems Fall 2008.
CS 311 – Lecture 14 Outline Process management system calls Introduction System calls  fork()  getpid()  getppid()  wait()  exit() Orphan process.
Inter Process Communication:  It is an essential aspect of process management. By allowing processes to communicate with each other: 1.We can synchronize.
Process in Unix, Linux and Windows CS-3013 C-term Processes in Unix, Linux, and Windows CS-3013 Operating Systems (Slides include materials from.
CS-502 Fall 2006Processes in Unix, Linux, & Windows 1 Processes in Unix, Linux, and Windows CS502 Operating Systems.
CSSE Operating Systems
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Processes.
1 Process Description and Control Chapter 3 = Why process? = What is a process? = How to represent processes? = How to control processes?
Processes in Unix, Linux, and Windows CS-502 Fall Processes in Unix, Linux, and Windows CS502 Operating Systems (Slides include materials from Operating.
Phones OFF Please Processes Parminder Singh Kang Home:
Process. Process Concept Process – a program in execution Textbook uses the terms job and process almost interchangeably A process includes: – program.
Process Description and Control Chapter 3. Major Requirements of an OS Interleave the execution of several processes to maximize processor utilization.
Linux in More Detail Shirley Moore CPS5401 August 29,
Process in Unix, Linux, and Windows CS-3013 A-term Processes in Unix, Linux, and Windows CS-3013 Operating Systems (Slides include materials from.
Introduction to Processes CS Intoduction to Operating Systems.
1Reference “Introduction To Unix Signals Programming” in the reference material section Man page – sigprocmask, alarm “Understanding the Linux Kernel”
The process concept (section 3.1, 3.3 and demos)  Process: An entity capable of requesting and using computer resources (memory, CPU cycles, files, etc).
Chapter 3 Process Description and Control
Linux+ Guide to Linux Certification, Third Edition
Learners Support Publications Classes and Objects.
Hands On UNIX II Dorcas Muthoni. Processes A running instance of a program is called a "process" Identified by a numeric process id (pid)‏  unique while.
System calls for Process management
Linux Processes Travis Willey Jeff Mihalik. What is a process? A process is a program in execution A process includes: –program counter –stack –data section.
Silberschatz, Galvin and Gagne  Operating System Concepts Process Concept An operating system executes a variety of programs:  Batch system.
Operating Systems Process Creation
CS4315A. Berrached:CMS:UHD1 Process Management Chapter 6.
UNIX Signals * POSIX-Defined Signals * Signaling Processes * Signal Mask * sigaction * kill and sigaction * alarm * Interval Timers * POSIX.1b Timers *
4300 Lines Added 1800 Lines Removed 1500 Lines Modified PER DAY DURING SUSE Lab.
Advanced Programming in the UNIX Environment Hop Lee.
Lesson 3-Touring Utilities and System Features. Overview Employing fundamental utilities. Linux terminal sessions. Managing input and output. Using special.
Process Description and Control Chapter 3. Source Modified slides from Missouri U. of Science and Tech.
CSC 660: Advanced Operating Systems
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 2: The Linux System Part 2.
The Process CIS 370, Fall 2009 CIS UMassD. The notion of a process In UNIX a process is an instance of a program in execution A job or a task Each process.
System calls for Process management Process creation, termination, waiting.
CS241 Systems Programming Discussion Section Week 2 Original slides by: Stephen Kloder.
An Introduction to processes R Bigelow. A Unix Process A process in Unix is simple a program The Unix system is made up of a group of processes all interacting.
CS241 Systems Programming Discussion Section Week 2 Original slides by: Stephen Kloder.
Process Relationships Chien-Chung Shen CIS/UD
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Chapter 3: Processes.
Process Management Process Concept Why only the global variables?
Friend Class Friend Class A friend class can access private and protected members of other class in which it is declared as friend. It is sometimes useful.
Structure of Processes
Unix Process Management
Processes David Ferry, Chris Gill
Example questions… Can a shell kill itself? Can a shell within a shell kill the parent shell? What happens to background processes when you exit from.
Tarek Abdelzaher Vikram Adve Marco Caccamo
Structure of Processes
Processes in Unix, Linux, and Windows
Lecture 5: Process Creation
Process Models, Creation and Termination
Processes in Unix, Linux, and Windows
Classes and Objects.
Today’s topic UNIX process relationship and job control
Controlling Processes
Chapter 3: Processes.
CSE 451: Operating Systems Winter 2003 Lecture 4 Processes
Processes in Unix, Linux, and Windows
CS510 Operating System Foundations
Process Description and Control in Unix
Structure of Processes
Presentation transcript:

Linux Operating System 許 富 皓

NameSpace

Namespace [Michael Kerrisk] Currently, Linux implements six different types of namespaces. The CLONE_NEW* identifiers listed in parentheses are the names of the constants used to identify namespace types when employing the namespace-related APIs (clone(), unshare(), and setns() )

Six Linux Namespaces Mount namespaces (CLONE_NEWNS, Linux 2.4.19) UTS namespaces (CLONE_NEWUTS, Linux 2.6.19) IPC namespaces (CLONE_NEWIPC, Linux 2.6.19) PID namespaces (CLONE_NEWPID, Linux 2.6.24) Network namespaces (CLONE_NEWNET, Linux 2.6.29)  User namespaces (CLONE_NEWUSER, Linux 3.8)

Goals of Namespace (1) [Michael Kerrisk] The purpose of each namespace is to wrap a particular global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource.

Goals of Namespace (2) [Michael Kerrisk] One of the overall goals of namespaces is to support the implementation of containers, a tool for lightweight virtualization (as well as other purposes) that provides a group of processes with the illusion that they are the only processes on the system

PID Namespace [Michael Kerrisk] The global resource isolated by PID namespaces is the process ID number space. This means that processes in different PID namespaces can have the same process ID. PID namespaces are used to implement containers that can be migrated between host systems while keeping the same process IDs for the processes inside the container.

Process PID [Michael Kerrisk] As with processes on a traditional Linux (or UNIX) system, the process IDs within a PID namespace are unique, and are assigned sequentially starting with PID 1. Likewise, as on a traditional Linux system, PID 1—the init process—is special: it is the first process created within the namespace, and it performs certain management tasks within the namespace.

Creation of a New PID Namespace [Michael Kerrisk] A new PID namespace is created by calling clone() with the CLONE_NEWPID flag. child_pid = clone(childFunc, child_stack, CLONE_NEWPID | SIGCHLD, argv[1]);

PID Namespace Hierarchy[Michael Kerrisk] PID namespaces form a hierarchy: A process can "see" only those processes contained in its own PID namespace and in the child namespaces nested below that PID namespace. If the parent of the child created by clone() is in a different namespace, the child cannot "see" the parent; therefore, getppid() reports the parent PID as being zero.

PID Namespace Hierarchy [text book]

/proc/PID Directory[Michael Kerrisk] Within a PID namespace, the /proc/PID directories show information only about processes within that PID namespace or processes within one of its descendant namespaces.

Mount a proc filesystem[Michael Kerrisk] However, in order to make the /proc/PID directories that correspond to a PID namespace visible, the proc filesystem ("procfs" for short) needs to be mounted from within that PID namespace. From a shell running inside the PID namespace (perhaps invoked via the system() library function), we can do this using a mount command of the following form: # mount -t proc proc /mount_point

Nested PID Namespaces[Michael Kerrisk] PID namespaces are hierarchically nested in parent-child relationships. Within a PID namespace, it is possible to see all other processes in the same namespace, as well as all processes that are members of descendant namespaces.

“See” a Process [Michael Kerrisk] Here, "see" means being able to make system calls that operate on specific PIDs. e.g., using kill() to send a signal to process. Processes in a child PID namespace cannot see processes that exist (only) in the parent PID namespace (or further removed ancestor namespaces).

PID returned by getpid() [Michael Kerrisk] A process will have one PID in each of the layers of the PID namespace hierarchy starting from the PID namespace in which it resides through to the root PID namespace. Calls to getpid() always report the PID associated with the namespace in which the process resides

Traditional init Process and Signals The traditional Linux init process is treated specially with respect to signals. The only signals that can be delivered to init are those for which the process has established a signal handler. All other signals are ignored. This prevents the init process—whose presence is essential for the stable operation of the system —from being accidentally killed, even by the super user.

init Processes of Namespaces and Signals PID namespaces implement some analogous behavior for the namespace-specific init process. Other processes in the namespace (even privileged processes) can send only those signals for which the init process has established a signal handler. Note that (as for the traditional init process) the kernel can still generate signals for the PID namespace init process in all of the usual circumstances e.g., hardware exceptions, terminal-generated signals such as SIGTTOU, and expiration of a timer.

Signals from Ancestor Namespaces Signals can be sent to the PID namespace init process by processes in ancestor PID namespaces. Again, only the signals for which the init process has established a handler can be sent, with two exceptions:  SIGKILL  and  SIGSTOP.

init Process and SIGKILL and SIGSTOP When a process in an ancestor PID namespace sends SIGKILL and SIGSTOP to the init process, they are forcibly delivered (and can't be caught). The SIGSTOP signal stops the init process; SIGKILL terminates it.

Termination of init Processes Since the init process is essential to the functioning of the PID namespace, if the init process is terminated by SIGKILL (or it terminates for any other reason), the kernel terminates all other processes in the namespace by sending them a SIGKILL signal.

Connection between Processes and Namespaces struct nsproxy *nsproxy;

Definition of struct nsproxy atomic_t count; struct uts_namespace *uts_ns; struct ipc_namespace *ipc_ns; struct mnt_namespace *mnt_ns; struct pid_namespace *pid_ns; struct net *net_ns; }; A nsproxy is shared by processes which share all namespaces. As soon as a single namespace is cloned or unshared, the nsproxy is copied.

struct nsproxy A structure to contain pointers to all per-process namespaces - fs (mount), uts, network, ipc, etc. 'count' is the number of processes holding a reference.

Initial Global Namespace struct nsproxy init_nsproxy = { .count = ATOMIC_INIT(1), .uts_ns = &init_uts_ns, #if defined(CONFIG_POSIX_MQUEUE)|| defined(CONFIG_SYSVIPC) .ipc_ns = &init_ipc_ns, #endif .mnt_ns = NULL, .pid_ns = &init_pid_ns, #ifdef CONFIG_NET .net_ns = &init_net, };

Process Identification Number Unix processes are always assigned a number to uniquely identify them in their namespace. This number is called the process identification number or PID for short. Each process generated with fork or clone is automatically assigned a new unique PID value by the kernel.

Process ID PIDs are numbered sequentially in each PID namespace: the PID of a newly created process is normally the PID of the previously created process increased by one. Of course, there is an upper limit on the PID values; when the kernel reaches such limit, it must start recycling the lower, unused PIDs. By default, the maximum PID number is PID_MAX_LIMIT-1 (32,767 or 262143).

Maximum PID Number #define PAGE_SHIFT 12 #define PAGE_SIZE 1UL << PAGE_SHIFT) #define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000) #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \ (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT)) P.S.: PID_MAX_LIMIT is equal to 215 (32768) or 224. #define PIDMAP_ENTRIES ((PID_MAX_LIMIT + 8*PAGE_SIZE - 1)/PAGE_SIZE/8) P.S.: PIDMAP_ENTRIES is equal to 1 or 215.

PIDs in PID Namespaces Namespaces add some additional complexity to how PIDs are managed. PID namespaces are organized in a hierarchy.

A Process May Have Multiple PIDs When a new namespace is created, all PIDs that are used in this namespace are visible to the parent namespace, but the child namespace does not see PIDs of the parent namespace. However this implies that some processes are equipped with more than one PID, namely, one per namespace they are visible in. This must be reflected in the data structures.

Global IDs Global IDs are identification numbers that are valid within the kernel itself and in the initial global namespace. For each ID type, a given global identifier is guaranteed to be unique in the whole system.

Local IDs Local IDs belong to a specific namespace and are not globally valid. For each ID type, they are valid within the namespace to which they belong, but identifiers of identical type may appear with the same ID number in a different namespace.

Global PID and TGID The global PID and TGID are directly stored in the task_struct, namely, in the elements pid and tgid: typedef int __kernel_pid_t; typedef __kernel_pid_t pid_t; struct task_struct { ... pid_t pid; pid_t tgid; }

PIDs and Processes Linux associates a different PID with each process or lightweight process in the system. As we shall see later in this chapter, there is a tiny exception on multiprocessor systems. This approach allows the maximum flexibility, because every execution context in the system can be uniquely identified.

Threads in the Same Group Must Have a Common PID On the other hand, Unix programmers expect threads in the same group to have a common PID. For instance, it should be possible to send a signal specifying a PID that affects all threads in the group. In fact, the POSIX 1003.1c standard states that all threads of a multithreaded application must have the same PID.

Thread Group To comply with POSIX 1003.1c standard, Linux makes use of thread groups. The identifier shared by the threads is the PID of the thread group leader , that is, the PID of the first lightweight process in the group. The thread group ID of a thread group is called TGID.

Process Groups Modern Unix operating systems introduce the notion of process groups to represent a job abstraction. For example, in order to execute the command line: $ ls | sort | more a shell that supports process groups, such as bash, creates a new group for the three processes corresponding to ls, sort, and more. In this way, the shell acts on the three processes as if they were a single entity (the job, to be precise).

Process Groups [waikato] One important feature is that it is possible to send a signal to every process in the group. Process groups are used for distribution of signals, and by terminals to arbitrate requests for their input and output.

Process Groups [waikato] Foreground Process Groups A foreground process has read and write access to the terminal. Every process in the foreground receives SIGINT (^C ) SIGQUIT (^\ ) and SIGTSTP signals. Background Process Groups A background process does not have read access to the terminal. If a background process attempts to read from its controlling terminal its process group will be sent a SIGTTIN.

Group Leaders and Process Group IDs Each process descriptor includes a field containing the process group ID. Each group of processes may have a group leader, which is the process whose PID coincides with the process group ID.

Creation of a New Process Group [Bhaskar] A newly created process is initially inserted into the process group of its parent. The shell after doing a fork would explicitly call setpgid to set the process group of the child. The process group is explicitly set for purposes of job control. When a command is given at the shell prompt, that process or processes (if there is piping) is assigned a new process group.

Login Sessions Modern Unix kernels also introduce login sessions. Informally, a login session contains all processes that are descendants of the process that has started a working session on a specific terminal -- usually, the first command shell process created for the user.

Login Sessions vs. Process Groups All processes in a process group must be in the same login session. A login session may have several process groups active simultaneously; one of these process groups is always in the foreground, which means that it has access to the terminal. The other active process groups are in the background. When a background process tries to access the terminal, it receives a SIGTTIN or SIGTTOUT signal. In many command shells, the internal commands bg and fg can be used to put a process group in either the background or the foreground.

Representation of a PID Namespace struct pid_namespace { struct kref kref; struct pidmap pidmap[PIDMAP_ENTRIES]; int last_pid; unsigned int nr_hashed; struct task_struct *child_reaper; struct kmem_cache *pid_cachep; unsigned int level; struct pid_namespace *parent; : struct user_namespace *user_ns; struct work_struct proc_work; kgid_t pid_gid; int hide_pid; int reboot; /* group exit code if this pidns was rebooted */ unsigned int proc_inum; };

child_reaper Field Every PID namespace is equipped with a process that assumes the role taken by init in the global picture. One of the purposes of init is to call wait4 for orphaned processes, and this must likewise be done by the init process of the namespace. A pointer to the task structure of this process is stored in child_reaper.

parent Field parent is a pointer to the parent namespace, and level denotes the depth in the namespace hierarchy. The initial namespace has level 0, any children of this namespace are in level 1, children of children are in level 2, and so on. Counting the levels is important because IDs in higher levels must be visible in lower levels.

pidmap Field struct pidmap { atomic_t nr_free; void *page; }; #define PIDMAP_ENTRIES ((PID_MAX_LIMIT + 8*PAGE_SIZE - 1)/PAGE_SIZE/8 struct pid_namespace { : struct pidmap pidmap[PIDMAP_ENTRIES]; }

PID bitmap [1][2][3] To keep track of which PIDs have been allocated and which are still free, the kernel uses a large bitmap in which each PID is identified by a bit. The value of the PID is obtained from the position of the bit in the bitmap.

Allocate a Free PID Allocating a free PID is then restricted essentially to looking for the first bit in the bitmap whose value is 0; this bit is then set to 1. static int alloc_pidmap(struct pid_namespace *pid_ns)

Free a PID Freeing a PID can be implemented by ‘‘toggling‘‘ the corresponding bit from 1 to 0. static void free_pidmap(struct upid *upid)

struct upid struct upid represents the information that is visible in a specific namespace. struct upid { /* Try to keep pid_chain in the same cacheline as nr for find_vpid */ int nr; struct pid_namespace *ns; struct hlist_node pid_chain; };

Fields of struct upid nr represents the numerical value of an ID, and ns is a pointer to the namespace to which the value belongs. All upid instances are kept on a hash table to which we will come in a moment, and pid_chain allows for implementing hash overflow lists with standard methods of the kernel.

The Kernel-internal Representation of A PID struct pid is the kernel-internal representation of a PID. struct pid { atomic_t count; unsigned int level; /* lists of tasks that use this pid */ struct hlist_head tasks[PIDTYPE_MAX]; struct rcu_head rcu; struct upid numbers[1]; };

Type enum pid_type enum pid_type { PIDTYPE_PID, PIDTYPE_PGID, PIDTYPE_SID, PIDTYPE_MAX }; Notice that thread group IDs are not contained in this collection. This is because the thread group ID is simply given by the PID of the thread group leader, so a separate entry is not necessary.

level Field of struct pid A process can be visible in multiple namespaces, and the local ID in each namespace will be different. level denotes in how many namespaces the process is visible (in other words, this is the depth of the containing namespace in the namespace hierarchy).

numbers Field of struct pid numbers contains a upid instance for each level. Note that the array consists formally of one element, and this is true if a process is contained only in the global namespace. Since the element is at the end of the structure, additional entries can be added to the array by simply allocating more space.

Graphic Explanation of Field struct upid numbers[] struct pid atomic_t count sruct hlist_head tasks[PIDTYPE_PID] sruct hlist_head tasks[PIDTYPE_PGID] sruct hlist_head tasks[PIDTYPE_SID] int level int nr ns struct upid numbers[0] pid_chain int nr ns struct upid numbers[1] pid_chain int nr ns pid_chain

tasks Field of struct pid The definition of struct pid is headed by a reference counter. tasks is an array with a list head for every ID type. This is necessary because an ID can be used for several processes. All task_struct instances that share a given ID are linked on this list. PIDTYPE_MAX denotes the number of ID types.

pids Field of struct task_struct Since all tast_struck structures that share an identifier are kept on a list headed by tasks, a list element is required in struct task_struct: struct task_struct { ... /* PID/PID hash table linkage. */ struct pid_link pids[PIDTYPE_MAX]; };

struct pid_link struct pid_link { struct hlist_node node; struct pid *pid; };

Graphic Explanation of Field struct hlist_head tasks[] struct pid struct pid count count tasks[0] tasks[0] tasks[1] tasks[1] tasks[2] tasks[2] level level int nr int nr ns ns node node pid_chain pid_chain pids[0] pid pid int nr int nr node node ns ns pids[1] pid pid pid_chain pid_chain node node pids[2] int nr int nr pid pid ns ns pid_chain group_leader pid_chain group_leader struct task_struct struct task_struct

Create a New pid Instance When a new process is created, a new pid instance is also created. long do_fork(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *parent_tidptr, int __user *child_tidptr) { : p = copy_process(clone_flags, stack_start, stack_size, child_tidptr, NULL, trace); } static struct task_struct *copy_process(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *child_tidptr, struct pid *pid, int trace) if (pid != &init_struct_pid) { retval = -ENOMEM; pid = alloc_pid(p->nsproxy->pid_ns); if (!pid) goto bad_fork_cleanup_io;

pids[] Fields of a New Process [source code] When a new process is created, The pids[0] field of the task_struct of the new process will be added to the linked list headed at the tasks[0] field of the new pid instance. If it is NOT a Thread Group Leader (TGL), the pids[1] (i.e. pids[PIDTYPE_PGID]) and pids[2] (i.e. pids[PIDTYPE_PGID]) fields of its task_struct will not be set.

Graphic Explanation of pids[] Fields of a New non-TGL Process struct task_struct struct task_struct struct pid count tasks[0] tasks[1] tasks[2] level int nr ns node node pid_chain pids[0] pid pid int nr node node pids[1] ns pid pid pid_chain node node pids[2] int nr pid pid ns group_leader group_leader pid_chain thread group leader

Graphic Explanation of pids[] Fields of a Session Leader Process struct task_struct struct pid count tasks[0] tasks[1] tasks[2] level int nr ns node pid_chain pids[0] pid int nr node ns pids[1] pid pid_chain node pids[2] int nr pid ns group_leader pid_chain

Relationship between a PGL and TGL [source code] A Process Group Leader (PGL) must also be a Thread Group Leader (TGL).

Graphic Explanation of pids[] Fields of a New TGL Process [source code] When a new process P1 is created by process P2 (P2 is not a TGL) and P2‘s Process Group Leader (PGL) is P3, The pids[0].node field of the task_struct of P1 will be added to the linked list headed at the tasks[0] field of the new pid instance. If P1 is a TGL, the pids[1].node (i.e. pids[PIDTYPE_PGID].node) of its task_struct will be added to the linked list headed at the tasks[1] field of a pid instance. The pid instance is pointed by the pids[PIDTYPE_PGID].pid of P3‘s task_struct. pids[2].node of the task_struct of P1 is handled similarly.

Graphic Explanation of Field tasks[], if P2 is not a TGL struct pid struct pid struct pid count count count tasks[0] tasks[0] tasks[0] tasks[1] tasks[1] tasks[1] tasks[2] tasks[2] tasks[2] node node node pids[0] pid pid pid node node node pids[1] pid pid pid node node node pids[2] pid pid pid group_leader group_leader group_leader P3, struct task_struct P2, struct task_struct P1, struct task_struct

Graphic Explanation of Field tasks[], if P2 is a TGL struct pid struct pid struct pid count count count tasks[0] tasks[0] tasks[0] tasks[1] tasks[1] tasks[1] tasks[2] tasks[2] tasks[2] node node node pids[0] pid pid pid node node node pids[1] pid pid pid node node node pids[2] pid pid pid group_leader group_leader group_leader P3, struct task_struct P2, struct task_struct P1, struct task_struct

Linked List That Links All Processes in a Process Group If a process is a process group leader (P.S.: it must also be a thread group leader), the tasks[1] field of the pid instance pointed by the pids[1].pid of its task_struct is the head of the linked list that links the field pids[1].pid of the task_struct of all thread group leaders in the same process group. count count count tasks[0] tasks[0] tasks[0] tasks[1] tasks[1] tasks[1] tasks[2] tasks[2] tasks[2] node node node pid pid pid node node node pid pid pid node node node pid pid pid group_leader group_leader group_leader process group leader

Thread Group Processes in the same thread group are chained together through the thread_group field of their tast_struct structures [1][2]. struct task_struct { : struct list_head thread_group; }

Function attach_pid() Suppose that a new instance of struct pid has been allocated and set up for a given ID type. It is attached to a task_struct structure as follows: int fastcall attach_pid(struct task_struct *task, enum pid_type type, struct pid *pid) { struct pid_link *link; link = &task->pids[type]; link->pid = pid; hlist_add_head_rcu(&link->node, &pid->tasks[type]); return 0; }

attach_pid(p,PIDTYPE_PGID,pid) count count count tasks[0] tasks[0] tasks[0] tasks[1] tasks[1] tasks[1] tasks[2] tasks[2] tasks[2] p node node node pid pid pid node node node pid pid pid node node node pid pid pid group_leader group_leader group_leader

struct pid related Helper Functions Obtain the pid instance associated with the task_struct structure. The auxiliary functions task_pid, task_tgid, task_pgrp, and task_session are provided for the different types of IDs. static inline struct pid *task_pid(struct task_struct *task) { return task->pids[PIDTYPE_PID].pid; } static inline struct pid *task_pgrp(struct task_struct *task) { return task->group_leader->pids[PIDTYPE_PGID].pid; }

Numerical PID related Helper Functions (1) Once the pid instance is available, the numerical ID can be read off from the upid information available in the numbers array in struct pid. pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns) { struct upid *upid; pid_t nr = 0; if (pid && ns->level <= pid->level) { upid = &pid->numbers[ns->level]; if (upid->ns == ns) nr = upid->nr; } return nr;

Numerical PID related Helper Functions (2) pid_vnr returns the local PID seen from the namespace to which the ID belongs. pid_nr obtains the global PID as seen from the init process. Both rely on pid_nr_ns and automatically select the proper level: 0 for the global PID, and pid->level for the local one.

pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns) count tasks[0] ns tasks[1] tasks[2] level unsigned int level; int nr match ns pid_chain struct pid_namespace : int nr ns numbers[ns->level] unsigned int level; pid_chain : int nr struct pid_namespace ns pid_chain

Return Value of the System Call getpid( ) The getpid( ) system call returns the value of TGID relative to the current process instead of the value of PID, so all the threads of a multithreaded application share the same identifier. Most processes belong to a thread group consisting of a single member; as thread group leaders, they have the TGID equal to the PID, thus the getpid( ) system call works as usual for this kind of process.

Return Value of the System Call getpid( ) pid_t pid_vnr(struct pid *pid) { return pid_nr_ns(pid, task_active_pid_ns(current)); } static inline pid_t task_tgid_vnr(struct task_struct *tsk) return pid_vnr(task_tgid(tsk)); SYSCALL_DEFINE0(getpid)  return task_tgid_vnr(current);

Graphic Explanation of getpid( ) struct task_struct node node struct pid struct pid pid pid count count : node node : level level pid pid int nr int nr node node ns ns pid pid pid_chain pid_chain int nr int nr group_leader group_leader ns ns pid_chain pid_chain current : : pid_nr_ns struct pid_namespace int nr nr int nr ns ns unsigned int level; unsigned int level; numbers[level] pid_chain pid_chain

pid_hash Hash Table Hash table pid_hash is used to find the pid instance that belongs to a numeric PID value in a given namespace. static struct hlist_head *pid_hash;

Size of pid_hash pid_hash is used as an array of hlist_head. The number of elements is determined by the RAM configuration of the machine and lies between 24 = 16 and 212 = 4,096 (It seems that the size used in kernel 3.9 is 16[1][2]).

pidhash_init pidhash_init computes the apt size and allocates the required storage. void __init pidhash_init(void) { unsigned int i, pidhash_size; pid_hash = alloc_large_system_hash("PID", sizeof(*pid_hash), 0, 18,HASH_EARLY | HASH_SMALL, &pidhash_shift, NULL, 0, 4096); pidhash_size = 1U << pidhash_shift; for (i = 0; i < pidhash_size; i++) INIT_HLIST_HEAD(&pid_hash[i]); }

Hash Function pid_hashfn static unsigned int pidhash_shift = 4; #define pid_hashfn(nr, ns) \ hash_long((unsigned long)nr + (unsigned long)ns, pidhash_shift) hash_long returns a value between 0 and 15, if the size of the hash table is 16.

Add a upid Instance (or numbers[] field) into the Hash Table pid_hash struct pid *alloc_pid(struct pid_namespace *ns) { struct pid *pid; enum pid_type type; int i, nr; struct pid_namespace *tmp; struct upid *upid; : for ( ; upid >= pid->numbers; --upid) { hlist_add_head_rcu(&upid->pid_chain,&pid_hash[pid_hashfn(upid->nr, upid->ns)]); upid->ns->nr_hashed++; }

Graphic Explanation of pid_hash count count int nr ns pid_chain count level : pid_hash[0] : : level level int nr int nr pid_hash[1] ns ns pid_chain pid_chain pid_hash[2] int nr int nr pid_hash[3] ns ns pid_chain pid_chain : : int nr int nr ns ns pid_hash[6] pid_chain pid_chain pid_hash[7] pid_hash[8] struct pid count count count : : : level level level int nr int nr int nr ns ns ns pid_chain pid_chain pid_chain pid_hash[12] int nr int nr int nr ns ns ns pid_chain pid_chain pid_chain : : : int nr int nr int nr ns ns ns pid_chain pid_chain pid_chain pid_hash[15]

Multiple PIDs of a Process When a new process is created, it may be visible in multiple namespaces. For each of them a local PID must be generated. This is handled in alloc_pid:

Excerpt of alloc_pid() struct pid *alloc_pid(struct pid_namespace *ns) { struct pid *pid; enum pid_type type; int i, nr; struct pid_namespace *tmp; struct upid *upid; ... tmp = ns; for (i = ns->level; i >= 0; i--) { nr = alloc_pidmap(tmp); pid->numbers[i].nr = nr; pid->numbers[i].ns = tmp; tmp = tmp->parent; } pid->level = ns->level;

Set the values of field numbers[] of struct pid Starting at the level of the namespace in which the process is created, the kernel goes down to the initial, global namespace and creates a local PID for each. All upid that are contained in struct pid are filled with the newly generated PIDs.

Obtain the pid Instance from a numbers[] Field struct pid Void foo(struct pid_namespace *ns) { struct upid *pnr; : container_of(pnr, struct pid, numbers[ns->level]); } count tasks[0] tasks[1] tasks[2] level int nr ns pid_chain : pnr int nr ns numbers[ns->level] pid_chain : int nr ns pid_chain

Function find_pid_ns() struct pid *find_pid_ns(int nr, struct pid_namespace *ns) { struct upid *pnr; hlist_for_each_entry_rcu(pnr, &pid_hash[pid_hashfn(nr, ns)], pid_chain) if (pnr->nr == nr && pnr->ns == ns) return container_of(pnr, struct pid, numbers[ns->level]); return NULL; }

Relationships among Processes Processes created by a program have a parent/child relationship. When a process creates multiple children, these children have sibling relationships. Several fields must be introduced in a process descriptor to represent these relationships with respect to a given process P. Processes 0 and 1 are created by the kernel. Process 1 (init) is the ancestor of all other processes.

Fields of a Process Descriptor Used to Express Parenthood Relationships (1) real_parent: points to the process descriptor of the process that created P or points to the descriptor of process 1 (init) if the parent process no longer exists. Therefore, when a user starts a background process and exits the shell, the background process becomes the child of init.

Fields of a Process Descriptor Used to Express Parenthood Relationships (2) Points to the current parent of P this is the process that must be signaled when the child process terminates. its value usually coincides with that of real_parent. It may occasionally differ, such as when another process issues a ptrace( ) system call requesting that it be allowed to monitor P. see the section "Execution Tracing" in Chapter 20.

Fields of a Process Descriptor Used to Express Parenthood Relationships (3) struct list_head children: The head of the list containing all children created by P. This list is formed through the sibling field of the child processes. struct list_head sibling: The pointers to the next and previous elements in the list of the sibling processes, those that have the same parent as P. P.S.: /* * children/sibling forms the list of my natural children */ struct list_head children; /* list of my children */ struct list_head sibling; /* linkage in my parent's children list

Family Relationships between Processes

Example Process P0 successively created P1, P2, and P3. Process P3, in turn, created process P4. children/sibling fields forms the list of children of P0 (those links marked with )

Other Relationship between Processes There exist other relationships among processes: a process can be a leader of a process group or of a login session, it can be a leader of a thread group, and it can also trace the execution of other processes (see the section "Execution Tracing" in Chapter 20).

Other Process Relationship Fields of a Process Descriptor P (1) struct task_struct * group_leader Process descriptor pointer of the thread group leader of P.

Other Process Relationship Fields of a Process Descriptor P (2) /* * ptraced is the list of tasks this task is using ptrace on. * This includes both natural children and PTRACE_ATTACH targets. * p->ptrace_entry is p's link on the p->parent->ptraced list. */ struct list_head ptraced; struct list_head ptrace_entry; [example]: function __ptrace_link()