[PATCH] Process Aggregates (PAGG) support for the 2.6 kernel

From: Erik Jacobson
Date: Mon Apr 26 2004 - 17:07:16 EST


Here, I am proposing Process Aggregates support for the 2.6 kernel.

What is Process Aggregates (PAGG)?
----------------------------------
PAGG provides for the implementation of arbitrary process groups in Linux.
It is a building block for kernel modules that can group processes
together into a single set for specific purposes beyond the traditional
process groups.


What types of things could make use of PAGG?
--------------------------------------------
Some example uses for PAGG include:

- System accounting - the CSA module makes use of it (see below)

- Clustering - Keeping track of processes for clustering

- NUMA placement - Making sure groups of processes run on specific CPUs or
chunks of memory

- Batch Queuing systems - Ensure specific jobs run in their designated
places.


More information about PAGG
---------------------------
There is a web page for the PAGG project at SGI:

http://oss.sgi.com/projects/pagg/

Also, the patch includes the file Documentation/pagg.txt. You may look there
for detailed information.

One project that makes extensive use of PAGG is Comprehensive System
Accounting (CSA). Details on that can be found here:
http://oss.sgi.com/projects/csa/


Info on the attached patch
--------------------------
This patch is made to apply to the 2.6.5 kernel. Patches for some other
versions of the kernel (including 2.4 and 2.6 versions) may be found on
the PAGG web site above.

--
Erik Jacobson - Linux System Software - Silicon Graphics - Eagan, Minnesotadiff -Naru 2.6-patch/Documentation/pagg.txt 2.6pagg-patch/Documentation/pagg.txt
--- 2.6-patch/Documentation/pagg.txt 1969-12-31 18:00:00.000000000 -0600
+++ 2.6pagg-patch/Documentation/pagg.txt 2004-04-26 14:36:05.000000000 -0500
@@ -0,0 +1,162 @@
+Linux Process Aggregates (PAGG)
+-------------------------------
+
+1. Description
+
+The process aggregates infrastructure, or PAGG, provides a generalized
+mechanism for providing arbitrary process groups in Linux. PAGG consists
+of a series of functions for registering and unregistering support
+for new types of process aggregation containers with the kernel.
+This is similar to the support currently provided within Linux that
+allows for dynamic support of filesystems, block and character devices,
+symbol tables, network devices, serial devices, and execution domains.
+This implementation of PAGG provides developers the basic hooks necessary
+to implement kernel modules for specific process containers, such as
+the job container.
+
+The do_fork function in the kernel was altered to support PAGG. If a
+process is attached to any PAGG containers and subsequently forks a
+child process, the child process will also be attached to the same PAGG
+containers. The PAGG containers involved during the fork are notified
+that a new process has been attached. The notification is accomplished
+via a callback function provided by the PAGG module.
+
+The do_exit function in the kernel has also been altered. If a process
+is attached to any PAGG containers and that process is exiting, the PAGG
+containers are notified that a process has detached from the container.
+The notification is accomplished via a callback function provided by
+the PAGG module.
+
+The sys_execve function has been modified to support an optional callout
+that can be run when a process in a pagg list does an exec. It can be
+used, for example, by other kernel modules that wish to do advanced CPU
+placement on multi-processor systems (just one example).
+
+Additional details concerning this implementation of the process aggregates
+infrastructure are described in the sections that follow.
+
+
+2. Kernel Changes
+
+This section describe the files and data strcutrues that are involved in this
+implementation of PAGG. Both modified as well as new files and data
+structures are discussed.
+
+3.1. Modified Files
+
+The following files were modified to implement PAGG:
+
+- include/linux/init_task.h
+- include/linux/sched.h
+- init/Config.help
+- init/Config.in
+- kernel/Makefile
+- kernel/exit.c
+- kernel/fork.c
+- fs/exec.c
+- Documentation/Configure.help
+- init/Kconfig
+
+This implementation of PAGG supports the i386 and ia64 architecture.
+
+2.2. New Files
+
+The following files were added to implement PAGG:
+
+- Documentation/pagg.txt
+- include/linux/pagg.h
+- kernel/pagg.c
+
+
+2.3. Modified Data Structures
+
+The following existing data structures were altered to implement PAGG.
+
+- struct task_struct: (include/linux/sched.h)
+ struct pagg_list pagg_list; /* List of pagg containers */
+
+This new member in task_struct, pagg_list, points to the list of pagg
+containers to which the process is currently attached.
+
+2.4. New Data Structures
+
+The following new data structures were introduced to implement PAGG.
+
+- struct pagg: (include/linux/pagg.h)
+ struct pagg_hook *hook /* Ptr to pagg module entry */
+ void *data; /* Task specific data */
+ struct list_head entry; /* List connection */
+
+- struct pagg_hook: (include/linux/pagg.h)
+ struct module *module; /* Ptr to PAGG module */
+ char *name; /* PAGG hook name - restricted
+ * to 32 characters. */
+ int (*attach)(struct task_struct *, /* Function to attach */
+ struct pagg *,
+ void *);
+ int (*detach)(struct task_struct *, /* Function to detach */
+ struct pagg *);
+ int (*init)(struct task_struct *, /* Load task init func. */
+ struct pagg *);
+ void *data; /* Module specific data */
+ struct list_head entry; /* List connection */
+ void (*exec)(struct task_struct *, struct pagg *); /* exec func ptr */
+
+The pagg structure provides the process' reference to the PAGG
+containers provided by the PAGG modules. The attach function pointer
+is the function used to notify the referenced PAGG container that the
+process is being attached. The detach function pointer is used to notify
+the referenced PAGG container that the process is exiting or otherwise
+detaching from the container. The exec function pointer is used when a
+process in the pagg container exec's a new process. This is optional and
+may be set to NULL if it is not needed by the pagg module.
+
+The pagg_hook structure provides the reference to the module that
+implements a type of PAGG container. In addition to the function pointers
+described concerning pagg, this structure provides an addition
+function pointer. The init function pointer is currently not used
+but will be available in the future. Future use of the init function
+will be optional and will used to attach currently running processes to
+a default PAGG container when a PAGG module is loaded on a running system.
+
+
+2.5. Modified Functions
+
+The following functions were changed to implement PAGG:
+
+- do_fork: (kernel/fork.c)
+ /* execute the following pseudocode before add to run-queue */
+ If parent process pagg list is not empty
+ Call attach_pagg_list function with child task_struct as argument
+- do_exit: (kernel/exit.c)
+ /* execute the following pseudocode prior to schedule call */
+ If current process pagg list is not empty
+ Call detach_pagg_list function with current task_struct
+- sys_execve: (fs/exec.c)
+ /* When a process in a pagg exec's, an optional callout can be run. This
+ is implemented with an optional function pointer in the pagg_hook. */
+
+2.6 New Functions
+
+The following new functions were added to implement PAGG:
+
+- int register_pagg_hook(struct pagg_hook *); (kernel/pagg.c)
+ Add module entry into table of pagg modules
+- int unregister_pagg_hook(struct pagg_hook *); (kernel/pagg.c)
+ Find module entry in list of pagg modules
+ Foreach task
+ If task is attached to this pagg module
+ return error
+ If no tasks are referencing this module
+ remove module entry from list of pagg modules
+- int attach_pagg_list(struct task_struct *); (kernel/pagg.c)
+ /* Assumed task pagg list pts to paggs that it attaches to */
+ While another pagg container reference
+ Make copy of pagg container reference & insert into new list
+ Attach task to pagg container using new container reference
+ Get next pagg container reference
+ Make task pagg list use the new pagg list
+- int detach_pagg_list(struct task_struct *); (kernel/pagg.c)
+ While another pagg container reference
+ Detach task from pagg container using reference
+
diff -Naru 2.6-patch/fs/exec.c 2.6pagg-patch/fs/exec.c
--- 2.6-patch/fs/exec.c 2004-03-16 14:13:30.000000000 -0600
+++ 2.6pagg-patch/fs/exec.c 2004-04-26 12:23:02.000000000 -0500
@@ -46,6 +46,7 @@
#include <linux/security.h>
#include <linux/syscalls.h>
#include <linux/rmap-locking.h>
+#include <linux/pagg.h>

#include <asm/uaccess.h>
#include <asm/pgalloc.h>
@@ -1151,6 +1152,7 @@
retval = search_binary_handler(&bprm,regs);
if (retval >= 0) {
free_arg_pages(&bprm);
+ exec_pagg_list_chk(current);

/* execve success */
security_bprm_free(&bprm);
diff -Naru 2.6-patch/include/linux/init_task.h 2.6pagg-patch/include/linux/init_task.h
--- 2.6-patch/include/linux/init_task.h 2004-03-16 14:13:30.000000000 -0600
+++ 2.6pagg-patch/include/linux/init_task.h 2004-04-13 21:42:35.000000000 -0500
@@ -2,6 +2,7 @@
#define _LINUX__INIT_TASK_H

#include <linux/file.h>
+#include <linux/pagg.h>

#define INIT_FILES \
{ \
@@ -112,6 +113,7 @@
.proc_lock = SPIN_LOCK_UNLOCKED, \
.switch_lock = SPIN_LOCK_UNLOCKED, \
.journal_info = NULL, \
+ INIT_TASK_PAGG(tsk) \
}


diff -Naru 2.6-patch/include/linux/pagg.h 2.6pagg-patch/include/linux/pagg.h
--- 2.6-patch/include/linux/pagg.h 1969-12-31 18:00:00.000000000 -0600
+++ 2.6pagg-patch/include/linux/pagg.h 2004-04-26 12:23:02.000000000 -0500
@@ -0,0 +1,249 @@
+/*
+ * PAGG (Process Aggregates) interface
+ *
+ *
+ * Copyright (c) 2000-2002, 2004 Silicon Graphics, Inc. All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ *
+ * Contact information: Silicon Graphics, Inc., 1500 Crittenden Lane,
+ * Mountain View, CA 94043, or:
+ *
+ * http://www.sgi.com
+ *
+ * For further information regarding this notice, see:
+ *
+ * http://oss.sgi.com/projects/GenInfo/NoticeExplan
+ */
+
+/*
+ * Description: This file, include/linux/pagg.h, contains the data
+ * structure definitions and function prototypes used to
+ * implement process aggrefates (paggs). Paggs provides a
+ * generalized was to implement process groupings or
+ * containers. Modules use these functions to register
+ * with the kernel as providers of process aggregation
+ * containers. The pagg data structures define the
+ * callback functions and data access pointers back into
+ * the pagg modules.
+ */
+
+#ifndef _PAGG_H
+#define _PAGG_H
+
+#include <linux/config.h>
+
+/*
+ * Used by task_struct to manage a list of pagg attachments for the task.
+ * The list will be used to hold references to pagg structures.
+ * These structures define the pagg attachments for the task.
+ *
+ * STRUCT MEMBERS:
+ * list: The list head pointer for the list of pagg
+ * structures.
+ * sem: The semaphore used to lock the list.
+ */
+struct pagg_list {
+ struct list_head head;
+ struct rw_semaphore sem;
+};
+
+#ifdef CONFIG_PAGG
+
+#define PAGG_NAMELN 32 /* Max chars in PAGG module name */
+
+
+/* Macro used to initialize a pagg_list structure after declaration
+ *
+ * Macro arguments:
+ * l: the pagg list (struct pagg_list)
+ */
+#define INIT_PAGG_LIST(l) \
+do { \
+ INIT_LIST_HEAD(l.head); \
+ init_rwsem(l.sem); \
+} while(0)
+
+
+/*
+ * Used by task_struct to manage list of pagg attachments for the process.
+ * Each pagg provides the link between the process and the
+ * correct pagg container.
+ *
+ * STRUCT MEMBERS:
+ * hook: Reference to pagg module structure. That struct
+ * holds the name key and function pointers.
+ * data: Opaque data pointer - defined by pagg modules.
+ * entry: List pointers
+ */
+struct pagg {
+ struct pagg_hook *hook;
+ void *data;
+ struct list_head entry;
+};
+
+/*
+ * Used by pagg modules to define the callback functions into the
+ * module.
+ *
+ * STRUCT MEMBERS:
+ * name: The name of the pagg container type provided by
+ * the module. This will be set by the pagg module.
+ * attach: Function pointer to function used when attaching
+ * a process to the pagg container referenced by
+ * this struct.
+ * detach: Function pointer to function used when detaching
+ * a process to the pagg container referenced by
+ * this struct.
+ * init: Function pointer to initialization function. This
+ * function is used when the module is loaded to attach
+ * existing processes to a default container as defined by
+ * the pagg module. This is optional and may be set to
+ * NULL if it is not needed by the pagg module.
+ * data: Opaque data pointer - defined by pagg modules.
+ * module: Pointer to kernel module struct. Used to increment &
+ * decrement the use count for the module.
+ * entry: List pointers
+ * exec: Function pointer to function used when a process
+ * in the pagg container exec's a new process. This
+ * is optional and may be set to NULL if it is not
+ * needed by the pagg module.
+ */
+struct pagg_hook {
+ struct module *module;
+ char *name; /* Name Key - restricted to 32 characters */
+ int (*attach)(struct task_struct *, struct pagg *, void*);
+ int (*detach)(struct task_struct *, struct pagg *);
+ int (*init)(struct task_struct *, struct pagg *);
+ void *data; /* Opaque module specific data */
+ struct list_head entry; /* List pointers */
+ void (*exec)(struct task_struct *, struct pagg *);
+};
+
+
+/* Kernel service functions for providing PAGG support */
+extern struct pagg *get_pagg(struct task_struct *task, char *key);
+extern struct pagg *alloc_pagg(struct task_struct *task,
+ struct pagg_hook *pt);
+extern void free_pagg(struct pagg *pagg);
+extern int register_pagg_hook(struct pagg_hook *pt_new);
+extern int unregister_pagg_hook(struct pagg_hook *pt_old);
+extern int attach_pagg_list(struct task_struct *to_task,
+ struct task_struct *from_task);
+extern int detach_pagg_list(struct task_struct *task);
+extern int exec_pagg_list(struct task_struct *task);
+
+/*
+ * Macro used when a child process must inherit attachment to pagg
+ * containers from the parent.
+ *
+ * Arguments:
+ * ct: child (struct task_struct *)
+ * pt: parent (struct task_struct *)
+ * cf: clone_flags
+ */
+#define attach_pagg_list_chk(ct, pt) \
+do { \
+ INIT_PAGG_LIST(&ct->pagg_list); \
+ if (!list_empty(&pt->pagg_list.head)) { \
+ if (attach_pagg_list(ct, pt) != 0) \
+ goto bad_fork_cleanup; \
+ } \
+} while(0)
+
+/*
+ * Macro used when a process must detach from pagg containers to which it
+ * is currenlty a member.
+ *
+ * Aguments:
+ * t: task (struct task_struct *)
+ */
+#define detach_pagg_list_chk(t) \
+do { \
+ if (!list_empty(&t->pagg_list.head)) { \
+ detach_pagg_list(t); \
+ } \
+} while(0)
+
+
+/*
+ * Macro used when a process exec's.
+ *
+ * Aguments:
+ * t: task (struct task_struct *)
+ */
+#define exec_pagg_list_chk(t) \
+do { \
+ if (!list_empty(&t->pagg_list.head)) { \
+ exec_pagg_list(t); \
+ } \
+} while(0)
+
+
+/*
+ * Utility macros for pagg handling and locking pagg lists.
+ *
+ * Arguments:
+ * t: task (struct task_list *)
+ * p: pagg (struct pagg *)
+ * d: data (ptr to data maintained by the
+ * pagg module - converts to void ptr)
+ */
+ /* Invoke module detach callback for the pagg & task */
+#define detach_pagg(t, p) p->hook->detach(t, p)
+ /* Invoke module attach callback for the pagg & task */
+#define attach_pagg(t, p, d) p->hook->attach(t, p, (void *)d)
+ /* Allows optional callout at exec */
+#define exec_pagg(t, p) do { \
+ if (p->hook->exec) \
+ p->hook->exec(t, p);\
+ } while(0)
+ /* Allows module to set data item for pagg */
+#define set_pagg(p, d) p->data = (void *)d
+ /* Down the read semaphore for the task's pagg_list */
+#define read_lock_pagg_list(t) down_read(&t->pagg_list.sem)
+ /* Up the read semaphore for the task's pagg_list */
+#define read_unlock_pagg_list(t) up_read(&t->pagg_list.sem)
+ /* Down the write semaphore for the task's pagg_list */
+#define write_lock_pagg_list(t) down_write(&t->pagg_list.sem)
+ /* Up the write semaphore for the task's pagg_list */
+#define write_unlock_pagg_list(t) up_write(&t->pagg_list.sem)
+
+/*
+ * Marco Used in INIT_TASK to set the head and sem of pagg_list.
+ * If CONFIG_PAGG is off, it is defined as an empty macro below.
+ */
+#define INIT_TASK_PAGG(tsk) \
+ .pagg_list = { \
+ .head = LIST_HEAD_INIT(tsk.pagg_list.head), \
+ .sem = __RWSEM_INITIALIZER(tsk.pagg_list.sem) \
+ }, \
+
+#else /* CONFIG_PAGG */
+
+/*
+ * Replacement macros used when PAGG (Process Aggregates) support is not
+ * compiled into the kernel.
+ */
+#define INIT_TASK_PAGG(tsk)
+#define INIT_PAGG_LIST(l) do { } while(0)
+#define attach_pagg_list_chk(ct, pt) do { } while(0)
+#define detach_pagg_list_chk(t) do { } while(0)
+#define exec_pagg_list_chk(t) do { } while(0)
+
+#endif /* CONFIG_PAGG */
+
+#endif /* _PAGG_H */
diff -Naru 2.6-patch/include/linux/sched.h 2.6pagg-patch/include/linux/sched.h
--- 2.6-patch/include/linux/sched.h 2004-04-05 14:18:05.000000000 -0500
+++ 2.6pagg-patch/include/linux/sched.h 2004-04-14 07:52:16.000000000 -0500
@@ -29,6 +29,7 @@
#include <linux/completion.h>
#include <linux/pid.h>
#include <linux/percpu.h>
+#include <linux/pagg.h>

struct exec_domain;

@@ -488,11 +489,15 @@

struct dentry *proc_dentry;
struct backing_dev_info *backing_dev_info;
-
struct io_context *io_context;

unsigned long ptrace_message;
siginfo_t *last_siginfo; /* For ptrace use. */
+
+#if defined(CONFIG_PAGG)
+/* List of pagg (process aggregate) attachments */
+ struct pagg_list pagg_list;
+#endif
};

static inline pid_t process_group(struct task_struct *tsk)
diff -Naru 2.6-patch/init/Kconfig 2.6pagg-patch/init/Kconfig
--- 2.6-patch/init/Kconfig 2004-03-16 14:13:30.000000000 -0600
+++ 2.6pagg-patch/init/Kconfig 2004-04-26 14:25:25.000000000 -0500
@@ -104,6 +104,14 @@
up to the user level program to do useful things with this
information. This is generally a good idea, so say Y.

+config PAGG
+ bool "Support for process aggregates (PAGGs)"
+ help
+ Say Y here if you will be loading modules which provide support
+ for process aggregate containers. Examples of such modules include the
+ Linux Jobs module and the Linux Array Sessions module. If you will not
+ be using such modules, say N.
+
config SYSCTL
bool "Sysctl support"
---help---
diff -Naru 2.6-patch/kernel/exit.c 2.6pagg-patch/kernel/exit.c
--- 2.6-patch/kernel/exit.c 2004-04-05 14:18:05.000000000 -0500
+++ 2.6pagg-patch/kernel/exit.c 2004-04-14 07:52:16.000000000 -0500
@@ -22,7 +22,7 @@
#include <linux/profile.h>
#include <linux/mount.h>
#include <linux/proc_fs.h>
-
+#include <linux/pagg.h>
#include <asm/uaccess.h>
#include <asm/pgtable.h>
#include <asm/mmu_context.h>
@@ -788,6 +788,9 @@
module_put(tsk->binfmt->module);

tsk->exit_code = code;
+
+ detach_pagg_list_chk(tsk);
+
exit_notify(tsk);
schedule();
BUG();
diff -Naru 2.6-patch/kernel/fork.c 2.6pagg-patch/kernel/fork.c
--- 2.6-patch/kernel/fork.c 2004-03-16 14:13:30.000000000 -0600
+++ 2.6pagg-patch/kernel/fork.c 2004-04-13 21:42:35.000000000 -0500
@@ -31,7 +31,7 @@
#include <linux/futex.h>
#include <linux/ptrace.h>
#include <linux/mount.h>
-
+#include <linux/pagg.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
#include <asm/uaccess.h>
@@ -232,6 +232,9 @@

init_task.rlim[RLIMIT_NPROC].rlim_cur = max_threads/2;
init_task.rlim[RLIMIT_NPROC].rlim_max = max_threads/2;
+
+ /* Initialize the pagg list in pid 0 before it can clone itself. */
+ INIT_PAGG_LIST(&current->pagg_list);
}

static struct task_struct *dup_task_struct(struct task_struct *orig)
@@ -985,6 +988,12 @@

p->parent_exec_id = p->self_exec_id;

+ /*
+ * call pagg modules to properly attach new process to the same
+ * process aggregate containers as the parent process.
+ */
+ attach_pagg_list_chk(p, current);
+
/* ok, now we should be set up.. */
p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : (clone_flags & CSIGNAL);
p->pdeath_signal = 0;
diff -Naru 2.6-patch/kernel/Makefile 2.6pagg-patch/kernel/Makefile
--- 2.6-patch/kernel/Makefile 2004-03-16 14:13:30.000000000 -0600
+++ 2.6pagg-patch/kernel/Makefile 2004-04-13 21:42:35.000000000 -0500
@@ -7,7 +7,7 @@
sysctl.o capability.o ptrace.o timer.o user.o \
signal.o sys.o kmod.o workqueue.o pid.o \
rcupdate.o intermodule.o extable.o params.o posix-timers.o \
- kthread.o
+ kthread.o pagg.o

obj-$(CONFIG_FUTEX) += futex.o
obj-$(CONFIG_GENERIC_ISA_DMA) += dma.o
@@ -18,6 +18,7 @@
obj-$(CONFIG_PM) += power/
obj-$(CONFIG_BSD_PROCESS_ACCT) += acct.o
obj-$(CONFIG_COMPAT) += compat.o
+obj-$(CONFIG_PAGG) += pagg.o
obj-$(CONFIG_IKCONFIG) += configs.o
obj-$(CONFIG_IKCONFIG_PROC) += configs.o
obj-$(CONFIG_STOP_MACHINE) += stop_machine.o
diff -Naru 2.6-patch/kernel/pagg.c 2.6pagg-patch/kernel/pagg.c
--- 2.6-patch/kernel/pagg.c 1969-12-31 18:00:00.000000000 -0600
+++ 2.6pagg-patch/kernel/pagg.c 2004-04-26 12:23:02.000000000 -0500
@@ -0,0 +1,426 @@
+/*
+ * PAGG (Process Aggregates) interface
+ *
+ *
+ * Copyright (c) 2000-2004 Silicon Graphics, Inc. All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ *
+ *
+ * Contact information: Silicon Graphics, Inc., 1500 Crittenden Lane,
+ * Mountain View, CA 94043, or:
+ *
+ * http://www.sgi.com
+ *
+ * For further information regarding this notice, see:
+ *
+ * http://oss.sgi.com/projects/GenInfo/NoticeExplan
+ */
+
+/*
+ * Description: This file, kernel/pagg.c, contains the routines used
+ * to implement process aggregates (paggs). The pagg
+ * extends the task_struct to allow for various process
+ * aggregation continers. Examples of such containers
+ * include "jobs" and cluster applications IDs. Process
+ * sessions and groups could have been implemented using
+ * paggs (although there would be little purpose in
+ * making that change at this juncture). The pagg
+ * structure maintains pointers to callback functions and
+ * data strucures maintained in modules that have
+ * registered with the kernel as pagg container
+ * providers.
+ */
+
+#include <linux/config.h>
+
+#ifdef CONFIG_PAGG
+
+#include <asm/uaccess.h>
+#include <linux/slab.h>
+#include <linux/sched.h>
+#include <asm/semaphore.h>
+#include <linux/smp_lock.h>
+#include <linux/proc_fs.h>
+#include <linux/module.h>
+#include <linux/pagg.h>
+
+/* list of pagg hook entries that reference the "module" implementations */
+static LIST_HEAD(pagg_hook_list);
+static DECLARE_RWSEM(pagg_hook_list_sem);
+
+
+/*
+ * get_pagg
+ *
+ * Given a pagg_list list structure, this function will return
+ * a pointer to the pagg struct that matches the search
+ * key. If the key is not found, the function will return NULL.
+ *
+ * The caller should hold at least a read lock on the pagg_list
+ * for task using read_lock_pagg_list(task).
+ */
+struct pagg *
+get_pagg(struct task_struct *task, char *key)
+{
+ struct list_head *entry;
+
+ list_for_each(entry, &task->pagg_list.head) {
+ struct pagg *pagg = list_entry(entry, struct pagg, entry);
+ if (!strcmp(pagg->hook->name,key)) {
+ return pagg;
+ }
+ }
+ return NULL;
+}
+
+
+/*
+ * alloc_pagg
+ *
+ * Given a task and a pagg hook, this function will allocate
+ * a new pagg structure, initialize the settings, and insert the pagg into
+ * the pagg_list for the task.
+ *
+ * The caller for this function should hold at least a read lock on the
+ * pagg_hook_list_sem - or ensure that the pagg hook entry cannot be
+ * removed. If this function was called from the pagg module (usually the
+ * case), then the caller need not hold this lock. The caller should hold
+ * a write lock on for the tasks pagg_list.sem. This can be locked using the
+ * write_lock_pagg_list(task) macro.
+ */
+struct pagg *
+alloc_pagg(struct task_struct *task, struct pagg_hook *pagg_hook)
+{
+ struct pagg *pagg;
+
+ pagg = (struct pagg *)kmalloc(sizeof(struct pagg), GFP_KERNEL);
+ if (!pagg)
+ return NULL;
+
+ pagg->hook = pagg_hook;
+ pagg->data = NULL;
+ list_add_tail(&pagg->entry, &task->pagg_list.head);
+ return pagg;
+}
+
+
+/*
+ * free_pagg
+ *
+ * This function will ensure the pagg is deleted form
+ * the list of pagg entries for the task. Finally, the memory for the
+ * pagg is discarded.
+ *
+ * The caller of this function should hold a write lock on the pagg_list.sem
+ * for the task. This can be locke dusing the write_lock_pagg_list(task)
+ * macro.
+ *
+ * Prior to calling free_pagg, the pagg should have been detached from the
+ * pagg container represented by this pagg. That is usually done using the
+ * macro detach_pagg(pagg).
+ */
+void
+free_pagg(struct pagg *pagg)
+{
+
+ list_del(&pagg->entry);
+ kfree(pagg);
+}
+
+
+/*
+ * get_pagg_hook
+ *
+ * Given a pagg hook name key, this function will return a pointer
+ * to the pagg_hook struct that contains that matches the name.
+ *
+ * You should hold either the write or read lock for pagg_hook_list_sem
+ * before using this function. This will ensure that the pagg_hook_list
+ * does not change while iterating through the list entries.
+ */
+static struct pagg_hook *
+get_pagg_hook(char *key)
+{
+ struct list_head *entry;
+ struct pagg_hook *pagg_hook;
+
+ list_for_each(entry, &pagg_hook_list) {
+ pagg_hook = list_entry(entry, struct pagg_hook, entry);
+ if (!strcmp(pagg_hook->name, key)) {
+ return pagg_hook;
+ }
+ }
+ return NULL;
+}
+
+
+/*
+ * register_pagg_hook
+ *
+ * Used to register a new pagg hook and enter it into the pagg_hook_list.
+ * The service name for a pagg hook is restricted to 32 characters.
+ *
+ * In the future an initialization function may also be defined so that all
+ * existing tasks can be assigned to a default pagg entry for the hook.
+ * However, this would require iterating through the tasklist. To do that
+ * requires that the tasklist_lock be read locked. Since the initialization
+ * function might be in a module, and therefore it might sleep (implementors
+ * decision), holding the tasklist_lock seems like a bad idea. It may be a
+ * requirement that the initialization function will be strictly forbidden
+ * from locking - by gentlemans agreement...
+ *
+ * If a memory error is encountered, the pagg hook is unregistered and any
+ * tasks that have been attached to the initial pagg container are detached
+ * from that container.
+ */
+int
+register_pagg_hook(struct pagg_hook *pagg_hook_new)
+{
+ struct pagg_hook *pagg_hook = NULL;
+
+ /* ADD NEW PAGG MODULE TO ACCESS LIST */
+ if (!pagg_hook_new)
+ return -EINVAL; /* error */
+ if (!list_empty(&pagg_hook_new->entry))
+ return -EINVAL; /* error */
+ if (pagg_hook_new->name == NULL || strlen(pagg_hook_new->name) > PAGG_NAMELN)
+ return -EINVAL; /* error */
+
+ /* Try to insert new hook entry into the pagg hook list */
+ down_write(&pagg_hook_list_sem);
+
+ pagg_hook = get_pagg_hook(pagg_hook_new->name);
+
+ if (pagg_hook) {
+ up_write(&pagg_hook_list_sem);
+ printk(KERN_WARNING "Attempt to register duplicate"
+ " PAGG support (name=%s)\n", pagg_hook_new->name);
+ return -EBUSY;
+ }
+
+ /* Okay, we can insert into the pagg hook list */
+ list_add_tail(&pagg_hook_new->entry, &pagg_hook_list);
+ up_write(&pagg_hook_list_sem);
+
+ printk(KERN_INFO "Registering PAGG support for (name=%s)\n",
+ pagg_hook_new->name);
+
+ return 0; /* success */
+
+}
+
+
+/*
+ * unregister_pagg_hook
+ *
+ * Used to unregister pagg hooks and remove them from the pagg_hook_list.
+ * Once the pagg hook entry in the pagg_hook_list is found, all of the
+ * tasks are scanned and detached from any pagg containers defined by the
+ * pagg implementation module.
+ */
+int
+unregister_pagg_hook(struct pagg_hook *pagg_hook_old)
+{
+ struct pagg_hook *pagg_hook;
+ struct task_struct *task;
+
+
+ /* Check the validity of the arguments */
+ if (!pagg_hook_old)
+ return -EINVAL; /* error */
+ if (list_empty(&pagg_hook_old->entry))
+ return -EINVAL; /* error */
+ if (pagg_hook_old->name == NULL)
+ return -EINVAL; /* error */
+
+ down_write(&pagg_hook_list_sem);
+
+ pagg_hook = get_pagg_hook(pagg_hook_old->name);
+ if (pagg_hook && pagg_hook == pagg_hook_old) {
+ /*
+ * Scan through processes on system and check for
+ * references to pagg containers for this pagg hook.
+ *
+ * The module cannot be unloaded if there are references.
+ */
+ read_lock(&tasklist_lock);
+ for_each_process(task) {
+ struct pagg *pagg = NULL;
+
+ read_lock_pagg_list(task);
+ pagg = get_pagg(task, pagg_hook_old->name);
+ /*
+ * We won't be accessing pagg's memory, just need
+ * to see if one existed - so we can release the task
+ * lock now.
+ */
+ read_unlock_pagg_list(task);
+ if (pagg) {
+ read_unlock(&tasklist_lock);
+ up_write(&pagg_hook_list_sem);
+ return -EBUSY;
+ }
+ }
+ list_del_init(&pagg_hook->entry);
+ read_unlock(&tasklist_lock);
+
+ up_write(&pagg_hook_list_sem);
+
+ printk(KERN_INFO "Unregistering PAGG support for"
+ " (name=%s)\n", pagg_hook_old->name);
+
+ return 0; /* success */
+ }
+
+ up_write(&pagg_hook_list_sem);
+
+ printk(KERN_WARNING "Attempt to unregister PAGG support (name=%s)"
+ " failed - not found\n", pagg_hook_old->name);
+
+ return -EINVAL; /* error */
+}
+
+
+/*
+ * attach_pagg_list
+ *
+ * Used to attach a new task to the same pagg containers to which it's parent
+ * is attached.
+ *
+ * The "from" argument is the parent task. The "to" argument is the child
+ * task.
+ *
+ */
+int
+attach_pagg_list(struct task_struct *to_task, struct task_struct *from_task)
+{
+ struct list_head *entry;
+ int retcode = 0;
+
+
+
+ /* lock the parents pagg_list we are copying from */
+ read_lock_pagg_list(from_task);
+
+ list_for_each(entry, &from_task->pagg_list.head) {
+ struct pagg *to_pagg = NULL;
+ struct pagg *from_pagg = list_entry(entry, struct pagg,
+ entry);
+ to_pagg = alloc_pagg(to_task, from_pagg->hook);
+ if (!to_pagg) {
+ retcode = -ENOMEM;
+ goto error_return;
+ }
+ retcode = attach_pagg(to_task, to_pagg, from_pagg->data);
+ if (retcode != 0) {
+ /* attach should issue error message */
+ goto error_return;
+ }
+ }
+
+ read_unlock_pagg_list(from_task);
+
+ return 0; /* success */
+
+ error_return:
+ /*
+ * Clean up all the pagg attachments made on behalf of the new
+ * task. Set new task pagg ptr to NULL for return.
+ */
+ read_unlock_pagg_list(from_task);
+ detach_pagg_list(to_task);
+ return retcode; /* failure */
+}
+
+
+/*
+ * detach_pagg_list
+ *
+ * Used to detach a task from all pagg containers to which it is attached.
+ */
+int
+detach_pagg_list(struct task_struct *task)
+{
+ struct list_head *entry;
+ int retcode = 0;
+ int rettmp = 0;
+
+ /* Remove ref. to paggs from task immediately */
+ write_lock_pagg_list(task);
+
+ if (list_empty(&task->pagg_list.head)) {
+ write_unlock_pagg_list(task);
+ return retcode;
+ }
+
+ list_for_each(entry, &task->pagg_list.head) {
+ int rettemp = 0;
+ struct pagg *pagg = list_entry(entry, struct pagg, entry);
+
+ entry = &task->pagg_list.head;
+
+ rettemp = detach_pagg(task, pagg);
+ if (rettmp) {
+ /* an error message should be logged in free_pagg */
+ retcode = rettmp;
+ }
+ free_pagg(pagg);
+ }
+
+ write_unlock_pagg_list(task);
+
+ return retcode; /* 0 = success, else return last code for failure */
+}
+
+
+/*
+ * exec_pagg_list
+ *
+ * Used to when a process that is in a pagg container does an exec.
+ *
+ * The "from" argument is the task. The "name" argument is the name
+ * of the process being exec'ed.
+ *
+ */
+int exec_pagg_list(struct task_struct *task) {
+ struct list_head *entry;
+
+
+
+ /* lock the parents pagg_list we are copying from */
+ read_lock_pagg_list(task);
+
+ list_for_each(entry, &task->pagg_list.head) {
+ struct pagg *pagg = list_entry(entry, struct pagg,
+ entry);
+ exec_pagg(task, pagg);
+ }
+
+ read_unlock_pagg_list(task);
+ return 0;
+}
+
+
+EXPORT_SYMBOL(get_pagg);
+EXPORT_SYMBOL(alloc_pagg);
+EXPORT_SYMBOL(free_pagg);
+EXPORT_SYMBOL(attach_pagg_list);
+EXPORT_SYMBOL(detach_pagg_list);
+EXPORT_SYMBOL(exec_pagg_list);
+EXPORT_SYMBOL(register_pagg_hook);
+EXPORT_SYMBOL(unregister_pagg_hook);
+
+#endif /* CONFIG_PAGG */