Re: Syscall numbers for BProc

From: hendriks@lanl.gov
Date: Fri Apr 04 2003 - 19:44:28 EST


On Fri, Apr 04, 2003 at 08:35:31PM +0100, Christoph Hellwig wrote:
> On Fri, Apr 04, 2003 at 12:32:18PM -0700, Erik Hendriks wrote:
> > Is it possible to get a Linux system call number allocated for BProc?
> > (http://sourceforge.net/projects/bproc) I've been using arbitrary
> > system call numbers for a while but there have been collisions with
> > new kernel features. I'd like to avoid that in the future. BProc
> > currently works on 2.4.x kernels on x86, alpha and ppc (32bit).
>
> Please explain why you need syscalls and the exact APIs.

Here's the answer to the second half... The kernel API is reasonably
large so let me know if anybody wants more detail anywhere. The C
library is basically a 1:1 mapping for these calls.

Also, if anybody wants to know more about BProc in general, they can
check out: http://public.lanl.gov/cluster/papers/papers/hendriks-ics02.pdf

API for the syscall:

arg0 is a function number and meaning of the rest of the arguments
depends on that.

For arg0:

0x0001 - BPROC_SYS_VERSION - get BProc version
  arg1 is a pointer to a bproc_version_t which gets filled in.

  return value: 0 on success, -errno on error

0x0002 - BPROC_SYS_DEBUG - undocumented debugging call a magic
  debugging hook who's argument meanings are fluid, currently it does:
  arg1 = 0 - return number of children by checking pptr and opptr.
  arg1 = 1 - return true/false indicating whether wait() will be local.
  arg1 = 2 - return value of nlchild (internal BProc process book keeping val)
  arg1 = 3 - perform process ID sanity check and return information about
             pptr and opptr in linux and BProc.

0x0003 - BPROC_SYS_MASTER - get master daemon file descriptor. The master
                       daemon reads/writes messages to/from kernel
                       space with this file descriptor.
  no arguments

  return value: a new file descriptor or -errno on failure.

0x0004 - BPROC_SYS_SLAVE - get slave daemon file descriptor. The slave
                      daemon reads/writes messages to/from kernel
                      space with this file descriptor.
  no arguments

  return value: a new file descriptor or -errno on failure.

0x0005 - BPROC_SYS_IOD - get I/O daemon file descriptor. The I/O daemon
                    reads/writes messages to/from kernel space with
                    this file descriptor.
  no arguments

  return value is a new file descriptor or -errno on failure.

0x0006 - BPROC_SYS_NOTIFY - get notifier file descriptor. An app can get a
                       file descriptor which becomes ready for reading
                       whenever a BProc machine state change happens.
  no arguments

  return value: a new file descriptor or -errno on failure.

0x0201 - BPROC_SYS_INFO - get information on node status
  arg1 pointer to an array of bproc_node_info_t's (first element contains
       index of last node seen, nodes returned will start with next node)
  arg2 number of elements in array

  return value: number of elements returned on success, -errno on error

0x0202 - BPROC_SYS_STATUS - set node status
  arg1 node number
  arg2 new node state

  return value: 0 on success, -errno on error

0x0203 - BPROC_SYS_CHOWN - change node permission bits (perms are file-like)
  arg1 node number
  arg2 new node perm

  return value: 0 on success, -errno on error

0x0207 - BPROC_SYS_CHROOT - ask slave node to chroot()
  arg1 node number
  arg2 pointer to patch to chroot() to.

  return value: 0 on success, -errno on error

0x0208 - BPROC_SYS_REBOOT - ask slave node to reboot
  arg1 node number

  return value: 0 on success, -errno on error

0x0209 - BPROC_SYS_HALT - ask slave node to halt
  arg1 node number

  return value: 0 on success, -errno on error

0x020A - BPROC_SYS_PWROFF - ask slave node to power off
  arg1 node number

  return value: 0 on success, -errno on error

0x020B - BPROC_SYS_PINFO - get information about location of remote processes
  arg1 pointer to an array of bproc_proc_info_t's (first element contains
       index of last proc seen, procs returned will start with next node)
  arg2 number of elements in array

  return value: number of elements returned on success, -errno on error

0x020E - BPROC_SYS_RECONNECT - ask slave daemon to reconnect
  arg1 node number
  arg2 pointer to bproc_connect_t which contains 2 sockaddrs - a local
       and remote address for the slave to use when re-connecting to
       the master.

  return value: 0 on success, -errno on error

0x0301 - BPROC_SYS_REXEC - remote exec (replace current process with exec
                           performed on remote node)
  arg1 node number
  arg2 pointer to bproc_move_t (contains exec args, io setup info, etc)

  return value: no return (it's an exec) on success, -errno on error

0x0302 - BPROC_SYS_MOVE - move caller to remote node
  arg1 node number
  arg2 pointer to bproc_move_t (contains flags, io setup info, etc)
  
  arg2 move flags (how much of the memory space gets sent)

  return value: 0 on success, -errno on error

0x0303 - BPROC_SYS_RFORK - fork a child onto another node. This is a
                           combination of the fork and move calls with
                           semantics such that no child process is
                           ever created (from the parent's point of
                           view) if the move step fails.
  arg1 node number
  arg2 pointer to bproc_move_t (contains flags, io setup info, etc)

  return value: parent: child pid on success, -errno on error
                child: 0

0x0304 - BPROC_SYS_EXECMOVE - exec and then move. This is a
                              combination of the xec and move
                              syscalls. This call performs an exec
                              and then moves the resulting process
                              image to a remote node before it is
                              allowed to run. This is used to place
                              images of programs which are not BProc
                              aware on remote nodes.
  arg1 node number
  arg2 pointer to bproc_move_t (contains exec args, io setup info, etc.)

  return value: no return on success, -errno on error if error happens
  in exec(). If error happens during the move step the process will
  exit with errno as its status.

0x0306 - BPROC_SYS_VRFORK - vector rfork - create many child processes
                            on many remote nodes efficiently.

  arg1 pointer to bproc_move_t. This contains the number of children
       to create, a list of nodes to move to, an array to store the
       resulting child process IDs, and possibly IO setup information.

  return value: parent: number of nodes or -errno on error.
                        pid array contains pids or -errno for each child.
                child: rank in list of nodes (0 .. n-1)

0x0307 - BPROC_SYS_EXEC - use master node to perform exec. A process
                          running on a slave node can ask it's "ghost"
                          on the front end to perform an exec for it.
                          The results of that exec will replace the
                          process on the slave node.
  arg1 pointer to bproc_move_t (contains execve args)

  return value: no return on success, -errno on failure.

0x0309 - BPROC_SYS_VEXECMOVE - vector execmove - create many child
                               processes on many remote nodes
                               efficiently. The child process image
                               is the result of the supplied execve.

  arg1 pointer to bproc_move_t. This contains the number of children
       to create, a list of nodes to move to, an array to store the
       resulting child process IDs, execve args and possibly IO setup
       information.

  return value: parent: number of nodes or -errno on failure. The array
                children: no return. If BPROC_RANK=XXXXXXX exists in the
                          environment, vexecmove will replace the Xs with
                          the child's rank in vexecmove.

0x1000 - at this offset, bproc provides an interface to virtual memory
         area dumper (vmadump). VMADump is the process save/restore
         mechanism that BProc uses internally.

0x1000 - VMAD_DO_DUMP - dump the calling process's image to a file descriptor
  arg1 file descriptor
  arg2 dump flags (controls which regions are dumped and which regions
       are stored as references to files.)

  return value: during dump: number of bytes written to the file descriptor.
                during undump: 0 (when the process image is restored, it will
                start by returning from this system call)

0x1001 - VMAD_DO_UNDUMP - restore a process image from a file
                          descriptor. The new image will replace the
                          calling process just like exec.
  arg1 file descriptor

  return value: no return on success, -errno on failure.

side note: where possible, vmadump adds a binary format for dump files
which allows a dump stored in a file to be executed directly.

0x1002 VMAD_DO_EXECDUMP - this is a combination of the exec and dump
                          system calls. An exec is performed and the
                          resulting process image is dumped to a file
                          descriptor. The process will exit after the
                          dump is complete.

  arg1 pointer to vmadump_execdump_args (contains execve arguments, a
       file descriptor to dump to and dump flags)

  return value: no return on success, -errno on failure.

VMADump has a "library list" in kernel space. This is the list of
files it presumes are available at undump time if you ask it not to
dump "libaries".

0x1030 - VMAD_LIB_CLEAR - clear the library list
  no arguments

  return value: 0 on success, -errno no failure.

0x1031 - VMAD_LIB_ADD - add a new library to the list
  arg1 pointer to filename
  arg2 length of filename

  return value: 0 on success, -errno no failure.

0x1032 - VMAD_LIB_DEL - delete a library from the list
  arg1 pointer to filename
  arg2 length of filename

  return value: 0 on success, -errno no failure.

0x1033 - VMAD_LIB_LIST - get the contents of the library list
  arg1 pointer to region to store library list
  arg2 size of region

  The list of libraries is store as a null delimited list of strings.

  return value: number of bytes stored on success, -errno no failure.

0x1034 - VMAD_LIB_SIZE - get the size in bytes of the library list
  no arguments

  return value: size in bytes of the library list
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Apr 07 2003 - 22:00:24 EST