Re: [patch 00/11] GRU Driver

From: Jack Steiner
Date: Thu Jun 12 2008 - 10:05:38 EST


On Thu, Jun 12, 2008 at 03:27:00PM +0200, Ingo Molnar wrote:
>
> * steiner@xxxxxxx <steiner@xxxxxxx> wrote:
>
> > This series of patches adds a driver for the SGI UV GRU. The driver is
> > still in development but it currently compiles for both x86_64 & IA64.
> > All simple regression tests pass on IA64. Although features remain to
> > be added, I'd like to start the process of getting the driver into the
> > kernel. Additional kernel drivers will depend on services provide by
> > the GRU driver.
> >
> > The GRU is a hardware resource located in the system chipset. The GRU
> > contains memory that is mmaped into the user address space. This
> > memory is used to communicate with the GRU to perform functions such
> > as load/store, scatter/gather, bcopy, AMOs, etc. The GRU is directly
> > accessed by user instructions using user virtual addresses. GRU
> > instructions (ex., bcopy) use user virtual addresses for operands.
>
> did i get it right that it's basically a fast, hardware based message
> passing interface that allows two tasks to communicate via DMA and
> interrupts, without holding up the CPU?

Yes


> If that is the case, wouldnt the
> proper support model be a network driver, instead of these special
> ioctls. (a network driver with no checksumming, with scatter-gather,
> zero-copy and TSO support, etc.)
>
> or a filesystem. Anything but special-purpose ioctls ...

The ioctls are not used directly by users.

Users function the GRU by directly writing to the memory that is mmaped into
GRU space, ie; load/store directly to GRU space. The ioctls are used
infrequently by libgru.so to configure the driver during user initialization
and to handle errors that may occur.

For example, here is the code that is required to issue a GRU
instruction & wait for completion:


Function:

/*
* Trivial example to load a cacheline of data from address <addr>.
* Data is loaded into byte 0 (hardcoded in the example) of the GRU data segment.
* Target address would likely be a function parameter but this is a stupid example.
*
* Function returns the status of the load. In this example, the load is synchronous.
* Real-life usage would probably split the vload() from the wait().
*/
int do_vload(void *cb, void *addr)
{
gru_vload(cb, addr, 0, XTYPE_CL, 1, 1, 0);
return gru_wait(cb);
}


00000000004005b0 <do_vload>:
4005b0: 48 83 ec 18 sub $0x18,%rsp
4005b4: 48 89 77 10 mov %rsi,0x10(%rdi)
4005b8: 48 c7 47 18 01 00 00 movq $0x1,0x18(%rdi)
4005bf: 00
4005c0: c7 47 04 00 00 00 00 movl $0x0,0x4(%rdi)
4005c7: 48 c7 47 20 01 00 00 movq $0x1,0x20(%rdi)
4005ce: 00
4005cf: c7 07 01 06 02 00 movl $0x20601,(%rdi)
4005d5: 48 89 7c 24 10 mov %rdi,0x10(%rsp)
4005da: 0f ae 7c 24 10 clflush 0x10(%rsp)
4005df: 31 c0 xor %eax,%eax
4005e1: f6 47 07 03 testb $0x3,0x7(%rdi)
4005e5: 74 05 je 4005ec <do_vload+0x3c>
4005e7: e8 cc fe ff ff callq 4004b8 <gru_wait_proc@plt> # unlikely to be called - mainly ito handle errors
4005ec: 48 83 c4 18 add $0x18,%rsp
4005f0: c3 retq

Unless an error occurs, there are no function calls involved. In many cases, the
entire code sequence would be inline.


--- jack
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/