Re: [PATCH 08 of 11] anon-vma-rwsem

From: Robin Holt
Date: Wed May 07 2008 - 20:38:53 EST


On Wed, May 07, 2008 at 02:36:57PM -0700, Linus Torvalds wrote:
> On Wed, 7 May 2008, Andrea Arcangeli wrote:
> >
> > I think the spinlock->rwsem conversion is ok under config option, as
> > you can see I complained myself to various of those patches and I'll
> > take care they're in a mergeable state the moment I submit them. What
> > XPMEM requires are different semantics for the methods, and we never
> > had to do any blocking I/O during vmtruncate before, now we have to.
>
> I really suspect we don't really have to, and that it would be better to
> just fix the code that does that.

That fix is going to be fairly difficult. I will argue impossible.

First, a little background. SGI allows one large numa-link connected
machine to be broken into seperate single-system images which we call
partitions.

XPMEM allows, at its most extreme, one process on one partition to
grant access to a portion of its virtual address range to processes on
another partition. Those processes can then fault pages and directly
share the memory.

In order to invalidate the remote page table entries, we need to message
(uses XPC) to the remote side. The remote side needs to acquire the
importing process's mmap_sem and call zap_page_range(). Between the
messaging and the acquiring a sleeping lock, I would argue this will
require sleeping locks in the path prior to the mmu_notifier invalidate_*
callouts().

On a side note, we currently have XPMEM working on x86_64 SSI, and
ia64 cross-partition. We are in the process of getting XPMEM working
on x86_64 cross-partition in support of UV.

Thanks,
Robin Holt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/