Re: [PATCH 2/9] mm: arm ready for split ptlock

From: Russell King
Date: Tue Oct 25 2005 - 19:20:28 EST


On Tue, Oct 25, 2005 at 11:00:09AM -0400, Nicolas Pitre wrote:
> On Tue, 25 Oct 2005, Russell King wrote:
>
> > On Mon, Oct 24, 2005 at 10:45:04PM -0400, Nicolas Pitre wrote:
> > > On Sat, 22 Oct 2005, Russell King wrote:
> > > > Please contact Nicolas Pitre about that - that was my suggestion,
> > > > but ISTR apparantly the overhead is too high.
> > >
> > > Going through a kernel buffer will simply double the overhead. Let's
> > > suppose it should not be a big enough issue to stop the patch from being
> > > merged though (and it looks cleaner that way). However I'd like for the
> > > WARN_ON((unsigned long)frame & 7) to remain as both the kernel and user
> > > buffers should be 64-bit aligned.
> >
> > The WARN_ON is pointless because we guarantee that the stack is always
> > 64-bit aligned on signal handler setup and return.
>
> Sure, but the iWMMXt context is stored after the standard sigcontext
> which also must be 64 bits in size (which might not be always the case
> if things change in the structure or in its padding).
>
> > > I don't see how standard COW could not happen. The only difference with
> > > a true write fault as if we used put_user() is that we bypassed the data
> > > abort vector and the code to get the FAR value. Or am I missing
> > > something?
> >
> > pte_write() just says that the page _may_ be writable. It doesn't say
> > that the MMU is programmed to allow writes. If pte_dirty() doesn't
> > return true, that means that the page is _not_ writable from userspace.
>
> Argh... So only suffice to s/pte_write/pte_dirty/ I'd guess?

No. If we're emulating a cmpxchg() on a clean BSS page, this code
as it stands today will write to the zero page making it a page which
is mostly zero. Bad news when it's mapped into other processes BSS
pages.

Changing this for pte_dirty() means that we'll refuse to do a cmpxchg()
on a clean BSS page. The data may compare correctly, but because it
isn't already dirty, you'll fail.

If we still had it, I'd say you need to use verify_area() to tell the
kernel to pre-COW the pages. However, that got removed a while back.

For the benefit of others, this is a bit of a bugger since you can't
use put_user() - the read + compare + write needs to be 100% atomic
and put_user() isn't - it may fault and hence we may switch contexts.

I don't have any ideas at present, but maybe that's because it's
getting on for 1:30am. 8/

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/