Re: BUG: mmapfile/writev spurious zero bytes (x86_64/not i386,bisected, reproducable)

From: Linus Torvalds
Date: Tue Jun 17 2008 - 16:18:02 EST




On Tue, 17 Jun 2008, Linus Torvalds wrote:
>
> Bron, does this untested patch hide the bug?

Ok, it does hide the bug.

> I don't think this patch is correct, because it doesn't really fix the
> basic issue (the code should do the right thing even if a page isn't
> there), but it might hide it by faulting in the whole "bytes" range rather
> than just the first iov.
>
> So Nick, it's still over to you, but if this does hide it, then that's an
> interesting detail in itself.

I actually am starting to think that the bug is in
__copy_to_user_inatomic_nocache().

The return value of that thing in the case of a failure seems to be
totally wrong. In particular, since that function does an unrolled loop of
accesses like so:

.Ls1: movq (%rsi),%r11
.Ls2: movq 1*8(%rsi),%r8
.Ls3: movq 2*8(%rsi),%r9
.Ls4: movq 3*8(%rsi),%r10
.Ld1: movnti %r11,(%rdi)
.Ld2: movnti %r8,1*8(%rdi)
.Ld3: movnti %r9,2*8(%rdi)
.Ld4: movnti %r10,3*8(%rdi)

it may have done three of the 64-bit loads, and then trap on the fouth:
but since it hasn't done a _single_ of the stores, it shouldn't count as
any different whether it traps on any of .Ls[1-4]. But that code
definitely seems to think it makes a difference whether the exception
happened at Ls1 or at Ls4, even though both points have copied _exactly_
as many bytes when the exception happens.

So I'm starting to think the bug is all in there, not in the VM itself.
See arch/x86/lib/copy_user_nocache.S.

In fact, even the comment there implies that the author didn't know or
care about returning the correct value.

Andi and Ingo added to Cc.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/