Re: [mm 4.15-rc8] Random oopses under memory pressure.

From: Linus Torvalds
Date: Thu Jan 18 2018 - 12:26:32 EST


On Thu, Jan 18, 2018 at 8:56 AM, Kirill A. Shutemov
<kirill@xxxxxxxxxxxxx> wrote:
>
> I can't say I fully grasp how 'diff' got this value and how it leads to both
> checks being false.

I think the problem is that page difference when they are in different sections.

When you do

pte_page(*pvmw->pte) - pvmw->page

then the compiler takes the pointer difference, and then divides by
the size of "struct page" to get an index.

But - and this is important - it does so knowing that the division it
does will have no modulus: the two 'struct page *' pointers are really
in the same array, and they really are 'n*sizeof(struct page)' apart
for some 'n'.

That means that the compiler can optimize the division. In fact, for
this case, gcc will generate

subl %ebx, %eax
sarl $3, %eax
imull $-858993459, %eax, %eax

because 'struct page' is 40 bytes in size, and that magic sequence
happens to divide by 40 (first divide by 8, then that magical "imull"
will divide by 5 *IFF* the thing is evenly divisible by 5 (and not too
big - but the shift guarantees that).

Basically, it's a magic trick, because real divides are very
expensive, but you can fake them more quickly if you can limit the
input domain.

But what does it mean if the two "struct page *" are not in the same
array, and the two arrays were allocated not aligned exactly 40 bytes
away, but some random number of pages away?

You get *COMPLETE*GARBAGE* when you do the above optimized divide.
Suddenly the divide had a modulus (because the base of the two arrays
weren't 40-byte aligned), and the "trick" doesn't work.

So that's why you can't do pointer diffs between two arrays. Not
because you can't subtract the two pointers, but because the
*division* part of the C pointer diff rules leads to issues.

Linus