Re: Linux 5.1-rc5

From: Martin Schwidefsky
Date: Wed Apr 17 2019 - 03:47:31 EST


On Tue, 16 Apr 2019 09:49:46 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, Apr 16, 2019 at 9:16 AM Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > We actually already *have* this function.
> >
> > It's called "gup_fast_permitted()" and it's used by x86-64 to verify
> > the proper address range. Exactly like s390 needs..
> >
> > Could you please use that instead?
>
> IOW, something like the attached.
>
> Obviously untested. And maybe 'current' isn't declared in
> <asm/pgtable.h>, in which case you'd need to modify it to instead make
> the inline function be "s390_gup_fast_permitted()" that takes a
> pointer to the mm, and do something like
>
> #define gup_fast_permitted(start, pages) \
> s390_gup_fast_permitted(current->mm, start, pages)
>
> instead.
>
> But I think you get the idea..

Nice, I did not realize that gup_fast_permitted is a platform
override-able function. So that part is doable in arch/s390. But I
spoke to soon, I got my first crash and realized that the common gup code
is not usable as it is. The reason is this e.g. this sequence:

pgdp = pgd_offset(current->mm, addr);
pgd_t pgd = READ_ONCE(*pgdp);
/* some checking on pgd */
gup_p4d_range(pgd, addr, next, write, pages, nr);

p4dp = p4d_offset(&pgd, addr);
p4d_t p4d = READ_ONCE(*p4dp);
/* some checking on p4d */
gup_pud_range(p4d, addr, next, write, pages, nr);

pudp = pud_offset(&p4d, addr);
pud_t pud = READ_ONCE(*pudp);
/* some checking on pud */
gup_pmd_range(pud, addr, next, write, pages, nr;

Each step along the way will read the page table entry and pass the
table entry to the next function. This clashes with the page table
folding on s390. The s390 gup code looks more like this:

pgdp = pgd_offset(current->mm, addr);
/* some checking on pgd */
pgd_t pgd = READ_ONCE(*pgdp);
gup_p4d_range(pgdp, pgd, addr, next, write, pages, &nr);

p4dp = p4d_offset(pgdp, addr);
p4d_t p4d = READ_ONCE(*p4dp);
/* some checking on p4d */
gup_pud_range(p4dp, p4d, addr, next, write, pages, nr);

pudp = pud_offset(p4dp, addr);
pud_t pud = READ_ONCE(*pudp);
/* some checking on pud */
gup_pmd_range(pudp, pud, addr, next, write, pages, nr;

There are magic dereferences in the s390 versions of p4d_offset,
pud_offset and pmd_offset functions. To make this work the pointer
passed to these functions may not be the local copy of the already
dereferenced table entry. I'll cook up a patch for the common code.

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.