Re: [PATCH v4 10/10] selftests/vm: test faulting in kernel, and verify pinnable pages

From: John Hubbard
Date: Sat Dec 19 2020 - 01:14:20 EST


On 12/17/20 10:52 AM, Pavel Tatashin wrote:
>

Hi Pavel,

This all looks good pretty good to me, with just a couple of minor
doubts interleaved with the documentation tweaks:

a) I'm not yet sure if the is_pinnable_page() concept is a keeper. If it's
not for some reason, then we should revisit this patch.

b) I don't yet understand why FOLL_TOUCH from gup/pup is a critical part
of the test.


When pages are pinned they can be faulted in userland and migrated, and
they can be faulted right in kernel without migration.

In either case, the pinned pages must end-up being pinnable (not movable).

Let's delete the above two sentences, which are confusing as currently
worded, and just keep approximately the last sentence below.


Add a new test without touching pages in userland, and use FOLL_TOUCH
instead. Also, verify that pinned pages are pinnable.

Maybe this instead:

Add a new test to gup_test, to verify that only "pinnable" pages are
pinned. Also, use gup/pup + FOLL_TOUCH to fault in the pages, rather
than faulting them in from user space.


? But I don't know why that second point is important. Is it actually
important in order to have a valid test? If so, why?



Signed-off-by: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>
---
mm/gup_test.c | 6 ++++++
tools/testing/selftests/vm/gup_test.c | 17 +++++++++++++----
2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/mm/gup_test.c b/mm/gup_test.c
index 24c70c5814ba..24fd542091ee 100644
--- a/mm/gup_test.c
+++ b/mm/gup_test.c
@@ -52,6 +52,12 @@ static void verify_dma_pinned(unsigned int cmd, struct page **pages,
dump_page(page, "gup_test failure");
break;
+ } else if (cmd == PIN_LONGTERM_BENCHMARK &&
+ WARN(!is_pinnable_page(page),
+ "pages[%lu] is NOT pinnable but pinned\n",
+ i)) {
+ dump_page(page, "gup_test failure");
+ break;
}
}
break;
diff --git a/tools/testing/selftests/vm/gup_test.c b/tools/testing/selftests/vm/gup_test.c
index 42c71483729f..f08cc97d424d 100644
--- a/tools/testing/selftests/vm/gup_test.c
+++ b/tools/testing/selftests/vm/gup_test.c
@@ -13,6 +13,7 @@
/* Just the flags we need, copied from mm.h: */
#define FOLL_WRITE 0x01 /* check pte is writable */
+#define FOLL_TOUCH 0x02 /* mark page accessed */


Aha, now I see why you wanted to pass other GUP flags, in the previous
patch. I think it's OK to pass this set of possible flags (as
.gup_flags) through ioctl, yes.

However (this is about the previous patch), I *think* we're better off
leaving the gup_test behavior as: "default is read-only pages, but you
can pass in -w to specify FOLL_WRITE". As opposed to passing in raw
flags from the command line. And yes, I realize that my -F option seemed
to recommand the latter...I'm regretting that -F approach now.

The other direction to go might be to stop doing that, and shift over to
just let the user specify FOLL_* flags directly on the command line, but
IMHO there's no need for that (yet), and it's a little less error-prone
to constrain it.

This leads to: change the "-F 1", to some other better-named option,
perhaps. Open to suggestion there.


static char *cmd_to_str(unsigned long cmd)
{
@@ -39,11 +40,11 @@ int main(int argc, char **argv)
unsigned long size = 128 * MB;
int i, fd, filed, opt, nr_pages = 1, thp = -1, repeats = 1, write = 1;
unsigned long cmd = GUP_FAST_BENCHMARK;
- int flags = MAP_PRIVATE;
+ int flags = MAP_PRIVATE, touch = 0;


Silly nit, can we put it on its own line? This pre-existing mess of
declarations makes it hard to read everything. One item per line is
easier on the reader, who is often just looking for a single item at a
time. Actually why not rename it slightly while we're here (see below),
maybe to this:

int use_foll_touch = 0;


char *file = "/dev/zero";
char *p;
- while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHp")) != -1) {
+ while ((opt = getopt(argc, argv, "m:r:n:F:f:abctTLUuwWSHpz")) != -1) {

Yes, this seems worth its own command line option.

switch (opt) {
case 'a':
cmd = PIN_FAST_BENCHMARK;
@@ -110,6 +111,10 @@ int main(int argc, char **argv)
case 'H':
flags |= (MAP_HUGETLB | MAP_ANONYMOUS);
break;
+ case 'z':
+ /* fault pages in gup, do not fault in userland */

How about:
/*
* Use gup/pup(FOLL_TOUCH), *instead* of faulting
* pages in from user space.
*/
use_foll_touch = 1;

+ touch = 1;
+ break;
default:
return -1;
}
@@ -167,8 +172,12 @@ int main(int argc, char **argv)
else if (thp == 0)
madvise(p, size, MADV_NOHUGEPAGE);
- for (; (unsigned long)p < gup.addr + size; p += PAGE_SIZE)
- p[0] = 0;
+ if (touch) {
+ gup.flags |= FOLL_TOUCH;
+ } else {
+ for (; (unsigned long)p < gup.addr + size; p += PAGE_SIZE)
+ p[0] = 0;
+ }

OK.

/* Only report timing information on the *_BENCHMARK commands: */
if ((cmd == PIN_FAST_BENCHMARK) || (cmd == GUP_FAST_BENCHMARK) ||


thanks,
--
John Hubbard
NVIDIA