Re: [PATCH] psi: annotate refault stalls from IO submission

From: Andrew Morton
Date: Mon Jul 22 2019 - 18:26:10 EST


On Mon, 22 Jul 2019 16:13:37 -0400 Johannes Weiner <hannes@xxxxxxxxxxx> wrote:

> psi tracks the time tasks wait for refaulting pages to become
> uptodate, but it does not track the time spent submitting the IO. The
> submission part can be significant if backing storage is contended or
> when cgroup throttling (io.latency) is in effect - a lot of time is
> spent in submit_bio(). In that case, we underreport memory pressure.

It's a somewhat broad patch. How significant is this problem in the
real world? Can we be confident that the end-user benefit is worth the
code changes?

> Annotate the submit_bio() paths (or the indirection through readpage)
> for refaults and swapin to get proper psi coverage of delays there.
>
> Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> ---
> fs/btrfs/extent_io.c | 14 ++++++++++++--
> fs/ext4/readpage.c | 9 +++++++++
> fs/f2fs/data.c | 8 ++++++++
> fs/mpage.c | 9 +++++++++
> mm/filemap.c | 20 ++++++++++++++++++++
> mm/page_io.c | 11 ++++++++---
> mm/readahead.c | 24 +++++++++++++++++++++++-

We touch three filesystems. Why these three? Are all other
filesystems OK or will they need work as well?

> ...
>
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
>
> ...
>
> @@ -2753,11 +2763,14 @@ static struct page *do_read_cache_page(struct address_space *mapping,
> void *data,
> gfp_t gfp)
> {
> + bool refault = false;
> struct page *page;
> int err;
> repeat:
> page = find_get_page(mapping, index);
> if (!page) {
> + unsigned long pflags;
> +

That was a bit odd. This?

--- a/mm/filemap.c~psi-annotate-refault-stalls-from-io-submission-fix
+++ a/mm/filemap.c
@@ -2815,12 +2815,12 @@ static struct page *do_read_cache_page(s
void *data,
gfp_t gfp)
{
- bool refault = false;
struct page *page;
int err;
repeat:
page = find_get_page(mapping, index);
if (!page) {
+ bool refault = false;
unsigned long pflags;

page = __page_cache_alloc(gfp);
_