Re: dm bufio: Reduce dm_bufio_lock contention

From: Mikulas Patocka
Date: Fri Jun 22 2018 - 08:45:03 EST




On Fri, 22 Jun 2018, Michal Hocko wrote:

> On Thu 21-06-18 21:17:24, Mikulas Patocka wrote:
> [...]
> > > But seriously, isn't the best way around the throttling issue to use
> > > PF_LESS_THROTTLE?
> >
> > Yes - it could be done by setting PF_LESS_THROTTLE. But I think it would
> > be better to change it just in one place than to add PF_LESS_THROTTLE to
> > every block device driver (because adding it to every block driver results
> > in more code).
>
> Why would every block device need this? I thought we were talking about
> mempool allocator and the md variant of it. They are explicitly doing
> their own back off so PF_LESS_THROTTLE sounds appropriate to me.

Because every block driver is suspicible to this problem. Two years ago,
there was a bug that when the user was swapping to dm-crypt device, memory
management would stall the allocations done by dm-crypt by 100ms - that
slowed down swapping rate and made the machine unuseable.

Then, people are complaining about the same problem in dm-bufio.

Next time, it will be something else.

(you will answer : "we will not fix bugs unless people are reporting them" :-)

> > What about this patch? If __GFP_NORETRY and __GFP_FS is not set (i.e. the
> > request comes from a block device driver or a filesystem), we should not
> > sleep.
>
> Why? How are you going to audit all the callers that the behavior makes
> sense and moreover how are you going to ensure that future usage will
> still make sense. The more subtle side effects gfp flags have the harder
> they are to maintain.

I did audit them - see the previous email in this thread:
https://www.redhat.com/archives/dm-devel/2018-June/thread.html

Mikulas

> > Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> >
> > Index: linux-2.6/mm/vmscan.c
> > ===================================================================
> > --- linux-2.6.orig/mm/vmscan.c
> > +++ linux-2.6/mm/vmscan.c
> > @@ -2674,6 +2674,7 @@ static bool shrink_node(pg_data_t *pgdat
> > * the LRU too quickly.
> > */
> > if (!sc->hibernation_mode && !current_is_kswapd() &&
> > + (sc->gfp_mask & (__GFP_NORETRY | __GFP_FS)) != __GFP_NORETRY &&
> > current_may_throttle() && pgdat_memcg_congested(pgdat, root))
> > wait_iff_congested(BLK_RW_ASYNC, HZ/10);
> >
>
> --
> Michal Hocko
> SUSE Labs
>