Re: [PATCH RFC] rcu/tree: Use GFP_MEMALLOC for alloc memory to free memory pattern

From: Paul E. McKenney
Date: Tue Mar 31 2020 - 13:49:07 EST


On Tue, Mar 31, 2020 at 07:02:32PM +0200, Uladzislau Rezki wrote:
> > >
> > > Paul was concerned about following scenario with hitting synchronize_rcu():
> > > 1. Consider a system under memory pressure.
> > > 2. Consider some other subsystem X depending on another system Y which uses
> > > kfree_rcu(). If Y doesn't complete the operation in time, X accumulates
> > > more memory.
> > > 3. Since kfree_rcu() on Y hits synchronize_rcu() a lot, it slows it down.
> > > This causes X to further allocate memory, further causing a chain
> > > reaction.
> > > Paul, please correct me if I'm wrong.
> > >
> > I see your point and agree that in theory it can happen. So, we should
> > make it more tight when it comes to rcu_head attachment logic.
> >
> Just adding more thoughts about such concern. Even though in theory we
> can run into something like that. But also please note, that under high
> memory pressure it also does not mean that (X) will always succeed with
> further infinite allocations, so memory pressure is something common.
> As soon as the situation becomes slightly better we do our work much
> efficient.
>
> Practically, i was trying to simulate memory pressure to hit synchronize_rcu()
> on my test system. By just simulating head-less freeing(for any object) and
> by always dynamic attaching path. So i could trigger it, but that was really
> hard to achieve and it happened only few times. So that was not like a constant
> hit. What i got constantly were:
>
> - System got recovered and proceed with "normal" path;
> - The OOM hit as a final step, when the system is run out of memory fully.
>
> So, practically i have not seen massive synchronize_rcu() hit.

Understood, but given the attractive properties of headless kfree_rcu(),
it is not unreasonable to expect its usage to remain low. In addition,
memory-pressure scenarios can be quite involved. Finally, as Joel
pointed out offlist, the per-CPU cached structure acts as a small
portion of kfree_rcu()-specific reserved memory, so you guys have at
least partially addressed parts of my concerns already.

I am not at all a fan of using GFP_MEMALLOC because kfree_rcu()
is sufficiently low-level to be in the business of ensuring its own
forward progress.

Thanx, Paul