Re: [PATCH tip/core/rcu 03/17] rcu/tree: Skip entry into the page allocator for PREEMPT_RT
From: Sebastian Andrzej Siewior
Date: Tue Jun 30 2020 - 12:45:51 EST
On 2020-06-24 13:12:12 [-0700], paulmck@xxxxxxxxxx wrote:
> From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>
>
> To keep the kfree_rcu() code working in purely atomic sections on RT,
> such as non-threaded IRQ handlers and raw spinlock sections, avoid
> calling into the page allocator which uses sleeping locks on RT.
>
> In fact, even if the caller is preemptible, the kfree_rcu() code is
> not, as the krcp->lock is a raw spinlock.
>
> Calling into the page allocator is optional and avoiding it should be
> Ok, especially with the page pre-allocation support in future patches.
> Such pre-allocation would further avoid the a need for a dynamically
> allocated page in the first place.
>
> Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> Reviewed-by: Uladzislau Rezki <urezki@xxxxxxxxx>
> Co-developed-by: Uladzislau Rezki <urezki@xxxxxxxxx>
> Signed-off-by: Uladzislau Rezki <urezki@xxxxxxxxx>
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@xxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> ---
> kernel/rcu/tree.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 64592b4..dbdd509 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3184,6 +3184,18 @@ kfree_call_rcu_add_ptr_to_bulk(struct kfree_rcu_cpu *krcp,
> if (!bnode) {
> WARN_ON_ONCE(sizeof(struct kfree_rcu_bulk_data) > PAGE_SIZE);
>
> + /*
> + * To keep this path working on raw non-preemptible
> + * sections, prevent the optional entry into the
> + * allocator as it uses sleeping locks. In fact, even
> + * if the caller of kfree_rcu() is preemptible, this
> + * path still is not, as krcp->lock is a raw spinlock.
> + * With additional page pre-allocation in the works,
> + * hitting this return is going to be much less likely.
> + */
> + if (IS_ENABLED(CONFIG_PREEMPT_RT))
> + return false;
This is not going to work together with the "wait context validator"
(CONFIG_PROVE_RAW_LOCK_NESTING). As of -rc3 it should complain about
printk() which is why it is still disabled by default.
So assume that this is fixed and enabled then on !PREEMPT_RT it will
complain that you have a raw_spinlock_t acquired (the one from patch
02/17) and attempt to acquire a spinlock_t in the memory allocator.
> bnode = (struct kfree_rcu_bulk_data *)
> __get_free_page(GFP_NOWAIT | __GFP_NOWARN);
> }
Sebastian