Re: [PATCH 04/10] perf/uprobe: RCU-ify find_uprobe()

From: Oleg Nesterov
Date: Mon Jul 08 2024 - 12:37:38 EST


I hate to say this again, but I'll try to read this series later ;)

But let me ask...

On 07/08, Peter Zijlstra wrote:
>
> +static void uprobe_free_rcu(struct rcu_head *rcu)
> +{
> + struct uprobe *uprobe = container_of(rcu, struct uprobe, rcu);
> + kfree(uprobe);
> +}
> +
> static void put_uprobe(struct uprobe *uprobe)
> {
> if (refcount_dec_and_test(&uprobe->ref)) {
> @@ -604,7 +612,7 @@ static void put_uprobe(struct uprobe *up
> mutex_lock(&delayed_uprobe_lock);
> delayed_uprobe_remove(uprobe, NULL);
> mutex_unlock(&delayed_uprobe_lock);
> - kfree(uprobe);
> + call_rcu(&uprobe->rcu, uprobe_free_rcu);

kfree_rcu() ?


> static struct uprobe *find_uprobe(struct inode *inode, loff_t offset)
> {
> - struct uprobe *uprobe;
> + unsigned int seq;
>
> - read_lock(&uprobes_treelock);
> - uprobe = __find_uprobe(inode, offset);
> - read_unlock(&uprobes_treelock);
> + guard(rcu)();
>
> - return uprobe;
> + do {
> + seq = read_seqcount_begin(&uprobes_seqcount);
> + struct uprobe *uprobe = __find_uprobe(inode, offset);
> + if (uprobe) {
> + /*
> + * Lockless RB-tree lookups are prone to false-negatives.
> + * If they find something, it's good.

Is it true in this case?

Suppose we have uprobe U which has no extra refs, so uprobe_unregister()
called by the task X should remove it from uprobes_tree and kfree.

Suppose that the task T hits the breakpoint and enters handle_swbp().

Now,

- X calls find_uprobe(), this increments U->ref from 1 to 2

register_for_each_vma() succeeds

X enters delete_uprobe()

- T calls find_active_uprobe() -> find_uprobe()

__read_seqcount_begin__read_seqcount_begin() returns an even number

__find_uprobe() -> rb_find_rcu() succeeds

- X continues and returns from delete_uprobe(), U->ref == 1

then it does the final uprobe_unregister()->put_uprobe(U),
refcount_dec_and_test() succeeds, X calls call_rcu(uprobe_free_rcu).

- T does get_uprobe() which changes U->ref from 0 to 1, __find_uprobe()
returns, find_uprobe() doesn't check read_seqcount_retry().

- T returns from find_active_uprobe() and plays with the "soon to be
freed" U.

Looks like __find_uprobe() needs refcount_inc_not_zero() or return NULL, but
I am not sure this is the only problem...

No?

Oleg.