Re: [PATCH] bpf: don't fail kmalloc while releasing raw_tp

From: Steven Rostedt
Date: Mon Nov 16 2020 - 17:10:47 EST


On Mon, 16 Nov 2020 16:34:41 -0500 (EST)
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:

> ----- On Nov 16, 2020, at 4:02 PM, rostedt rostedt@xxxxxxxxxxx wrote:
>
> > On Mon, 16 Nov 2020 15:44:37 -0500
> > Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> >> If you use a stub function, it shouldn't affect anything. And the worse
> >> that would happen is that you have a slight overhead of calling the stub
> >> until you can properly remove the callback.
> >
> > Something like this:
> >
> > (haven't compiled it yet, I'm about to though).

Still need more accounting to work on. Almost finished though. ;-)

> >
> > -- Steve
> >
> > diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> > index 3f659f855074..8eab40f9d388 100644
> > --- a/kernel/tracepoint.c
> > +++ b/kernel/tracepoint.c
> > @@ -53,10 +53,16 @@ struct tp_probes {
> > struct tracepoint_func probes[];
> > };
> >
> > -static inline void *allocate_probes(int count)
> > +/* Called in removal of a func but failed to allocate a new tp_funcs */
> > +static void tp_stub_func(void)
>
> I'm still not sure whether it's OK to call a (void) function with arguments.

Actually, I've done it. The thing is, what can actually happen? A void
function that simply returns should not do anything. If anything, the only
waste is that the caller would save more registers than necessary.

I can't think of anything that can actually happen, but perhaps there is. I
wouldn't want to make a stub function for every trace point (it wouldn't be
hard to do).

But perhaps we should ask the compiler people to make sure.

>
> > +{
> > + return;
> > +}
> > +
> > +static inline void *allocate_probes(int count, gfp_t extra_flags)
> > {
> > struct tp_probes *p = kmalloc(struct_size(p, probes, count),
> > - GFP_KERNEL);
> > + GFP_KERNEL | extra_flags);
> > return p == NULL ? NULL : p->probes;
> > }
> >
> > @@ -150,7 +156,7 @@ func_add(struct tracepoint_func **funcs, struct
> > tracepoint_func *tp_func,
> > }
> > }
> > /* + 2 : one for new probe, one for NULL func */
> > - new = allocate_probes(nr_probes + 2);
> > + new = allocate_probes(nr_probes + 2, 0);
> > if (new == NULL)
> > return ERR_PTR(-ENOMEM);
> > if (old) {
> > @@ -188,8 +194,9 @@ static void *func_remove(struct tracepoint_func **funcs,
> > /* (N -> M), (N > 1, M >= 0) probes */
> > if (tp_func->func) {
> > for (nr_probes = 0; old[nr_probes].func; nr_probes++) {
> > - if (old[nr_probes].func == tp_func->func &&
> > - old[nr_probes].data == tp_func->data)
> > + if ((old[nr_probes].func == tp_func->func &&
> > + old[nr_probes].data == tp_func->data) ||
> > + old[nr_probes].func == tp_stub_func)
> > nr_del++;
> > }
> > }
> > @@ -207,15 +214,20 @@ static void *func_remove(struct tracepoint_func **funcs,
> > int j = 0;
> > /* N -> M, (N > 1, M > 0) */
> > /* + 1 for NULL */
> > - new = allocate_probes(nr_probes - nr_del + 1);
> > - if (new == NULL)
> > - return ERR_PTR(-ENOMEM);
> > - for (i = 0; old[i].func; i++)
> > - if (old[i].func != tp_func->func
> > - || old[i].data != tp_func->data)
> > - new[j++] = old[i];
> > - new[nr_probes - nr_del].func = NULL;
> > - *funcs = new;
> > + new = allocate_probes(nr_probes - nr_del + 1, __GFP_NOFAIL);
> > + if (new) {
> > + for (i = 0; old[i].func; i++)
> > + if (old[i].func != tp_func->func
> > + || old[i].data != tp_func->data)
>
> as you point out in your reply, skip tp_stub_func here too.
>
> > + new[j++] = old[i];
> > + new[nr_probes - nr_del].func = NULL;
> > + } else {
> > + for (i = 0; old[i].func; i++)
> > + if (old[i].func == tp_func->func &&
> > + old[i].data == tp_func->data)
> > + old[i].func = tp_stub_func;
>
> I think you'll want a WRITE_ONCE(old[i].func, tp_stub_func) here, matched
> with a READ_ONCE() in __DO_TRACE. This introduces a new situation where the
> func pointer can be updated and loaded concurrently.

I thought about this a little, and then only thing we really should worry
about is synchronizing with those that unregister. Because when we make
this update, there are now two states. the __DO_TRACE either reads the
original func or the stub. And either should be OK to call.

Only the func gets updated and not the data. So what exactly are we worried
about here?

>
> > + }
> > + *funcs = old;
>
> The line above seems wrong for the successful allocate_probe case. You will likely
> want *funcs = new on successful allocation, and *funcs = old for the failure case.

Yeah, it crashed because of this ;-)

Like I said, untested!

-- Steve