Re: [RFC][PATCH] tracing: Define "fake" struct trace_pid_list

From: Steven Rostedt
Date: Sat Oct 02 2021 - 19:58:05 EST


On Sat, 2 Oct 2021 15:39:45 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Sat, Oct 2, 2021 at 1:04 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> >
> > [ Note, this is on top of my tree in ftrace/core, but wanted to ask if
> > this is the proper "fix".
>
> Ugh, please no. This is going to be very confusing, and it's going to
> mess with anything that does things based on type (eg traditionally
> module signatures etc).

This is why I asked ;-)

>
> I'd rather you just expose the proper type, if that is what it takes.

I can, as I'm currently the only one using this, it should be fine. I
just like to not expose structures that shouldn't be touched, but
that's my preference, but that's not really that common of a habit in
the kernel anyway.

>
> > Some compilers give this error:
>
> Only some? Which ones? And what did you do to make it appear? Sounds
> like whatever change wasn't worth it.

Yes, this is what surprised me. It worked on all my machines and for
most of my tests, which are pretty much all gcc 10.X. But for one of my
tests, I compile with gcc 8.1.0 (one I pulled down from kernel.org a
while ago), and that's the one that blew up.

It has nothing to do with the config (the same config compiles fine
with gcc 10.x). And if I didn't have a test that compiled with 8.1.0, I
would never had know this was an issue.

>
> The advantage of some "opaque type" does _not_ override the
> disadvantage of then having to make up these kinds of horrific
> workarounds that actively lie to the compiler.

I felt uncomfortable with the change, and that's why I wanted to get
your opinion before having you first see it in a pull request.

>
> We have tons of structures (and occasionally single structure members)
> that we don't want people to access directly, and instead use a
> wrapper function. That doesn't mean that they can't be exposed as a
> type.
>
> > The reason is that rcu_dereference_sched() has a check that uses
> > typeof(*p) of the pointer passed to it.
>
> Sadly, we do that for a reason - we do a
>
> typeof(*p) *__local_p;
>
> to drop the address space specifiers from (or add them to) the pointer.
>
> That said, I wonder how many of them are actually needed. At least
> some of them are purely for sparse
>
> So at least some could probably just use
>
> typeof(p) __local_p;
>
> instead, which would avoid the problem with a pointer to an incomplete
> type (and keep it as a pointer to an incomplete type).
>
> So one option might be to work on the RCU accessor macros instead.

I looked at changing them, but for the one place:

((typeof(*p) __force __kernel *)(_________p1));

Where there's a separation from the type and adding of "__force __kernel"
before the pointer. Not sure if it matters or not.

I'll do a little investigation, and see if tweaks to these RCU macros
will fix it, otherwise, I'll just move the structure back out to being
public.

Thanks for the feedback,

-- Steve