Re: [PATCH 00/14] Fix wrong %pF and %pS printk format specifier usages

From: Petr Mladek
Date: Tue Sep 12 2017 - 07:18:44 EST


On Fri 2017-09-08 22:49:51, Helge Deller wrote:
> On 08.09.2017 08:18, Sergey Senozhatsky wrote:
> > On (09/07/17 16:05), Luck, Tony wrote:
> > [..]
> >>>> if (not_a_function_descriptor(ptr))
> >>>> return ptr;
> >>>
> >>> I'm not sure if it's possible on ia64/ppc64/parisc64
> >>> to reliably detect if it's a function descriptor or not.
> >>
> >> Agreed. I don't know how to write this test (without changing the compiler to
> >> put the pointers in a separate section ... and then changing the module loader
> >> to keep a list of all these sections).
> >
> > let me try one more time :)
> >
> > so below is a number of assumptions, let me know if anything is wrong
> > there.... and let's try to fix the "wrong bits" ;)
> >
> >
> > RFC
> >
> >
> > 1) function descriptor table is in .data, not in .text
> > correct?
> >
> > 2) symbol resolution consists of 3 steps:
> >
> > a) we check if this is a kernel symbol and resolve it if so
> > b) we check if the addr belongs to any module and resolve the addr
> > if so
> > c) we check if the addr is bpf and resolve it if so. let's skip this part.
> >
> >
> > so, for (a) we probably can do something like below. can't we?
> > // not tested, as usual.
> >
> >
> > so there are probably some broken parts there. like...
> > I don't know. something.
> >
> > so - what is broken, and how can we fix/tweak it? help me out.
>
> Sergey, I'm sure there is a way how you can get it somehow to work the way
> you describe above, but even then nobody can guarantee you that it
> will work in 100% of the cases.

It seems that dereferencing an invalid function descriptor is rather
safe because probe_kernel_address() prevents crashes.

The question is if we could get wrong results by the autodetection.
The following possibilities come to my mind:

First, if the variable used to store the function descriptor is on
stack and is not initialized. Then there is a non-trivial chance
that the garbage on the stack will be a real return address to an
existing function. Then the autodetection would help to hide this.

Second, if wonder if the address of the function descriptor
might be in callsyms as well. Note that global variables
are in kallsyms as well. Then we would always print
the name of this variable.

I do not have a strong opinion here. On one hand, it is clear
that %pS and %pF are often misused. But I am not sure if the above
possible problems are acceptable.


> It's somehow like "we have %lu and %c specifiers, and it's basically
> the same, so let's try to figure out at runtime which one should be
> used based on analysis of what was given as argument".
> It may work somehow, but not always.

I am not sure if I miss something. But the different output of
%lu and %c should be easy to distinguish. Also the difference
is the same on all architectures and should be well known.
This is not true for the %pS vs. %pF species.


> What about the idea of a %luS specifier (or something other) ?

I am not a big fun of this. IMHO, the relation between a pointer
and symbol name makes more sense that a relation between an
unsigned long and a symbol name. IMHO, this would just
add even more confusion.

Best Regards,
Petr


PS: I wonder if the improved documentation and fixing all occurrences
might be enough to reduce this mistake. I guess that most of them
were caused by copying the same pattern from an already broken code.