Re: [RFC] fix kallsyms to allow discrimination of local symbols
From: James Bottomley
Date: Mon Jul 21 2008 - 23:53:23 EST
On Mon, 2008-07-21 at 21:44 -0400, Frank Ch. Eigler wrote:
> James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> writes:
>
> > [...] Fix all of this by prefixing local symbols with the actual C
> > file name they occur in separated by '|' (I had to use '|' since ':'
> > is already in use for module prefixes in kallsyms lookups. [...]
> > Comments?
>
> Can we take some time to review how we got here?
>
>
> - You disprefer systemtap's use of an established, non-deprecated API
> for placing kernel probes. (We calculate addresses by a mixture of
> elf-analysis and runtime user-space lookup means. That's partly
> since kallsyms_lookup was unexported over our objections.) There is
> nothing outright broken (e.g. incorrect numbers) with what systemtap
> has been doing for years.
You mean embedding half a megabyte of symbols simply so you can avoid
the inconvenience of using a kernel API? yes, I think it's ...
suboptimal.
> - You argue that symbols+offset kprobing is better. We can see that,
> in some sense, but ...
>
> - I explain that we are used to final address calculating, as we'll
> have to do that regardless for user-space probes. Plus we need to
> work with kernels that predate the symbol+offset kprobe api
> extension. So this change would not simplify systemtap in any way.
> You do not respond.
There is no current userspace infrastructure, since utrace still isn't
in the kernel, so you're predicating this argument on an event which
hasn't happened.
Even assuming utrace is accepted, embedding the symbol table of every
user space process in the probes is still daft. It's this constant
assertion that "it must be done my way" that's causing such a drag on
the open source process. For instance, the obvious way to me of doing
this would be to map the user space stack into the systemtap runtime and
unwind it from there instead of vectoring it into the kernel.
> - I offer _stext+offset (for the kernel) and (.text*)+offset (for
> modules) kprobes: basically to use the "better" symbol+offset
> kprobes api, but use the same single reference addresses we already
> do, and leaving just the final addition to the kernel. You do not
> respond materially.
I thought this and subsequent emails addressed the points pretty well:
http://marc.info/?l=linux-kernel&m=121632572409118
> - You argue that it cannot only be any symbol+offset ... but the actual
> nearest symbol+offset. But that doesn't work for local symbols. So
> you fix that to the nearest globally visible symbol+offset. But this
> requires:
> - yet more extra work and code from systemtap
I'm afraid that's how open source development works ... you iterate to
find the best solution
> - extension to the kernel build system, and kallsyms runtime data to
> fix the current local-symbol-ambiguity problem
Finding weaknesses in APIs and fixing them is what it's all about.
> - storage of all that new file name data in permanent unswappable
> kernel data (>>100kB, if done simply prefixing local symbol names
> file file names).
I'd check my facts before making assertions. The kernel symbol table is
stored in a compressed form that actually eliminates most of these
repetitions.
> - possible further complications related to filename string matching
Any substantiation of that?
> - You have yet to invent a scheme to allow offloading *data* address
> calculations to the kernel. Without that (and perhaps more),
> systemtap will *still* have to fetch same base _stext etc.
> addresses at run time that it currently does -- even if it did not
> use them to compute kprobes addresses.
That would be because I haven't actually started looking at this one
yet. Of course, that would make it a great starting point for others
who wished to help.
> In total, this path would end up with both systemtap and the kernel
> more complex, larger and a bit slower too.
Really? I count the reduction of the probe modules from 500kb to 50kb a
worthwhile saving. I don't even see where anything became larger.
> Does that still seem an
> acceptable cost, just to get systemtap to change its preferred kprobes
> api?
James
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/