Re: [PATCH v5 1/3] Provide in-kernel headers to make extending kernel easier

From: Joel Fernandes
Date: Mon Apr 15 2019 - 10:05:36 EST


On Sun, Apr 14, 2019 at 12:38:34PM -0700, Olof Johansson wrote:
> On Wed, Apr 10, 2019 at 8:15 PM Alexei Starovoitov
> <alexei.starovoitov@xxxxxxxxx> wrote:
> >
> > On Wed, Apr 10, 2019 at 09:34:49AM -0700, Olof Johansson wrote:
> > > On Wed, Apr 10, 2019 at 8:51 AM Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Apr 10, 2019 at 11:07 AM Olof Johansson <olof@xxxxxxxxx> wrote:
> > > > [snip]
> > > > > > > Wouldn't it be more convenient to provide it in a standardized format
> > > > > > > such that you won't have to take an additional step, and always have
> > > > > > > This is that form IMO.
> > ...
> > > Compared to:
> > > - Extract tarball
> > > - Build and load
> > > - Remove file tree from filesystem
> >
> > I think there are too many assumptions in this thread in regard to what
> > is more convenient for user space.
> > I think bcc should try to avoid extracting tarball into file system.
> > For example libbcc can uncompress kheader.tar.xz into virtual file system
> > of clang front-end. It's more or less std::map<string, string>
> > with key=path, value=content of the file. Access to such in-memory
> > 'files' is obviously faster than doing open/read syscalls.
>
> I think performance is a red herring, especially since you have to
> uncompress it on every compiler invocation. With this you'd need to
> read/touch/write _all_ header files, not just the one your current
> compiler invocation will use.
>
> In the grand scheme of things, open/mmap syscalls wouldn't necessarily
> be slower.

Agreed.

> > bcc already uses this approach for some bcc internal 'files' that
> > it passes to clang during compilation.
> > All of /proc/kheaders.tar.xz files can be passed the same way
> > without extracting them into real file system.
>
> This is now a circular argument. Joel was stating that the plain text
> headers took up too much memory, but now it's preferred to create such
> filesystem in userspace memory on *every compiler invocation*?
> That's... not better. And definitely worse if you want to compile in
> parallel.

The BCC patch does not extract purely into memory, but uses temporary
directory: https://github.com/iovisor/bcc/pull/2312
I believe this is a good approach.

> From my perspective, this is where we're at:
>
> This patch seems to have been met with a lot of responses in the tone
> of "this is not an appealing solution". Meanwhile, some of the
> suggested alternative solutions have not worked out, and we are now at
> a point where there's less interest in exploring alternatives and
> arguments to merge as-is with only minor adjustments.
>
> I understand the desire to solve this. It'd be really convenient to
> have a way to runtime build against the same structure layouts that
> the kernel was built with. But I haven't heard anyone say that they

About structure layouts, I'm assuming you mean compiler generated debug info.
That does not work for eBPF tools as was mentioned previously in these threads:
https://lkml.org/lkml/2019/3/11/1358
https://lkml.org/lkml/2019/3/11/1363

> *like* the solution proposed, and I haven't seen many of those
> expressing concerns being converted to support it.

IMHO there has been good number of people on both sides of the argument. If
it were as strong of an opposition as you think, then I would have personally
not wanted this merged tbh. We do want a solution that is clean and works, and
I think this is a candidate.

[snip]
> I'd be a *lot* less hesitant if this went into debugfs or another
> location than /proc, which is one of the most regression-sensitive
> interfaces we have in the kernel.

The solution should be regression-sensitive imho, we don't want the tracing
tools to break, people use them. And their usability and robustness is
important which prompted these patches. For some time we have been hosting
downloadable headers for popular kernels (such as LTS versions) to workaround
this issue, but this both a maintenance issue and non-scalable, and not
robust (someone boots a custom kernel or we forget to update headers for a
future LTS release and the tools break).

thanks,

- Joel