Re: disabling group leader perf_event

From: Ingo Molnar
Date: Tue Sep 07 2010 - 18:28:00 EST



* Pekka Enberg <penberg@xxxxxxxxxxxxxx> wrote:

> Hi Ingo,
>
> On 9/7/10 7:03 AM, Ingo Molnar wrote:
> >But i'd prefer C code really, as it's really 'abstract data' in the most
> >generic sense. That's why the trace filter engine started with a subset
> >of C.
>
> I think it sounds better in principle than what it will be in
> practice. The OpenGL shadling language the same kind of model where
> you use an API call to upload C-like code that gets parsed. That of
> course has the unfortunate side-effect that compilation error
> reporting isn't all that user-friendly because you have to query for
> errors separately.

Not really. It's not a binary choice. The very same checking code can be
used by tools and by the kernel too.

The kernel does the checking not because we want to do development of
this code by using the kernel as an editor/compiler, but because we want
to allow unprivileged tasks to pass in stuff, hence we must verify.

Error reporting by the kernel is a rare slowpath and should be pretty
straightforward and minimalistic: to return the position of the parsing
error.

> I think we've seen with ftrace vs. perf that it's easier to write
> rich, user-friendly interfaces in userspace than in kernel-space.
>
> >>[...] You also probably don't want to put heavy-weight compiler
> >>optimization passes in the kernel so with an intermediate form, you
> >>can do much of that in user-space.
> >
> >The question of what can and cannot be done in the kernel is overrated.
> >We sure can put a C compiler into the kernel - 10 years down the line we
> >wont understand what the fuss was all about.
>
> Yeah, I'm not saying we can't do that but it's a big chunk of code
> that can be potentially exploited.

The kernel is 10+ million lines of code that can potentially be
exploited ...

> >>As for the intermediate form, you might want to take a look at Dalvik:
> >>
> >>http://www.netmite.com/android/mydroid/dalvik/docs/dalvik-bytecode.html
> >>
> >>and probably ParrotVM bytecode too. The thing to avoid is stack-based
> >>instructions like in Java bytecode because although it's easy to write
> >>interpreters for them, it makes JIT'ing harder (which needs to convert
> >>stack-based representation to register-based) and probably doesn't
> >>lend itself well to stack-constrained kernel code.
> >
> >_If_ we pass in any sort of machine code to the kernel (which bytecode
> >really is), then we should do the right thing and pass in raw x86
> >bytecode, and verify it in the kernel.
> >
> >That way the compiler can be kept out of the kernel, and performance of
> >the thing will be phenomenal from day 1 on.
> >
> >For non-x86 in most cases we can use a simple translator that runs
> >during the verification run - or of course they could have their own
> >native 'assembly bytecode' verifier and their user-space could compile
> >to those.
>
> If you'd go for x86 as 'assembly bytecode' which ISA would you pick?
> 32-bit or 64-bit? I can see problems with both of them:

I'd use the native mode and would start with 64-bit.

> - The register set that can be encoded with 32-bit ISA is very
> limited which will force us to spill in memory.
>
> - The 64-bit ISA with REX prefixes is unnecessarily fat.
>
> - Instructions work directly on memory addresses which makes
> verification harder
>
> - The 32-bit ABI uses stack for argument passing which forces us
> to verify that operations on stack make sense.
>
> OTOH, if the ABI is that you upload _native code_ on every
> architecture, then the trade-off makes more sense to me. The
> downside is that we'd need a separate verifier for each
> architecture, though.

Correct. I still prefer the C style variant tho.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/