Re: [PATCH 1/2] kmemcheck v3

From: Andi Kleen
Date: Fri Feb 08 2008 - 08:13:49 EST


On Fri, Feb 08, 2008 at 01:59:57PM +0100, Vegard Nossum wrote:
> On 2/8/08, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> > On Fri, Feb 08, 2008 at 01:18:37PM +0100, Vegard Nossum wrote:
> > > On 2/8/08, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> > > > Your assumption that only the string instructions can take
> > > > multiple page faults seems a little dangerous too.
> > >
> > > Yes, this is true. I cannot guarantee that there are no other
> > > instructions that could access more than one memory location but only
> > > take one page fault. However, since the kernel does boot, we at least
> > > know that these instructions are not very frequently used. (If you
> > > know of any other instructions we might be missing, I'll be happy to
> > > know about it!)
> >
> > Pretty much all in the right circumstances.
> >
> > e.g. consider a segment reload in tracked memory.
> >
> > Also there are various instructions which do all kinds of complicated
> > things internally; like IRET or INT: often with many memory accesses.
> > Just page through a instruction manual and look at the pseudo code
> > describing what the various instructions do.
>
> Yes, this is true. Then our task is to make sure that this memory is
> never allocated from tracked caches. We do have some changes in this
> area, for instance we never track task structs. Keep in mind that only
> slab objects are tracked currently, so things like stacks never catch
> page faults. I am not sure if this is exactly what you had in mind,
> but I don't know other kernel code very well enough to come up with
> perhaps more relevant examples :-)

Given that you don't seem to handle networking yet I wonder how
many cases you really tested so far.

> For now, I am simply assuming that we never load task segments, GDTs,
> LDTs, or paging structures from tracked memory (e.g. regular
> kmalloc()).

There's the stack for once too. And some others I'm probably
forgetting.

> currently with surprisingly little amounts of code.

You only need this for the size and to detect string instructions, right?
The address should be delivered with the page fault and the r/w status too.

I think for string instructions you could probably detect it with
a little state machine that detects multiple page faults on the same
instruction.

Or just prevent the compiler/the code from generating string instructions.
There should not be that many once you stop gcc from generating inline
string ops (-Os is probably enough for that)

For size you could in theory use VT which has special support in the CPU
to help with parsing this, although that would limit it to modern CPUs
and would require quite some infrastructure.

>
> But AFAIK the format for MMX and SSE is different from the "regular"
> instructions, and so I don't know how to parse them. But this is
> something we can look at later.

I'm pretty sure there are other special instructions that you will
eventually run into. Intel (and sometimes AMD) add new ones each CPU
generation :) Reimplementing instruction decoding on x86 is not
an easy job. Anyways if you really want to do it I would rather
recommend to use one of the existing codes like the x86-emulate
that is in KVM, but even that one is far from complete. Trying
to avoid it would probably better.

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/