Re: [PATCH 5/6] nohz: support PR_DATAPLANE_STRICT mode

From: Chris Metcalf
Date: Fri May 15 2015 - 17:25:33 EST


On 05/12/2015 06:23 PM, Andy Lutomirski wrote:
On May 13, 2015 6:06 AM, "Chris Metcalf" <cmetcalf@xxxxxxxxxx> wrote:
On 05/11/2015 06:28 PM, Andy Lutomirski wrote:
On Mon, May 11, 2015 at 12:13 PM, Chris Metcalf <cmetcalf@xxxxxxxxxx> wrote:
In this case, killing the task is appropriate, since that's exactly
the semantics that have been asked for - it's like on architectures
that don't natively support unaligned accesses, but fake it relatively
slowly in the kernel, and in development you just say "give me a
SIGBUS when that happens" and in production you might say
"fix it up and let's try to keep going".
I think more control is needed. I also think that, if we go this
route, we should distinguish syscalls, synchronous non-syscall
entries, and asynchronous non-syscall entries. They're quite
different.

I don't think it's necessary to distinguish the types. As long as we
have a PC pointing to the instruction that triggered the problem,
we can see if it's a system call instruction, a memory write that
caused a page fault, a trap instruction, etc.
Not true. PC right after a syscall insn could be any type of kernel
entry, and you can't even reliably tell whether the syscall insn was
executed or, on x86, whether it was a syscall at all. (x86 insns
can't be reliably decided backwards.)

PC pointing at a load could be a page fault or an IPI.

All that we are trying to do with this API, though, is distinguish
synchronous faults. So IPIs, etc., should not be happening
(they would be bugs), and hopefully we are mostly just
distinguishing different types of synchronous program entries.
That said, I did a si_info flag to differentiate syscalls from other
synchronous entries, and I'm open to looking at more such if
it seems useful.

Again, though, I think we really do need to distinguish at least MCE and NMI (on x86) from the others.

Yes, those are both interesting cases, and I'm not entirely
sure what the right way to handle them is - for example,
likely disable STRICT if you are running with perf enabled.

I look forward to hearing more when you're back next week!

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/