Re: [PATCH v7 05/11] task_isolation: add debug boot flag

From: Chris Metcalf
Date: Mon Sep 28 2015 - 17:56:08 EST


On 09/28/2015 04:59 PM, Andy Lutomirski wrote:
On Mon, Sep 28, 2015 at 11:17 AM, Chris Metcalf <cmetcalf@xxxxxxxxxx> wrote:
The new "task_isolation_debug" flag simplifies debugging
of TASK_ISOLATION kernels when processes are running in
PR_TASK_ISOLATION_ENABLE mode. Such processes should get no
interrupts from the kernel, and if they do, when this boot flag is
specified a kernel stack dump on the console is generated.

It's possible to use ftrace to simply detect whether a task_isolation
core has unexpectedly entered the kernel. But what this boot flag
does is allow the kernel to provide better diagnostics, e.g. by
reporting in the IPI-generating code what remote core and context
is preparing to deliver an interrupt to a task_isolation core.

It may be worth considering other ways to generate useful debugging
output rather than console spew, but for now that is simple and direct.
This may be addressed elsewhere, but is there anything that alerts the
task or the admin if it's PR_TASK_ISOLATION_ENABLE and *not* on a
nohz_full core?

No, and I've thought about it without coming up with a great
solution. We could certainly fail the initial prctl() if the caller
was not on a nohz_full core. But this seems a little asymmetric
since the task could be on such a core at prctl() time, and then
do a sched_setaffinity() later to a non-nohz-full core. Would
we want to fail that call? Seems heavy-handed. Or we could
then clear the task-isolation state and emit a console message.

I suppose we could notice that we were on a nohz-full
enabled system and the task isolation flags were set on return
to userspace, but we were not on a nohz-full core, and emit
a console message and clear the task-isolation state at that point.
But that also seems a little questionable; maybe the user for
some reason was doing some odd migratory thing with their
tasks or threads and was going to end up migrating them to
a final destination where the prctl() would apply.

Any suggestions for a better approach? Is it worth doing the
minimal printk-warning approach in the previous paragraph?
My instinct is to say that we just leave it as-is, I think.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/