Re: [PATCH 0/6] support "dataplane" mode for nohz_full

From: Chris Metcalf
Date: Fri May 15 2015 - 11:06:00 EST


On 05/11/2015 09:47 PM, Mike Galbraith wrote:
On Mon, 2015-05-11 at 15:25 -0400, Chris Metcalf wrote:
On 05/11/2015 03:19 PM, Mike Galbraith wrote:
I really shouldn't have acked nohz_full -> isolcpus. Beside the fact
that old static isolcpus was_supposed_ to crawl off and die, I know
beyond doubt that having isolated a cpu as well as you can definitely
does NOT imply that said cpu should become tickless.
True, at a high level, I agree that it would be better to have a
top-level concept like Frederic's proposed ISOLATION that includes
isolcpus and nohz_cpu (and other stuff as needed).

That said, what you wrote above is wrong; even with the patch you
acked, setting isolcpus does not automatically turn on nohz_full for
a given cpu. The patch made it true the other way around: when
you say nohz_full, you automatically get isolcpus on that cpu too.
That does, at least, make sense for the semantics of nohz_full.
I didn't write that, I wrote nohz_full implies (spelled '->') isolcpus.
Yes, with nohz_full currently being static, the old allegedly dying but
also static isolcpus scheduler off switch is a convenient thing to wire
the nohz_full CPU SET (<- hint;) property to.

Yes, I was responding to the bit where you said "having isolated a
cpu as well as you can does NOT imply it should become tickless",
but indeed, the "nohz_full -> isolcpus" patch didn't make that true.
In any case sounds like we were just talking past each other.

BTW, another facet of this: Rik wants to make isolcpus immune to
cpusets, which makes some sense, user did say isolcpus=, but that also
makes isolcpus truly static. If the user now says nohz_full=, they lose
the ability to deactivate CPU isolation, making the set fairly useless
for anything other than HPC. Currently, the user can flip the isolation
switch as he sees fit. He takes a size extra large performance hit for
having said nohz_full=, but he doesn't lose generic utility.

I don't I follow this completely. If the user says nohz_full=, he
probably doesn't care about deactivating isolcpus later, since that
defeats the entire purpose of the nohz_full= in the first place,
as far as I can tell. And when you say "anything other than HPC",
I'm not sure what you mean; as far as I know high-performance
computing only cares because it wants that extra 0.5% of the
cpu or whatever interrupts eat up, but just as a nice-to-have.
The real use case is high-performance userspace drivers where
the nohz_full cores are responding to real-time things like packet
arrivals with almost no latency to spare.

What is the generic utility you're envisioning for nohz_full cores
that have turned off scheduler isolation? I assume it's some
workload where you'd prefer not to have too many interrupts
but still are running multiple tasks, but in that case does it really
make much difference in practice?

Thomas has nuked the hrtimer softirq.

Yes, this I didn't know. So I will drop my "no ksoftirqd" patch and
we will see if ksoftirqs emerge as an issue for my "cpu isolation"
stuff in the future; it may be that that was the only issue.

Inlining softirqs may save a context switch, but adds cycles that we may
consume at higher frequency than the thing we're avoiding.

Yes but consuming cycles is not nearly as much of a concern
as avoiding interrupts or scheduling, certainly for the case of
userspace drivers that I described above.

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/