Re: [PATCH RFC tip/core/rcu 0/5] Expedited grace periods encouraging normal ones

From: Paul E. McKenney
Date: Wed Jul 01 2015 - 16:46:55 EST


On Wed, Jul 01, 2015 at 07:02:42PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 01, 2015 at 09:17:05AM -0700, Paul E. McKenney wrote:
> > On Wed, Jul 01, 2015 at 04:17:10PM +0200, Peter Zijlstra wrote:
>
> > > 74b51ee152b6 ("ACPI / osl: speedup grace period in acpi_os_map_cleanup")
> >
> > Really???
> >
> > I am not concerned about this one. After all, one of the first things
> > that people do for OS-jitter-sensitive workloads is to get rid of
> > binary blobs. And any runtime use of ACPI as well. And let's face it,
> > if your latency-sensitive workload is using either binary blobs or ACPI,
> > you have already completely lost. Therefore, an additional expedited
> > grace period cannot possibly cause you to lose any more.
>
> This isn't solely about rt etc.. this call is a generic facility used by
> however many consumers. A normal workstation/server could run into it at
> relatively high frequency depending on its workload.
>
> Even on not latency sensitive workloads I think hammering all active
> CPUs is bad behaviour. Remember that a typical server class machine
> easily has more than 32 CPUs these days.

Well, that certainly is one reason for the funnel locking, sequence
counters, etc., keeping the overhead bounded despite large numbers
of CPUs. So I don't believe that a non-RT/non-HPC workload is going
to notice.

> > > Also, I'm not entirely convinced things like:
> > >
> > > fd2ed4d25270 ("dm: add statistics support")
> > > 83d5e5b0af90 ("dm: optimize use SRCU and RCU")
> > > ef3230880abd ("backing-dev: use synchronize_rcu_expedited instead of synchronize_rcu")
> > >
> > > Are in the 'never' happens category. Esp. the backing-dev one, it
> > > triggers every time you unplug a USB stick or similar.
> >
> > Which people should be assiduously avoiding for any sort of
> > industrial-control system, especially given things like STUXNET.
>
> USB sure, but a backing dev is involved in nfs clients, loopback and all
> sorts of block/filesystem like setups.
>
> unmount an NFS mount and voila expedited rcu, unmount a loopback, tada.
>
> All you need is a regular server workload triggering any of that on a
> semi regular basis and even !rt people might start to notice something
> is up.

I don't believe that latency-sensitive systems are going to be messing
with remapping their storage at runtime, let alone on a regular basis.
If they are not latency sensitive, and assuming that the rate of
storage remapping is at all sane, I bet that they won't notice the
synchronize_rcu_expedited() overhead. The overhead of the actual
remapping will very likely leave the synchronize_rcu_expedited() overhead
way down in the noise.

And if they are doing completely insane rates of storage remapping,
I suspect that the batching in the synchronize_rcu_expedited()
implementation will reduce the expedited-grace-period overhead still
further as a fraction of the total.

> > > Rejigging a DM might indeed be rare enough; but then again, people use
> > > DM explicitly so they can rejig while in operation.
> >
> > They rejig DM when running OS-jitter-sensitive workloads?
>
> Unlikely but who knows, I don't really know DM, so I can't even tell
> what would trigger these.

In my experience, the hard-core low-latency guys avoid doing pretty
much anything that isn't completely essential to the job at hand.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/