Re: [LKP] Re: [clocksource] 6c52b5f3cf: stress-ng.opcode.ops_per_sec -14.4% regression

From: Paul E. McKenney
Date: Sat Apr 24 2021 - 13:53:26 EST


On Sat, Apr 24, 2021 at 08:29:20PM +0800, Feng Tang wrote:
> Hi Paul,
>
> On Fri, Apr 23, 2021 at 07:02:54AM -0700, Paul E. McKenney wrote:
> [...]
> > > > > > Following is the tsc freq info from kernel log
> > > > > >
> > > > > > [ 0.000000] DMI: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
> > > > > > [ 0.000000] tsc: Detected 2100.000 MHz processor
> > > > > > ...
> > > > > > [ 13.859982] tsc: Refined TSC clocksource calibration: 2095.077 MHz
> > > > >
> > > > > So what are our options?
> > > > >
> > > > > 1. Clear CLOCK_SOURCE_MUST_VERIFY from tsc-early.
> > > > >
> > >
> > > I think option 1 is fine, as tsc will still get checked once 'tsc'
> > > clocksource is registered, but Thomas and Peter should know more
> > > background and corner cases of tsc.
> >
> > I will look at adding such a patch to my series, preceding the change
> > to 1/1000 deviation.
> >
> > > Also we have been working on another patchset to skip watchdog check
> > > for x86 platforms with stable tsc:
> > >
> > > https://lore.kernel.org/lkml/1618291897-71581-1-git-send-email-feng.tang@xxxxxxxxx/
> > > https://lore.kernel.org/lkml/1618291897-71581-2-git-send-email-feng.tang@xxxxxxxxx/
> >
> > It will be interesting to see what fraction of those with large numbers
> > of systems choose to revert your 2/2, and for what period of time.
> > You really needed my clocksource patch series to have been in place some
> > years back so that people wouldn't have been seeing the false-postive
> > clock-skew complaints. Those complaints did not help people build up
> > their trust in the TSC. :-/
>
> I read you patchset, and I think the recheck to avoid false alarm makes
> sense to me, as well as the debug method you adds, and they have no
> conflict with my patches which tends to newer x86 platforms only.

Would you be willing to give your Tested-by, Acked-by, or Reviewed-by
to those patches?

Obviously, you would not be willing to do so for the patch that reduces
the skew threshold, at least not until my most recent patch is beaten
into shape.

> And yes, I only have met and debugged tsc wrongly marked unstable cases
> on several clients platforms, and in one case I disabled the HPET for
> Baytrail 10 years ago. Our test farm has different kinds of servers,
> only up to 4 sockets and 192 CPUs, where no tsc unstable issue has been
> seen.

That is encouraging, but how many systems are in your test farm?

> And I'm eager to know if there is any real case of an unreliable tsc
> on the 'large numbers' of x86 system which complies with our cpu feature
> check. And if there is, my 2/2 definitely should be dropped.

If you have enough systems, you see all sorts of strange things just
due to the normal underlying failure rate of hardware.

So my question is instead whether we will see any TSC failures
unaccompanied by any other signs of trouble.

And if your 2/2 goes in, those who still distrust TSC will simply
revert it. In their defense, their distrust was built up over a very
long period of time for very good reasons.

> > This last sentence is not a theoretical statement. In the past, I have
> > suggested using the existing "tsc=reliable" kernel boot parameter,
> > which disables watchdogs on TSC, similar to your patch 2/2 above.
> > The discussion was short and that boot parameter was not set. And the
> > discussion motivated to my current clocksource series. ;-)
> >
> > I therefore suspect that someone will want a "tsc=unreliable" boot
> > parameter (or similar) to go with your patch 2/2.
>
> Possibly :)
>
> But I wonder if tsc is disabled on that 'large system', what will be
> used instead? HPET is known to be much slower for clocksource, as shown
> in this regression report :) not mentioning the 'acpi_pm' timer.

Indeed, the default switch to HPET often causes the system to be taken
out of service due to the resulting performance shortfall. There is
of course some automated recovery, and no, I am not familiar with the
details, but I suspect that a simple reboot is an early recovery step.
However, if the problem were to persist, the system would of course be
considered to be permanently broken.

> Again, I want to know the real tsc unstable case. I have spent lots
> of time searching these info from git logs and mail archives before
> writing the patches.

So do I, which is why I put together this patch series. My employer has
a fairly strict upstream-first for things like this which are annoyances
that are likely hiding other bugs, but which are not causing significant
outages, which was of course the motivation for the fault-injection
patches.

As I said earlier, it would have been very helpful to you for a patch
series like this to have been applied many years ago. If it had been,
we would already have the failure-rate data that you requested. And of
course if that failure-rate data indicated that TSC was reliable, there
would be far fewer people still distrusting TSC.

Thanx, Paul