Re: [RFC PATCH] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode

From: Bhaskar Upadhaya
Date: Mon Jan 06 2020 - 06:03:33 EST


On Thu, Jan 2, 2020 at 11:31 PM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Tue, Dec 17, 2019 at 11:03:38PM -0800, Bhaskar Upadhaya wrote:
> > Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
> > because of which it is serviced when the CPU eventually wakes up with a
> > subsequent non-deferrable timer and not at the configured polling interval.
> >
> > For polling mode, the polling interval configured by firmware should not
> > be exceeded as per ACPI_6_3 spec[refer Table 18-394],
>
> I see
>
> "Table 18-394 Hardware Error Notification Structure"
>
> where does it say that the interval should not be exceeded and what is
> going to happen if it gets exceeded?

Definition of poll interval as per spec (referred ACPI 6.3):
"Indicates the poll interval in milliseconds OSPM should use to
periodically check the error source for the presence of an error
condition."

This indicates OSPM should periodically check error source within poll
interval, but with timer being configured with TIMER_DEFERRABLE, timer
is not called within poll interval limit
>
> IOW, are you fixing something you're observing on some platform or
> you're reading the spec only?

We are observing an issue in our ThunderX2 platforms wherein
ghes_poll_func is not called within poll interval when timer is
configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence we
are losing the error records.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette