Re: [PATCH v2] tpm: Allow system suspend to continue when TPM suspend fails
From: Jason A. Donenfeld
Date: Fri Jan 06 2023 - 11:01:59 EST
Hi Todd & ChromeOS folks,
On Fri, Jan 06, 2023 at 04:01:56AM +0100, Jason A. Donenfeld wrote:
> TPM 1 is sometimes broken across system suspends, due to races or
> locking issues or something else that haven't been diagnosed or fixed
> yet, most likely having to do with concurrent reads from the TPM's
> hardware random number generator driver. These issues prevent the system
> from actually suspending, with errors like:
>
> tpm tpm0: A TPM error (28) occurred continue selftest
> ...
> tpm tpm0: A TPM error (28) occurred attempting get random
> ...
> tpm tpm0: Error (28) sending savestate before suspend
> tpm_tis 00:08: PM: __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x80 returns 28
> tpm_tis 00:08: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 28
> tpm_tis 00:08: PM: failed to suspend: error 28
> PM: Some devices failed to suspend, or early wake event detected
>
> This issue was partially fixed by 23393c646142 ("char: tpm: Protect
> tpm_pm_suspend with locks"), in a last minute 6.1 commit that Linus took
> directly because the TPM maintainers weren't available. However, it
> seems like this just addresses the most common cases of the bug, rather
> than addressing it entirely. So there are more things to fix still,
> apparently.
>
> In lieu of actually fixing the underlying bug, just allow system suspend
> to continue, so that laptops still go to sleep fine. Later, this can be
> reverted when the real bug is fixed.
>
> Link: https://lore.kernel.org/lkml/7cbe96cf-e0b5-ba63-d1b4-f63d2e826efa@xxxxxxx/
> Cc: stable@xxxxxxxxxxxxxxx # 6.1+
> Reported-by: Vlastimil Babka <vbabka@xxxxxxx>
> Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Jason A. Donenfeld <Jason@xxxxxxxxx>
> ---
> This is basically untested and I haven't worked out if there are any
> awful implications of letting the system sleep when TPM suspend fails.
> Maybe some PCRs get cleared and that will make everything explode on
> resume? Maybe it doesn't matter? Somebody well versed in TPMology should
> probably [n]ack this approach.
When idling scrolling on my telephone to try to see what the
implications of skipping TPM_ORD_SAVESTATE could be, I stumbled across
some ChromeOS commits related to it, and realized that, ah-hah, finally
there's an obvious group of stakeholders who make heavy use of the TPM
and have likely amassed some expertise on it.
So I was wondering if you'd take a look at this patch briefly to make
sure it won't break ChromeOS laptops.
Jason