Re: [PATCH v5] tpm: Map the ACPI provided event log

From: Jarkko Sakkinen
Date: Wed Dec 25 2024 - 10:31:54 EST


On Tue Dec 24, 2024 at 6:05 PM EET, Ard Biesheuvel wrote:
> On Tue, 24 Dec 2024 at 05:03, Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote:
> >
> > The following failure was reported:
> >
> > [ 10.693310][ T1] tpm_tis STM0925:00: 2.0 TPM (device-id 0x3, rev-id 0)
> > [ 10.848132][ T1] ------------[ cut here ]------------
> > [ 10.853559][ T1] WARNING: CPU: 59 PID: 1 at mm/page_alloc.c:4727 __alloc_pages_noprof+0x2ca/0x330
> > [ 10.862827][ T1] Modules linked in:
> > [ 10.866671][ T1] CPU: 59 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.0-lp155.2.g52785e2-default #1 openSUSE Tumbleweed (unreleased) 588cd98293a7c9eba9013378d807364c088c9375
> > [ 10.882741][ T1] Hardware name: HPE ProLiant DL320 Gen12/ProLiant DL320 Gen12, BIOS 1.20 10/28/2024
> > [ 10.892170][ T1] RIP: 0010:__alloc_pages_noprof+0x2ca/0x330
> > [ 10.898103][ T1] Code: 24 08 e9 4a fe ff ff e8 34 36 fa ff e9 88 fe ff ff 83 fe 0a 0f 86 b3 fd ff ff 80 3d 01 e7 ce 01 00 75 09 c6 05 f8 e6 ce 01 01 <0f> 0b 45 31 ff e9 e5 fe ff ff f7 c2 00 00 08 00 75 42 89 d9 80 e1
> > [ 10.917750][ T1] RSP: 0000:ffffb7cf40077980 EFLAGS: 00010246
> > [ 10.923777][ T1] RAX: 0000000000000000 RBX: 0000000000040cc0 RCX: 0000000000000000
> > [ 10.931727][ T1] RDX: 0000000000000000 RSI: 000000000000000c RDI: 0000000000040cc0
> >
> > Above shows that ACPI pointed a 16 MiB buffer for the log events because
> > RSI maps to the 'order' parameter of __alloc_pages_noprof(). Address the
> > bug by mapping the region when needed instead of copying.
> >
>
> How can you be sure the memory contents will be preserved? Does it say
> anywhere in the TCG spec that this needs to use a memory type that is
> preserved by default?

TCG log calls the size as the minimum size for the log area but is not
too accurate on details [1]. I don't actually know what "minimum" even
means in this context as it is just a fixed size cut of the physical
address space.

I don't think that can ever change. It would be oddballs if some
dynamic change would make ACPI tables show incorrect information
on memory ranges. Do you know any pre-existing example of such
behavior (not sarcasm, just interested)?

Anyway considering this type of dynamics TCG spec is inaccurate.

>
> Also, the fact that we're now at v5 kind of proves my point that this
> approach may be too complex for a simple bug fix. Why not switch to
> kvmalloc() for a backportable fix, and improve upon that for future
> kernels?

OK, I could possibly live with this. 16 MiB is not that much with
current memory sizes so if everyone agrees this then it is fine and I'll
change this patch as feature for my next PR. I just don't want to decide
any abritrarily chosen truncate range. For me it just feels wasting
memory for no reason, that's all.

Alternatively the code do pre-fetch iteration of what happens when
you do "cat /sys/kernel/security/tpm0/binary_measurements" and then
we would end up about 100 kB or similar figure with this hardware
but that would require code I already did and few bits more for
full implementation.

[1] https://trustedcomputinggroup.org/wp-content/uploads/TCG_ACPIGeneralSpec_v1p3_r8_pub.pdf

BR, Jarkko