[PATCH 2/2] tpm: tpm_crb: enhance resource mapping mechanism for supporting AMD's fTPM

From: Safford, David (GE Global Research, US)
Date: Fri Aug 30 2019 - 13:58:46 EST


> From: Seunghun Han <kkamagui@xxxxxxxxx>
> Sent: Friday, August 30, 2019 12:55 PM
> To: Safford, David (GE Global Research, US) <david.safford@xxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxx>; Jarkko Sakkinen
> <jarkko.sakkinen@xxxxxxxxxxxxxxx>; Peter Huewe <peterhuewe@xxxxxx>;
> open list:TPM DEVICE DRIVER <linux-integrity@xxxxxxxxxxxxxxx>; Linux
> Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>
> Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> mechanism for supporting AMD's fTPM
>
> >
> > > From: linux-integrity-owner@xxxxxxxxxxxxxxx <linux-integrity-
> > > owner@xxxxxxxxxxxxxxx> On Behalf Of Seunghun Han
> > > Sent: Friday, August 30, 2019 9:55 AM
> > > To: Jason Gunthorpe <jgg@xxxxxxxx>
> > > Cc: Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx>; Peter Huewe
> > > <peterhuewe@xxxxxx>; open list:TPM DEVICE DRIVER <linux-
> > > integrity@xxxxxxxxxxxxxxx>; Linux Kernel Mailing List <linux-
> > > kernel@xxxxxxxxxxxxxxx>
> > > Subject: EXT: Re: [PATCH 2/2] tpm: tpm_crb: enhance resource mapping
> > > mechanism for supporting AMD's fTPM
> > >
> > > >
> > > > On Fri, Aug 30, 2019 at 06:56:39PM +0900, Seunghun Han wrote:
> > > > > I got an AMD system which had a Ryzen Threadripper 1950X and MSI
> > > > > mainboard, and I had a problem with AMD's fTPM. My machine
> > > > > showed
> > > an
> > > > > error message below, and the fTPM didn't work because of it.
> > > > >
> > > > > [ 5.732084] tpm_crb MSFT0101:00: can't request region for resource
> > > > > [mem 0x79b4f000-0x79b4ffff] [ 5.732089] tpm_crb:
> > > > > probe of MSFT0101:00 failed with error -16
> > > > >
> > > > > When I saw the iomem, I found two fTPM regions were in the ACPI
> > > > > NVS
> > > area.
> > > > > The regions are below.
> > > > >
> > > > > 79a39000-79b6afff : ACPI Non-volatile Storage
> > > > > 79b4b000-79b4bfff : MSFT0101:00
> > > > > 79b4f000-79b4ffff : MSFT0101:00
> > > > >
> > > > > After analyzing this issue, I found that crb_map_io() function
> > > > > called
> > > > > devm_ioremap_resource() and it failed. The ACPI NVS didn't allow
> > > > > the TPM CRB driver to assign a resource in it because a busy bit
> > > > > was set to the ACPI NVS area.
> > > > >
> > > > > To support AMD's fTPM, I added a function to check intersects
> > > > > between the TPM region and ACPI NVS before it mapped the region.
> > > > > If some intersects are detected, the function just calls
> > > > > devm_ioremap() for a workaround. If there is no intersect, it
> > > > > calls
> > > devm_ioremap_resource().
> > > > >
> > > > > Signed-off-by: Seunghun Han <kkamagui@xxxxxxxxx>
> > > > > ---
> > > > > drivers/char/tpm/tpm_crb.c | 25 +++++++++++++++++++++++--
> > > > > 1 file changed, 23 insertions(+), 2 deletions(-)
> > > >
> > > > This still seems to result in two drivers controlling the same memory.
> > > > Does this create bugs and races during resume?
> > > >
> > > > Jason
> > >
> > > When I tested this patch in my machine, it seemed that ACPI NVS was
> > > saved after TPM CRB driver sent "TPM2_Shutdown(STATE)" to the fTPM
> > > while suspending. Then, ACPI NVS was restored while resuming.
> > > After resuming, PCRs didn't change and TPM2 tools such as
> > > tpm2_pcrlist, tpm2_extend, tpm2_getrandoms worked well.
> > > So, according to my test result, it seems that the patch doesn't
> > > create bugs and race during resume.
> > >
> > > Seunghun
> >
> > This was discussed earlier on the list.
> > The consensus was that, while safe now, this would be fragile, and
> > subject to unexpected changes in ACPI/NVS, and we really need to tell
> > NVS to exclude the regions for long term safety.
>
> Thank you for your advice. We also discussed earlier and concluded that
> checking and raw remapping are enough to work around this. The link is
> here, https://lkml.org/lkml/2019/8/29/962 .

I don't see Matthew Garrett's agreement on that thread.

> > As separate issues, the patches do not work at all on some of my AMD
> systems.
> > First, you only force the remap if the overlap is with NVS, but I have
> > systems where the overlap is with other reserved regions. You should
> > force the remap regardless, but if it is NVS, grab the space back from NVS.
>
> I didn't know about that. I just found the case from your thread that AMD
> system assigned TPM region into the reserved area. However, as I know, the
> reserved area has no busy bit so that TPM CRB driver could assign buffer
> resources in it. Right? In my view, if you patched your TPM driver with my
> patch series, then it could work. Would you explain what happened in TPM
> CRB driver and reserved area?

Good question. I'll try it out this weekend.

> > Second, the patch extends the wrong behavior of the current driver to
> > both buffer regions. If there is a conflict between what the device's
> > control register says, and what ACPI says, the existing driver explicitly
> "trusts the ACPI".
> > This is just wrong. The actual device will use the areas as defined by
> > its control registers regardless of what ACPI says. I talked to
> > Microsoft, and their driver trusts the control register values, and
> > doesn't even look at the ACPI values.
>
> As you know, the original code trusts the ACPI table because of the
> workaround for broken BIOS, and this code has worked well for a long time.
> In my view, if we change this code to trust control register value, we could
> make new problems and need a lot of time to check the workaround. So, I
> want to trust the ACPI value now.

I don't think the workaround has every really worked, other than the
Helpful firmware error it emits. I don't think anyone has tested the
workaround with large requests. The tpm_crb device itself is telling
us the buffers it is using. Why are we ignoring that and trusting the
known bad ACPI tables? Makes no sense to me.

> >
> > In practice, I have tested several systems in which the device
> > registers show The correct 4K buffers, but the driver instead trusts
> > the ACPI values, which list just 1K buffers. 1K buffers will not work
> > for large requests, and the device is going to read and write the 4K buffers
> regardless.
> >
> > dave
>
> I know about that. However, the device driver is not going to read and write
> 4K buffers if you patch your TPM driver with my patch series.
> One of my patches has an enhancement feature that could calculate the
> buffer size well. The TPM driver uses exactly 1K buffer for this case, not 4K
> buffer, and it works.

Have you tested large requests (well over 1K) to make sure?

dave
>
> Seunghun