Re: [regression] significant delays when secureboot is enabled since 6.10

From: Roberto Sassu
Date: Wed Sep 11 2024 - 04:56:04 EST


On Tue, 2024-09-10 at 16:28 +0300, Jarkko Sakkinen wrote:
> On Tue Sep 10, 2024 at 3:57 PM EEST, James Bottomley wrote:
> > On Tue, 2024-09-10 at 15:48 +0300, Jarkko Sakkinen wrote:
> > > On Tue Sep 10, 2024 at 3:39 PM EEST, Jarkko Sakkinen wrote:
> > > > On Tue Sep 10, 2024 at 12:05 PM EEST, Roberto Sassu wrote:
> > > > > On Tue, 2024-09-10 at 11:01 +0200, Linux regression tracking
> > > > > (Thorsten
> > > > > Leemhuis) wrote:
> > > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > >
> > > > > > James, Jarkoo, I noticed a report about a regression in
> > > > > > bugzilla.kernel.org that appears to be caused by this change of
> > > > > > yours:
> > > > > >
> > > > > > 6519fea6fd372b ("tpm: add hmac checks to tpm2_pcr_extend()")
> > > > > > [v6.10-rc1]
> > > > > >
> > > > > > As many (most?) kernel developers don't keep an eye on the bug
> > > > > > tracker,
> > > > > > I decided to forward it by mail. To quote from
> > > > > > https://bugzilla.kernel.org/show_bug.cgi?id=219229 :
> > > > > >
> > > > > > > When secureboot is enabled,
> > > > > > > the kernel boot time is ~20 seconds after 6.10 kernel.
> > > > > > > it's ~7 seconds on 6.8 kernel version.
> > > > > > >
> > > > > > > When secureboot is disabled,
> > > > > > > the boot time is ~7 seconds too.
> > > > > > >
> > > > > > > Reproduced on both AMD and Intel platform on ThinkPad X1 and
> > > > > > > T14.
> > > > > > >
> > > > > > > It probably caused autologin failure and micmute led not
> > > > > > > loaded on AMD platform.
> > > > > >
> > > > > > It was later bisected to the change mentioned above. See the
> > > > > > ticket for
> > > > > > more details.
> > > > >
> > > > > Hi
> > > > >
> > > > > I suspect I encountered the same problem:
> > > > >
> > > > > https://lore.kernel.org/linux-integrity/b8a7b3566e6014ba102ab98e10ede0d574d8930e.camel@xxxxxxxxxxxxxxx/
> > > > >
> > > > > Going to provide more info there.
> > > >
> > > > I suppose you are going try to acquire the tracing data I asked?
> > > > That would be awesome, thanks for taking the troube.  Let's look
> > > > at the data and draw conclusions based on that.
> > > >
> > > > Workaround is pretty simple: CONFIG_TCG_TPM2_HMAC=n to the kernel
> > > > configuration disables the feature.
> > > >
> > > > For making decisions what to do with the  we are talking about ~2
> > > > week window estimated, given the Vienna conference slows things
> > > > down, so I hope my workaround is good enough before that.
> > >
> > > I can enumerate three most likely ways to address the issue:
> > >
> > > 1. Strongest: drop from defconfig.
> > > 2. Medium: leave to defconfig but add an opt-in kernel command-line
> > >    parameter.
> > > 3. Lightest: if we can based on tracing data nail the regression in
> > >    sustainable schedule, fix it.
> >
> > Actually, there's a fourth: not use sessions for the PCR extend (if
> > we'd got the timings when I asked, this was going to be my suggestion
> > if they came back problematic). This seems only to be a problem for
> > IMA measured boot (because it does lots of extends). If necessary this
> > could even be wrapped in a separate config or boot option that only
> > disables HMAC on extend if IMA (so we still get security for things
> > like sd-boot)
>
> I can buy that but with a twist that make it an opt-in kernel command
> line option. We don't want to take already existing functionality away
> from those who might want to use it (given e.g. hardening requirements),
> and with that basis opt-in (by default disabled) would be more balanced
> way to address the issue.
>
> Please do a send a patch!

I made few measurements. I have a Fedora 38 VM with TPM passthrough.

Kernels: 6.11-rc2+ (guest), 6.5.0-45-generic (host)

QEMU:

rc qemu-kvm 1:4.2-3ubuntu6.27
ii qemu-system-x86 1:6.2+dfsg-2ubuntu6.22


TPM2_PT_MANUFACTURER:
raw: 0x49465800
value: "IFX"
TPM2_PT_VENDOR_STRING_1:
raw: 0x534C4239
value: "SLB9"
TPM2_PT_VENDOR_STRING_2:
raw: 0x36373000
value: "670"


No HMAC:

# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
0) | tpm2_pcr_extend() {
0) 1.112 us | tpm_buf_append_hmac_session();
0) # 6360.029 us | tpm_transmit_cmd();
0) # 6415.012 us | }


HMAC:

# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
1) | tpm2_pcr_extend() {
1) | tpm2_start_auth_session() {
1) * 36976.99 us | tpm_transmit_cmd();
1) * 84746.51 us | tpm_transmit_cmd();
1) # 3195.083 us | tpm_transmit_cmd();
1) @ 126795.1 us | }
1) 2.254 us | tpm_buf_append_hmac_session();
1) 3.546 us | tpm_buf_fill_hmac_session();
1) * 24356.46 us | tpm_transmit_cmd();
1) 3.496 us | tpm_buf_check_hmac_response();
1) @ 151171.0 us | }

Roberto