Re: [PATCH v10 07/21] acpi/ghes: rework the logic to handle HEST source ID
From: Igor Mammedov
Date: Thu Oct 03 2024 - 10:52:25 EST
On Tue, 1 Oct 2024 13:57:59 +0200
Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> wrote:
> Em Tue, 17 Sep 2024 13:59:34 +0200
> Igor Mammedov <imammedo@xxxxxxxxxx> escreveu:
>
> > On Sat, 14 Sep 2024 08:13:28 +0200
> > Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> wrote:
> >
> > > The current logic is based on a lot of duct tape, with
> > > offsets calculated based on one define with the number of
> > > source IDs and an enum.
> > >
> > > Rewrite the logic in a way that it would be more resilient
> > > of code changes, by moving the source ID count to an enum
> > > and make the offset calculus more explicit.
> > >
> > > Such change was inspired on a patch from Jonathan Cameron
> > > splitting the logic to get the CPER address on a separate
> > > function, as this will be needed to support generic error
> > > injection.
> >
> > so this patch switches to using HEST to lookup error status block
> > by source id, though nothing in commit message mentions that.
> > Perhaps it's time to rewrite commit message to be more
> > specific/clear on what it's doing.
> >
> > now, I'd split this on several patches that should also take care of
> > wiring needed to preserve old lookup to keep migration with 9.1 machines
> > working:
[...]
> > 6. cleanup fwcfg based on x-has-hardware_errors_addr,
> > i.e. for 'true':
> > ask for write pointer to hardware_errors like it's done in current code
> > and don't register hest_addr write pointer
> > while for 'false'
> > do opposite of above.
>
> This doesn't work. without the fw_cfg logic for both, QEMU/BIOS won't boot
> and/or the hardware_errors won't work, causing ghes to do nothing.
we should look more into it,
only 1 of them hest_addr(9.2+) or hwerror_addr(9.1) is necessary
so if it breaks, it looks like a bug somewhere to me.
>
[...]
>
>
> Thanks,
> Mauro
>