Re: [PATCH RESEND] PCI/AER: Use a common function to print AER error bits

From: Bjorn Helgaas
Date: Mon May 07 2018 - 18:07:07 EST


On Mon, Apr 30, 2018 at 12:41:26PM -0500, Alex G. wrote:
> On 04/30/2018 12:15 PM, Bjorn Helgaas wrote:
> > On Sat, Apr 28, 2018 at 12:07:48PM -0500, Alex G. wrote:
>
> (snip)
> >> I could update the offending line to say:
> >> + info.first_error = PCI_ERR_CAP_FEP(aer->cap_control);
> >
> > That's what I would have expected. So I'd say either do this, or add
> > a comment about why it's not the right thing to do.
>
> Okay.
>
> >> Though I still have the concerns with validating CPER data:
> >>
> >>> I can see a way to use even more common printk code, but that requires
> >>> validating the PCI regs we get from firmware. That means we need to make
> >>> a guarantee about CPER that is beyond the scope of this patch.
> >
> > Sounds like this is material for another patch, but if/when you do
> > that, I'd like to understand your concern about validating the
> > registers we get from firmware. Are you worried about getting
> > incorrect register contents, then printing the wrong info, making
> > the wrong decision about how to recover, something else?
>
> I don't trust firmware, and I have daymares about firmware leaving these
> fields uninitialized. In jargon, I'd like to treat it as external
> untrusted serialized data.

That makes good sense to me.

In this particular case, we only test first_error for equality:

__aer_print_error(...)
{
...

pci_err(dev, " [%2d] %-22s%s\n", i, errmsg,
info->first_error == i ? " (First)" : "");

so I don't think there's any danger. If we were using it to index an
array or something, we should certainly validate it first.