Re: [PATCH] PCI: pciehp: Use appropriate conditions to check the hotplug controller status

From: Lukas Wunner
Date: Mon May 27 2024 - 04:51:06 EST


On Sun, May 26, 2024 at 10:45:36PM +0800, yaoma wrote:
> > 2024 5 24 15:53 Lukas Wunner <lukas@xxxxxxxxx>
> > On Fri, May 24, 2024 at 02:30:23PM +0800, Bitao Hu wrote:
> > > The values of 'present' and 'link_active' have similar meanings:
> > > the value is %1 if the status is ready, and %0 if it is not. If the
> > > hotplug controller itself is not available, the value should be
> > > %-ENODEV. However, both %1 and %-ENODEV are considered true, which
> > > obviously does not meet expectations. 'Slot(xx): Card present' and
> > > 'Slot(xx): Link Up' should only be output when the value is %1.
> > [...]
> > > --- a/drivers/pci/hotplug/pciehp_ctrl.c
> > > +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> > > @@ -276,10 +276,10 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
> > > case OFF_STATE:
> > > ctrl->state = POWERON_STATE;
> > > mutex_unlock(&ctrl->state_lock);
> > > - if (present)
> > > + if (present > 0)
> > > ctrl_info(ctrl, "Slot(%s): Card present\n",
> > > slot_name(ctrl));
> > > - if (link_active)
> > > + if (link_active > 0)
> > > ctrl_info(ctrl, "Slot(%s): Link Up\n",
> > > slot_name(ctrl));
> > > ctrl->request_result = pciehp_enable_slot(ctrl);
> >
> > We already handle the "<= 0" case immediately above this code excerpt:
> >
> > if (present <= 0 && link_active <= 0) {
> > ...
> > }
>
> I'm not sure if the following scenarios would occur in actual production
> environment, but from the code level, there is the possibility of
> "present <= 0 && link_active > 0" or "present > 0 && link_active <= 0".
> In these cases, the "<= 0" conditions will not be properly handled,
> and "ctrl_info" will output incorrect prompt messages.

I see, that makes sense.

"present" and "link_active" can be -ENODEV if reading the config space
of the hotplug port failed. That's typically the case if the hotplug
port itself was hot-removed, which happens all the time with
Thunderbolt/USB4.

E.g. pciehp_card_present() may return 1 and pciehp_check_link_active()
may return -ENODEV because the hotplug port was hot-removed in-between
the two function calls. In that case we'll emit both "Card present"
*and* "Link Up". The latter is uncalled for and is supressed by your
patch.

So your code change is
Reviewed-by: Lukas Wunner <lukas@xxxxxxxxx>

..but it would be good if you could respin the patch and explain the
rationale of the code change in the commit message more clearly.
Basically summarize what you and I have explained above.

Also, the percent sign % in front of 0, 1, -ENODEV is unnecessary in
commit messages. It only has special meaning in kernel-doc.

Thanks,

Lukas