Re: [PATCH v9 3/9] PCI: Create PCI library functions in support of DOE mailboxes.

From: Ira Weiny
Date: Wed Jun 01 2022 - 00:53:56 EST


On Tue, May 31, 2022 at 10:56:52AM -0700, Davidlohr Bueso wrote:
> On Tue, 31 May 2022, Davidlohr Bueso wrote:
>
> > On Tue, 31 May 2022, ira.weiny@xxxxxxxxx wrote:
> >
> > > +static void doe_statemachine_work(struct work_struct *work)
> > > +{
> > > + struct delayed_work *w = to_delayed_work(work);
> > > + struct pci_doe_mb *doe_mb = container_of(w, struct pci_doe_mb,
> > > + statemachine);
> > > + struct pci_dev *pdev = doe_mb->pdev;
> > > + int offset = doe_mb->cap_offset;
> > > + struct pci_doe_task *task;
> > > + u32 val;
> > > + int rc;
> > > +
> > > + mutex_lock(&doe_mb->task_lock);
> > > + task = doe_mb->cur_task;
> > > + mutex_unlock(&doe_mb->task_lock);
> >
> > Instead of a mutex, would it be better to use a rwsem here to protect
> > the state machine and allow for concurrent reads for the work callback?
> > It is a general interface and a trivial change, but not sure how much
> > performance is cared about.
>
> Actually why is this a sleeping lock at all? Afaict all critical regions
> are short and just deal with loads and stores of oe_mb->task_lock (and
> pci_doe_submit_task also checks the doe_mb->flags with the lock held).
> This could be a spinlock or similarly a rwlock.

This is a good point... My only excuse is that task_lock used to lock more
than just the cur_task so I suspect that I just kept it as a mutex after a
rework at some point with out thinking about this deeper.

Thinking about it I don't see a benefit to a rwlock. We don't have multiple
readers.

But I've just looked at this code again and I'm not sure that the exclusion is
correct with regard to the state machine. I think the state needs to be IDLE
before retire_cur_task() is called or the state machine could be in an invalid
state when the next task runs. I think there is a bug in the DOE_WAIT_ABORT*
cases when not error and not busy. In that case there is a race with the next
task getting run the state being DOE_WAIT_ABORT*. In the timeout case we will
call the mailbox dead.

I can't remember if Jonathan originally locked the state machine or the
task or both.

I think I have fixed it but, I'll look at it again in the morning.

Thanks,
Ira

>
> Thanks,
> Davidlohr