Re: [PATCH v3 10/12] scsi/ncr5380: Expedite register polling
From: Finn Thain
Date: Mon Oct 10 2016 - 18:58:19 EST
On Mon, 10 Oct 2016, Russell King - ARM Linux wrote:
> On Mon, Oct 10, 2016 at 12:46:53AM -0400, Finn Thain wrote:
> > Avoid the call to NCR5380_poll_politely2() when possible. The call is
> > easily short-circuited on the PIO fast path, using the inline wrapper.
> > This requires that the NCR5380_read macro be made available before any
> > #include "NCR5380.h" so a few declarations have to be moved too.
> > Signed-off-by: Finn Thain <fthain@xxxxxxxxxxxxxxxxxxx>
> > Reviewed-by: Hannes Reinecke <hare@xxxxxxxx>
> > Tested-by: Ondrej Zary <linux@xxxxxxxxxxxxxxxxxxxx>
> > Tested-by: Michael Schmitz <schmitzmic@xxxxxxxxx>
> > ---
> > diff --git a/drivers/scsi/arm/cumana_1.c b/drivers/scsi/arm/cumana_1.c
> > index ae1d4c6..fb7600d 100644
> > --- a/drivers/scsi/arm/cumana_1.c
> > +++ b/drivers/scsi/arm/cumana_1.c
> > @@ -29,6 +29,10 @@
> > #define NCR5380_implementation_fields \
> > unsigned ctrl
> > +struct NCR5380_hostdata;
> > +static u8 cumanascsi_read(struct NCR5380_hostdata *, unsigned int);
> > +static void cumanascsi_write(struct NCR5380_hostdata *, unsigned int, u8);
> > +
> > #include "../NCR5380.h"
> > #define CTRL 0x16fc
> This seems to be non-obviously unrelated to this commit - should it be
> in some other commit?
The source of the problem here is the #include "../NCR5380.h", because
that header file both declares struct NCR5380_hostdata and (with this
patch) it also calls cumanascsi_read(). That means cumanascsi_read() has
to be declared earlier, and that in turn requires an incomplete definition
of struct NCR5380_hostdata. The commit log alludes to the problem but
could be made more explicit if you would like (?).
BTW, this problem doesn't arise in the other six 5380 drivers because each
one implements NCR5380_read() as a macro. It would be nice to get rid of
the macros in favour of an ops struct (in the course of converting the
driver to a library module) but I doubt that this is feasible now that
I've seen the performance benefit of this patch series.
In the absence of link time optimization, the library module design would
prevent the inlining of functions like cumanascsi_read(). Since every byte
transfered by the 5380 in a PIO transfer requires at least 5 chip register
accesses, function calls would be prohibitively expensive. So I think we
are stuck with inline accessors or macros.