RE: [bisected] 5.12-rc1 hpsa regression: "scsi: hpsa: Correct dev cmds outstanding for retried cmds" breaks hpsa P600

From: Don.Brace
Date: Fri Mar 05 2021 - 15:46:42 EST


-----Original Message-----
From: Arnd Bergmann [mailto:arnd@xxxxxxxxxx]
Sent: Friday, March 5, 2021 7:32 AM
Subject: Re: [bisected] 5.12-rc1 hpsa regression: "scsi: hpsa: Correct dev cmds outstanding for retried cmds" breaks hpsa P600

On Fri, Mar 5, 2021 at 10:24 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> On Fri, Mar 5, 2021 at 12:26 AM <Don.Brace@xxxxxxxxxxxxx> wrote:
> > > > On 3/2/21 11:26 PM, Sergei Trofimovich wrote:
> > struct CommandList {
> > struct CommandListHeader Header; /* 0 20 */
> > struct RequestBlock Request; /* 20 20 */
> > struct ErrDescriptor ErrDesc; /* 40 12 */
> > struct SGDescriptor SG[32]; /* 52 512 */
> > /* --- cacheline 8 boundary (512 bytes) was 52 bytes ago --- */
> > u32 busaddr; /* 564 4 */
> > struct ErrorInfo * err_info; /* 568 8 */
> > /* --- cacheline 9 boundary (576 bytes) --- */
> > struct ctlr_info * h; /* 576 8 */
> > int cmd_type; /* 584 4 */
> > long int cmdindex; /* 588 8 */
> > struct completion * waiting; /* 596 8 */
> > struct scsi_cmnd * scsi_cmd; /* 604 8 */
> > struct work_struct work; /* 612 32 */
> > /* --- cacheline 10 boundary (640 bytes) was 4 bytes ago --- */
> > struct hpsa_scsi_dev_t * phys_disk; /* 644 8 */
> > struct hpsa_scsi_dev_t * device; /* 652 8 */
> > bool retry_pending; /* 660 1 */
> > atomic_t refcount; /* 661 4 */
>
> How come this atomic_t is no longer aligned to its natural alignment?

There is a

#pragma pack(1)

in linux 203 of this file!

It looks like some of the members in struct raid_map_data and struct CommandListHeader need to be annotated as packed, but the file accidentally packs everything until the '#pragma pack()'
in line 875, including the kernel-side CommandList data structure that clearly must not be packed.

Arnd
---
Don:
Thanks for your input. It helps a lot.

The pragma setting predates my taking over the driver.

It's true that there is a section of each command entry that is DMAed from the controller (from start of the CommandList up to busaddr) and the rest is driver housekeeping information. The unsupported controllers seem to be unable to handle the changed alignment.

I have a patch I'll send up soon to change the alignment back...
int retry_pending; /* 652 4 */
struct hpsa_scsi_dev_t * device; /* 656 8 */
atomic_t refcount; /* 664 4 */

/* size: 768, cachelines: 12, members: 16 */
/* padding: 100 */
} __attribute__((__aligned__(128)));

Since this is a maintenance driver, I would rather not do too much surgery and invoke regression tests (and we no longer support these controllers). I'd rather just send up a patch to correct the issue on these legacy controllers. I have one ready to send up.

Thanks for your observation and your attention.
I'll send up the patch soon.

Don