Re: [PATCH] IB/rxe: Don't clamp residual length to mtu

From: Doug Ledford
Date: Mon May 01 2017 - 14:44:07 EST


On Tue, 2017-04-25 at 09:29 +0200, Johannes Thumshirn wrote:
> On Thu, Apr 06, 2017 at 02:49:44PM +0200, Johannes Thumshirn wrote:
> >
> > When reading a RDMA WRITE FIRST packet we copy the DMA length from
> > the RDMA
> > header into the qp->resp.resid variable for later use. Later in
> > check_rkey()
> > we clamp it to the MTU if the packet is anÂÂRDMA WRITE packet and
> > has a
> > residual length bigger than the MTU. Later in write_data_in() we
> > subtract the
> > payload of the packet from the residual length. If the packet
> > happens to have a
> > payload of exactly the MTU size we end up with a residual length of
> > 0 despite
> > the packet not being the last in the conversation. When the next
> > packet in the
> > conversation arrives, we don't have any residual length left and
> > thus set the QP
> > into an error state.
> >
> > This broke NVMe over Fabrics functionality over rdma_rxe.ko
> >
> > The patch was verified using the following test.
> >
> > Â# echo eth0 > /sys/module/rdma_rxe/parameters/add
> > Â# nvme connect -t rdma -a 192.168.155.101 -s 1023 -n nvmf-test
> > Â# mkfs.xfs -fK /dev/nvme0n1
> > Âmeta-data=/dev/nvme0n1ÂÂÂÂÂÂÂÂÂÂÂisize=256ÂÂÂÂagcount=4,
> > agsize=65536 blks
> > ÂÂÂÂÂÂÂÂÂÂ=ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂsectsz=4096ÂÂattr=2,
> > projid32bit=1
> > ÂÂÂÂÂÂÂÂÂÂ=ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂcrc=0ÂÂÂÂÂÂÂÂfinobt=0, sparse=0
> > ÂdataÂÂÂÂÂ=ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂbsize=4096ÂÂÂblocks=262144,
> > imaxpct=25
> > ÂÂÂÂÂÂÂÂÂÂ=ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂsunit=0ÂÂÂÂÂÂswidth=0 blks
> > ÂnamingÂÂÂ=version 2ÂÂÂÂÂÂÂÂÂÂÂÂÂÂbsize=4096ÂÂÂascii-ci=0 ftype=1
> > ÂlogÂÂÂÂÂÂ=internal logÂÂÂÂÂÂÂÂÂÂÂbsize=4096ÂÂÂblocks=2560,
> > version=2
> > ÂÂÂÂÂÂÂÂÂÂ=ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂsectsz=4096ÂÂsunit=1 blks, lazy-
> > count=1
> > Ârealtime =noneÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂextsz=4096ÂÂÂblocks=0,
> > rtextents=0
> > Â# mount /dev/nvme0n1 /tmp/
> > Â[ÂÂ148.923263] XFS (nvme0n1): Mounting V4 Filesystem
> > Â[ÂÂ148.961196] XFS (nvme0n1): Ending clean mount
> > Â# dd if=/dev/urandom of=test.bin bs=1M count=128
> > Â128+0 records in
> > Â128+0 records out
> > Â134217728 bytes (134 MB, 128 MiB) copied, 0.437991 s, 306 MB/s
> > Â# sha256sum test.bin
> > Âcde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129aÂÂ
> > test.bin
> > Â# cp test.bin /tmp/
> > Âsha256sum /tmp/test.bin
> > Âcde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129aÂÂ
> > /tmp/test.bin
> >
> > Signed-off-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
> > Cc: Hannes Reinecke <hare@xxxxxxx>
> > Cc: Sagi Grimberg <sagi@xxxxxxxxxxx>
> > Cc: Max Gurtovoy <maxg@xxxxxxxxxxxx>
> > ---
>
> Doug anything left here? I already have an Ack from Moni. This patch
> is needed
> to get NVMe over Fabrics working on rxe so I'd like to see it in
> v4.12.

Nope, it's all good. ÂI applied it today.

--
Doug Ledford <dledford@xxxxxxxxxx>
  GPG KeyID: B826A3330E572FDD
 Â
Key fingerprint = AE6B 1BDA 122B 23B4 265B Â1274 B826 A333 0E57 2FDD