Re: [PATCH IB/core 2/2] IB/cm: Send authentic pkey in REQ msg and check eligibility of the pkeys

From: Jason Gunthorpe
Date: Wed May 16 2018 - 10:39:27 EST


On Wed, May 16, 2018 at 08:47:21AM +0200, HÃkon Bugge wrote:

> > This is not a difficult issue.
> >
> > If the GMP is properly tagged with the right PKey then it will never
> > be delivered to the VM if the VM does not have the PKey in the
> > table.
>
> Not quite right. For the shared port model, a GMP will (most
> probably) be accepted by the physical port, due to:

Sure, but I am talking about the VM's 'virtual port'.

> So, if the GMP is destined to VM1, which is a limited member, but we
> have another VM2 which is a full member of the same partition, the
> GMP will pass the HCAâs PKey check.

Sure.

> > It is up to the hypervisor to block GMPs that have Pkeys not in
> > the virtual PKey table of the VF.
>
> The packet is received by the HCA and stripped from IB headers, in
> particular the BTH. How can the "hypervisor" block it when its
> doesnât have access to the GMPâs BTH.PKey?

This is wrong too, in Linux the WC's for GMP packets include the pkey
index that was used to RX the packet. This is a Linux extension
matching the one on the sending side.

The hypervisor *must* block these GMPs, it is a hard requirement of
the pkey security model for VMs.

AFAIK the Mellanox drivers do this right - do you know differently?

> > The only time you could need a new REJ code is if the GMP is using a
> > PKey different from the REQ - which is a pretty goofy thing to do
> > considering this VM case.
>
> Its goofy. In the CX-3 shared port model, the BTH.PKey is the
> default one and the REQ.PKey is the full one even if the sending
> VMâs port only is a limited member. This patch series fixes the last
> issue.

Again, this is wrong, the BTH.Pkey and REQ.Pkey should be the same -
please fix that first before trying to change anything else.

> > Remember the SM doesn't know what Pkeys are in the VM, so it is
> > basically impossible for the REQ side to reliably select two different
> > pkeys and know that they will bothmake it to the VM.
>
> The active side should use the "authentic" PKey in the REQ
> message. That is the one that would be used in BTH.PKey when
> communication has been established. This is implemented by this
> patch series.
>
> Not sure what you mean by "reliably select two different pkeys". The
> CM REQ message contains one PKey.

If BTH.Pkey != REQ.PKey then the requestor side has to obviously
select two PKeys, which is basically impossible.

The VM should not be part of the default partition, for instance.

> See above, not sure how that could be implemented. And if it is
> solved by the HCA trapping the GMP due to the PKey check, it doesnât
> help me, as the purpose of the series is to avoid (excessive) PKey
> traps sent to the SM.

It might not help you, but is shows this fix for the pkey trap issue
is wrong. You must fix the pkey traps on the sending side, not on the
responder side..

> > If I recall there were bugs here in mlx drivers, where the driver
> > sent with the wrong Pkey. I think this has actually been fixed
> > now, so please check the upstream kernel to be sure the Pkey is
> > not what it is supposed to be.
>
> Let me get back to you with some ibdumps here.

Upstream kernel (eg 4.16) and newish Mellanox firmware please.

And it would be fantastic if you could ID why this is happening, AFAIK
it should be OK in the kernel.

Jason