RE: [Intel-wired-lan] [PATCH iwl-net] ice: fix VSI lists confusion when adding VLANs
From: Romanowski, Rafal
Date: Fri Sep 06 2024 - 06:52:14 EST
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@xxxxxxxxxx> On Behalf Of Michal
> Swiatkowski
> Sent: Monday, September 2, 2024 12:52 PM
> To: mschmidt <mschmidt@xxxxxxxxxx>
> Cc: Drewek, Wojciech <wojciech.drewek@xxxxxxxxx>; Szycik, Marcin
> <marcin.szycik@xxxxxxxxx>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; intel-wired-
> lan@xxxxxxxxxxxxxxxx; Eric Dumazet <edumazet@xxxxxxxxxx>;
> netdev@xxxxxxxxxxxxxxx; Nguyen, Anthony L <anthony.l.nguyen@xxxxxxxxx>;
> Miskell, Timothy <timothy.miskell@xxxxxxxxx>; Ertman, David M
> <david.m.ertman@xxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Paolo Abeni
> <pabeni@xxxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>; Daniel Machon
> <daniel.machon@xxxxxxxxxxxxx>
> Subject: Re: [Intel-wired-lan] [PATCH iwl-net] ice: fix VSI lists confusion when
> adding VLANs
>
> On Mon, Sep 02, 2024 at 12:06:52PM +0200, Michal Schmidt wrote:
> > The description of function ice_find_vsi_list_entry says:
> > Search VSI list map with VSI count 1
> >
> > However, since the blamed commit (see Fixes below), the function no
> > longer checks vsi_count. This causes a problem in
> > ice_add_vlan_internal, where the decision to share VSI lists between
> > filter rules relies on the vsi_count of the found existing VSI list being 1.
> >
> > The reproducing steps:
> > 1. Have a PF and two VFs.
> > There will be a filter rule for VLAN 0, refering to a VSI list
> > containing VSIs: 0 (PF), 2 (VF#0), 3 (VF#1).
> > 2. Add VLAN 1234 to VF#0.
> > ice will make the wrong decision to share the VSI list with the new
> > rule. The wrong behavior may not be immediately apparent, but it can
> > be observed with debug prints.
> > 3. Add VLAN 1234 to VF#1.
> > ice will unshare the VSI list for the VLAN 1234 rule. Due to the
> > earlier bad decision, the newly created VSI list will contain
> > VSIs 0 (PF) and 3 (VF#1), instead of expected 2 (VF#0) and 3 (VF#1).
> > 4. Try pinging a network peer over the VLAN interface on VF#0.
> > This fails.
> >
> > Reproducer script at:
> > https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/test-vlan-
> > vsi-list-confusion.sh
> > Commented debug trace:
> > https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/ice-vlan-v
> > si-lists-debug.txt
> > Patch adding the debug prints:
> > https://gitlab.com/mschmidt2/linux/-/commit/f8a8814623944a45091a77c609
> > 4c40bfe726bfdb
> >
> > One thing I'm not certain about is the implications for the LAG
> > feature, which is another caller of ice_find_vsi_list_entry. I don't
> > have a LAG-capable card at hand to test.
> >
> > Fixes: 25746e4f06a5 ("ice: changes to the interface with the HW and FW
> > for SRIOV_VF+LAG")
> > Signed-off-by: Michal Schmidt <mschmidt@xxxxxxxxxx>
> > ---
> > drivers/net/ethernet/intel/ice/ice_switch.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c
> > b/drivers/net/ethernet/intel/ice/ice_switch.c
> > index fe8847184cb1..4e6e7af962bd 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_switch.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_switch.c
> > @@ -3264,7 +3264,7 @@ ice_find_vsi_list_entry(struct ice_hw *hw, u8
> > recp_id, u16 vsi_handle,
> >
> > list_head = &sw->recp_list[recp_id].filt_rules;
> > list_for_each_entry(list_itr, list_head, list_entry) {
> > - if (list_itr->vsi_list_info) {
> > + if (list_itr->vsi_count == 1 && list_itr->vsi_list_info) {
> > map_info = list_itr->vsi_list_info;
> > if (test_bit(vsi_handle, map_info->vsi_map)) {
> > *vsi_list_id = map_info->vsi_list_id;
> > --
> > 2.45.2
> >
>
> Thanks, it for sure looks correct. Reusing VSI list when the rule is new
> seems like an error. I don't know why it was needed for LAG, probably
> Dave will now.
>
> You can add in the description that bug is caused because of reusing VSI
> list created for VLAN 0. All created VFs VSIs are added to VLAN 0
> filter. When none zero VLAN is created on VF which is already in VLAN 0
> (normal case) the VSI list from VLAN 0 is reused. It leads to a problem
> because all VFs (VSIs to be sepcific) that are subscribed to VLAN 0 will
> now receive a new VLAN tag traffic. This is one bug, another is the bug
> that you described. Removing filters from one VF will remove VLAN filter
> from the previous VF. It happens in case of reset of VF.
>
> For example:
> - creation of 3 VFs
> - we have VSI list (used for VLAN 0) [0 (pf), 2 (vf1), 3 (vf2), 4 (vf3)]
> - we are adding VLAN 100 on VF1, we are reusing the previous list
> because 2 is there
> - VLAN traffic works fine, but VLAN 100 tagged traffic can be received
> on all VSIs from the list (for example broadcast or unicast)
> - trust is turing on on VF2, VF2 is resetting, all filters from VF2 are
> removed; the VLAN 100 filter is also remove because 3 is on the list
> - VLAN traffic to VF1 isn't working anymore, there is a need to recreate
> VLAN interface to readd VLAN filter
>
> In summary, I don't see the use case when reusing VSI list which more
> than one VSI on it for new rule is valid scenario.
>
> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@xxxxxxxxxxxxxxx>
>
> Thanks,
> Michal
Tested-by: Rafal Romanowski <rafal.romanowski@xxxxxxxxx>