Re: [PATCH net-next v1 2/2] net: dsa: microchip: add ETS Qdisc support for KSZ9477 series

From: Vladimir Oltean
Date: Tue Mar 07 2023 - 16:11:45 EST


On Tue, Mar 07, 2023 at 08:52:50PM +0100, Oleksij Rempel wrote:
> > > One more question is, what is actual expected behavior of mqprio if max_rate
> > > option is used? In my case, if max_rate is set to a queue (even to max value),
> > > then strict priority TSA will not work:
> > > queue0---max rate 100Mbit/s---\
> > > |---100Mbit/s---
> > > queue1---max rate 100Mbit/s---/
> > >
> > > in this example both streams will get 49Mbit/s. My expectation of strict prio
> > > is that queue1 should get 100Mbit/s and queue 0Mbit/s
> >
> > I don't understand this. Have you already implemented mqprio offloading
> > and this is what you observe?
>
> Ack.
>
> > max_rate is an option per traffic class. Are queue0 and queue1 mapped to
> > the same traffic class in your example, or are they not?
>
> They are separate TCs. It is not possible to assign multiple TXQs to on TC on
> KSZ.
>
> > Could you show the full ommand you ran?
>
> tc qdisc add dev lan2 parent root handle 100 mqprio num_tc 4 map 0 1 2 3
> queues 1@0 1@1 1@2 1@3 hw 1 mode channel shaper bw_rlimit max_rate
> 70Mbit 70Mbit 70Mbit 70Mbit
>
> lan2 is bridged with lan1 and lan3. Egress traffic on lan2 is from lan1 and lan3.
> For testing I use 2 iperf3 instances with different PCP values in the VLAN tag.
> Classification is done by HW (currently not configurable from user space)

Hmm, I still don't understand the question. First of all you changed the
data between messages - first you talk about max_rate 100 Mbps and then
you specify max_rate 70Mbit per traffic class. Possibly also the link
speeds are changed between the 2 examples. What is the link speed of the
egress port in the 2 examples?

The question is phrased as "what is the actual expected behavior" - that
would be easy - the traffic classes corresponding to the 2 TXQs are rate
limited to no more than 100 Mbps each. When the total sum of bandwidth
consumptions exceeds the capacity of the link is when you'll start
seeing prioritization effects.

If the question is why this doesn't happen in your case and they get
equal bandwidths instead (assuming you do create congestion), I don't know;
I have seen neither your implementation nor am I familiar with the
hardware. However, there are a few things I've noticed which might be of
help:

- the fact that you get 50-50 bandwidth allocation sounds an awful lot
to me as if the TXQs are still operating in WRR mode and not in strict
priority mode.

- the KSZ9477 datasheet says that rate limiting is per port, and not per
queue, unless Switch MAC Control 5 Register bit 3 (Queue Based Egress
Rate Limit Enable) is set.

- maybe you simply failed to convert the rates properly between the unit
of measurement passed by iproute2 to the unit of measurement expected
by hw. Here's a random comment from the ice driver:

/* TC command takes input in K/N/Gbps or K/M/Gbit etc but
* converts the bandwidth rate limit into Bytes/s when
* passing it down to the driver. So convert input bandwidth
* from Bytes/s to Kbps
*/

("TC command" means iproute2, the conversion is in the "get_rate64()" function)

> > sorry for the quick response, need to go right now
>
> No proble. Have fun.

There wasn't anything funny to do, I had to rush to do some shopping
before the grocery stores closed.