RE: [EXT] Re: [v2,net-next, 1/2] enetc: Configure the Time-Aware Scheduler via tc-taprio offload

From: Po Liu
Date: Wed Nov 13 2019 - 23:39:22 EST


Hi Ivan,


> -----Original Message-----
> From: Ivan Khoronzhuk <ivan.khoronzhuk@xxxxxxxxxx>
> Sent: 2019å11æ13æ 21:42
> To: Po Liu <po.liu@xxxxxxx>
> Cc: Claudiu Manoil <claudiu.manoil@xxxxxxx>; davem@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; vinicius.gomes@xxxxxxxxx;
> Vladimir Oltean <vladimir.oltean@xxxxxxx>; Alexandru Marginean
> <alexandru.marginean@xxxxxxx>; Xiaoliang Yang
> <xiaoliang.yang_1@xxxxxxx>; Roy Zang <roy.zang@xxxxxxx>; Mingkai Hu
> <mingkai.hu@xxxxxxx>; Jerry Huang <jerry.huang@xxxxxxx>; Leo Li
> <leoyang.li@xxxxxxx>
> Subject: Re: [EXT] Re: [v2,net-next, 1/2] enetc: Configure the Time-Aware
> Scheduler via tc-taprio offload
>
> Caution: EXT Email
>
> On Wed, Nov 13, 2019 at 03:45:08AM +0000, Po Liu wrote:
> >Hi Ivan,
> >
> >> -----Original Message-----
> >> From: Ivan Khoronzhuk <ivan.khoronzhuk@xxxxxxxxxx>
> >> Sent: 2019å11æ13æ 5:10
> >> To: Po Liu <po.liu@xxxxxxx>
> >> Cc: Claudiu Manoil <claudiu.manoil@xxxxxxx>; davem@xxxxxxxxxxxxx;
> >> linux- kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx;
> >> vinicius.gomes@xxxxxxxxx; Vladimir Oltean <vladimir.oltean@xxxxxxx>;
> >> Alexandru Marginean <alexandru.marginean@xxxxxxx>; Xiaoliang Yang
> >> <xiaoliang.yang_1@xxxxxxx>; Roy Zang <roy.zang@xxxxxxx>; Mingkai Hu
> >> <mingkai.hu@xxxxxxx>; Jerry Huang <jerry.huang@xxxxxxx>; Leo Li
> >> <leoyang.li@xxxxxxx>
> >> Subject: [EXT] Re: [v2,net-next, 1/2] enetc: Configure the Time-Aware
> >> Scheduler via tc-taprio offload
> >>
> >> Caution: EXT Email
> >>
> >> Hello,
> >>
> >> On Tue, Nov 12, 2019 at 08:42:49AM +0000, Po Liu wrote:
> >> >ENETC supports in hardware for time-based egress shaping according
> >> >to IEEE 802.1Qbv. This patch implement the Qbv enablement by the
> >> >hardware offload method qdisc tc-taprio method.
> >> >Also update cbdr writeback to up level since control bd ring may
> >> >writeback data to control bd ring.
> >> >
> >> >Signed-off-by: Po Liu <Po.Liu@xxxxxxx>
> >> >Signed-off-by: Vladimir Oltean <vladimir.oltean@xxxxxxx>
> >> >Signed-off-by: Claudiu Manoil <claudiu.manoil@xxxxxxx>
> >> >---
> >> >changes:
> >> >- introduce a local define CONFIG_FSL_ENETC_QOS to fix the various
> >> > configurations will result in link errors.
> >> > Since the CONFIG_NET_SCH_TAPRIO depends on many Qos configs. Not
> >> > to use it directly in driver. Add it to CONFIG_FSL_ENETC_QOS
> >> >depends
> >> > on list, so only CONFIG_NET_SCH_TAPRIO enabled, user can enable
> >> >this
> >> > tsn feature, or else, return not support.
> >> >
> >> > drivers/net/ethernet/freescale/enetc/Kconfig | 10 ++
> >> > drivers/net/ethernet/freescale/enetc/Makefile | 1 +
> >> > drivers/net/ethernet/freescale/enetc/enetc.c | 19 ++-
> >> > drivers/net/ethernet/freescale/enetc/enetc.h | 7 +
> >> > .../net/ethernet/freescale/enetc/enetc_cbdr.c | 5 +-
> >> > .../net/ethernet/freescale/enetc/enetc_hw.h | 150 ++++++++++++++++--
> >> > .../net/ethernet/freescale/enetc/enetc_qos.c | 130 +++++++++++++++
> >> > 7 files changed, 300 insertions(+), 22 deletions(-) create mode
> >> > 100644 drivers/net/ethernet/freescale/enetc/enetc_qos.c
> >> >
> >>
> >> [...]
> >>
> >> >
> >> >@@ -1483,6 +1479,19 @@ int enetc_setup_tc(struct net_device *ndev,
> >> >enum
> >> tc_setup_type type,
> >> > return 0;
> >> > }
> >> >
> >> >+int enetc_setup_tc(struct net_device *ndev, enum tc_setup_type type,
> >> >+ void *type_data)
> >> >+{
> >> >+ switch (type) {
> >> >+ case TC_SETUP_QDISC_MQPRIO:
> >> >+ return enetc_setup_tc_mqprio(ndev, type_data);
> >> Sorry didn't see v2, so i duplicate my question here:
> >>
> >> This patch is for taprio offload, I see that mqprio is related and is
> >> part of taprio offload configuration. But taprio offload has own mqprio
> settings.
> >> The taprio mqprio part is not offloaded with TC_SETUP_QDISC_MQPRIO.
> >>
> >> So, a combination of mqprio and tario qdiscs used.
> >> Could you please share the commands were used for your setup?
> >>
> >
> >Example command:
> >tc qdisc replace dev eth0 parent root handle 100 taprio num_tc 8 map 0
> >1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 01
> >sched-entry S 02 300000 flags 0x2
>
> So, the TC_SETUP_QDISC_MQPRIO is really not required here, and mqprio qdisc
> is not used. Then why is it here, should be placed in separate patch at least.
>

Taprio is not fully copy the mqprio but refer the mqprio feature. I don't think there is
Problem for them all list in driver. Some person may not use taprio at all.

> But even the comb mqprio qdisc and taprio qdisc are used together then taprio
> requires hw offload also. I'm Ok to add it later to taprio, and I'm asking about it
> because I need it also, what ever way it could be added.
>
> Not clear how you combined mqprio qdisc and taprio now to get it working
> From this command:
>
> tc qdisc replace dev eth0 parent root handle 100 taprio num_tc 8 map 0 1 2 3
> 4
> 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 01 sched-
> entry S 02
> 300000 flags 0x2
>
> I can conclude that some default h/w mapping (one to one) were used and no
> need to use mqprio qdisc, the command above is enough.
>
> Is it true?
> Then move it to spearate patch please as it's not taprio related yet.

As mentioned, to set the queue mapping it is require the data_type include the mqprio parameter.
So here hardware set it as default. Without the settting, taprio would not say it is implement the offload.
So I donât think it is proper to seperate another patch.

> >
> >> And couple interesting questions about all of this:
> >> - The taprio qdisc has to have mqprio settings, but if it's done with
> >> mqprio then it just skipped (by reading tc class num).
> >> - If no separate mqprio qdisc configuration then mqprio conf from
> >> taprio is set, who should restore tc mappings when taprio qdisc is unloaded?
> >> Maybe there is reason to implement TC_SETUP_QDISC_MQPRIO offload in
> >> taprio since it's required feature?
> >
>
> [...]
>
> >>
> >> >+
> >> >+ /* Configure the (administrative) gate control list using the
> >> >+ * control BD descriptor.
> >> >+ */
> >> >+ gcl_config = &cbd.gcl_conf;
> >> >+
> >> >+ data_size = sizeof(struct tgs_gcl_data) + gcl_len *
> >> >+ sizeof(struct gce);
> >> >+
> >> >+ gcl_data = kzalloc(data_size, __GFP_DMA | GFP_KERNEL);
> >> >+ if (!gcl_data)
> >> >+ return -ENOMEM;
> >> >+
> >> >+ gce = (struct gce *)(gcl_data + 1);
> >> >+
> >> >+ /* Since no initial state config in taprio, set gates open as default.
> >> >+ */
> >> tc-taprio and IEEE Qbv allows to change configuration in flight, so
> >> that oper state is active till new admin start time. So, here comment
> >> says it does initial state config, if in-flight feature is not
> >> supported then error has to be returned instead of silently rewriting
> >> configuration. But if it can be implemented then state should be
> remembered/verified in order to not brake oper configuration?
> >
> >I think this is ok as per standard. Also see this comment in
> >net/sched/sch_taprio.c:
>
> From the code above (duplicate for convenience):
> if (admin_conf->enable) {
> enetc_wr(&priv->si->hw,
> ENETC_QBV_PTGCR_OFFSET,
> temp & (~ENETC_QBV_TGE));
> usleep_range(10, 20);
> enetc_wr(&priv->si->hw,
> ENETC_QBV_PTGCR_OFFSET,
> temp | ENETC_QBV_TGE);
> } else {
> enetc_wr(&priv->si->hw,
> ENETC_QBV_PTGCR_OFFSET,
> temp & (~ENETC_QBV_TGE));
> return 0;
> }
>
> I see that's not true, as Simon noted, you have same command:
>
> enetc_wr(&priv->si->hw, ENETC_QBV_PTGCR_OFFSET, temp &
> (~ENETC_QBV_TGE));
>
> before enabling and for disabling. I guess it's stop command, that is disable qbv.
> So, before enabling new configuration with enetc_setup_tc(), the tario is
> inited|cleared|reseted|rebooted|defaulted|offed but not updated. It
> inited|cleared|reseted|rebooted|defaulted|means no
> in-flight capabilities or they are ignored.

The Qbv spec do not show the disable before enable.
Here is hardware workaround to avoid the hardware run into unknow state.

>
> JFI, it's possible to do first time:
>
> tc qdisc replace dev eth0 parent root handle 100 taprio num_tc 8 map 0 1 2 3
> 4
> 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 01 sched-
> entry S 02
> 300000 flags 0x2
>
> and then:
>
> tc qdisc replace dev eth0 parent root handle 100 taprio base-time 01 sched-
> entry S 02 200000 flags 0x2
>
> Do it many times, but w/o mqprio configuration.
>
> So, this function must return error if it cannot be done, as above commands
> suppose that configuration can be updated in runtime, that is, set ADMIN cycle
> while OPER cycle is still active for some time and not broken like it's done now.
> If it can be achieved then no need to do enetc_wr(&priv->si->hw,
> ENETC_QBV_PTGCR_OFFSET, temp & (~ENETC_QBV_TGE)); when admin_conf-
> >enable.
>
> So or return error, or do it appropriately.
>
> >
> > /* Until the schedule starts, all the queues are open */ I would
> >change the comment.
> >
> >> >+ gcl_config->atc = 0xff;
> >> >+ gcl_config->acl_len = cpu_to_le16(gcl_len);
> >>
> >> Ok, this is maximum number of schedules.
> >> According to tc-taprio it's possible to set cycle period more then
> >> schedules actually can consume. If cycle time is more, then last
> >> gate's state can be kept till the end of cycle. But if last schedule
> >> has it's own interval set then gates should be closed till the end of
> >> cycle or no? if it has to be closed, then one more endl schedule
> >> should be present closing gates at the end of list for the rest cycle time. Can
> be implemented in h/w but just to be sure, how it's done in h/w?
> >>
> >There is already check the list len in up code.
> >if (admin_conf->num_entries > enetc_get_max_gcl_len(&priv->si->hw))
> > return -EINVAL;
> >gcl_len = admin_conf->num_entries;
>
> I mean +1 schedule to finalize the cycle with closed gates if last schedule has
> provided time interval. If I set couple schedules, with intervals, and cycle time
> more then those schedules consume, then I suppose the gates closed for the
> rest time of cycle, probably.
>
> Example:
>
> sched1 ----> shced2 ----> time w/o scheds -> cycle end
> gate 1 gate 2 no gates
>
> But if shced2 is last one then gate 2 is opened till the end of cycle.
> So, to close gate one more shched is needed with closed gates to finalize it.
> Or it's supposed that gate2 is opened till the end of cycle, How is it in you case.
> Or this is can be provided by configuration from tc?
>
> It's question not statement. Just though. Anyway i'll verify later it can be done
> with tc and how it's done in sw version, just interesting how your h/w works.

What user command set the gate list would set into the hardware.
We have detail setting in the user manual which you can have it try.

> >
> >> >+
> >> >+ if (!admin_conf->base_time) {
> >> >+ gcl_data->btl =
> >> >+ cpu_to_le32(enetc_rd(&priv->si->hw, ENETC_SICTR0));
> >> >+ gcl_data->bth =
> >> >+ cpu_to_le32(enetc_rd(&priv->si->hw,
> >> >+ ENETC_SICTR1));
>
> [...]
>
> --
> Regards,
> Ivan Khoronzhuk