Re: [PATCH net-next v6 5/9] net: dsa: microchip: add support for different DCB app configurations

From: Vladimir Oltean
Date: Wed Apr 10 2024 - 19:13:10 EST


Hi Oleksij,

On Wed, Apr 10, 2024 at 10:05:52AM +0200, Oleksij Rempel wrote:
> diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
> index b2d1c61400c51..78a9622adecde 100644
> --- a/drivers/net/dsa/microchip/ksz_common.c
> +++ b/drivers/net/dsa/microchip/ksz_common.c
> @@ -2364,6 +2365,10 @@ static int ksz_setup(struct dsa_switch *ds)
> goto out_ptp_clock_unregister;
> }
>
> + ret = ksz_dcb_init(dev);
> + if (ret)
> + goto out_ptp_clock_unregister;
> +
> /* start switch */
> regmap_update_bits(ksz_regmap_8(dev), regs[S_START_CTRL],
> SW_START, SW_START);
> @@ -2691,7 +2696,7 @@ static int ksz_port_setup(struct dsa_switch *ds, int port)
> * there is no need to do anything.
> */
>
> - return 0;
> + return ksz_dcb_init_port(dev, port);
> }
>
> void ksz_port_stp_state_set(struct dsa_switch *ds, int port, u8 state)
> @@ -3943,6 +3948,11 @@ static const struct dsa_switch_ops ksz_switch_ops = {
> .port_setup_tc = ksz_setup_tc,
> .get_mac_eee = ksz_get_mac_eee,
> .set_mac_eee = ksz_set_mac_eee,
> + .port_get_default_prio = ksz_port_get_default_prio,
> + .port_set_default_prio = ksz_port_set_default_prio,
> + .port_get_dscp_prio = ksz_port_get_dscp_prio,
> + .port_get_apptrust = ksz_port_get_apptrust,
> + .port_set_apptrust = ksz_port_set_apptrust,
> };
>
> struct ksz_device *ksz_switch_alloc(struct device *base, void *priv)
> diff --git a/drivers/net/dsa/microchip/ksz_dcb.c b/drivers/net/dsa/microchip/ksz_dcb.c
> new file mode 100644
> index 0000000000000..d2122f844c80b
> --- /dev/null
> +++ b/drivers/net/dsa/microchip/ksz_dcb.c
> +/**
> + * ksz_init_global_dscp_map - Initializes the global DSCP-to-priority mapping
> + * @dev: Pointer to the KSZ switch device structure
> + *
> + * This function initializes the global DSCP-to-priority mapping table for the
> + * switch.
> + *
> + * Return: 0 on success, or a negative error code on failure
> + */
> +static int ksz_init_global_dscp_map(struct ksz_device *dev)
> +{
> + int reg, per_reg, ret, dscp;
> + u8 data = 0;
> + u8 mask;
> +
> + /* On KSZ9xxx variants, DSCP remapping is disabled by default.
> + * Enable to have, predictable and reproducible behavior across
> + * different devices.
> + */
> + if (!is_ksz8(dev)) {
> + ret = ksz_rmw8(dev, KSZ9477_REG_SW_MAC_TOS_CTRL,
> + KSZ9477_SW_TOS_DSCP_REMAP,
> + KSZ9477_SW_TOS_DSCP_REMAP);
> + if (ret)
> + return ret;
> + }
> +
> + ksz_get_dscp_prio_reg(dev, &reg, &per_reg, &mask);
> +
> + for (dscp = 0; dscp < DSCP_MAX; dscp++) {
> + int ipv, shift, tt;
> +
> + /* Map DSCP to Traffic Type, which is corresponding to the
> + * Internal Priority Value (IPV) in the switch.
> + */
> + if (!is_ksz8(dev)) {
> + ipv = ietf_dscp_to_ieee8021q_tt(dscp);
> + } else {
> + /* On KSZ8xxx variants we do not have IPV to queue
> + * remapping table. We need to convert DSCP to Traffic
> + * Type and then to queue.
> + */
> + tt = ietf_dscp_to_ieee8021q_tt(dscp);
> + if (tt < 0)
> + return tt;
> +
> + ipv = ieee8021q_tt_to_tc(tt, dev->info->num_tx_queues);
> + }
> +
> + if (ipv < 0)
> + return ipv;
> +
> + shift = (dscp % per_reg) * (8 / per_reg);
> + data |= (ipv & mask) << shift;
> +
> + if (dscp % per_reg == per_reg - 1) {
> + ret = ksz_write8(dev, reg + (dscp / per_reg), data);
> + if (ret)
> + return ret;
> +
> + data = 0;
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * ksz_dcb_init_port - Initializes the DCB configuration for a port on a KSZ
> + * @dev: Pointer to the KSZ switch device structure
> + * @port: Port number for which to initialize the DCB configuration
> + *
> + * This function initializes the DCB configuration for the specified port on a
> + * KSZ switch. Particular DCB configuration is set for the port, including the
> + * default priority and apptrust selectors.
> + * The default priority is set to Best Effort, and the apptrust selectors are
> + * set to all supported selectors.
> + *
> + * Return: 0 on success, or a negative error code on failure
> + */
> +int ksz_dcb_init_port(struct ksz_device *dev, int port)
> +{
> + int ret, ipv;
> +
> + if (is_ksz8(dev)) {
> + ipv = ieee8021q_tt_to_tc(IEEE8021Q_TT_BE,
> + dev->info->num_tx_queues);
> + if (ipv < 0)
> + return ipv;
> + } else {
> + ipv = IEEE8021Q_TT_BE;
> + }
> +
> + /* Set the default priority for the port to Best Effort */
> + ret = ksz_port_set_default_prio(dev->ds, port, ipv);
> + if (ret)
> + return ret;
> +
> + return ksz_port_set_apptrust(dev->ds, port, ksz_supported_apptrust,
> + ARRAY_SIZE(ksz_supported_apptrust));
> +}
> +
> +/**
> + * ksz_dcb_init - Initializes the DCB configuration for a KSZ switch
> + * @dev: Pointer to the KSZ switch device structure
> + *
> + * This function initializes the DCB configuration for a KSZ switch. The global
> + * DSCP-to-priority mapping table is initialized.
> + *
> + * Return: 0 on success, or a negative error code on failure
> + */
> +int ksz_dcb_init(struct ksz_device *dev)
> +{
> + int ret;
> +
> + ret = ksz_init_global_dscp_map(dev);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}

Sorry for not responding to your previous question about this:
https://lore.kernel.org/netdev/ZfmJ-O8XMT8oO-TS@xxxxxxxxxxxxxx/
Simply put, I had a period with not a lot of free time, even for reading
emails.

I'm on the fence on whether your solution to the "global DSCP-to-prio
mapping rather than per-port" problem is the right one.

We try to avoid baking policies into the kernel, no matter how well
intended the 802.1Q and IETF RFC8325 recommendations are. They are still
just recommendations and examples, and a particular use case may want to
configure things completely differently (or as hinted in the Wi-Fi specific
RFC8325: maybe the administrator doesn't want to assign the higher
traffic classes, for network control protocols, by using IP DSCP, and
doesn't want user flows to request DSCP values that would get access to
these traffic classes. It can indeed be seen as a security concern).

I empathize with the incovenience of having to map the per-netdev dcbnl
application priority table API with a piece of hardware where that table
is shared across all ports. But yet, I don't think it is a strong enough
justification for us to make an exception and say: "yeah, ok, let's not
even implement .port_set_dscp_prio() to make the thing configurable, but
let's bake into the kernel a fixed policy that's good for everyone".

No, I think we _need_ the thing to be configurable, and not try so hard
with the ieee8021q helpers to hardcode things just right in the kernel.

Have you tried the obvious: "every time there is a change to the global
DSCP mapping table, push the change into the dcbnl app table of all user
netdevs, so that the user becomes aware of what happens"? Kernel drivers
can do that, through direct calls to dcb_ieee_setapp(). DSA does it too,
to probe the initial QoS configuration of the ports and push it to the
application priority tables.