Re: [PATCH net-next RFC v4 03/15] devlink: Add reload action stats

From: Jiri Pirko
Date: Wed Sep 16 2020 - 02:08:12 EST


Tue, Sep 15, 2020 at 10:20:39PM CEST, moshe@xxxxxxxxxx wrote:
>
>On 9/15/2020 4:33 PM, Jiri Pirko wrote:
>> Tue, Sep 15, 2020 at 02:30:19PM CEST, moshe@xxxxxxxxxx wrote:
>> > On 9/14/2020 4:39 PM, Jiri Pirko wrote:
>> > > Mon, Sep 14, 2020 at 08:07:50AM CEST, moshe@xxxxxxxxxxxx wrote:
>> [..]
>>
>>
>> > > > +/**
>> > > > + * devlink_reload_implicit_actions_performed - Update devlink on reload actions
>> > > > + * performed which are not a direct result of devlink reload call.
>> > > > + *
>> > > > + * This should be called by a driver after performing reload actions in case it was not
>> > > > + * a result of devlink reload call. For example fw_activate was performed as a result
>> > > > + * of devlink reload triggered fw_activate on another host.
>> > > > + * The motivation for this function is to keep data on reload actions performed on this
>> > > > + * function whether it was done due to direct devlink reload call or not.
>> > > > + *
>> > > > + * @devlink: devlink
>> > > > + * @limit_level: reload action limit level
>> > > > + * @actions_performed: bitmask of actions performed
>> > > > + */
>> > > > +void devlink_reload_implicit_actions_performed(struct devlink *devlink,
>> > > > + enum devlink_reload_action_limit_level limit_level,
>> > > > + unsigned long actions_performed)
>> > > What I'm a bit scarred of that the driver would call this from withing
>> > > reload_down()/up() ops. Perheps this could be WARN_ON'ed here (or in
>> > > devlink_reload())?
>> > >
>> > Not sure how I know if it was called from devlink_reload_down()/up() ? Maybe
>> > mutex ? So the warn will be actually mutex deadlock ?
>> No. Don't abuse mutex for this.
>> Just make sure that the counters do not move when you call
>> reload_down/up().
>>
>
>Can make that, but actually I better take devlink->lock anyway in this
>function to avoid races, WDYT ?

Either you need to protect some data or not. So if you do, do it.


>
>> > > > +{
>> > > > + if (!devlink_reload_supported(devlink))
>> > > Hmm. I think that the driver does not have to support the reload and can
>> > > still be reloaded by another instance and update the stats here. Why
>> > > not?
>> > >
>> > But I show counters only for supported reload actions and levels, otherwise
>> > we will have these counters on devlink dev show output for other drivers that
>> > don't have support for devlink reload and didn't implement any of these
>> > including this function and these drivers may do some actions like
>> > fw_activate in another way and don't update the stats and so that will make
>> > these stats misleading. They will show history "stats" but they don't update
>> > them as they didn't apply anything related to devlink reload.
>> The case I tried to point at is the driver instance, that does not
>> implement reload ops itself, but still it can be reloaded by someone else -
>> the other driver instance outside.
>>
>> The counters should work no matter if the driver implements reload ops
>> or not. Why wouldn't they? The user still likes to know that the devices
>> was reloaded.
>>
>
>OK, so you say that every driver should show all counters no matter what
>actions it supports and if it supports devlink reload at all, right ?

Well, as I wrote in the other email, I think that there should be 2 sets
of stats for this.


>
>>
>> > > > + return;
>> > > > + devlink_reload_action_stats_update(devlink, limit_level, actions_performed);
>> > > > +}
>> > > > +EXPORT_SYMBOL_GPL(devlink_reload_implicit_actions_performed);
>> > > > +
>> > > > static int devlink_reload(struct devlink *devlink, struct net *dest_net,
>> > > > enum devlink_reload_action action,
>> > > > enum devlink_reload_action_limit_level limit_level,
>> > > > - struct netlink_ext_ack *extack, unsigned long *actions_performed)
>> > > > + struct netlink_ext_ack *extack, unsigned long *actions_performed_out)
>> > > > {
>> > > > + unsigned long actions_performed;
>> > > > int err;
>> > > >
>> > > > if (!devlink->reload_enabled)
>> > > > @@ -2998,9 +3045,14 @@ static int devlink_reload(struct devlink *devlink, struct net *dest_net,
>> > > > if (dest_net && !net_eq(dest_net, devlink_net(devlink)))
>> > > > devlink_reload_netns_change(devlink, dest_net);
>> > > >
>> > > > - err = devlink->ops->reload_up(devlink, action, limit_level, extack, actions_performed);
>> > > > + err = devlink->ops->reload_up(devlink, action, limit_level, extack, &actions_performed);
>> > > > devlink_reload_failed_set(devlink, !!err);
>> > > > - return err;
>> > > > + if (err)
>> > > > + return err;
>> > > > + devlink_reload_action_stats_update(devlink, limit_level, actions_performed);
>> > > > + if (actions_performed_out)
>> > > Just make the caller to provide valid pointer, as I suggested in the
>> > > other patch review.
>> >
>> > Ack.
>> >
>> > > > + *actions_performed_out = actions_performed;
>> > > > + return 0;
>> > > > }
>> > > >
>> > > > static int
>> > > > --
>> > > > 2.17.1
>> > > >