Re: [PATCH v3 0/8] kernel: taint when the driver firmware crashes

From: Ben Greear
Date: Thu May 28 2020 - 11:05:10 EST




On 05/28/2020 07:27 AM, Luis Chamberlain wrote:
On Wed, May 27, 2020 at 02:36:42PM -0700, Jakub Kicinski wrote:
On Wed, 27 May 2020 03:19:18 +0000 Luis Chamberlain wrote:
I read your patch, and granted, I will accept I was under the incorrect
assumption that this can only be used by networking devices, however it
the devlink approach achieves getting userspace the ability with
iproute2 devlink util to query a device health, on to which we can peg
firmware health. But *this* patch series is not about health status and
letting users query it, its about a *critical* situation which has come up
with firmware requiring me to reboot my system, and the lack of *any*
infrastructure in the kernel today to inform userspace about it.

So say we use netlink to report a critical health situation, how are we
informing userspace with your patch series about requring a reboot?

One of main features of netlink is pub/sub model of notifications.

Whatever you imagine listening to your uevent can listen to
devlink-health notifications via devlink.

In fact I've shown this off in the RFC patches I sent to you, see
the devlink mon health command being used.

Yes but I looked at iputils2 devlink and seems I made an incorrect
assumption this can only be used for a network device rather than
a struct device.

I'll take a second look.

Hello Jakub,

I'm thinking about something similar to what Luis is proposing, but in
my case I'd like to report just when the driver knows the hardware is gone
and cannot be recovered, like when this is reported:

[ 2548.851832] WARNING: CPU: 3 PID: 98 at backports-4.19.98-1/net/mac80211/util.c:2040 ieee80211_reconfig+0x98/0xb64 [mac80211]
[ 2548.856020] Hardware became unavailable during restart.

I'd like to be able to tie this into a watch-dog program to allow automatic reboot
of the system soon after this event is seen, for instance.

Could you post your devlink RFC patches somewhere public?

Thanks,
Ben


--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com