RFC: Use of devlink/health report for non-Ethernet devices

From: Ray Jui
Date: Mon Feb 03 2020 - 18:01:50 EST


Hi Jiri/Eran/David,

I've been investigating the health report feature of devlink, and have a couple related questions as follows:

1. Based on my investigation, it seems that devlink health report mechanism provides the hook for a device driver to report errors, dump debug information, trigger object dump, initiate self-recovery, and etc. The current users of health report are all Ethernet based drivers. However, it does not seem the health report framework prohibits the use from any non-Ethernet based device drivers. Is my understanding correct?

2. Following my first question, in this case, do you think it makes any sense to use devlink health report as a generic error reporting and recovery mechanism, for other devices, e.g., NVMe and Virt I/O?

3. In the Ethernet device driver based use case, if one has a "smart NIC" type of platform, i.e., running Linux on the embedded processor of the NIC, it seems to make a lot of sense to also use devlink health report to deal with other non-Ethernet specific errors, originated from the embedded Linux (or any other OSes). The front-end driver that registers various health reporters will still be an Ethernet based device driver, running on the host server system. Does this make sense to you?

Thanks in advance for your feedback!

Thanks,

Ray