Re: [PATCH net-next v2 2/2] cxgb4: collect hardware dump in second kernel

From: Thadeu Lima de Souza Cascardo
Date: Sat Mar 24 2018 - 18:28:14 EST


On Sat, Mar 24, 2018 at 04:26:34PM +0530, Rahul Lakkireddy wrote:
> Register callback to collect hardware/firmware dumps in second kernel
> before hardware/firmware is initialized. The dumps for each device
> will be available under /sys/kernel/crashdd/cxgb4/ directory in second
> kernel.
>
> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@xxxxxxxxxxx>
> Signed-off-by: Ganesh Goudar <ganeshgr@xxxxxxxxxxx>
> ---
> v2:
> - No Changes.
>
> Changes since rfc v2:
> - Update comments and commit message for sysfs change.
>
> rfc v2:
> - Updated dump registration to the new API in patch 1.
[...]
> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> index e880be8e3c45..265cb026f868 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> @@ -5527,6 +5527,18 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> if (err)
> goto out_free_adapter;
>
> + if (is_kdump_kernel()) {
> + /* Collect hardware state and append to
> + * /sys/kernel/crashdd/cxgb4/ directory
> + */
> + err = cxgb4_cudbg_crashdd_add_dump(adapter);
> + if (err) {
> + dev_warn(adapter->pdev_dev,
> + "Fail collecting crash device dump, err: %d. Continuing\n",
> + err);
> + err = 0;
> + }
> + }
>

The problem I see with this approach is that you require that the driver
is built into the kdump kernel (or present as a module in the kdump
initramfs), and that you will probe the device during the collection of
the dumps.

IMHO, if you are going to require the device to be probed by the same
driver during kdump, you might just as well use the device object itself
to present the crash data. I think that's what Stephen Hemminger meant
when he said to use sysfs. No need at all for any special crashdd. Just
add an attribute or attribute group to the device object.

Otherwise, as Eric Biederman pointed out, you should just add that data
into the vmcore before you kexec, so you don't even need to look at a
different file, and the driver does not even need to be present in the
kdump kernel.

Cascardo.

> if (!is_t4(adapter->params.chip)) {
> s_qpp = (QUEUESPERPAGEPF0_S +
> --
> 2.14.1