Re: [PATCH v8 1/4] Bluetooth: Add support for hci devcoredump

From: Luiz Augusto von Dentz
Date: Mon Mar 27 2023 - 15:48:59 EST


Hi Manish, Simon,

On Sun, Mar 26, 2023 at 8:47 AM Simon Horman <simon.horman@xxxxxxxxxxxx> wrote:
>
> On Thu, Mar 23, 2023 at 02:10:15PM -0700, Manish Mandlik wrote:
> > From: Abhishek Pandit-Subedi <abhishekpandit@xxxxxxxxxxxx>
> >
> > Add devcoredump APIs to hci core so that drivers only have to provide
> > the dump skbs instead of managing the synchronization and timeouts.
> >
> > The devcoredump APIs should be used in the following manner:
> > - hci_devcoredump_init is called to allocate the dump.
> > - hci_devcoredump_append is called to append any skbs with dump data
> > OR hci_devcoredump_append_pattern is called to insert a pattern.
> > - hci_devcoredump_complete is called when all dump packets have been
> > sent OR hci_devcoredump_abort is called to indicate an error and
> > cancel an ongoing dump collection.
> >
> > The high level APIs just prepare some skbs with the appropriate data and
> > queue it for the dump to process. Packets part of the crashdump can be
> > intercepted in the driver in interrupt context and forwarded directly to
> > the devcoredump APIs.
> >
> > Internally, there are 5 states for the dump: idle, active, complete,
> > abort and timeout. A devcoredump will only be in active state after it
> > has been initialized. Once active, it accepts data to be appended,
> > patterns to be inserted (i.e. memset) and a completion event or an abort
> > event to generate a devcoredump. The timeout is initialized at the same
> > time the dump is initialized (defaulting to 10s) and will be cleared
> > either when the timeout occurs or the dump is complete or aborted.
> >
> > Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@xxxxxxxxxxxx>
> > Signed-off-by: Manish Mandlik <mmandlik@xxxxxxxxxx>
> > Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@xxxxxxxxxxxx>
>
> ...
>
> > +static int hci_devcoredump_update_hdr_state(char *buf, size_t size, int state)
> > +{
> > + int len = 0;
> > +
> > + if (!buf)
> > + return 0;
> > +
> > + len = snprintf(buf, size, "Bluetooth devcoredump\nState: %d\n", state);
>
> The snprintf documentation says:
>
> * The return value is the number of characters which would be
> * generated for the given input, excluding the trailing null,
> * as per ISO C99. If the return is greater than or equal to
> * @size, the resulting string is truncated.
>
> While the scnprintf documentation says:
>
> * The return value is the number of characters written into @buf not including
> * the trailing '\0'. If @size is == 0 the function returns 0.
>
> As the return value us used to determine how many bytes to put to
> an skb, you might want scnprintf(), or a check on the value of len here.

+1

> > +
> > + return len + 1; /* snprintf adds \0 at the end upon state rewrite */
> > +}
> > +
> > +/* Call with hci_dev_lock only. */
> > +static int hci_devcoredump_update_state(struct hci_dev *hdev, int state)
> > +{
> > + hdev->dump.state = state;
> > +
> > + return hci_devcoredump_update_hdr_state(hdev->dump.head,
> > + hdev->dump.alloc_size, state);
> > +}
>
> ...
>
> > +/* Call with hci_dev_lock only. */
> > +static int hci_devcoredump_prepare(struct hci_dev *hdev, u32 dump_size)
> > +{
> > + struct sk_buff *skb = NULL;
> > + int dump_hdr_size;
> > + int err = 0;
> > +
> > + skb = alloc_skb(MAX_DEVCOREDUMP_HDR_SIZE, GFP_ATOMIC);
> > + if (!skb) {
> > + bt_dev_err(hdev, "Failed to allocate devcoredump prepare");
>
> I don't think memory allocation errors need to be logged like this,
> as they are already logged by the core.
>
> Please run checkpatch, which flags this.

+1, looks like the CI was already causing warnings about these.

> > + return -ENOMEM;
> > + }
> > +
> > + dump_hdr_size = hci_devcoredump_mkheader(hdev, skb);
> > +
> > + if (hci_devcoredump_alloc(hdev, dump_hdr_size + dump_size)) {
> > + err = -ENOMEM;
> > + goto hdr_free;
> > + }
> > +
> > + /* Insert the device header */
> > + if (!hci_devcoredump_copy(hdev, skb->data, skb->len)) {
> > + bt_dev_err(hdev, "Failed to insert header");
> > + hci_devcoredump_free(hdev);
> > +
> > + err = -ENOMEM;
> > + goto hdr_free;
> > + }
> > +
> > +hdr_free:
> > + if (skb)
>
> It seems that this condition is always true.
> And in any case, kfree_skb can handle a NULL argument.

+1

> > + kfree_skb(skb);
> > +
> > + return err;
> > +}
>
> ...
>
> > +void hci_devcoredump_rx(struct work_struct *work)
> > +{
> > + struct hci_dev *hdev = container_of(work, struct hci_dev, dump.dump_rx);
> > + struct sk_buff *skb;
> > + struct hci_devcoredump_skb_pattern *pattern;
> > + u32 dump_size;
> > + int start_state;
> > +
> > +#define DBG_UNEXPECTED_STATE() \
> > + bt_dev_dbg(hdev, \
> > + "Unexpected packet (%d) for state (%d). ", \
> > + hci_dmp_cb(skb)->pkt_type, hdev->dump.state)
>
> nit: indentation seems excessive in above 3 lines.
>
> > +
> > + while ((skb = skb_dequeue(&hdev->dump.dump_q))) {
> > + hci_dev_lock(hdev);
> > + start_state = hdev->dump.state;
> > +
> > + switch (hci_dmp_cb(skb)->pkt_type) {
> > + case HCI_DEVCOREDUMP_PKT_INIT:
> > + if (hdev->dump.state != HCI_DEVCOREDUMP_IDLE) {
> > + DBG_UNEXPECTED_STATE();
> > + goto loop_continue;
>
> I'm probably missing something terribly obvious.
> But can the need for the loop_continue label be avoided by using 'break;' ?

Yeah, in fact I think Id use dedicated functions for each state.

> > + }
> > +
> > + if (skb->len != sizeof(dump_size)) {
> > + bt_dev_dbg(hdev, "Invalid dump init pkt");
> > + goto loop_continue;
> > + }
> > +
> > + dump_size = *((u32 *)skb->data);
> > + if (!dump_size) {
> > + bt_dev_err(hdev, "Zero size dump init pkt");
> > + goto loop_continue;
> > + }

I'd replace the code above with skb_pull_data, we could perhaps start
adding something like skb_pull_u32 to make it simpler though.

> > + if (hci_devcoredump_prepare(hdev, dump_size)) {
> > + bt_dev_err(hdev, "Failed to prepare for dump");
> > + goto loop_continue;
> > + }
> > +
> > + hci_devcoredump_update_state(hdev,
> > + HCI_DEVCOREDUMP_ACTIVE);
> > + queue_delayed_work(hdev->workqueue,
> > + &hdev->dump.dump_timeout,
> > + DEVCOREDUMP_TIMEOUT);
> > + break;
> > +
> > + case HCI_DEVCOREDUMP_PKT_SKB:
> > + if (hdev->dump.state != HCI_DEVCOREDUMP_ACTIVE) {
> > + DBG_UNEXPECTED_STATE();
> > + goto loop_continue;
> > + }
> > +
> > + if (!hci_devcoredump_copy(hdev, skb->data, skb->len))
> > + bt_dev_dbg(hdev, "Failed to insert skb");
> > + break;
> > +
> > + case HCI_DEVCOREDUMP_PKT_PATTERN:
> > + if (hdev->dump.state != HCI_DEVCOREDUMP_ACTIVE) {
> > + DBG_UNEXPECTED_STATE();
> > + goto loop_continue;
> > + }
> > +
> > + if (skb->len != sizeof(*pattern)) {
> > + bt_dev_dbg(hdev, "Invalid pattern skb");
> > + goto loop_continue;
> > + }
> > +
> > + pattern = (void *)skb->data;
> > +
> > + if (!hci_devcoredump_memset(hdev, pattern->pattern,
> > + pattern->len))
> > + bt_dev_dbg(hdev, "Failed to set pattern");
> > + break;
> > +
> > + case HCI_DEVCOREDUMP_PKT_COMPLETE:
> > + if (hdev->dump.state != HCI_DEVCOREDUMP_ACTIVE) {
> > + DBG_UNEXPECTED_STATE();
> > + goto loop_continue;
> > + }
> > +
> > + hci_devcoredump_update_state(hdev,
> > + HCI_DEVCOREDUMP_DONE);
> > + dump_size = hdev->dump.tail - hdev->dump.head;
> > +
> > + bt_dev_info(hdev,
> > + "Devcoredump complete with size %u "
> > + "(expect %zu)",
>
> I think it is best practice not to split quoted strings across multiple lines.
> Although it leads to long lines (which is undesirable)
> keeping the string on one line aids searching the code (with grep).
>
> checkpatch warns about this.

Well this should be an info to begin with and I'd probably add a
bt_dev_dbg at the beginning printing like "%s -> %s", old_state,
new_state, which makes things a lot simpler.

> > + dump_size, hdev->dump.alloc_size);
> > +
> > + dev_coredumpv(&hdev->dev, hdev->dump.head, dump_size,
> > + GFP_KERNEL);
> > + break;
> > +
> > + case HCI_DEVCOREDUMP_PKT_ABORT:
> > + if (hdev->dump.state != HCI_DEVCOREDUMP_ACTIVE) {
> > + DBG_UNEXPECTED_STATE();
> > + goto loop_continue;
> > + }
> > +
> > + hci_devcoredump_update_state(hdev,
> > + HCI_DEVCOREDUMP_ABORT);
> > + dump_size = hdev->dump.tail - hdev->dump.head;
> > +
> > + bt_dev_info(hdev,
> > + "Devcoredump aborted with size %u "
> > + "(expect %zu)",
> > + dump_size, hdev->dump.alloc_size);

Ditto, lets log the old state and new state using bt_dev_dbg.

> > + /* Emit a devcoredump with the available data */
> > + dev_coredumpv(&hdev->dev, hdev->dump.head, dump_size,
> > + GFP_KERNEL);
> > + break;
> > +
> > + default:
> > + bt_dev_dbg(hdev,
> > + "Unknown packet (%d) for state (%d). ",
> > + hci_dmp_cb(skb)->pkt_type, hdev->dump.state);
> > + break;
> > + }
> > +
> > +loop_continue:
> > + kfree_skb(skb);
> > + hci_dev_unlock(hdev);
> > +
> > + if (start_state != hdev->dump.state)
> > + hci_devcoredump_notify(hdev, hdev->dump.state);
> > +
> > + hci_dev_lock(hdev);
> > + if (hdev->dump.state == HCI_DEVCOREDUMP_DONE ||
> > + hdev->dump.state == HCI_DEVCOREDUMP_ABORT)
> > + hci_devcoredump_reset(hdev);
> > + hci_dev_unlock(hdev);

Don't think this is much better than calling hci_devcoredump_reset at
the respective state handler instead since you had to lock again, or
is this because hci_devcoredump_notifty? I'd probably document if that
is the case, otherwise I'd move it to be called by
hci_devcoredump_update_state.

> > + }
> > +}
> > +EXPORT_SYMBOL(hci_devcoredump_rx);
>
> ...
>
> > +static inline bool hci_devcoredump_enabled(struct hci_dev *hdev)
> > +{
> > + return hdev->dump.supported;
> > +}
> > +
> > +int hci_devcoredump_init(struct hci_dev *hdev, u32 dmp_size)
> > +{
> > + struct sk_buff *skb = NULL;
>
> nit: I don't think it is necessary to initialise skb here.
> Likewise elsewhere in this patch.
>
> > +
> > + if (!hci_devcoredump_enabled(hdev))
> > + return -EOPNOTSUPP;
> > +
> > + skb = alloc_skb(sizeof(dmp_size), GFP_ATOMIC);
> > + if (!skb) {
> > + bt_dev_err(hdev, "Failed to allocate devcoredump init");
> > + return -ENOMEM;
> > + }
> > +
> > + hci_dmp_cb(skb)->pkt_type = HCI_DEVCOREDUMP_PKT_INIT;
> > + skb_put_data(skb, &dmp_size, sizeof(dmp_size));
> > +
> > + skb_queue_tail(&hdev->dump.dump_q, skb);
> > + queue_work(hdev->workqueue, &hdev->dump.dump_rx);
> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL(hci_devcoredump_init);

Since it looks like we are going to need another round, could you
please use hci_devcd_ as prefix instead?


--
Luiz Augusto von Dentz