Re: mb2q experience and couple issues

From: Thomas Gleixner
Date: Thu Oct 01 2020 - 05:13:05 EST


Alexei,

On Wed, Sep 30 2020 at 11:12, Alexei Starovoitov wrote:
> For the last couple years we've been using mb2q tool to normalize patches
> and it worked wonderfully.

Fun. I thought I'm the only user of it :)

> Recently we've hit few bugs:
> curl -s https://patchwork.kernel.org/patch/11807443/mbox/ >
> /tmp/mbox.i; ~/bin/mb2q --mboxout mbox.o /tmp/mbox.i
> Drop Message w/o Message-ID: No subject
> No patches found in mbox
>
> I've tried to debug it, but couldn't figure out what's going on.
> The subject and message-id fields are parsed correctly,
> but later something happens.
> Could you please take a look?

The problem is the mbox storage format. The mbox created by curl has a
mail body which has a line starting with 'From' in the mail body:

From the VAR btf_id, the verifier can also read the address of the
ksym's corresponding kernel var from kallsyms and use that to fill
dst_reg.

The mailbox parser trips over that From and takes it as start of the
next message.

http://qmail.org/qmail-manual-html/man5/mbox.html

Usually mailbox storage escapes a From at the start of a
newline with '>':

>From the VAR btf_id, the verifier can also read the address of the
ksym's corresponding kernel var from kallsyms and use that to fill
dst_reg.

Yes, it's ugly and I haven't figured out a proper way to deal with
that. There are quite some mbox formats out there and they all are
incompatible with each other and all of them have different horrors.

Let me think about it.

> Another issue we've hit was that some mailers split message-id
> into few lines like this:
> curl -s https://patchwork.kernel.org/patch/11809399/mbox/|grep -2 Message-Id:
> Subject: [PATCH bpf-next v4 1/6] bpf: add classid helper only based on skb->sk
> Date: Wed, 30 Sep 2020 17:18:15 +0200
> Message-Id:
> <ed633cf27a1c620e901c5aa99ebdefb028dce600.1601477936.git.daniel@xxxxxxxxxxxxx>
> X-Mailer: git-send-email 2.21.0
>
> That was an easy fix:
> - mid = pmsg.msgid.lstrip('<').rstrip('>')
> + mid = pmsg.msgid.lstrip('\n').lstrip(' ').lstrip('<').rstrip('>')
>
> The tglx/quilttools.git doesn't have this fix, so I'm guessing you
> haven't seen it yet.

Indeed, but it just should be:

+ mid = pmsg.msgid.strip().lstrip('<').rstrip('>')

Thanks,

tglx