From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Tue, 10 Sep 2024 11:49:20 -0700
On 9/10/2024 11:33 AM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Tue, 10 Sep 2024 11:16:59 -0700
On 9/10/2024 10:57 AM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Tue, 10 Sep 2024 09:55:03 -0700
On 9/9/2024 5:48 PM, Kuniyuki Iwashima wrote:
From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Mon, 9 Sep 2024 17:29:04 -0700
I have some more time investigating the issue. The sequence of packet
arrival and consumption definitely points to an issue with OOB handling
and I will be submitting a patch for that.
It seems a bit late.
My patches were applied few minutes before this mail was sent.
https://urldefense.com/v3/__https://lore.kernel.org/netdev/172592764315.3964840.16480083161244716649.git-patchwork-notify@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!M806VrqNEGFgGXEoWG85msKAdFPXup7RzHy9Kt4q_HOfpPWsjNHn75KyFK3a3jWvOb9EEQuFGOjpqgk$
That is a subpar fix. I am not sure why the maintainers accepted the fix
when it was clear that I was still looking into the issue.
Just because it's not a subpar fix and you were slow and wrong,
clining to triggering the KASAN splat without thinking much.
Plus the
claim that it fixes the panic is absolutely wrong.
The _root_ cause of the splat is mishandling of OOB in manage_oob()
which causes UAF later in another recvmsg().
Honestly your patch is rather a subpar fix to me, few points:
1. The change conflicts with net-next as we have already removed
the additional unnecessary refcnt for OOB skb that has caused
so many issue reported by syzkaller
2. Removing OOB skb in queue_oob() relies on the unneeded refcnt
but it's not mentioned; if merge was done wrongly, another UAF
will be introduced in recvmsg()
3. Even the removing logic is completely unnecessary if manage_oob()
is changed
4. The scan_again: label is misplaced; two consecutive empty OOB skbs
never exist at the head of recvq
5. ioctl() is not fixed
6. No test added
7. Fixes: tag is bogus
8. Subject lacks target tree and af_unix prefix
If you want to nit pick, nit pick away, Just because the patch email
lacks proper formatting does not make the patch technically inferior.
Ironically you just nit picked 8.
I have no idea what you mean. I am more worried about technical
correctness than formatting -- That does not mean formatting is not
necessary.
I started pointing out technical stuff and ended with nit-pick because
"I am more worried about technical correctness", but you started nit
picking from the last point. That's unfortunate.
My
fix is a proper fix not a hack. The change in queue_oob is sufficient to
fix all issues including SIOCATMARK. The fix in manage_oob is just for
correctness.
Then, it should be WARN_ON_ONCE() not to confuse future readers.
In your fix I specifically did not like the change made to
fix SIOCATMARK.
I don't like that part too, but it's needed to avoid the additional refcnt
that is much worse as syzbot has been demonstrating.
syzbot has nothing to do with doing a proper fix.
You don't understand my point. syzbot has been finding many real issues
that were caused by poor handling of the additional refcount.
Also, removing it discovered another bug in manage_oob(). That's a enough
reason to explain why we should remove the unnecessary refcnt.
One has to understand
the code though to do the fix at the proper location.
I'm not saying that the patch is correct if it silences syzbot.
Actually, I said KASAN is handy but you need not rely on it.
Rather it's you who argued the splat was needed even without trying
to understand the code.
I really don't understand why you are saying this to me now.
What is most worrying is claim to fixing a panic when it can not even
happen with the bug.
It's only on your setup. syzbot and I were able to trigger that with
the bug.
Really, what is so special about my setup that kasan does not like? Can
you point me to the exact location where the access is made?
I don't know, it's your job.
I am at least glad that you have backed off your assertion that my
change does not fix the ioctl.
Okay, I was wrong about that, and what about other points, fragile
refcnt, non-WARN_ON_ONCE(), misplaced label, no test, bogus tag ?
I am sure if I keep pressing you, you
will back off the panic claim as well.
I also don't understand what you are saying and why you still can't
correlate the splat and the sequences of syscalls in the repro.
You yourself admitted you did not
know why kasan was not panicing, Has anyone else hit the same panic?
If you can pin point the exact location where the illegal access is
made, please do so and I will accept that I am wrong, other than that I
am not interested in this constant back and forth with no technical
details just fluff.
Please read my changelog (and mails) carefully that pin-point the
exact location and reason where/why the illegal access happens.
Yes please don't if you do not have anything technical to say, all your comments are "smart comments". This email thread would end if you could just say, here is line XXXX where the skb is de referenced, but you have not because you have no idea.
This will be the last mail from me in this thread. I don't want to
waste time on someone who doesn't read mails.