Re: [syzbot] [net?] KASAN: slab-use-after-free Read in unix_stream_read_actor (2)

From: Kuniyuki Iwashima
Date: Tue Sep 10 2024 - 15:49:55 EST


From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
Date: Tue, 10 Sep 2024 11:49:20 -0700
> On 9/10/2024 11:33 AM, Kuniyuki Iwashima wrote:
> > From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
> > Date: Tue, 10 Sep 2024 11:16:59 -0700
> >> On 9/10/2024 10:57 AM, Kuniyuki Iwashima wrote:
> >>> From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
> >>> Date: Tue, 10 Sep 2024 09:55:03 -0700
> >>>> On 9/9/2024 5:48 PM, Kuniyuki Iwashima wrote:
> >>>>> From: Shoaib Rao <rao.shoaib@xxxxxxxxxx>
> >>>>> Date: Mon, 9 Sep 2024 17:29:04 -0700
> >>>>>> I have some more time investigating the issue. The sequence of packet
> >>>>>> arrival and consumption definitely points to an issue with OOB handling
> >>>>>> and I will be submitting a patch for that.
> >>>>>
> >>>>> It seems a bit late.
> >>>>> My patches were applied few minutes before this mail was sent.
> >>>>> https://urldefense.com/v3/__https://lore.kernel.org/netdev/172592764315.3964840.16480083161244716649.git-patchwork-notify@xxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!M806VrqNEGFgGXEoWG85msKAdFPXup7RzHy9Kt4q_HOfpPWsjNHn75KyFK3a3jWvOb9EEQuFGOjpqgk$
> >>>>>
> >>>>
> >>>> That is a subpar fix. I am not sure why the maintainers accepted the fix
> >>>> when it was clear that I was still looking into the issue.
> >>>
> >>> Just because it's not a subpar fix and you were slow and wrong,
> >>> clining to triggering the KASAN splat without thinking much.
> >>>
> >>>
> >>>> Plus the
> >>>> claim that it fixes the panic is absolutely wrong.
> >>>
> >>> The _root_ cause of the splat is mishandling of OOB in manage_oob()
> >>> which causes UAF later in another recvmsg().
> >>>
> >>> Honestly your patch is rather a subpar fix to me, few points:
> >>>
> >>> 1. The change conflicts with net-next as we have already removed
> >>> the additional unnecessary refcnt for OOB skb that has caused
> >>> so many issue reported by syzkaller
> >>>
> >>> 2. Removing OOB skb in queue_oob() relies on the unneeded refcnt
> >>> but it's not mentioned; if merge was done wrongly, another UAF
> >>> will be introduced in recvmsg()
> >>>
> >>> 3. Even the removing logic is completely unnecessary if manage_oob()
> >>> is changed
> >>>
> >>> 4. The scan_again: label is misplaced; two consecutive empty OOB skbs
> >>> never exist at the head of recvq
> >>>
> >>> 5. ioctl() is not fixed
> >>>
> >>> 6. No test added
> >>>
> >>> 7. Fixes: tag is bogus
> >>>
> >>> 8. Subject lacks target tree and af_unix prefix
> >>
> >> If you want to nit pick, nit pick away, Just because the patch email
> >> lacks proper formatting does not make the patch technically inferior.
> >
> > Ironically you just nit picked 8.
> >
> >
>
> I have no idea what you mean. I am more worried about technical
> correctness than formatting -- That does not mean formatting is not
> necessary.

I started pointing out technical stuff and ended with nit-pick because
"I am more worried about technical correctness", but you started nit
picking from the last point. That's unfortunate.


>
> >> My
> >> fix is a proper fix not a hack. The change in queue_oob is sufficient to
> >> fix all issues including SIOCATMARK. The fix in manage_oob is just for
> >> correctness.
> >
> > Then, it should be WARN_ON_ONCE() not to confuse future readers.
> >
> >
> >> In your fix I specifically did not like the change made to
> >> fix SIOCATMARK.
> >
> > I don't like that part too, but it's needed to avoid the additional refcnt
> > that is much worse as syzbot has been demonstrating.
> >
>
> syzbot has nothing to do with doing a proper fix.

You don't understand my point. syzbot has been finding many real issues
that were caused by poor handling of the additional refcount.

Also, removing it discovered another bug in manage_oob(). That's a enough
reason to explain why we should remove the unnecessary refcnt.


> One has to understand
> the code though to do the fix at the proper location.

I'm not saying that the patch is correct if it silences syzbot.

Actually, I said KASAN is handy but you need not rely on it.

Rather it's you who argued the splat was needed even without trying
to understand the code.

I really don't understand why you are saying this to me now.


>
> >
> >>
> >> What is most worrying is claim to fixing a panic when it can not even
> >> happen with the bug.
> >
> > It's only on your setup. syzbot and I were able to trigger that with
> > the bug.
> >
>
> Really, what is so special about my setup that kasan does not like? Can
> you point me to the exact location where the access is made?

I don't know, it's your job.

>
> I am at least glad that you have backed off your assertion that my
> change does not fix the ioctl.

Okay, I was wrong about that, and what about other points, fragile
refcnt, non-WARN_ON_ONCE(), misplaced label, no test, bogus tag ?


> I am sure if I keep pressing you, you
> will back off the panic claim as well.

I also don't understand what you are saying and why you still can't
correlate the splat and the sequences of syscalls in the repro.


> You yourself admitted you did not
> know why kasan was not panicing, Has anyone else hit the same panic?
>
> If you can pin point the exact location where the illegal access is
> made, please do so and I will accept that I am wrong, other than that I
> am not interested in this constant back and forth with no technical
> details just fluff.

Please read my changelog (and mails) carefully that pin-point the
exact location and reason where/why the illegal access happens.

This will be the last mail from me in this thread. I don't want to
waste time on someone who doesn't read mails.