Re: [PATCH] kconfig: Add kernel config option for fuzz testing.

From: Dmitry Vyukov
Date: Tue Dec 17 2019 - 03:53:57 EST


On Tue, Dec 17, 2019 at 9:36 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Mon, Dec 16, 2019 at 10:07 PM Tetsuo Handa
> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On 2019/12/17 5:18, Theodore Y. Ts'o wrote:
> > >> this case was too hard to blacklist, as explained at
> > >> https://lore.kernel.org/lkml/4d1a4b51-999b-63c6-5ce3-a704013cecb6@xxxxxxxxxxxxxxxxxxx/ .
> > >> syz_execute_func() can find deeper bug by executing arbitrary binary code, but
> > >> we cannot blacklist specific syscalls/arguments for syz_execute_func() testcases.
> > >> Unless we guard on the kernel side, we won't be able to re-enable syz_execute_func()
> > >> testcases.
> > >
> > > I looked at the reference, but I didn't see the explanation in the
> > > above link about why it was "too hard to blacklist". In fact, it
> > > looks like a bit earlier in the thread, Dmitry stated that adding this
> > > find of blacklist "is not hard"?
> > >
> > > https://lore.kernel.org/lkml/CACT4Y+Z_+H09iOPzSzJfs=_D=dczk22gL02FjuZ6HXO+p0kRyA@xxxxxxxxxxxxxx/
> > >
> >
> > That thread handled two bugs which disabled console output.
> >
> > The former bug (March 2019) was that fuzzer was calling syslog(SYSLOG_ACTION_CONSOLE_LEVEL) and
> > was fixed by blacklisting syslog(SYSLOG_ACTION_CONSOLE_LEVEL). This case was easy to blacklist.
> >
> > The latter bug (May 2019) was that fuzzer was calling binary code (via syz_execute_func()) which
> > emdebbed a system call that changes console loglevel. Since it was too difficult to blacklist,
> > syzkaller gave up use of syz_execute_func().
>
>
> Yes, what Tetsuo says. Only syscall numbers and top-level arguments to
> syscalls are easy to filter out. When indirect memory is passed to
> kernel or (fd,ioctl) pairs are involved it boils down to solving the
> halting problem.
>
> There are some nice benefits of doing this in some form in kernel:
> 1. It makes reported crashes more trustworthy for kernel developers.
> Currently you receive a crash and you don't know if it's a bug, or
> working as intended (as turned out with serial console). It also
> happened with syzkaller learning interesting ways to reach /dev/mem,
> which was directly prohibited, but it managed to reach it via some
> mounts and corrupted file names. Disabling the DEVMEM config in the
> end reliably solved it once and for all.
>
> 2. It avoids duplication across test systems. Doing this in each test
> system is again makes kernel testing hard and imposes duplication.
> Tomorrow somebody runs new version of trinity, triforce, afl, perf
> fuzzer and they hit the same issues, you get the same reports, and
> have the same discussions and they slowly build the same list.
>
> I like the idea of runtime knob that Andi proposed, and in fact this
> looks quite similar to lockdown needs. And in fact it already have
> LOCKDOWN_TIOCSSERIAL (does it does exactly the same as what we need?
> if not, it may be something to improve in lockdown). It seems that any
> fuzzing needs at least LOCKDOWN_INTEGRITY_MAX.
>
> Could we incorporate some of the checks in this patch into lockdown?
> FWIW we've just disabled sysrq entirely:
> https://github.com/google/syzkaller/blob/master/dashboard/config/bits-syzbot.config#L182
> because random packets over usb can trigger a panic sysrq (again
> almost impossible to reliably filter these out on fuzzer side).
>
> Not sure why lockdown prohibits parts of TIOCSSERIAL. Does it want to
> prevent memory corruption, or also prevent turning off console output
> (hiding traces)? If the latter then it may be reasonable to lockdown
> lowering of console_loglevel.
>
> But looks at some of the chunks here, it seems that we want something
> slightly orthogonal to lockdown integrity/confidentiality, which
> liveness/dos-prevention (can't accidentally bring down the whole
> machine). E.g. FIFREEZE or bad SYSRQs.
>
> There was some reason I did not enable lockdown when it was merged.
> Now looking at the list again: LOCKDOWN_DEBUGFS is a no-go for us...

+Matthew for a lockdown question
We are considering [ab]using lockdown (you knew this will happen!) for
fuzzing kernel. LOCKDOWN_DEBUGFS is a no-go for us and we may want a
few other things that may be fuzzing-specific.
The current inflexibility comes from the global ordering of levels:

if (kernel_locked_down >= level)
if (kernel_locked_down >= what) {

Is it done for performance? Or for simplicity?
If it would be more like an arbitrary bitset, then we could pick and
choose what we need and add few options on the side (not part of
integrity/confidentiality).
I see several potential ways to do it:
1. Rework the current code to allow arbitrary sets of options enabled.
2. Leave the current global ordering and checks intact and add a
separate mode that will do bitmask-like checks (e.g. under
CONFIG_LOCK_DOWN_FUZZER).
3. Or hardcode the fuzzer list statically based on config (but then we
can leave the rest of the code intact), e.g.:

#if !CONFIG_LOCK_DOWN_FUZZER
[LOCKDOWN_DEBUGFS] = "debugfs access",
#endif
#if CONFIG_LOCK_DOWN_FUZZER
[LOCKDOWN_FIFREEZE] = "FIFREEZE ioctl",
#endif

Thoughts?