Re: Semantics of blktrace with lockdown (integrity) enabled kernel.

From: Junxiao Bi
Date: Thu Apr 06 2023 - 15:30:45 EST


On 4/6/23 11:39 AM, Paul Moore wrote:

On Thu, Apr 6, 2023 at 1:38 PM Konrad Rzeszutek Wilk
<konrad.wilk@xxxxxxxxxx> wrote:
Hey Jens, Paul, James, Nathan,

We are trying to use blktrace with a kernel that has lockdown enabled and find that it cannot run.

Specifically the issue is that we are trying to do is pretty simple:

strace -f blktrace -d /dev/sda -w 60

[pid 148882] <... mprotect resumed>) = 0
[pid 148881] openat(AT_FDCWD, "/sys/kernel/debug/block/sda/trace0", O_RDONLY|O_NONBLOCK <unfinished ...>
[pid 148882] sched_setaffinity(0, 8, [1]) = 0
[pid 148881] <... openat resumed>) = -1 EPERM (Operation not permitted)

which fails. The analysis from Eric (CCed) is that

All debugfs entries do not exist until blktrace is run. It is opening
/sys/kernel/debug/block/sda/trace0 which isn’t there normally. While running the utility,
to place something in it, it must have the write permission set. When exiting out of
blktrace, the entry is gone, both on a machine running with secure boot enabled
and one with it disabled. Which also indicates the write permission was set,
otherwise the entry would still be there.

The fix is simple enough (see attachment) but we are not sure about the semantics of what
lockdown has in mind.

Looking at the include/linux/security.h the LOCKDOWN_TRACEFS exists which would
imply that it is expected that operations with tracefs *should* work with lockdown (integrity mode).

But at the same point, debugfs writable attributes are a nono with lockdown.

So what is the right way forward?
What did you use as a basis for your changes? I'm looking at the
patch you sent and it appears to be making a change to a
debugfs_lockdown_whitelisted() function defined in
fs/debugfs/internal.h which does not exist in Linus' tree. If I
search through all of the archives on lore.kernel.org the only hit I
get is your email, so it seems doubtful it is in a subsystem tree
which hasn't made its way to Linus yet.

Before we go any further, can you please verify that your issue is
reproducible on a supported, upstream tree (preferably Linus')?

The patch attached is applied to oracle kernel which is just used to demo the idea of a possible fix.

Upstream will have the same issue because blktrace uses relay files from debugfs to transfer trace information from kernel to userspace. Those relay files are having permission 0400 which are good, but they support mmap (struct file_operations relay_file_operations), which are against the rule of lock down. Is there any security concern to whitelist these files in lockdown mode? Any idea how to fix this for upstream?

static int debugfs_locked_down(struct inode *inode,
                   struct file *filp,
                   const struct file_operations *real_fops)
{
    if ((inode->i_mode & 07777 & ~0444) == 0 &&
        !(filp->f_mode & FMODE_WRITE) &&
        !real_fops->unlocked_ioctl &&
        !real_fops->compat_ioctl &&
        !real_fops->mmap)
        return 0;

    if (security_locked_down(LOCKDOWN_DEBUGFS))
        return -EPERM;

    return 0;
}

Thanks,

Junxiao.