Re: [PATCH 1/9] kernel: add a PF_FORCE_COMPAT flag

From: Al Viro
Date: Sun Sep 20 2020 - 11:03:07 EST


On Sun, Sep 20, 2020 at 03:55:47PM +0200, Arnd Bergmann wrote:
> On Sun, Sep 20, 2020 at 12:09 AM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> > On Fri, Sep 18, 2020 at 05:16:15PM +0200, Christoph Hellwig wrote:
> > > On Fri, Sep 18, 2020 at 02:58:22PM +0100, Al Viro wrote:
> > > > Said that, why not provide a variant that would take an explicit
> > > > "is it compat" argument and use it there? And have the normal
> > > > one pass in_compat_syscall() to that...
> > >
> > > That would help to not introduce a regression with this series yes.
> > > But it wouldn't fix existing bugs when io_uring is used to access
> > > read or write methods that use in_compat_syscall(). One example that
> > > I recently ran into is drivers/scsi/sg.c.
> >
> > So screw such read/write methods - don't use them with io_uring.
> > That, BTW, is one of the reasons I'm sceptical about burying the
> > decisions deep into the callchain - we don't _want_ different
> > data layouts on read/write depending upon the 32bit vs. 64bit
> > caller, let alone the pointer-chasing garbage that is /dev/sg.
>
> Would it be too late to limit what kind of file descriptors we allow
> io_uring to read/write on?
>
> If io_uring can get changed to return -EINVAL on trying to
> read/write something other than S_IFREG file descriptors,
> that particular problem space gets a lot simpler, but this
> is of course only possible if nobody actually relies on it yet.

S_IFREG is almost certainly too heavy as a restriction. Looking through
the stuff sensitive to 32bit/64bit, we seem to have
* /dev/sg - pointer-chasing horror
* sysfs files for efivar - different layouts for compat and native,
shitty userland ABI design (
struct efi_variable {
efi_char16_t VariableName[EFI_VAR_NAME_LEN/sizeof(efi_char16_t)];
efi_guid_t VendorGuid;
unsigned long DataSize;
__u8 Data[1024];
efi_status_t Status;
__u32 Attributes;
} __attribute__((packed));
) is the piece of crap in question; 'DataSize' is where the headache comes
from. Regular files, BTW...
* uhid - character device, milder pointer-chasing horror. Trouble
comes from this:
/* Obsolete! Use UHID_CREATE2. */
struct uhid_create_req {
__u8 name[128];
__u8 phys[64];
__u8 uniq[64];
__u8 __user *rd_data;
__u16 rd_size;

__u16 bus;
__u32 vendor;
__u32 product;
__u32 version;
__u32 country;
} __attribute__((__packed__));
and suggested replacement doesn't do any pointer-chasing (rd_data is an
embedded array in the end of struct uhid_create2_req).
* evdev, uinput - bitness-sensitive layout, due to timestamps
* /proc/bus/input/devices - weird crap with printing bitmap, different
_text_ layouts seen by 32bit and 64bit readers. Binary structures are PITA,
but with sufficient effort you can screw the text just as hard... Oh, and it's
a regular file.
* similar in sysfs analogue

And AFAICS, that's it for read/write-related method instances.