Re: [PATCH] io_uring: avoid page allocation warnings
From: Mark Rutland
Date: Wed May 01 2019 - 11:09:42 EST
On Wed, May 01, 2019 at 06:41:43AM -0600, Jens Axboe wrote:
> On 5/1/19 4:30 AM, Mark Rutland wrote:
> > On Tue, Apr 30, 2019 at 12:11:59PM -0600, Jens Axboe wrote:
> >> On 4/30/19 11:03 AM, Mark Rutland wrote:
> >>> I've just had a go at that, but when using kvmalloc() with or without
> >>> GFP_KERNEL_ACCOUNT I hit OOM and my system hangs within a few seconds with the
> >>> syzkaller prog below:
> >>>
> >>> ----
> >>> Syzkaller reproducer:
> >>> # {Threaded:false Collide:false Repeat:false RepeatTimes:0 Procs:1 Sandbox: Fault:false FaultCall:-1 FaultNth:0 EnableTun:false EnableNetDev:false EnableNetReset:false EnableCgroups:false EnableBinfmtMisc:false EnableCloseFds:false UseTmpDir:false HandleSegv:false Repro:false Trace:false}
> >>> r0 = io_uring_setup(0x378, &(0x7f00000000c0))
> >>> sendmsg$SEG6_CMD_SET_TUNSRC(0xffffffffffffffff, &(0x7f0000000240)={&(0x7f0000000000)={0x10, 0x0, 0x0, 0x40000000}, 0xc, 0x0, 0x1, 0x0, 0x0, 0x10}, 0x800)
> >>> io_uring_register$IORING_REGISTER_BUFFERS(r0, 0x0, &(0x7f0000000000), 0x1)
> >>> ----
> >>>
> >>> ... I'm a bit worried that opens up a trivial DoS.
> >>>
> >>> Thoughts?
> >>
> >> Can you post the patch you used?
> >
> > Diff below.
>
> And the reproducer, that was never posted.
It was; the "Syzakller reproducer" above is the reproducer I used with
syz-repro.
I've manually minimized that to C below. AFAICT, that hits a leak, which
is what's triggering the OOM after the program is run a number of times
with the previously posted kvmalloc patch.
Per /proc/meminfo, that memory isn't accounted anywhere.
> Patch looks fine to me. Note
> that buffer registration is under the protection of RLIMIT_MEMLOCK.
> That's usually very limited for non-root, as root you can of course
> consume as much as you want and OOM the system.
Sure.
As above, it looks like there's a leak, regardless.
Thanks,
Mark.
---->8----
#include <stdint.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <linux/uio.h>
// NOTE: arm64 syscall numbers
#ifndef __NR_io_uring_register
#define __NR_io_uring_register 427
#endif
#ifndef __NR_io_uring_setup
#define __NR_io_uring_setup 425
#endif
#define __IORING_REGISTER_BUFFERS 0
struct __io_sqring_offsets {
uint32_t head;
uint32_t tail;
uint32_t ring_mask;
uint32_t ring_entries;
uint32_t flags;
uint32_t dropped;
uint32_t array;
uint32_t resv1;
uint64_t resv2;
};
struct __io_uring_params {
uint32_t sq_entries;
uint32_t cq_entries;
uint32_t flags;
uint32_t sq_thread_cpu;
uint32_t sq_thread_idle;
uint32_t resv[5];
struct __io_sqring_offsets sq_off;
struct __io_sqring_offsets cq_off;
};
static struct __io_uring_params params;
static struct iovec iov = {
.iov_base = (void *)0x10,
.iov_len = 1024 * 1024 * 1024,
};
int main(void)
{
int fd;
fd = syscall(__NR_io_uring_setup, 0x1, ¶ms);
syscall(__NR_io_uring_register, fd, __IORING_REGISTER_BUFFERS, &iov, 1);
return 0;
}