Gaah, my mailer autocompleted Jens' email with an old one..
Sorry for the repeat email with the correct address.
On Thu, Sep 17, 2015 at 11:04 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
On Thu, Sep 17, 2015 at 10:40 PM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
PS: just hit another "did this just get broken in 4.3-rc1" issue - I
can't run blktrace while there's a IO load because:
$ sudo blktrace -d /dev/vdc
BLKTRACESETUP(2) /dev/vdc failed: 5/Input/output error
Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
....
[ 641.424618] blktrace: page allocation failure: order:5, mode:0x2040d0
[ 641.438933] [<ffffffff811c1569>] kmem_cache_alloc_trace+0x129/0x400
[ 641.440240] [<ffffffff811424f8>] relay_open+0x68/0x2c0
[ 641.441299] [<ffffffff8115deb1>] do_blk_trace_setup+0x191/0x2d0
gdb) l *(relay_open+0x68)
0xffffffff811424f8 is in relay_open (kernel/relay.c:582).
577 return NULL;
578 if (subbuf_size > UINT_MAX / n_subbufs)
579 return NULL;
580
581 chan = kzalloc(sizeof(struct rchan), GFP_KERNEL);
582 if (!chan)
583 return NULL;
584
585 chan->version = RELAYFS_CHANNEL_VERSION;
586 chan->n_subbufs = n_subbufs;
and struct rchan has a member struct rchan_buf *buf[NR_CPUS];
and CONFIG_NR_CPUS=8192, hence the attempt at an order 5 allocation
that fails here....
Hm. Have you always had MAX_SMP (and the NR_CPU==8192 that it causes)?
From a quick check, none of this code seems to be new.
That said, having that
struct rchan_buf *buf[NR_CPUS];
in "struct rchan" really is something we should fix. We really should
strive to not allocate things by CONFIG_NR_CPU's, but by the actual
real CPU count.
This looks to be mostly Jens' code, and much of it harkens back to 2006. Jens?