Re: [RFC 2/2] procfs: /proc/sched_debug fails on very very largemachines.

From: Dave Jones
Date: Tue Nov 06 2012 - 18:49:44 EST


On Tue, Nov 06, 2012 at 05:24:15PM -0600, Nathan Zimmer wrote:
> On Tue, Nov 06, 2012 at 04:31:28PM -0500, Dave Jones wrote:
> > On Tue, Nov 06, 2012 at 03:02:21PM -0600, Nathan Zimmer wrote:
> > > On systems with 4096 cores attemping to read /proc/sched_debug fails.
> > > We are trying to push all the data into a single kmalloc buffer.
> > > The issue is on these very large machines all the data will not fit in 4mb.
> > >
> > > A better solution is to not us the single_open mechanism but to provide
> > > our own seq_operations and treat each cpu as an individual record.
> >
> > Good timing.
> >
> > This looks like it would solve the problem I just reported here:
> > https://lkml.org/lkml/2012/11/6/390
> >
> > That happens even on an 8-way, so it's not just niche machines that have
> > this problems.
>
> Glad to help. I hadn't thought of memory tight situation but it does make sense
> that it helps as it can get by with 4k allocation vs grabbing successively
> large chucks.
>
> If you have seen similar issues with your fuzz testing let me know where and
> I'll take a look.

I think /proc/timer_list could probably use the same treatment.
I had traces showing that using 64k allocations too, but I think I may have
just bricked my testbox.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/