Re: [PATCH] vmalloc: add warning in __vmalloc

From: Steven Whitehouse
Date: Thu May 03 2012 - 09:50:28 EST


Hi,

On Thu, 2012-05-03 at 16:30 +1000, Nick Piggin wrote:
> On 3 May 2012 15:46, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> > On Thu, 3 May 2012, Minchan Kim wrote:
> >> On 05/03/2012 04:46 AM, Andrew Morton wrote:
> >> > Well. What are we actually doing here? Causing the kernel to spew a
> >> > warning due to known-buggy callsites, so that users will report the
> >> > warnings, eventually goading maintainers into fixing their stuff.
> >> >
> >> > This isn't very efficient :(
> >>
> >>
> >> Yes. I hope maintainers fix it before merging this.
> >>
> >> >
> >> > It would be better to fix that stuff first, then add the warning to
> >> > prevent reoccurrences. Yes, maintainers are very naughty and probably
> >> > do need cattle prods^W^W warnings to motivate them to fix stuff, but we
> >> > should first make an effort to get these things fixed without
> >> > irritating and alarming our users.
> >> >
> >> > Where are these offending callsites?
> >
> > Okay, maybe this is a stupid question, but: if an fs can't call vmalloc
> > with GFP_NOFS without risking deadlock, calling with GFP_KERNEL instead
> > doesn't fix anything (besides being more honest). This really means that
> > vmalloc is effectively off-limits for file systems in any
> > writeback-related path, right?
>
> Anywhere it cannot reenter the filesystem, yes. GFP_NOFS is effectively
> GFP_KERNEL when calling vmalloc.
>
> Note that in writeback paths, a "good citizen" filesystem should not require
> any allocations, or at least it should be able to tolerate allocation failures.
> So fixing that would be a good idea anyway.

For cluster filesystems, there is an additional issue. When we allocate
memory with GFP_KERNEL we might land up pushing inodes out of cache,
which can also mean deallocating them. That process involves taking
cluster locks, and so it is not valid to do this while holding another
cluster lock (since the locks may be taken in random order).

In the GFS2 use case for vmalloc, this is being done if kmalloc fails
and also if the memory required is too large for kmalloc (very unlikely,
but possible with very large directories). Also, it is being done under
a cluster lock (shared mode).

I recently looked back at the thread which resulted in that particular
vmalloc call being left there:
http://www.redhat.com/archives/cluster-devel/2010-July/msg00021.html
http://www.redhat.com/archives/cluster-devel/2010-July/msg00022.html
http://www.redhat.com/archives/cluster-devel/2010-July/msg00023.html

which reminded me of the problem. So this might not be so easy to
resolve...

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/