Re: [PATCH] mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set

From: Anton Altaparmakov
Date: Thu Sep 04 2014 - 04:05:47 EST


On 4 Sep 2014, at 03:30, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> __GFP_FS and __GFP_IO are (or were) for communicating to vmscan: don't
> enter the fs for writepage, don't write back swapcache.
>
> I guess those concepts have grown over time without a ton of thought
> going into it. Yes, I suppose that if a filesystem's writepage is
> called (for example) it expects that it will be able to perform
> writeback and it won't check (or even be passed) the __GFP_IO setting.
>
> So I guess we could say that !__GFP_FS && GFP_IO is not implemented and
> shouldn't occur.
>
> That being said, it still seems quite bad to disable VFS cache
> shrinking for PF_MEMALLOC_NOIO allocation attempts.

I think what it really boils down to is that file systems cannot allow recursion into _that_ file system so if VFS/VM shrinking could skip over all inodes/dentries/pages that are associated with the superblock of the volume for which the allocation is being done then that would be just fine.

An alternative would be that the file systems would need to be passed in a flag that will tell them that it is not safe to take locks and then file systems that need to take a lock could return with -EDEADLOCK and the VM can then skip over those entries and reclaim others. Though I think it would be more efficient for the VFS/VM to simply not call into the file system that is doing the allocation as above...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/