Re: Control page reclaim granularity
From: Greg Thelen
Date: Thu Mar 08 2012 - 03:46:59 EST
Zheng Liu <gnehzuil.liu@xxxxxxxxx> writes:
> Hi list,
>
> Recently we encounter a problem about page reclaim. I abstract it in here.
> The problem is that there are two different file types. One is small index
> file, and another is large data file. The index file is mmaped into memory,
> and application hope that they can be kept in memory and don't be reclaimed
> too frequently. The data file is manipulted by read/write, and they should
> be reclaimed more frequently than the index file.
>
> As previously discussion [1], Konstantin suggest me to mmap index file with
> PROT_EXEC flag. Meanwhile he provides a patch to set a flag in mm_flags to
> increase the priority of mmaped file pages. However, these solutions are
> not perfect. I review the related patches (8cab4754 and c909e993) and I
> think that mmaped index file with PROT_EXEC flag is too tricky. From the
> view of applicaton programmer, index file is a regular file that stores
> some data. So they should be mmap with PROT_READ | PROT_WRITE rather than
> with PROT_EXEC. As commit log said (8cab4754), the purpose of this patch
> is to keep executable code in memory to improve the response of application.
> In addition, Kongstantin's patch needs to adjust the application program.
> So in some cases, we cannot touch the code of application, and this patch is
> useless.
>
> I have discussed with Kongstantin about this problem and we think maybe
> kernel should provide some mechanism. For example, user can set memory
> pressure priorities for vma or inode, or mmaped pages and file pages can be
> reclaimed separately. If someone has thought about it, please let me know.
> Any feedbacks are welcomed. Thank you.
>
> Previously discussion:
> 1. http://marc.info/?l=linux-mm&m=132947026019538&w=2
>
> Regards,
> Zheng
It's not exactly the same approach, but we have toyed with the idea of
charging different inodes to different cgroups. Each cgroup would have
different soft/hard limits to allow for different cache behavior.
http://www.spinics.net/lists/linux-mm/msg06006.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/