Re: [PATCH] mm : update ra->ra_pages if it's NOT equal to bdi->ra_pages
From: Minchan Kim
Date: Fri Aug 14 2020 - 18:17:21 EST
On Fri, Aug 14, 2020 at 04:19:29AM +0100, Matthew Wilcox wrote:
> On Fri, Aug 14, 2020 at 10:45:37AM +0800, Zhaoyang Huang wrote:
> > On Fri, Aug 14, 2020 at 10:33 AM Andrew Morton
> > <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, 14 Aug 2020 10:20:11 +0800 Zhaoyang Huang <huangzhaoyang@xxxxxxxxx> wrote:
> > >
> > > > On Fri, Aug 14, 2020 at 10:07 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Fri, Aug 14, 2020 at 02:43:55AM +0100, Matthew Wilcox wrote:
> > > > > > On Fri, Aug 14, 2020 at 09:30:11AM +0800, Zhaoyang Huang wrote:
> > > > > > > file->f_ra->ra_pages will remain the initialized value since it opend, which may
> > > > > > > be NOT equal to bdi->ra_pages as the latter one is updated somehow(etc,
> > > > > > > echo xxx > /sys/block/dm/queue/read_ahead_kb).So sync ra->ra_pages to the
> > > > > > > updated value when sync read.
> > > > > >
> > > > > > It still ignores the work done by shrink_readahead_size_eio()
> > > > > > and fadvise(POSIX_FADV_SEQUENTIAL).
> > > > >
> > > > > ... by the way, if you're trying to update one particular file's readahead
> > > > > state, you can just call fadvise(POSIX_FADV_NORMAL) on it.
> > > > >
> > > > > If you want to update every open file's ra_pages by writing to sysfs,
> > > > > then just no. We don't do that.
> > > > No, What I want to fix is the file within one process's context keeps
> > > > using the initialized value when it is opened and not sync with new
> > > > value when bdi->ra_pages changes.
> > >
> > > So you're saying that
> > >
> > > echo xxx > /sys/block/dm/queue/read_ahead_kb
> > >
> > > does not affect presently-open files, and you believe that it should do
> > > so?
> > >
> > > I guess that could be a reasonable thing to want - it's reasonable for
> > > a user to expect that writing to a global tunable will take immediate
> > > global effect. I guess.
> > >
> > > But as Matthew says, it would help if you were to explain why this is
> > > needed. In full detail. What operational problems is the present
> > > implementation causing?
> > The real scenario is some system(like android) will turbo read during
> > startup via expanding the readahead window and then set it back to
> > normal(128kb as usual). However, some files in the system process
> > context will keep to be opened since it is opened up and has no chance
> > to sync with the updated value as it is almost impossible to change
> > the files attached to the inode(processes are unaware of these
> > things). we have to fix it from a kernel perspective.
>
> OK, this is a much more useful description of the problem, thank you!
It's not the first time we brought up the issue.
https://patchwork.kernel.org/patch/10866161/
Hopefully, we have some solution at this time.
>
> I can think of two possibilities here. One is that maybe our readahead
> heuristics just don't work on modern phone hardware. Perhaps we need
> to ramp up more aggressively by default.
>
> The other is that maybe it really is just a "boost at startup" kind
> of situation and so we should support _that_. Some interface where
> we can set a ra_boost, and then do:
>
> if (ra_boost)
> newsize *= 2;
>
> in get_init_ra_size().
With kernel boot paramter, it sounds good idea to me.