Re: [PATCH] mm/fadvise: discard partial pages iff endbyte is also eof

From: Andrew Morton
Date: Thu Jan 04 2018 - 17:54:46 EST


On Thu, 04 Jan 2018 16:17:50 +0800 "åå(Caspar)" <jinli.zjl@xxxxxxxxxxxxxxx> wrote:

> > So, thinking caps on: why not just discard them? After all, that's
> > what userspace asked us to do.
>
> Hi Andrew, I doubt if "just discard them" is a proper action to match
> the userspace's expectation. Maybe we will never meet the userspace's
> expectation since we are doing pages in kernel while userspace is
> passing bytes offset/length to the kernel. Note that Mel Gorman has
> already documented page-unaligned behaviors in posix_fadvise() man
> page[1] but obviously not all people (including /me) are able to read
> the _latest_ version, so someone might still uses the syscall with page
> unaligned offset/length. The userspace might only ask for discarding
> certain *bytes*, instead of *pages*.
>
> And I think we need to look back first why we thought "preserved is
> better than discard". If we throw the whole page, the rest part of the
> page might still be required (consider the offset and length is in the
> middle of a file) because it's untagged:
>
> ...|------------ PAGE --------------|...
> ...| DONTNEED |------ UNTAGGED -----|...
>
> but the page has gone, page fault occurs and we need to reload it from
> the disk -- performance degradation happens.
>
> Maybe that's why we would rather preserv the whole page before.
>
> But if we don't throw the partial page at all, and if the tail partial
> page is _exactly the end of the file_, a page that advised to be NONEED
> would be left in memory. And we all know that it is safe to throw it.
>
> So we come up with this patch -- to keep the partial page not been
> throwing away, and add a special case when the partial page is the end
> of the file, we can throw it safely. I guess it might be a better solution.

OK, that makes sense.

As Mel (sort of) said, "delete part of page" can mean "I want to retain
the other part of the page". So we should retain the page. But for
end-of-file, there is no "other part of the page".

> One thing I'm worrying about is that, this patch might lead to a new
> undocumented behavior, so maybe we need to document this special case in
> posix_fadvise() man page too? hmmm...

That wouldn't hurt.

Could you please resend the patch with the changelog updated to reflect
this discussion?