Re: [PATCH] kernel: sysctl: make drop_caches write-only

From: Andrew Morton
Date: Fri Nov 01 2019 - 14:59:55 EST


On Fri, 1 Nov 2019 10:45:40 -0400 Johannes Weiner <hannes@xxxxxxxxxxx> wrote:

> On Fri, Nov 01, 2019 at 11:09:01AM +0000, Chris Down wrote:
> > Hm, not sure why my client didn't show this reply.
> >
> > Andrew Morton writes:
> > > Risk: some (odd) userspace code will break. Fixable by manually chmodding
> > > it back again.
> >
> > The only scenario I can construct in my head is that someone has built
> > something to watch drop_caches for modification, but we already have the
> > kmsg output for that.

The scenario is that something opens /proc/sys/vm/drop_caches for
reading, gets unexpected EPERM and blows up?

> > > Reward: very little.
> > >
> > > Is the reward worth the risk?
> >
> > There is evidence that this has already caused confusion[0] for many,
> > judging by the number of views and votes. I think the reward is higher than
> > stated here, since it makes the intent and lack of persistent API in the API
> > clearer, and less likely to cause confusion in future.
> >
> > 0: https://unix.stackexchange.com/q/17936/10762
>
> Yes, I should have mentioned this in the changelog, but:
>
> While mitigating a VM problem at scale in our fleet, there was
> confusion about whether writing to this file will permanently switch
> the kernel into a non-caching mode. This influences the decision
> making in a tense situation, where tens of people are trying to fix
> tens of thousands of affected machines: Do we need a rollback
> strategy? What are the performance implications of operating in a
> non-caching state for several days? It also caused confusion when the
> kernel team said we may need to write the file several times to make
> sure it's effective ("But it already reads back 3?").

OK. What if we make reads always return "0"? That will fix the
misleading output and is more backwards-compatible?