Re: [PATCH v14 2/7] mm: add VM_DROPPABLE for designating always lazily freeable mappings
From: Jason A. Donenfeld
Date: Tue Jan 03 2023 - 14:36:08 EST
Hi Linus,
On Tue, Jan 03, 2023 at 11:19:36AM -0800, Linus Torvalds wrote:
> performed as well as they could, but on the whole this is still a
> really tiny thing, and Jason is trying to micro-optimize something
> that THE KERNEL SHOULD NOT CARE ABOUT.
I don't think this is about micro-optimization. Rather, userspace RNGs
aren't really possible in a safe way at the moment. This patchset aims
to make that possible, by providing things that libc will use. The cover
letter of this series makes that case.
> This should all be in libc. Not in the kernel with special magic vdso
> support and special buffer allocations. The kernel should give good
> enough support that libc can do a good job, but the kernel should
> simply *not* take the approach of "libc will get this wrong, so let's
> just do all the work for it".
That's not what this patchset does. libc still needs to handle
per-thread semantics itself and slice up buffers and so forth. The vDSO
doesn't allocate any memory. I suspect this was Ingo's presumption too,
and you extrapolated from that. But that's not what's happening.
> Now, if the magic buffers were something cool that were a generic
> concept that a lot of *other* cases would also kill for, that's one
Actually, I was thinking VM_DROPPABLE might be a somewhat interesting
thing to introduce for database caches and so forth, where dropping
things under memory pressure is actually useful. Obviously that's the
result of a thought process involving a solution looking for a problem,
but I considered this a month or so ago when I first sent this in, and
decided that if I was to expose this via a MAP_* flag in mmap(), that
should come later, so I didn't here. Anyway, that is all to say it's not
like this is the only use for it. But either way, I don't actually have
my sights set on it as a general solution -- after all, I am not in the
process of authoring a database cache or something -- and if I can make
Andy's vm_ops suggestion work, that sounds perfectly fine to me.
> thing. But this is such a small special case that absolutely *nobody*
> has asked for, and that nothing else wants.
Okay so that's where I think you're really quite mistaken. If you recall
the original discussion on this, I was initially a bit hesitant to do it
and didn't really want to do it that much. And then I looked into it,
and talked to a bunch of library and program authors, and saw that
there's actually quite a bit of demand for this, and generally an
unhealthy ecosystem of bad solutions that have cropped up in lieu of a
good one.
(I talked about this a bit with tglx at Plumbers, and I had hoped to
discuss with you as well, but you weren't available.)
Jason