RE: [PATCH] fs/select: add vmalloc fallback for select(2)

From: David Laight
Date: Tue Sep 27 2016 - 07:40:42 EST


From: Nicholas Piggin
> Sent: 27 September 2016 12:25
> On Tue, 27 Sep 2016 10:44:04 +0200
> Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> > On 09/23/2016 06:47 PM, Jason Baron wrote:
> > > Hi,
> > >
> > > On 09/23/2016 03:24 AM, Nicholas Piggin wrote:
> > >> On Fri, 23 Sep 2016 14:42:53 +0800
> > >> "Hillf Danton" <hillf.zj@xxxxxxxxxxxxxxx> wrote:
> > >>
> > >>>>
> > >>>> The select(2) syscall performs a kmalloc(size, GFP_KERNEL) where size grows
> > >>>> with the number of fds passed. We had a customer report page allocation
> > >>>> failures of order-4 for this allocation. This is a costly order, so it might
> > >>>> easily fail, as the VM expects such allocation to have a lower-order fallback.
> > >>>>
> > >>>> Such trivial fallback is vmalloc(), as the memory doesn't have to be
> > >>>> physically contiguous. Also the allocation is temporary for the duration of the
> > >>>> syscall, so it's unlikely to stress vmalloc too much.
> > >>>>
> > >>>> Note that the poll(2) syscall seems to use a linked list of order-0 pages, so
> > >>>> it doesn't need this kind of fallback.
> > >>
> > >> How about something like this? (untested)
> >
> > This pushes the limit further, but might just delay the problem. Could be an
> > optimization on top if there's enough interest, though.
>
> What's your customer doing with those selects? If they care at all about
> performance, I doubt they want select to attempt order-4 allocations, fail,
> then use vmalloc :)

If they care about performance they shouldn't be passing select() lists that
are anywhere near that large.
If the number of actual fd is small - use poll().

Otherwise you want one of the 'event' mechanisms in order to avoid setting
the markers on every fd after every event (can't remember how you do that
in Linux).

At least this isn't SYSV - poll() was O(n^2) in the number of fd
(because the fd were on a linked list).

David