Re: [PATCH v2] dma-buf/heaps: system_heap: Avoid DoS by limiting single allocations to half of all memory

From: John Stultz
Date: Thu Apr 06 2023 - 00:24:36 EST


On Wed, Apr 5, 2023 at 8:09 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, 06 Apr 2023 11:17:12 +0900 Jaewon Kim <jaewon31.kim@xxxxxxxxxxx> wrote:
>
> > >> + if (len / PAGE_SIZE > totalram_pages())
> > >> + return ERR_PTR(-ENOMEM);
> > >
> > >We're catering for a buggy caller here, aren't we? Are such large
> > >requests ever reasonable?
> > >
> > >How about we decide what's the largest reasonable size and do a
> > >WARN_ON(larger-than-that), so the buggy caller gets fixed?
> >
> > Yes we're considering a buggy caller. I thought even totalram_pages() / 2 in
> > the old ion system is also unreasonable. To avoid the /2, I changed it to
> > totalram_pages() though.
> >
> > Because userspace can request that size repeately, I think WARN_ON() may be
> > called to too often, so that it would fill the kernel log buffer.
>
> Oh geeze. I trust that userspace needs elevated privileges of some form?
>
> If so, then spamming dmesg isn't an issue - root can do much worse than
> that.
>
> > Even we think WARN_ON_ONCE rather than WARN_ON, the buggy point is not kernel
> > layer. Unlike page fault mechanism, this dma-buf system heap gets the size from
> > userspace, and it is allowing unlimited size. I think we can't fix the buggy
> > user space with the kernel warning log. So I think warning is not enough,
> > and we need a safeguard in kernel layer.
>
> I really dislike that ram/2 thing - it's so arbitrary, hence is surely
> wrong for all cases. Is there something more thoughtful we can do?

Just for context, here's the old commit that added this to ION:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9e8440eca61298ecccbb27f53036124a7a3c6c8

I think the consideration was that allocations larger than half of
memory are likely due to erroneously "negative" size values.

My memory is foggy on any discussions from that time, but I imagine
the thinking was treating the value as if it were signed and error out
immediately on negative values, but rather than just capping at 2gb on
32bit systems, one could scale it to half of the system memory size,
as that seemed an appropriate border of "obviously wrong".

And the reason why I think folks wanted to avoid just warning and
continuing with the allocation, is that these large allocations would
bog the system down badly before it failed, so failing quickly was a
benefit as the system was still responsive and able to be used to
collect logs and debug the issue.

When you say "decide what's the largest reasonable size", I think it
is difficult as with the variety of RAM sizes and buffer sizes I don't
think there's a fixed limit. Systems with more ram will use larger
buffers for image/video capture buffers. And yes, you're right that
ram/2-1 in a single allocation is just as broken, but I'm not sure how
to establish a better guard rail.

thanks
-john