RE: [PATCH v2] dma-buf/heaps: system_heap: Avoid DoS by limiting single allocations to half of all memory

From: Jaewon Kim
Date: Fri Apr 07 2023 - 09:04:29 EST


>On Thu, Apr 6, 2023 at 7:24?PM Jaewon Kim <jaewon31.kim@xxxxxxxxxxx> wrote:
>>
>> >On Thu, Apr 6, 2023 at 4:38?PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> >>
>> >> On Thu, 6 Apr 2023 16:27:28 -0700 "T.J. Mercier" <tjmercier@xxxxxxxxxx> wrote:
>> >>
>> >> > > When you say "decide what's the largest reasonable size", I think it
>> >> > > is difficult as with the variety of RAM sizes and buffer sizes I don't
>> >> > > think there's a fixed limit. Systems with more ram will use larger
>> >> > > buffers for image/video capture buffers. And yes, you're right that
>> >> > > ram/2-1 in a single allocation is just as broken, but I'm not sure how
>> >> > > to establish a better guard rail.
>> >> > >
>> >> > > thanks
>> >> > > -john
>> >> >
>> >> > I like ENOMEM with the len / PAGE_SIZE > totalram_pages() check and
>> >> > WARN_ON. We know for sure that's an invalid request, and it's pretty
>> >> > cheap to check as opposed to trying a bunch of reclaim before failing.
>> >>
>> >> Well, if some buggy caller has gone and requested eleventy bigabytes of
>> >:)
>> >> memory, doing a lot of reclaiming before failing isn't really a problem
>> >> - we don't want to optimize for this case!
>> >>
>> >The issue I see is that it could delay other non-buggy callers, or
>> >cause reclaim that wouldn't have happened if we just outright rejected
>> >a known-bad allocation request from the beginning.
>> >
>> >> > For buffers smaller than that I agree with John in that I'm not sure
>> >> > there's a definitive threshold.
>> >>
>> >> Well... why do we want to do _anything_ here? Why cater for buggy
>> >> callers? I think it's because "dma-buf behaves really badly with very
>> >> large allocation requests". Again, can we fix that instead?
>> >>
>> >There are a variety of different allocation strategies used by
>> >different exporters so I don't think there's one dma-buf thing we
>> >could fix for slow, large allocations in general. For the system_heap
>> >in this patch it's really just alloc_pages. I'm saying I don't think
>> >the kernel should ever ask alloc_pages for more memory than exists on
>> >the system, which seems like a pretty reasonable sanity check to me.
>> >Given that, I don't think we should do anything for buffers smaller
>> >than totalram_pages() (except maybe to prevent OOM panics via
>> >__GFP_RETRY_MAYFAIL when we attempt to exhaust system memory on any
>> >request - valid or otherwise).
>>
>> I think T. J. also agree with me on what I shared.
>> if (len / PAGE_SIZE > totalram_pages()) return ERR_PTR(-ENOMEM);
>> #define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP | __GFP_RETRY_MAYFAIL)
>>
>Oh yeah, sorry if that wasn't clear. I was referring to your updated
>check for just totalram_pages() above, not totalram_pages() / 2.
>

Yes I thought you meant that. Thank you.
If there is no objection, I will resend with that, totalram_pages (not / 2) and
__GFP_RETRY_MAYFAIL.

>> Regarding the dma-buf behavior, I also would like to say that the dma-buf
>> system heap seems to be designed to allocate that large memory. In mobile
>> devices, we need that large memory for camera buffers or grahpics
>> rendendering buffers. So that large memory should be allowed but the invalid
>> huge size over ram should be avoided.
>>
>> I agree on that mm should reclaim even for the large size. But that reclaim
>> process may affect system performance or user convenience. In that perspective
>> I thought ram / 2 was reasonable, but yes not a golden value. I hope to use
>> just ram size as sanity check.
>>
>> Additionally if all agree, we may be able to apply __GFP_RETRY_MAYFAIL too.
>>
>> BR