Re: [PATCH v2] mm: Warn about costly page allocation
From: Minchan Kim
Date: Wed Jul 11 2012 - 17:18:39 EST
On Thu, Jul 12, 2012 at 5:40 AM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
> On Wed, 11 Jul 2012, Minchan Kim wrote:
>
>> I agree it's an ideal but the problem is that it's too late.
>> Once product is released, we have to recall all products in the worst case.
>> The fact is that lumpy have helped high order allocation implicitly but we removed it
>> without any notification or information. It's a sort of regression and we can't say
>> them "Please report us if it happens". It's irresponsible, too.
>> IMHO, at least, what we can do is to warn about it before it's too late.
>>
>
> High order allocations that fail should still display a warning message
> when __GFP_NOWARN is not set, so I don't see what this additional warning
> adds. I don't think it's responsible to ask admins to know what lumpy
> reclaim is, what memory compaction is, or when a system tends to have more
> high order allocations when memory compaction would be helpful.
>
> What we can do, though, is address bug reports as they are reported when
> high order allocations fail and previous kernels are successful. I
> haven't seen any lately.
Did you read my description?
"
Let's think this scenario.
There is QA team in embedded company and they have tested their product.
In test scenario, they can allocate 100 high order allocation.
(they don't matter how many high order allocations in kernel are needed
during test. their concern is just only working well or fail of their
middleware/application) High order allocation will be serviced well
by natural buddy allocation without lumpy's help. So they released
the product and sold out all over the world.
Unfortunately, in real practice, sometime, 105 high order allocation was
needed rarely and fortunately, lumpy reclaim could help it so the product
doesn't have a problem until now.
If they use latest kernel, they will see the new config CONFIG_COMPACTION
which is very poor documentation, and they can't know it's replacement of
lumpy reclaim(even, they don't know lumpy reclaim) so they simply disable
that option for size optimization. Of course, QA team still test it but they
can't find the problem if they don't do test stronger than old.
It ends up release the product and sold out all over the world, again.
But in this time, we don't have both lumpy and compaction so the problem
would happen in real practice. A poor enginner from Korea have to flight
to the USA for the fix a ton of products. Otherwise, should recall products
from all over the world. Maybe he can lose a job. :(
"
It's not much exaggerated. who should we blame?
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/