On Tue, 15 Feb 2011, Peter Kruse wrote:
> > we have set vm.min_free_kbytes = 2097152 but the problem
> > obviously did not go away.
>
> 2GB of reserves? How much memory does your system have?
48GB
Ok then you just may potentially clog up the DMA zones. Maybe set the
reserves to a reasonable level like 10M or so?
How many buffers are configured at the various levels for the device that
is receiving messages? I guess that may be a bit on the high side?
> Could you post the entire messages from the kernel log? We need the OOM
> info to figure out more about the problem.
>
I attach one of the call traces, or would it be better if I send the
kern.log (about 6MB)?
The call traces are sufficient but the traces vanished when I hit reply.
Include them inline next time. It would be good to have the log starting
at the last system boot. There is some information cut off that I would to
see.
An atomic order 1 allocation failed and led to the OOM but it seems that
there is still ample memory available. Slab is in "fallback_alloc" so
something went wrong with the regular allocation attempt. Any use of
cpusets or cgroups?
A significant amount of memory has been allocated to reclaimable slabs.
I guess these are the socket buffers?
Feb 10 11:59:49 beosrv1-t kernel: [1968911.211777] Node 0 Normal
free:965164kB min:917952kB low:1147440kB high:1376928kB
active_anon:2742680kB inactive_anon:293184kB active_file:4801512kB
inactive_file:11129708kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:21719040kB mlocked:0kB dirty:600kB
writeback:0kB mapped:26356kB shmem:4896kB slab_reclaimable:1780208kB <-----!!
slab_unreclaimable:199576kB kernel_stack:1576kB pagetables:22956kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
Could you try to reduce the number of network buffers?
Attachment:
kern.log.gz
Description: GNU Zip compressed data