Re: [PATCH v2] mm: dmapool: use provided gfp flags for all dma_alloc_coherent()calls

From: Soeren Moch
Date: Tue Jan 15 2013 - 22:25:04 EST


On 16.01.2013 03:40, Jason Cooper wrote:
Soeren,

On Wed, Jan 16, 2013 at 01:17:59AM +0100, Soeren Moch wrote:
On 15.01.2013 22:56, Jason Cooper wrote:
On Tue, Jan 15, 2013 at 03:16:17PM -0500, Jason Cooper wrote:
If my understanding is correct, one of the drivers (most likely one)
either asks for too small of a dma buffer, or is not properly
deallocating blocks from the per-device pool. Either case leads to
exhaustion, and falling back to the atomic pool. Which subsequently
gets wiped out as well.

If my hunch is right, could you please try each of the three dvb drivers
in turn and see which one (or more than one) causes the error?

In fact I use only 2 types of DVB sticks: em28xx usb bridge plus drxk
demodulator, and dib0700 usb bridge plus dib7000p demod.

I would bet for em28xx causing the error, but this is not thoroughly
tested. Unfortunately testing with removed sticks is not easy, because
this is a production system and disabling some services for the long
time we need to trigger this error will certainly result in unhappy
users.

Just out of curiosity, what board is it?

The kirkwood board? A modified Guruplug Server Plus.

I will see what I can do here. Is there an easy way to track the buffer
usage without having to wait for complete exhaustion?

DMA_API_DEBUG

OK, maybe I can try this.

In linux-3.5.x there is no such problem. Can we use all available memory
for dma buffers here on armv5 architectures, in contrast to newer
kernels?

Were the loads exactly the same when you tested 3.5.x?

Exactly the same, yes.

I looked at the
changes from v3.5 to v3.7.1 for all four drivers you mentioned as well
as sata_mv.

The biggest thing I see is that all of the media drivers got shuffled
around into their own subdirectories after v3.5. 'git show -M 0c0d06c'
shows it was a clean copy of all the files.

What would be most helpful is if you could do a git bisect between
v3.5.x (working) and the oldest version where you know it started
failing (v3.7.1 or earlier if you know it).

I did not bisect it, but Marek mentioned earlier that commit
e9da6e9905e639b0f842a244bc770b48ad0523e9 in Linux v3.6-rc1 introduced
new code for dma allocations. This is probably the root cause for the
new (mis-)behavior (due to my tests 3.6.0 is not working anymore).
I'm not very familiar with arm mm code, and from the patch itself I
cannot understand what's different. Maybe CONFIG_CMA is default
also for armv5 (not only v6) now? But I might be totally wrong here,
maybe someone of the mm experts can explain the difference?

Regards,
Soeren





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/