Re: 2.6.23-mm1

From: Jens Axboe
Date: Mon Oct 15 2007 - 02:50:44 EST


On Mon, Oct 15 2007, Milan Broz wrote:
> Andrew Morton wrote:
> > On Sun, 14 Oct 2007 21:12:08 +0200 "Torsten Kaiser" <just.for.lkml@xxxxxxxxxxxxxx> wrote:
> ...
> >> 354036 Page allocated via order 0, mask 0x11202
> >> 1 (PFN/Block always differ) PFN 3072 Block 6 type 0 Flags
> >> 354338 [0xffffffff80266373] mempool_alloc+83
> >> 354338 [0xffffffff80266373] mempool_alloc+83
> >> 354025 [0xffffffff802bb389] bio_alloc_bioset+185
> >> 354058 [0xffffffff804d2b40] kcryptd_do_crypt+0
> >> 354052 [0xffffffff804d2cc7] kcryptd_do_crypt+391
> >> 354058 [0xffffffff804d2b40] kcryptd_do_crypt+0
> >> 354052 [0xffffffff80245d3c] run_workqueue+204
> >> 354062 [0xffffffff802467b0] worker_thread+0
> >>
> >> I'm using dm-crypt with CONFIG_CRYPTO_TWOFISH_X86_64
> >>
> >>> The other info shows a tremendous memory leak, not via slab. Looks like
> >>> someone is running alloc_pages() directly and isnb't giving them back.
> >> Blaming it on dm-crypt looks right, as the leak seems to happens, if
> >> there is (heavy) disk activity.
> >> (updatedb just ate ~500 Mb)
> >>
> >
> > Yup, it does appear that dm-crypt is leaking. Let's add some cc's.
>
> More precisely - change below from git-block.patch update
> caused that pages are not deallocated at all.
> (cc-ing Jens)
>
> -static int crypt_endio(struct bio *clone, unsigned int done, int error)
> +static void crypt_endio(struct bio *clone, int error)
> ...
> - * free the processed pages, even if
> - * it's only a partially completed write
> + * free the processed pages
> */
> - if (!read_io)
> - crypt_free_buffer_pages(cc, clone, done);
> -
> - /* keep going - not finished yet */
> - if (unlikely(clone->bi_size))
> - return 1;
> -
> - if (!read_io)
> + if (!read_io) {
> + crypt_free_buffer_pages(cc, clone, clone->bi_size);
> goto out;
> + }
>
> clone->bi_size is zero here now, so crypt_free_buffer_pages will not
> work correctly (previously there was count of processed bytes).
>
> But because it seems that bio cannot be processed partially now, we can
> simplify crypt_free_buffer_pages to always remove all allocated pages.

Neil, this doesn't look very good. dm-crypt needs to know the clone io
size, so ->bi_size was definitely used properly in this context before.
Now it's gone. Suggestions on how to fix that up?

I've been less than impressed with the bi_end_io() patchset so far, it's
been full of typos and bad conversions. I'm tempted to revert the whole
thing, clearly it wasn't ready for merge.

--
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/