Re: Regression from 2.6.36

From: Eric Dumazet
Date: Thu Apr 07 2011 - 07:57:19 EST


Le jeudi 07 avril 2011 Ã 19:21 +0800, AmÃrico Wang a Ãcrit :
> On Thu, Apr 7, 2011 at 6:19 PM, Jiri Slaby <jslaby@xxxxxxx> wrote:
> > Cced few people.
> >
> > Also the series which introduced this were discussed at:
> > http://lkml.org/lkml/2010/5/3/53


> >
>
> I guess this is due to that lots of fdt are allocated by kmalloc(),
> not vmalloc(), and we kfree() them in rcu callback.
>
> How about deferring all of the removal to workqueue? This may
> hurt performance I think.
>
> Anyway, like the patch below... makes sense?
>
> Not-yet-signed-off-by: WANG Cong <xiyou.wangcong@xxxxxxxxx>
>
> ---
> diff --git a/fs/file.c b/fs/file.c
> index 0be3447..34dc355 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -96,20 +96,14 @@ void free_fdtable_rcu(struct rcu_head *rcu)
> container_of(fdt, struct files_struct, fdtab));
> return;
> }
> - if (!is_vmalloc_addr(fdt->fd) && !is_vmalloc_addr(fdt->open_fds)) {
> - kfree(fdt->fd);
> - kfree(fdt->open_fds);
> - kfree(fdt);
> - } else {
> - fddef = &get_cpu_var(fdtable_defer_list);
> - spin_lock(&fddef->lock);
> - fdt->next = fddef->next;
> - fddef->next = fdt;
> - /* vmallocs are handled from the workqueue context */
> - schedule_work(&fddef->wq);
> - spin_unlock(&fddef->lock);
> - put_cpu_var(fdtable_defer_list);
> - }
> +
> + fddef = &get_cpu_var(fdtable_defer_list);
> + spin_lock(&fddef->lock);
> + fdt->next = fddef->next;
> + fddef->next = fdt;
> + schedule_work(&fddef->wq);
> + spin_unlock(&fddef->lock);
> + put_cpu_var(fdtable_defer_list);
> }


Nope, this makes no sense at all.

Its probably the other way. We want to free those blocks ASAP

A fix would be to make alloc_fdmem() use vmalloc() if size is more than
4 pages, or whatever limit is reached.

We had a similar memory problem in fib_trie in the past : We force a
synchronize_rcu() every XXX Mbytes allocated to make sure we dont have
too much ram waiting to be freed in rcu queues.







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/