Re: [PATCH RFC] audit: move the tree pruning to a dedicated thread

From: Paul Moore
Date: Tue Dec 09 2014 - 11:33:41 EST


On Thursday, December 04, 2014 12:39:21 PM Imre Palik wrote:
> From: "Palik, Imre" <imrep@xxxxxxxxx>
>
> When file auditing is enabled, during a low memory situation, a memory
> allocation with __GFP_FS can lead to pruning the inode cache. Which can,
> in turn lead to audit_tree_freeing_mark() being called. This can call
> audit_schedule_prune(), that tries to fork a pruning thread, and
> waits until the thread is created. But forking needs memory, and the
> memory allocations there are done with __GFP_FS.
>
> So we are waiting merrily for some __GFP_FS memory allocations to complete,
> while holding some filesystem locks. This can take a while ...
>
> This patch creates a single thread for pruning the tree from
> audit_tree_init(), and thus avoids the deadlock that the on-demand thread
> creation can cause.
>
> An alternative approach would be to move the thread creation outside of the
> lock. This would assume that other layers of the filesystem code don't
> hold any locks, and it would need some rewrite of the code to limit the
> amount of threads possibly spawned.
>
> Reported-by: Matt Wilson <msw@xxxxxxxxxx>
> Cc: Matt Wilson <msw@xxxxxxxxxx>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Imre Palik <imrep@xxxxxxxxx>
> ---
> kernel/audit_tree.c | 53 ++++++++++++++++++++++++++++++++---------------
> 1 file changed, 35 insertions(+), 18 deletions(-)

Sorry for the delay, we've changed maintainers recently and some patches/issue
were lost in the handoff. Some comments below ...

> diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c
> index 0caf1f8..cf6db88 100644
> --- a/kernel/audit_tree.c
> +++ b/kernel/audit_tree.c
> @@ -37,6 +37,7 @@ struct audit_chunk {
>
> static LIST_HEAD(tree_list);
> static LIST_HEAD(prune_list);
> +static struct task_struct *prune_thread;
>
> /*
> * One struct chunk is attached to each inode of interest.
> @@ -806,30 +807,39 @@ int audit_tag_tree(char *old, char *new)
> */
> static int prune_tree_thread(void *unused)
> {
> - mutex_lock(&audit_cmd_mutex);
> - mutex_lock(&audit_filter_mutex);
> + for (;;) {
> + set_current_state(TASK_INTERRUPTIBLE);
> + if (list_empty(&prune_list))
> + schedule();
> + __set_current_state(TASK_RUNNING);
>
> - while (!list_empty(&prune_list)) {
> - struct audit_tree *victim;
> + mutex_lock(&audit_cmd_mutex);
> + mutex_lock(&audit_filter_mutex);
>
> - victim = list_entry(prune_list.next, struct audit_tree, list);
> - list_del_init(&victim->list);
> + while (!list_empty(&prune_list)) {
> + struct audit_tree *victim;
>
> - mutex_unlock(&audit_filter_mutex);
> + victim = list_entry(prune_list.next,
> + struct audit_tree, list);
> + list_del_init(&victim->list);
>
> - prune_one(victim);
> + mutex_unlock(&audit_filter_mutex);
>
> - mutex_lock(&audit_filter_mutex);
> - }
> + prune_one(victim);
>
> - mutex_unlock(&audit_filter_mutex);
> - mutex_unlock(&audit_cmd_mutex);
> + mutex_lock(&audit_filter_mutex);
> + }
> +
> + mutex_unlock(&audit_filter_mutex);
> + mutex_unlock(&audit_cmd_mutex);
> + }
> return 0;
> }
>
> static void audit_schedule_prune(void)
> {
> - kthread_run(prune_tree_thread, NULL, "audit_prune_tree");
> + BUG_ON(!prune_thread);

I don't really like the BUG_ON() here. If we can't guarantee that the thread
is still alive, we should look into some fallback approach so that we can
still prune the tree. I imagine something could be done with the parameter to
prune_tree_thread() to indicate if it is running in a dedicated thread or not.

> + wake_up_process(prune_thread);
> }
>
> /*
> @@ -896,9 +906,10 @@ static void evict_chunk(struct audit_chunk *chunk)
> for (n = 0; n < chunk->count; n++)
> list_del_init(&chunk->owners[n].list);
> spin_unlock(&hash_lock);
> + mutex_unlock(&audit_filter_mutex);
> if (need_prune)
> audit_schedule_prune();
> - mutex_unlock(&audit_filter_mutex);
> +
> }
>
> static int audit_tree_handle_event(struct fsnotify_group *group,
> @@ -938,10 +949,16 @@ static int __init audit_tree_init(void)
> {
> int i;
>
> - audit_tree_group = fsnotify_alloc_group(&audit_tree_ops);
> - if (IS_ERR(audit_tree_group))
> - audit_panic("cannot initialize fsnotify group for rectree watches");
> -
> + prune_thread = kthread_create(prune_tree_thread, NULL,
> + "audit_prune_tree");
> + if (IS_ERR(prune_thread)) {
> + audit_panic("cannot start thread audit_prune_tree");

Only in the most extreme configurations is audit_panic() an actual panic().
This goes hand in hand with the comment above regarding the case where the
pruning thread may not exist.

> + } else {
> + wake_up_process(prune_thread);
> + audit_tree_group = fsnotify_alloc_group(&audit_tree_ops);
> + if (IS_ERR(audit_tree_group))
> + audit_panic("cannot initialize fsnotify group for rectree
watches");
> + }

The above doesn't really need to be in an else block does it?

> for (i = 0; i < HASH_SIZE; i++)
> INIT_LIST_HEAD(&chunk_hash_heads[i]);

--
paul moore
www.paul-moore.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/