[bug] very high non-preempt latency in context_struct_compute_av()

From: Ingo Molnar
Date: Mon Jun 04 2007 - 07:28:15 EST



a simple ssh login triggers a ~130 msecs non-preemptible latency even
with CONFIG_PREEMPT enabled, on a fast Core2Duo CPU (!).

the latency is caused by a _very_ long loop in the SELinux code:

sshd-4828 0.N.. 465894us : avtab_search_node (context_struct_compute_av)
sshd-4828 0.N.. 465895us : cond_compute_av (context_struct_compute_av)
sshd-4828 0.N.. 465895us : avtab_search_node (cond_compute_av)
sshd-4828 0.N.. 465895us : avtab_search_node (context_struct_compute_av)
sshd-4828 0.N.. 465896us : cond_compute_av (context_struct_compute_av)
sshd-4828 0.N.. 465896us : avtab_search_node (cond_compute_av)
sshd-4828 0.N.. 465896us : avtab_search_node (context_struct_compute_av)
sshd-4828 0.N.. 465896us : cond_compute_av (context_struct_compute_av)
sshd-4828 0.N.. 465896us : avtab_search_node (cond_compute_av)

it is triggered like this:

sshd-4828 0..s. 462986us : tasklet_action (__do_softirq)
sshd-4828 0..s. 462986us : rcu_process_callbacks (tasklet_action)
sshd-4828 0..s. 462986us : __rcu_process_callbacks (rcu_process_callbacks)
sshd-4828 0..s. 462987us : __rcu_process_callbacks (rcu_process_callbacks)
sshd-4828 0D.s. 462987us : _local_bh_enable (__do_softirq)
sshd-4828 0DN.. 462987us : idle_cpu (irq_exit)
sshd-4828 0.N.. 462988us : avtab_search_node (context_struct_compute_av)
sshd-4828 0.N.. 462989us : cond_compute_av (context_struct_compute_av)

and it ends like this after 130 msecs:

sshd-4828 0.N.. 594068us : ebitmap_contains (constraint_expr_eval)
sshd-4828 0.N.. 594068us : ebitmap_get_bit (constraint_expr_eval)
sshd-4828 0.N.. 594068us : constraint_expr_eval (context_struct_compute_av)
sshd-4828 0.N.. 594068us : ebitmap_contains (constraint_expr_eval)
sshd-4828 0.N.. 594068us : ebitmap_get_bit (constraint_expr_eval)
sshd-4828 0.N.. 594070us : sprintf (sel_write_user)
sshd-4828 0.N.. 594070us : vsnprintf (sprintf)
sshd-4828 0.N.. 594072us : number (vsnprintf)
sshd-4828 0.N.. 594072us : security_sid_to_context (sel_write_user)
sshd-4828 0.N.. 594073us : _read_lock (security_sid_to_context)
sshd-4828 0.N.. 594073us : sidtab_search (security_sid_to_context)
sshd-4828 0.N.. 594073us : context_struct_to_string (security_sid_to_context)
sshd-4828 0.N.. 594074us : mls_compute_context_len (context_struct_to_string)
sshd-4828 0.N.. 594075us : ebitmap_cmp (mls_compute_context_len)
sshd-4828 0.N.. 594075us : __kmalloc (context_struct_to_string)
sshd-4828 0.N.. 594076us : sprintf (context_struct_to_string)
sshd-4828 0.N.. 594077us : vsnprintf (sprintf)

i've got the full trace saved away (200 MB ...) and can upload it, but
it's just repetitions of the above lines.

The distribution is Fedora 7, v2.6.21 (but also happens in recent -git)
and a simple 'ssh localhost' login is enough to trigger this. It
triggers every time and this is causing audio skipping in certain apps.
It is even visible in glxgears smoothness: a small 'bump' is visible in
the otherwise smooth rotation of glxgears. Enabling CONFIG_PREEMPT does
not fix this issue as the function runs under spinlocks. (enabling
CONFIG_PREEMPT_RT in -rt fixes the issue - but that still leaves us with
the huge 130 msecs cost of that function.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/