Re: [PATCH RFC tip/core/rcu 1/9] rcu: Add call_rcu_tasks()
From: Paul E. McKenney
Date: Wed Jul 30 2014 - 10:23:26 EST
On Wed, Jul 30, 2014 at 03:23:39PM +0200, Mike Galbraith wrote:
> On Tue, 2014-07-29 at 11:06 -0700, Paul E. McKenney wrote:
> > On Tue, Jul 29, 2014 at 07:33:32PM +0200, Peter Zijlstra wrote:
>
> > > FWIW its _the_ thing that makes nohz_full uninteresting for me. The
> > > required overhead is insane. But yes there are people willing to pay
> > > that etc..
> >
> > It would indeed be good to reduce the overhead. I could imagine all sorts
> > of insane approaches involving assuming that CPU write buffers flush in
> > bounded time, though CPU vendors seem unwilling to make guarantees in
> > this area. ;-)
> >
> > Or is something other than rcu_user_enter() and rcu_user_exit() causing
> > the pain here?
>
> Border guards stamping visas. Note native_sched_clock().
Thank you for running this!
So the delta accounting is much of the pain. Hmmm...
Thanx, Paul
> echo 0 > sched_wakeup_granularity_ns
> taskset -c 3 pipe-test 1
>
> CONFIG_NO_HZ_FULL=y 604.2 KHz CONFIG_NO_HZ_FULL=y, nohz_full=3 303.5 KHz
> 10.45% __schedule 8.74% native_sched_clock
> 10.03% system_call 5.63% __schedule
> 4.86% _raw_spin_lock_irqsave 4.75% _raw_spin_lock
> 4.51% __switch_to 4.35% reschedule_interrupt
> 4.31% copy_user_generic_string 3.91% _raw_spin_unlock_irqrestore
> 3.50% pipe_read 3.35% system_call
> 3.02% pipe_write 2.73% context_tracking_user_exit
> 2.76% mutex_lock 2.30% _raw_spin_lock_irqsave
> 2.30% native_sched_clock 2.08% context_tracking_user_enter
> 2.27% copy_page_to_iter_iovec 1.94% __switch_to
> 2.16% mutex_unlock 1.88% copy_user_generic_string
> 2.15% _raw_spin_unlock_irqrestore 1.80% account_system_time
> 1.86% copy_page_from_iter_iovec 1.77% rcu_eqs_enter_common.isra.42
> 1.85% vfs_write 1.60% pipe_read
> 1.67% new_sync_read 1.58% pipe_write
> 1.61% new_sync_write 1.39% mutex_lock
> 1.49% vfs_read 1.37% enqueue_task_fair
> 1.47% fsnotify 1.25% rcu_eqs_exit_common.isra.43
> 1.43% __fget_light 1.14% get_vtime_delta
> 1.36% enqueue_task_fair 1.11% flat_send_IPI_mask
> 1.28% finish_task_switch 1.07% tracesys
> 1.26% dequeue_task_fair 1.03% dequeue_task_fair
> 1.25% __sb_start_write 1.01% copy_page_to_iter_iovec
> 1.22% _raw_spin_lock_irq 1.01% int_check_syscall_exit_work
> 1.20% try_to_wake_up 0.97% vfs_write
> 1.16% update_curr 0.94% __context_tracking_task_switch
> 1.05% __fsnotify_parent 0.93% mutex_unlock
> 1.03% pick_next_task_fair 0.88% copy_page_from_iter_iovec
> 1.02% sys_write 0.87% new_sync_write
> 1.01% sys_read 0.86% __fget_light
> 1.00% __wake_up_sync_key 0.85% __sb_start_write
> 0.93% __wake_up_common 0.85% int_ret_from_sys_call
> 0.92% copy_page_to_iter 0.83% syscall_trace_leave
> 0.90% check_preempt_wakeup 0.78% new_sync_read
> 0.90% __srcu_read_lock 0.78% account_user_time
> 0.89% put_prev_task_fair 0.76% update_curr
> 0.88% copy_page_from_iter 0.74% fsnotify
> 0.82% __sb_end_write 0.73% try_to_wake_up
> 0.76% __percpu_counter_add 0.71% finish_task_switch
> 0.74% prepare_to_wait 0.70% _raw_spin_lock_irq
> 0.72% touch_atime 0.69% __wake_up_sync_key
> 0.71% pipe_wait 0.69% __tick_nohz_task_switch
>
> pinned endless stat("/", &buf)
>
> CONFIG_NO_HZ_FULL=y CONFIG_NO_HZ_FULL=y, nohz_full=3
> 17.13% system_call 8.78% system_call
> 11.20% kmem_cache_alloc 8.52% native_sched_clock
> 7.14% lockref_get_not_dead 6.02% context_tracking_user_exit
> 7.10% kmem_cache_free 4.53% kmem_cache_alloc
> 6.42% path_init 4.46% _raw_spin_lock
> 5.69% copy_user_generic_string 4.13% copy_user_generic_string
> 5.25% lockref_put_or_lock 4.01% kmem_cache_free
> 4.14% strncpy_from_user 3.36% context_tracking_user_enter
> 3.99% path_lookupat 3.25% lockref_get_not_dead
> 3.12% complete_walk 3.25% lockref_put_or_lock
> 2.91% getname_flags 2.86% rcu_eqs_enter_common.isra.42
> 2.88% cp_new_stat 2.84% path_init
> 2.79% vfs_fstatat 2.56% rcu_eqs_exit_common.isra.43
> 2.59% user_path_at_empty 2.52% int_check_syscall_exit_work
> 1.93% link_path_walk 2.51% tracesys
> 1.81% generic_fillattr 2.08% cp_new_stat
> 1.75% dput 2.00% syscall_trace_leave
> 1.71% filename_lookup.isra.50 1.75% complete_walk
> 1.66% mntput 1.69% path_lookupat
> 1.45% vfs_getattr_nosec 1.58% strncpy_from_user
> 1.04% final_putname 1.56% get_vtime_delta
> 1.02% SYSC_newstat 1.34% int_with_check
>
> CONFIG_NO_HZ_FULL=y, nohz_full=3
> - 8.53% [kernel] [k] native_sched_clock â
> - native_sched_clock â
> - 96.76% local_clock â
> - get_vtime_delta â
> - 51.95% vtime_account_user â
> 99.96% context_tracking_user_exit â
> syscall_trace_enter â
> tracesys â
> __xstat64 â
> __libc_start_main â
> - 48.05% __vtime_account_system â
> 99.96% vtime_user_enter â
> context_tracking_user_enter â
> syscall_trace_leave â
> int_check_syscall_exit_work â
> __xstat64 â
> __libc_start_main â
> - 3.23% get_vtime_delta â
> 52.96% vtime_account_user â
> context_tracking_user_exit â
> syscall_trace_enter â
> tracesys â
> __xstat64 â
> __libc_start_main â
> 47.04% __vtime_account_system â
> vtime_user_enter â
> context_tracking_user_enter â
> syscall_trace_leave â
> int_check_syscall_exit_work â
> __xstat64 â
> __libc_start_main
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/