3.4: nohz load accounting still has some problem

From: Tao Ma
Date: Thu May 24 2012 - 22:52:09 EST


Hi Peter,
With 3.4 we still has problems with nohz load accounting. In one of our
product system, the number of running processes is around 16, but
loadavg is only around 8-10. Without your fix c308b56b5, it is only
less than 1... Your patch does work, but not good enough. :(

After some investigation, it seems that we still have a hole, but we and
not sure and haven't figure out how to resolve it. So maybe you have a
good idea of whether our analysis is good and how to fix it.

So in general after your fix c308b56b5, we will fold the nohz remainder
to the global one after all the cpu has calculated the real value. But
there does exist some case: See
cpu 0 1 2
calc calc

idle and update
calc_load_tasks_idle

calc

Now when cpu2 calculates load, it will use calc_load_tasks_idle which
has been changed by cpu1. So the load isn't accurate any more. Am I
missing something here?

Thanks
Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/