On Mon, Mar 04, 2013 at 11:31:46PM +0530, Raghavendra K T wrote:This patch series further filters better vcpu candidate to yield to
in PLE handler. The main idea is to record the preempted vcpus using
preempt notifiers and iterate only those preempted vcpus in the
handler. Note that the vcpus which were in spinloop during pause loop
exit are already filtered.
The %improvement and patch series look good.
Thanks Jiannan, Avi for bringing the idea and Gleb, PeterZ for
precious suggestions during the discussion.
Thanks Srikar for suggesting to avoid rcu lock while checking task state
that has improved overcommit cases.
There are basically two approches for the implementation.
Method 1: Uses per vcpu preempt flag (this series).
Method 2: We keep a bitmap of preempted vcpus. using this we can easily
iterate over preempted vcpus.
Note that method 2 needs an extra index variable to identify/map bitmap to
vcpu and it also needs static vcpu allocation.
We definitely don't want something that requires static vcpu allocation.
I think it'd be better to add another counter for the vcpu bit assignment.
I am also posting Method 2 approach for reference in case it interests.
I guess the interest in Method2 would come from perf numbers. Did you try
comparing Method1 vs. Method2?