On Wed, Apr 08, 2015 at 07:41:49AM -0700, Alexander Duyck wrote:
On 04/08/2015 01:42 AM, Michael S. Tsirkin wrote:Maybe it's safe, and maybe there's no performance impact. But what's
On Tue, Apr 07, 2015 at 05:47:42PM -0700, Alexander Duyck wrote:The generic implementation for the smp_ barriers does the same thing when
This change makes it so that instead of using smp_wmb/rmb which variesWell the generic implementation has:
depending on the kernel configuration we can can use dma_wmb/rmb which for
most architectures should be equal to or slightly more strict than
The advantage to this is that these barriers are available to uniprocessor
builds as well so the performance should improve under such a
Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxx>
#define dma_rmb() rmb()
#define dma_wmb() wmb()
So for these arches you are slightly speeding up UP but slightly hurting SMP -
I think we did benchmark the difference as measureable in the past.
CONFIG_SMP is defined. The only spot where there should be an appreciable
difference between the two is on ARM where we define the dma_ barriers as
being in the outer shareable domain, and for the smp_ barriers they are
inner shareable domain.
Additionally, isn't this relying on undocumented behaviour?Consistent in this case represents memory that exists within one coherency
The documentation says:
"These are for use with consistent memory"
and virtio does not bother to request consistent memory
domain. So in the case of x86 for instance this represents writes only to
system memory. If you mix writes to system memory and device memory (PIO)
then you should be using the full wmb/rmb to guarantee ordering between the
One wonders whether these will always be strong enough.For the purposes of weak barriers they should be, and they are only slightly
stronger than SMP in one case so odds are strength will not be the issue.
As far as speed I would suspect that the difference between inner and outer
shareable domain should be negligible compared to the difference between a
dsb() and a dmb().
the purpose of the patch? From the commit log, It sounds like it's an
optimization, but it's not an obvious win, and it's not accompanied by