Re: [PATCH] efifb: allow user to disable write combined mapping.

From: Linus Torvalds
Date: Tue Jul 18 2017 - 21:15:55 EST


On Tue, Jul 18, 2017 at 5:00 PM, Dave Airlie <airlied@xxxxxxxxx> wrote:
>
> More digging:
> Single CPU system:
> Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
> 01:00.1 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200EH
>
> Now I can't get efifb to load on this (due to it being remote and I've
> no idea how to make
> my install efi onto it), but booting with no framebuffer, and running
> the tests on the mga,
> show no slowdown on this.

Is it actually using write-combining memory without a frame buffer,
though? I don't think it is. So the lack of slowdown might be just
from that.

> Now I'm starting to wonder if it's something that only happens on
> multi-socket systems.

Hmm. I guess that's possible, of course.

[ Wild and crazy handwaving... ]

Without write combining, all the uncached writes will be fully
serialized and there is no buffering in the chip write buffers. There
will be at most one outstanding PCI transaction in the uncore write
buffer.

In contrast, _with_ write combining, the write buffers in the uncore
can fill up.

But why should that matter? Maybe memory ordering. When one of the
cores (doesn't matter *which* core) wants to get a cacheline for
exclusive use (ie it did a write to it), it will need to invalidate
cachelines in other cores. However, the uncore now has all those PCI
writes buffered, and the write ordering says that they should happen
before the memory writes. So before it can give the core exclusive
ownership of the new cacheline, it needs to wait for all those
buffered writes to be pushed out, so that no other CPU can see the new
write *before* the device saw the old writes.

But I'm not convinced this is any different in a multi-socket
situation than it is in a single-socket one. The other cores on the
same socket should not be able to see the writes out of order
_either_.

And honestly, I think PCI write posting rules makes the above crazy
handwaving completely bogus anyway. Writes _can_ be posted, so the
memory ordering isn't actually that tight.

I dunno. I really think it would be good if somebody inside Intel
would look at it..

Linus