Re: [BISECTED] v4.15-rc: Boot regression on x86_64/AMD

From: Linus Torvalds
Date: Tue Jan 09 2018 - 14:18:15 EST


On Tue, Jan 9, 2018 at 2:37 AM, Christian KÃnig
<christian.koenig@xxxxxxx> wrote:
>
> I tested a bit with Aaro and came up with the attached patch, it adds a 16GB
> guard between the end of memory and the new window for the PCIe root hub.
> But I agree with you that this is just a hack and not a real solution.

Guys, that last statement makes no sense.

The *real* hack was that original patch that caused problems.

I mean, just look at it. It has a completely made up - and bad -
default start, and then it tries to forcibly just create the maximum
window it possibly can. Well, not quite, but almost.

Now *THAT* is hacky, and fragile, and a bad idea. It's a fundamentally
bad idea exactly because it assumes

(a) we have perfect knowledge

(b) that window that wasn't even enabled or configured in the first
place should be the maximum window.

both of those assumptions seem completely bogus, and seem to have no
actual reason.

This comment in that code really does say it all:

/* Just grab the free area behind system memory for this */

very lackadaisical.

I agree that the 16GB guard is _also_ random, but it's certainly not
less random or hacky.

But I really wonder why you want to be that close to memory at all.

What was wrong with the patch thgat just put it the hell out of any
normal memory range, and just changed the default start from one
random (and clearly bad) value to _another_ random but at least
out-of-the-way value?

IOW, this change

- res->start = 0x100000000ull;
+ res->start = 0xbd00000000ull;

really has a relatively solid explanation for it: "pick a high address
that is likely out of the way". That's *much* better than "pick an
address that is right after memory".

Now, could there be a better high address to pick? Probably. It would
be really nice to have a *reason* for the address to be picked.

But honestly, even "it doesn't break Aaro's machine" is a better
reason than many, in the absence of other reasons.

For example, was there a reason for that random 756GB address? Is the
limit of the particular AMD 64-bit bar perhaps at the 1TB mark (and
that "res->end" value is because "close to it, but not at the top")?

So I think "just above RAM" is a _horrible_ default starting point.
The random 16GB guard is _better_, but it honestly doesn't seem any
better than the simpler original patch.

A starting point like "halfway to from the hardware limit" would
actually be a better reason. Or just "we picked an end-point, let's
pick a starting point that gives us a _sufficient_ - but not excessive
- window".

Or any number of other possible starting points. Almost _anything_ is
better than "end of RAM".

That "above end of RAM" might be a worst-case fall-back value (and in
fact, I think that _is_ pretty close to what the PCI code uses for the
case of "we don't have any parent at all, so we'll just have to assume
it's a transparent bridge"), but I don't think it's necessarily what
you should _strive_ for.

So the hackyness of whatever the fix is really should be balanced with
the hackyness of the original patch that introduced this problem.

Linus