Re: x86 git tree broken (bisected)
From: Yinghai Lu
Date: Sun Apr 13 2008 - 14:07:41 EST
On Sun, Apr 13, 2008 at 9:12 AM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
>
> On Sunday, 13 of April 2008, Yinghai Lu wrote:
> > On Fri, Apr 11, 2008 at 1:51 PM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > >
> > > On Friday, 11 of April 2008, Yinghai Lu wrote:
> > > > On Fri, Apr 11, 2008 at 12:26 PM, Rafael J. Wysocki <rjw@xxxxxxx> wrote:
> > > > > On Friday, 11 of April 2008, Rafael J. Wysocki wrote:
> > > > > > On Thursday, 10 of April 2008, Ingo Molnar wrote:
> > > > > > >
> > > > > > > * Ingo Molnar <mingo@xxxxxxx> wrote:
> > > > > > >
> > > > > > > > > > > First, the X server doesn't want to start (it says it couldn't
> > > > > > > > > > > mmap the framebuffer).
> > > > > > > > > >
> > > > > > > > > > could you send your .config?
> > > > > > > > >
> > > > > > > > > Attached.
> > > > > > > >
> > > > > > > > could you disable this option:
> > > > > > > >
> > > > > > > > CONFIG_NONPROMISC_DEVMEM=y
> > > > > > > >
> > > > > > > > does it help with the X problem?
> > > > > >
> > > > > > That didn't help.
> > > > > >
> > > > > > > btw., Xorg works fine here on a comparable AMD system - but i use a
> > > > > > > rather new distro (Fedora 8) which has Xorg 7.2.
> > > > > >
> > > > > > My system is an OpenSUSE 10.3 and it has Xorg 7.2 as well.
> > > > > >
> > > > > > I think the problem is somehow related to the Radeon.
> > > > >
> > > > > The bisection turned up commit ea1441bdf53692c3dc1fd2658addcf1205629661
> > > > > "x86: use bus conf in NB conf fun1 to get bus range on, on 64-bit" as the one
> > > > > causing problems.
> > > > >
> > > > > Unfortunately, I can't revert cleanly it, because there are two more commits
> > > > > depending on it in a highly nontrivial fashion, so I have reverted all three
> > > > > commits
> > > > >
> > > > > a365998cd2cecfb827469dbd57c29602c106cb83
> > > > > 44f7f90fbe7a3a99aab082f765346514b7b5c705
> > > > > ea1441bdf53692c3dc1fd2658addcf1205629661
> > > > >
> > > > > and X starts again. Also, suspend to RAM works from under X.
> > > >
> > > > please keep the three patches and applied the two attached debug patches.
> > > >
> > > > i wonder if there is some io allocation overlapping with your system.
> > >
> > > Attached is a boot dmesg output from the current x86 git tree with your two
> > > patches applied.
> > >
> > can you try to apply the patch i sent to you about agp bridge order
> > reading for buggy silicon?
> >
> > Please boot kernel with "debug"...
> >
> > I want to verify if you can get
> >
> > "
> > Aperture conflicts with PCI mapping.
> > "
> >
> > in your boot log...
>
> It's not present in there:
>
> rafael@albercik:~> grep Aperture failing-with-patch-dmesg.log
> Aperture too small (32 MB)
> Aperture from AGP @ de000000 size 4096 MB (APSIZE 0)
> Aperture too small (0 MB)
> agpgart: Aperture pointing to RAM
> agpgart: Aperture from AGP @ de000000 size 4096 MB
> agpgart: Aperture too small (0 MB)
>
did you apply the patch like the attached that i sent you in another mail?
YH
{PATCH] x86_64: agp_gart size checking for buggy device
while lookin Rafael J. Wysocki <rjw@xxxxxxx> system boot log,
find some funny print out
Node 0: aperture @ de000000 size 32 MB
Aperture too small (32 MB)
AGP bridge at 00:04:00
Aperture from AGP @ de000000 size 4096 MB (APSIZE 0)
Aperture too small (0 MB)
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 4000000
...
agpgart: Detected AGP bridge 20
agpgart: Aperture pointing to RAM
agpgart: Aperture from AGP @ de000000 size 4096 MB
agpgart: Aperture too small (0 MB)
agpgart: No usable aperture found.
agpgart: Consider rebooting with iommu=memaper=2 to get a good aperture.
it mean BIOS allocate correct gart on NB and AGP bridge but because one bug in silicon
( the agp bridge report wrong order, it want 4G)
the kernel will reject that allcation, becase the size is only 32M. and try to get another
64M for gart, and late fix_northbridge can not revert that change because it still read
wrong size from agp bridge.
So try to double check order from agp bridge, before calling aperture_valid().
Signed-off-by: Yinghai Lu <yhlu.kernel@xxxxxxxxx>
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index 479926d..9f86778 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -138,6 +138,7 @@ static __u32 __init read_agp(int num, int slot, int func, int cap, u32 *order)
int nbits;
u32 aper_low, aper_hi;
u64 aper;
+ u32 old_order;
printk(KERN_INFO "AGP bridge at %02x:%02x:%02x\n", num, slot, func);
apsizereg = read_pci_config_16(num, slot, func, cap + 0x14);
@@ -146,6 +147,9 @@ static __u32 __init read_agp(int num, int slot, int func, int cap, u32 *order)
return 0;
}
+ /* old_order could be the value from NB gart setting */
+ old_order = *order;
+
apsize = apsizereg & 0xfff;
/* Some BIOS use weird encodings not in the AGPv3 table. */
if (apsize & 0xff)
@@ -159,6 +163,16 @@ static __u32 __init read_agp(int num, int slot, int func, int cap, u32 *order)
aper_hi = read_pci_config(num, slot, func, 0x14);
aper = (aper_low & ~((1<<22)-1)) | ((u64)aper_hi << 32);
+ /*
+ * some sick chip, APSIZE is 0, it mean it wants 4G
+ * so let double check that order, let trust AMD NB setting
+ */
+ if (aper + (32UL<<(20 + *order)) > 0x100000000UL) {
+ printk(KERN_INFO "Aperture size %u MB (APSIZE %x) is not right, use setting from NB\n",
+ 32 << *order, apsizereg);
+ *order = old_order;
+ }
+
printk(KERN_INFO "Aperture from AGP @ %Lx size %u MB (APSIZE %x)\n",
aper, 32 << *order, apsizereg);
diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index 9d82045..288d1f5 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -312,6 +312,17 @@ static __devinit int fix_northbridge(struct pci_dev *nb, struct pci_dev *agp,
pci_read_config_dword(agp, 0x10, &aper_low);
pci_read_config_dword(agp, 0x14, &aper_hi);
aper = (aper_low & ~((1<<22)-1)) | ((u64)aper_hi << 32);
+
+ /*
+ * some sick chip, APSIZE is 0, it mean it wants 4G
+ * so let double check that order, let trust AMD NB setting
+ */
+ if (aper + (32UL<<(20 + order)) > 0x100000000UL) {
+ printk(KERN_INFO "Aperture size %u MB is not right, use setting from NB\n",
+ 32 << order);
+ order = nb_order;
+ }
+
printk(KERN_INFO PFX "Aperture from AGP @ %Lx size %u MB\n", aper, 32 << order);
if (order < 0 || !aperture_valid(aper, (32*1024*1024)<<order))
return -1;