Re: kernel oops and panic in acpi_atomic_read under 2.6.39.3. calltrace included

From: Huang Ying
Date: Sun Sep 04 2011 - 22:50:23 EST


On 09/03/2011 07:32 AM, rick@xxxxxxxxxxxx wrote:
> Hi Huang,
>
> Sorry for the delay in my response. Hurricane Irene delayed our testing a
> bit.
>
> I had to switch the 5620 CPUS I had for 5670s. After 4 days of running
> (it was usually about 2 before) I finally got this output in dmesg:
>
> [337296.365930] GHES: gar accessed: 0, 0xbf7b9370
> [337296.365936] ACPI atomic read mem: addr 0xbf7b9370 mapped to
> ffffc90013ee8370
>
> It is not mapped to 0 as expected, but it didn't crash now!

But I don't think this patch fixed the issue. Maybe just hided the
issue. Do you have time to try the new patch attached?

Best Regards,
Huang Ying

---
drivers/acpi/apei/ghes.c | 6 ++++++
drivers/acpi/atomicio.c | 2 ++
2 files changed, 8 insertions(+)

--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -299,6 +299,9 @@ static struct ghes *ghes_new(struct acpi
return ERR_PTR(-ENOMEM);
ghes->generic = generic;
rc = acpi_pre_map_gar(&generic->error_status_address);
+ pr_info(GHES_PFX "gar mapped: %d, %#llx\n",
+ generic->error_status_address.space_id,
+ generic->error_status_address.address);
if (rc)
goto err_free;
error_block_length = generic->error_block_length;
@@ -398,6 +401,9 @@ static int ghes_read_estatus(struct ghes
u32 len;
int rc;

+ pr_err(GHES_PFX "gar accessed: %d, %#llx\n",
+ g->error_status_address.space_id,
+ g->error_status_address.address);
rc = acpi_atomic_read(&buf_paddr, &g->error_status_address);
if (rc) {
if (!silent && printk_ratelimit())
--- a/drivers/acpi/atomicio.c
+++ b/drivers/acpi/atomicio.c
@@ -270,6 +270,8 @@ static int acpi_atomic_read_mem(u64 padd

rcu_read_lock();
addr = __acpi_ioremap_fast(paddr, width);
+ if (!addr)
+ panic("ACPI atomic read mem: addr %#llx is not mapped!\n", paddr);
switch (width) {
case 8:
*val = readb(addr);