Re: [PATCH 2.6.38-10-generic] device driver: fix oops in radeondriver due to incorrect value from hardware

From: Michel Dänzer
Date: Thu Sep 08 2011 - 09:51:20 EST


On Mit, 2011-08-10 at 09:27 -0400, Alex Deucher wrote:
> 2011/8/10 Michel DÃnzer <michel@xxxxxxxxxxx>:
> > On Die, 2011-08-09 at 23:52 +0530, Mayank Rungta wrote:
> >> Added a check for the radeon ring buffer write index in r600.c which
> >> reads 0xffffffff on resume. This results in an Oops during
> >> radeon_ring_write. Masking the value averts this.
> >>
> >> This problem is not seen to be fixed in 3.0 r600.c as well.
> >>
> >> Detailed analysis of the problem can be found at -
> >>
> >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/820746/
> >>
> >> ---
> >>
> >> BUG: unable to handle kernel paging request at fa501ffc - Oops at
> >> r600_cp_start+0x48/0x380 in r600_cp_resume+0x345/0x580 [radeon]
> >>
> >> drivers/gpu/drm/radeon/r600.c
> >>
> >>
> >>
> >> --- linux-2.6.38/drivers/gpu/drm/radeon/r600.c.orig 2011-08-05
> >> 15:39:40.824612700 +0530
> >> +++ linux-2.6.38/drivers/gpu/drm/radeon/r600.c 2011-08-08
> >> 05:29:21.744417857 +0530
> >> @@ -2218,6 +2218,8 @@ int r600_cp_resume(struct radeon_device
> >>
> >> rdev->cp.rptr = RREG32(CP_RB_RPTR);
> >> rdev->cp.wptr = RREG32(CP_RB_WPTR);
> >> + /* protect against crazy HW on resume */
> >> + rdev->cp.wptr &= rdev->cp.ptr_mask;
> >
> > Although the same workaround is already in r100.c, I wonder if we
> > shouldn't rather try and eliminate all reads from the CP_RB_WPTR
> > register, at least other than for debugging purposes. Alex, what do you
> > think?
>
> Either this or reset the registers to 0 or a saved value on resume
> rather than reading from them.

The patch below is what I had in mind. Does this fix the problem above?