Re: [PATCH 2/6] gpu: host1x: Fix syncpoint wait return value
From: Daniel Vetter
Date: Tue Jun 11 2013 - 08:09:38 EST
On Tue, Jun 11, 2013 at 1:43 PM, Terje Bergström <tbergstrom@xxxxxxxxxx> wrote:
> On 11.06.2013 14:00, Daniel Vetter wrote:
>> We don't use the EAGAIN ioctl restarting to resubmit the batchbuffer
>> which blew up the gpu (that one has been submitted already in a
>> different ioctl call), but to be able to restart the ioctl after the
>> reset has completed: We need to kick every thread which is potentially
>> holding GEM locks and make sure that we block them (at a point where
>> they don't hold any locks) until the reset handler completed. To avoid
>> a validation nightmare we use the same codepaths as we use for signal
>> interrupts, so ioctl restarting is a very natural fit for this.
>>
>> Resubmitting victim workloads when a gpu crash happened is something
>> the reset handler would do (kernel work item currently), not any
>> userspace process doing an ioctl. But atm we don't resubmit victimized
>> workloads.
>
> I don't understand the end-to-end of how resubmit is supposed to work.
> User space is not supposed to resubmit, but still EAGAIN is returned to
> user space, and drmIoctl() in user space just calls the came ioctl
> again. Sounds like drmIoctl() is completely wrong.
Maybe it wasn't clear, but -EAGAIN does _not_ resubmit work. -EAGAIN
is used to restart the ioctl if we had to kick a thread (to make sure
it doesn't hold any locks), e.g. for a blocking wait on oustanding
rendering. The codepaths taken work exactly as if the thread is
interrupt with a signal.
> In Tegra, when a job blows up, we reset the involved units, and set the
> pushbuffer pointer of host1x to point to the next job, and re-enable
> units. There's no need for anybody to resubmit anything, as kernel
> already has them.
Yeah, that's how it works in i915.ko, too.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/