Re: PROBLEM: kernel BUG at drivers/gpu/drm/drm_fops.c:146!

From: Dave Airlie
Date: Thu Jan 29 2009 - 20:43:35 EST


On Fri, Jan 30, 2009 at 11:20 AM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, 30 Jan 2009 11:06:47 +1000 Dave Airlie <airlied@xxxxxxxxx> wrote:
>
>> On Fri, Jan 30, 2009 at 10:30 AM, Andrew Morton
>> <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> > (cc's added)
>> >
>> > On Wed, 21 Jan 2009 13:27:48 +0100
>> > Sami Kerola <kerolasa@xxxxxx> wrote:
>> >
>> >> I compiled the Torvalds git kernel 2.6.29-rc2-00013 and I got an oops.
>> >> The oops happens when ever X starts. Initially I was booting with run
>> >> level 5 and it hung. I tried to use run level to 3 and an operating
>> >> system started just fine. When I type startx the hung happen again.
>> >> Please let me know if you need some more information besides oops from
>> >> messages file and lspci output.
>> >>
>> >>
>> >> Jan 21 08:53:58 lelux kernel: ------------[ cut here ]------------
>> >> Jan 21 08:53:58 lelux kernel: kernel BUG at drivers/gpu/drm/drm_fops.c:146!
>> >
>> > I assume that 2.6.28 didn't do this?
>>
>> This is a userspace race between udev and libdrm, I'm not sure we can do
>> anything in the kernel other than BUG, maybe we should just WARN instead.
>>
>> Basically, libdrm creates devices nodes, the initial drm opening gets that, udev
>> comes along when the module is loaded and re-creates the device node,
>> when AIGLX opens the device
>> it can't figure out wtf just happened, as the inode->i_mapping we use
>> to store the GEM device mmap ranges is different.
>>
>> I think building libdrm with --enable-udev is the correct answer, and
>> maybe switching this to a WARN so it doesn't blow up.
>>
>> maybe we shouldn't be storing the inode mapping like this? anyone any
>> better idea?
>>
>
> hm, I'm a bit surprised to see the drm code using `struct
> address_space' and read_mapping_page() and unmap_mapping_range() and
> such. I thought those only worked with regular files and pagecache :)
>
> Is it possible to briefly explain what's going on there?
>
> What instance of address_space_operations does ->dev_mapping actually
> point at?

Okay a bit tired and headache coming on but I'll try, maybe jbarnes
can help out,

We need to provide mappings to userspace that are backed by memory
that can move around behind the mappings.

So userspace wants a mapping for a GEM object via the AGP/GTT aperture
instead of directly to the backing pages.
Now as the GEM object is backed by shmem we can't use the shmem file
descriptor we have to tie the mapping to without
hacking up the shmem mmap functionality which seemed like a bad plan.

So GEM uses the device inode to setup the mappings on. We just use a
simple linear allocator to split up the device inodes address space
and assign chunks to handles for different objects. The userspace app
then uses the handle via mmap to get access to the VMAs. Now when GEM
wants to move that object out of the GTT or to another area of the GTT
we need some way to invalidate it, so we use unmap_mapping_range
which destroys all the mappings for the object in all the VMA for all
the processes mapping it currently

GEM's read_mapping_page is distinct from this and is to do with the
shmem interfaceing.

Not sure if this explains it or just make it worse.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/