Re: kernel BUG at lib/radix-tree.c:473!

From: Jaya Kumar
Date: Sun Aug 17 2008 - 08:09:20 EST


On Thu, Aug 14, 2008 at 6:48 PM, Markus Armbruster <armbru@xxxxxxxxxx> wrote:
> Jeremy Fitzhardinge <jeremy@xxxxxxxx> writes:
>
>> Hugh Dickins wrote:
>>> As you can see, I'm still groping towards the right answer.
>>> The driver probably needs to provide its own backing_dev_info
>>> (or point to a suitable default), and its own address_space_ops,
>>> and perhaps more (there should be examples elsewhere). But whether
>>> it is actually wrong, or whether I was wrong to mess it up, I've
>>> not yet decided.
>>>
>>
>> My understanding is that the driver is doing something a bit clever:
>> it uses the page dirty flags to determine which parts of the
>> framebuffer have been written to, and uses that information to
>> minimize the amount of stuff that needs to be copied out. The writes
>
> Yes.
>
>> to the pages are not expected to generate actual page faults.
>>
>> But I haven't really looked at it closely, and I'm not at all familiar
>> with the vm at this layer. I'm not sure how it actually allocates the
>> framebuffer memory for example (vmalloc? incrementally on faults?).
>
> vmalloc()
>
>> I'm hoping Markus will leap in, since wrote this stuff. Or, gasp,
>> I'll read the code myself.
>
> The actual cleverness is in fb_defio[*], which was written by Jaya
> Kumar (cc'ed). I merely ripped out the old, somewhat racy cleverness
> I inherited from Anthony Liguori (which you can still admire in Xen's
> 2.6.18 kernel), and switched over to use fb_defio instead. Because
> one instance of clever code is enough.
>
> My understanding of fb_defio's inner workings is rather limited I
> fear. I'm just using it.
>
> Jaya, could you help?
>

I will try my best. Ok, I read through the thread. My current
understanding is as follows:

- Jeremy observed this issue when starting Xorg with Xen pvfb on 2.6.27-rc1
- Ian bisected it to 14fcc23fdc78e9d32372553ccf21758a9bd56fa1
- Peter pointed out from the trace we may be dirtying a page not in
the page cache
- Hugh mentioned prior to the bisected patch maybe the faulting page
had a .set_page_dirty that was ok but now it doesn't.
- Jeremy pointed out that the fault is at 1 page in to the /dev/fb0 mapping
- Hugh mentioned:
" The driver probably needs to provide its own backing_dev_info
(or point to a suitable default), and its own address_space_ops,
and perhaps more (there should be examples elsewhere). But whether
it is actually wrong, or whether I was wrong to mess it up, I've
not yet decided. "

In defio, the page mapping is provided through the vm_file that got
setup during mmap.
page->mapping = vma->vm_file->f_mapping;

I haven't figured how setting inode->i_mapping->a_ops is affecting
this. I will pull tip and test with metronomefb and see if I can
reproduce the issue when starting Xfbdev on that and start debugging
from there.

Thanks,
jaya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/