Re: RFC: Block reservation for hugetlbfs
From: David Gibson
Date: Thu Feb 23 2006 - 23:44:49 EST
On Wed, Feb 22, 2006 at 02:09:09PM +1100, Nick Piggin wrote:
> David Gibson wrote:
> >On Wed, Feb 22, 2006 at 11:38:42AM +1100, Nick Piggin wrote:
> >
> >>David Gibson wrote:
> >>
> >>>On Tue, Feb 21, 2006 at 03:18:59PM +1100, Nick Piggin wrote:
> >>
> >>>>This introduces
> >>>>tree_lock(r) -> hugetlb_lock
> >>>>
> >>>>And we already have
> >>>>hugetlb_lock -> lru_lock
> >>>>
> >>>>So we now have tree_lock(r) -> lru_lock, which would deadlock
> >>>>against lru_lock -> tree_lock(w), right?
> >>>>
> >>>
> >>>>From a quick glance it looks safe, but I'd _really_ rather not
> >>>
> >>>>introduce something like this.
> >>>
> >>>
> >>>Urg.. good point. I hadn't even thought of that consequence - I was
> >>>more worried about whether I need i_lock or i_mutex to protect my
> >>>updates to i_blocks.
> >>>
> >>>Would hugetlb_lock -> tree_lock(r) be any preferable (I think that's a
> >>>possible alternative).
> >>>
> >>
> >>Yes I think that should avoid the introduction of new lock dependency.
> >
> >
> >Err... "Yes" appears to contradict the rest of you statement, since my
> >suggestion would still introduce a lock dependency, just a different
> >one one. It is not at all obvious to me how to avoid a lock
> >dependency entirely.
> >
>
> I mean a new core mm lock depenency (ie. lru_lock -> tree_lock).
>
> But I must have been smoking something last night: for the life
> of me I can't see why I thought there was already a hugetlb_lock
> -> lru_lock dependency in there...?!
>
> So I retract my statement. What you have there seems OK.
Sadly, you weren't smoking something, and it's not OK. As akpm
pointed out later, the lru_lock dependecy is via __free_pages() which
is called from update_and_free_page() with hugetlb_lock held.
> >Also, any thoughts on whether I need i_lock or i_mutex or something
> >else would be handy..
>
> I'm not much of an fs guy. How come you don't use i_size?
i_size is already used for a hard limit on the file size - faulting a
page beyond i_size will SIGBUS, whereas faulting a page beyond
i_blocks just isn't guaranteed. In particular, we always extend
i_size when makiing a new mapping, whereas we only extend i_blocks
(and thus reserve pages) on a SHARED mapping (because space is being
guaranteed for things in the mapping, not for a random processes
MAP_PRIVATE copy).
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/