On Wed, 21 Mar 2001, Linus Torvalds wrote:
> On Wed, 21 Mar 2001, Linus Torvalds wrote:
> >
> > The deadlock implies that somebody scheduled with page_table_lock held.
> > Which would be really bad.
>
> ..and it is probably do_swap_page().
>
> Despite the name, "lookup_swap_cache()" does more than a lookup - it will
> wait for the page that it looked up. And we call it with the
> page_table_lock held in do_swap_page().
Darn, you're too quick.  (just figured it out and was about to report:)
Trace; c012785b <__lock_page+83/ac>
Trace; c012789b <lock_page+17/1c>
Trace; c01279a1 <__find_lock_page+81/f0>
Trace; c013008b <lookup_swap_cache+4b/164>
Trace; c0125362 <do_swap_page+12/1cc>
Trace; c012571f <handle_mm_fault+77/c4>
Trace; c01148b4 <do_page_fault+0/426>
Trace; c0114a17 <do_page_fault+163/426>
Trace; c01148b4 <do_page_fault+0/426>
Trace; c011581e <schedule+3e6/5ec>
Trace; c010908c <error_code+34/3c>
> Ho humm. Does the appended patch fix it for you? Looks obvious enough, but
> this bug is actually hidden on true SMP, and I'm too lazy to test with
> "num_cpus=1" or something..
I'm sure it will, but will be back in a few with confirmation.
>
> 		Linus
>
> -----
> diff -u --recursive --new-file pre6/linux/mm/memory.c linux/mm/memory.c
> --- pre6/linux/mm/memory.c	Tue Mar 20 23:13:03 2001
> +++ linux/mm/memory.c	Wed Mar 21 22:21:27 2001
> @@ -1031,18 +1031,20 @@
>  	struct vm_area_struct * vma, unsigned long address,
>  	pte_t * page_table, swp_entry_t entry, int write_access)
>  {
> -	struct page *page = lookup_swap_cache(entry);
> +	struct page *page;
>  	pte_t pte;
>
> +	spin_unlock(&mm->page_table_lock);
> +	page = lookup_swap_cache(entry);
>  	if (!page) {
> -		spin_unlock(&mm->page_table_lock);
>  		lock_kernel();
>  		swapin_readahead(entry);
>  		page = read_swap_cache(entry);
>  		unlock_kernel();
> -		spin_lock(&mm->page_table_lock);
> -		if (!page)
> +		if (!page) {
> +			spin_lock(&mm->page_table_lock);
>  			return -1;
> +		}
>
>  		flush_page_to_ram(page);
>  		flush_icache_page(vma, page);
> @@ -1053,13 +1055,13 @@
>  	 * Must lock page before transferring our swap count to already
>  	 * obtained page count.
>  	 */
> -	spin_unlock(&mm->page_table_lock);
>  	lock_page(page);
> -	spin_lock(&mm->page_table_lock);
>
>  	/*
> -	 * Back out if somebody else faulted in this pte while we slept.
> +	 * Back out if somebody else faulted in this pte while we
> +	 * released the page table lock.
>  	 */
> +	spin_lock(&mm->page_table_lock);
>  	if (pte_present(*page_table)) {
>  		UnlockPage(page);
>  		page_cache_release(page);
(oh well, I had the right idea at least)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Mar 23 2001 - 21:00:16 EST