Re: Your rename change and loopback

Andrea Arcangeli (andrea@suse.de)
Tue, 25 May 1999 03:50:42 +0200 (CEST)


Excuse for the late reply but I lost the email in the backlog :-).

On Sat, 22 May 1999, Alexander Viro wrote:

>On Sat, 22 May 1999, H.J. Lu wrote:
>
>> 1. 2.2.9 + the patch above. It seems ok so far.
>> 2. 2.2.6 + the patch above. I am getting kernel messages:
>> Attempt to refile free buffer
>> Attempt to refile free buffer
>> Attempt to refile free buffer
>> Attempt to refile free buffer
>> Attempt to refile free buffer
>> Attempt to refile free buffer

The patch that you tried is not yet correct enough. If you slept then
bhnext could be refiled. So you must at least check if bhnext->b_list
match with nlist.

>a) While irrelevant in this situation, anti-deadlock protection (allowance
>for 2-fold amount of requests) will break violently for loopback over
>loopback.

bdflush can't block in mark_buffer_dirty() so it won't deadlock as far as
you won't run many sync or unmount by hand at the same time.

You are right pointing out that such WRITEA-write_cmd code wouldn't be
roubust enough and I think I'll remove it. It's only an hack. BTW, if you
run 200 sync in parallel you can still get deadlocks with loop devices,
there's nothing that enforces safety.

Also if you use an ext2fs mounted on a loop device in _sync_ mode, you
should be able to deadlock (too lazy to try to reproduce now though :-),
because all write requests will be generated by potential NR_REQUEST
different tasks. They may generate all the requests at the same, and then
they may all go to sleep because there are too many dirty buffers, so
waking and waiting bdlfush to complete, and bdflush will block because
there are no request-struct available.

Also in 2.2.9 and 2.3.3 sync_old_buffers() can generate a deadlock because
it may block in mark_dirty_buffers() waiting for bdflush that will wait
for a free request (all request busy by the block device).

But if you won't deadlock you should run _fine_ (and the deadlock can't
happen with my code if you don't run sync by hand and if you don't mount
the fs in sync mode).

>one, though. Could somebody comment on those? Andrea? You've dealt with
>buffer.c lately...

I would ask to try to reproduce the odd-buffer error with my latest buffer
patch against 2.2.9 applyed that fixes all races I know about:

ftp://e-mind.com/pub/andrea/kernel-patches/buffer-2.2.9-N.gz

In such patch there's still the WRITEA trick but it won't make any
difference at runtime (since kflushd is just the only one that can't
deadlock) so you can just try such patch safely. In the meantime I'll
remove the WRITEA trick since it should change nothing and the loop device
should be able to deadlock at any time running many `sync' by hand. It has
to be fixed in a robust way not with the WRITEA hack. Maybe I'll think
about it tomorrow...

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/