Re: EXT3 deadlock in 2.4.22 and 2.4.23-pre7 - quota related?

From: Jan Kara
Date: Tue Oct 28 2003 - 19:03:44 EST


> Neil Brown <neilb@xxxxxxxxxxxxxxx> wrote:
> >
> >
> > Hi all, and particularly Andrew and Stephen,
> >
> > I recently "upgraded' one of my NFS fileservers from (patched)2.4.18
> > to 2.4.23-pre7 (in order to resolve a HIMEM related memory pressure
> > problem).
> >
> > Unfortunately I have experienced what appears to be a deadlock.
> >
>
> I don't recall a time when quotas on ext3 were not deadlocky :(
Sadly yes... The problems with journal_begin/end are constant..

> > ...
> >
> > rquotad Call Trace: [sleep_on+75/124]
> > [start_this_handle+205/368] [journal_start+149/196]
> > [ext3_dirty_inode+116/268] [__mark_inode_dirty+50/168]
> > [update_atime+75/80] [do_generic_file_read+1158/1172]
> > [generic_file_read+147/400] [file_read_actor+0/224]
> > [do_get_write_access+1382/1420] [v1_read_dqblk+121/196]
> > [read_dqblk+76/128] [dqget+344/484] [vfs_get_dqblk+21/64]
> > [v1_get_dqblk+39/172] [link_path_walk+2680/2956]
> > [do_compat_quotactl+417/688] [resolve_dev+89/108]
> > [sys_quotactl+166/275] [system_call+51/56]
>
> read_dqblk() took dqio_sem, then ext3_dirty_inode() did journal_start().
>
> > At the same time, "sync" is running:
> >
> > sync Call Trace: [__down+109/208] [__down_failed+8/12]
> > [.text.lock.dquot+73/286] [ext3_sync_dquot+337/462]
> > [vfs_quota_sync+102/372] [sync_dquots_dev+194/260]
> > [fsync_dev+66/128] [sys_sync+7/16] [system_call+51/56]
>
> ext3_sync_dquot() did journal_start(), then someone (commit_dqblk?) tried
> to take dqio_sem.
>
> Probably it is not too hard to flip the ordering in the latter case, but all
> other code paths need to be checked, and perhaps even documented.
Actually I think we need to flip the first part - ext3_sync_dquot() is
there exactly because we needed to handle the sync() case separately..

> Jan, didn't we fix this very problem in 2.6 recently?
If I remember right last 2.6 problem was a bit different - the problem
was with getting some locks when handling quota files which should not
be acquired...

Honza

--
Jan Kara <jack@xxxxxxx>
SuSE CR Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/