On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote:So, yes, seems to be an inode allocation problem.
I'm pretty sure the ENOSPC problem which you both found is an inodeOk. I am now running with the patch enabled on two machines - but as the issue occured only 2 times in nearly 2 months on two machines...
allocation problem. Some of you seem to have an easier time
reproducing it than others; could you try this patch, and periodically
scan your system logs for the message "ext4: find_group_flex failed,
fallback succeeded"? If the problem goes away for you, and you find
the occasional aforemention message in your system log, that will
confirm what I suspect, which is the bug is in fs/ext4/inode.c's
find_group_flex() function. (If I'm wrong, the fallback code will
activate only when the filesystem is genuinely out of inodes, which
should be very rare.)
More comments are in the patch header. My current long-term plan for
dealing with this is to enhance find_group_orlov() to and
find_group_other() to understand about flex_bg's.