On Wed 12-04-23 20:41:21, Baokun Li wrote:
If extent status tree update fails, we have inconsistency between what isWhen I was looking through this patch, I've realized there's a problem with
stored in the extent status tree and what is stored on disk. And that can
cause even data corruption issues in some cases.
For extents that cannot be dropped we use __GFP_NOFAIL to allocate memory.
And with the above logic, the undo operation in __es_remove_extent that
may cause inconsistency if the split extent fails is unnecessary, so we
remove it as well.
Suggested-by: Jan Kara<jack@xxxxxxx>
Signed-off-by: Baokun Li<libaokun1@xxxxxxxxxx>
my plan :-|. See below...
static struct extent_status *I have remembered that the combination of GFP_ATOMIC and GFP_NOFAIL is
ext4_es_alloc_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len,
- ext4_fsblk_t pblk)
+ ext4_fsblk_t pblk, int nofail)
{
struct extent_status *es;
- es = kmem_cache_alloc(ext4_es_cachep, GFP_ATOMIC);
+ gfp_t gfp_flags = GFP_ATOMIC;
+
+ if (nofail)
+ gfp_flags |= __GFP_NOFAIL;
+
+ es = kmem_cache_alloc(ext4_es_cachep, gfp_flags);
if (es == NULL)
return NULL;
discouraged because the kernel has no sane way of refilling reserves for
atomic allocations when in atomic context. So this combination can result
in lockups.
So what I think we'll have to do is that we'll just have to return error
from __es_insert_extent() and __es_remove_extent() and in the callers we
drop the i_es_lock, allocate needed status entries (one or two depending on
the desired operation) with GFP_KERNEL | GFP_NOFAIL, get the lock again and
pass the preallocated entries into __es_insert_extent /
__es_remove_extent(). It's a bit ugly but we can at least remove those
__es_shrink() calls which are not pretty either.
Honza