Re: [PATCH] f2fs: fix unaligned field offset in 32-bits platform

From: Jaegeuk Kim
Date: Fri Mar 10 2023 - 16:08:47 EST


On 03/10, David Laight wrote:
> From: Jaegeuk Kim
> > Sent: 09 March 2023 23:55
> >
> > On 03/08, David Laight wrote:
> > > From: Chao Yu <chao@xxxxxxxxxx>
> > > > Sent: 07 March 2023 15:14
> > > >
> > > > F2FS-fs (dm-x): inconsistent rbtree, cur(3470333575168) next(3320009719808)
> > > > ------------[ cut here ]------------
> > > > kernel BUG at fs/f2fs/gc.c:602!
> > > > Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
> > > > PC is at get_victim_by_default+0x13c0/0x1498
> > > > LR is at f2fs_check_rb_tree_consistence+0xc4/0xd4
> > > > ....
> > > > [<c04d98b0>] (get_victim_by_default) from [<c04d4f44>] (f2fs_gc+0x220/0x6cc)
> > > > [<c04d4f44>] (f2fs_gc) from [<c04d4780>] (gc_thread_func+0x2ac/0x708)
> > > > [<c04d4780>] (gc_thread_func) from [<c015c774>] (kthread+0x1a8/0x1b4)
> > > > [<c015c774>] (kthread) from [<c01010b4>] (ret_from_fork+0x14/0x20)
> > > >
> > > > The reason is there is __packed attribute in struct rb_entry, but there
> > > > is no __packed attribute in struct victim_entry, so wrong offset of key
> > > > field will be parsed in struct rb_entry in f2fs_check_rb_tree_consistence,
> > > > it describes memory layouts of struct rb_entry and struct victim_entry in
> > > > 32-bits platform as below:
> > > >
> > > > struct rb_entry {
> > > > [0] struct rb_node rb_node;
> > > > union {
> > > > struct {...};
> > > > [12] unsigned long long key;
> > > > } __packed;
> > >
> > > This __packed removes the 4-byte pad before the union.
> > > I bet it should be removed...
> >
> > struct rb_node {
> > unsigned long __rb_parent_color;
> > struct rb_node *rb_right;
> > struct rb_node *rb_left;
> > } __attribute__((aligned(sizeof(long))));
> >
> > Hmm, isn't this aligned to 32bits originally? Why does 32bits pad 4-bytes
> > if we remove __packed?
>
> That attribute is entirely pointless.
> IIRC It is a request to increase the alignment to that of long.
> It wouldn't have any effect even if 'packed' was also specified.
> (Unless you are building for 64-bit windows.)
>
> The 'issue' is that on arm32 (unlike x86) 'long long' has
> 64bit alignment.
> So without the __packed on the union there is a 4 byte hole
> before the union.
>
> ...
> > IIUC, the problem comes since we access the same object with two structures
> > to handle rb_tree.
> >
> > E.g.,
> >
> > [struct extent_node] [struct rb_entry]
> > struct rb_node rb_node; struct rb_node rb_node;
> > union {
> > struct extent_info ei; struct {
> > unsigned int fofs; unsigned int ofs;
> > unsigned int len; unsigned int len;
> > };
> > unsigned long long key;
> > } __packed;
> >
> > So, I think if we get a different offset of fofs or ofs between in
> > extent_node and rb_entry, further work'll access a wrong memory since
> > we simply cast the object pointer between two.
>
> That example actually happens to work.
> But replace 'unsigned int' with 'long long' and it all fails.
>
> That is horribly broken (by design).
> You can't do that and expect it to work at all.
> This is another case where you don't want __packed, but to
> request a compilation error if padding was added.
>
> The safest 'fix' (it is still broken by design) is to change the
> alignment of rb_node to be that of 'long long' and remove the
> __packed from the union.
> That adds a 4 byte pad to rb_node on some, but not all, 32bit
> architectures.
> Then all the code that used the above pattern is (probably) ok.
>
> You can use (if I've spelt them right) aligned(Alignof(long long))
> but not aligned(__alignof(long long)) because the latter returns
> the preferred alignment not the actual alignment (8 not 4 on x86).
> I think Alignof() is ok for current kernels, but not for anything
> that might get backported to stable.
> You can use __alignof(struct {long long x;}).
>
> Actually this should also work:
> struct rb_node {
> union {
> long long alignment;
> struct {
> unsigned long __rb_parent_color;
> struct rb_node *rb_right;
> struct rb_node *rb_left;
> }
> }
> };

Thank you for the explanation. IMHO, it'd be good to keep the existing rb_node
for all the other components, but a problem of wrong design in f2fs.

I posted three patches to remove this buggy rb_entry sharing.
https://lore.kernel.org/lkml/20230310210454.2350881-1-jaegeuk@xxxxxxxxxx/T/#t

>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)