Re: [PATCH 4.14 024/110] btrfs: use proper endianness accessors for super_copy

From: Anand Jain
Date: Fri Mar 16 2018 - 12:19:31 EST

Next message: Mike Kravetz: "Re: [PATCH v3] hugetlbfs: check for pgoff value overflow"
Previous message: Sinan Kaya: "[PATCH v3 09/18] fm10k: Eliminate duplicate barriers on weakly-ordered archs"
In reply to: Anand Jain: "Re: [PATCH 4.14 024/110] btrfs: use proper endianness accessors for super_copy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 03/16/2018 02:55 AM, Christoph Biedl wrote:

Greg Kroah-Hartman wrote...

4.14-stable review patch. If anyone has any objections, please let me know.

commit 3c181c12c431fe33b669410d663beb9cceefcd1b upstream.

(...)

If the filesystem is always used on a same endian host, this will not
be a problem.

From my observations I cannot quite subscribe to that.

On big-endian systems, this change intruduces severe corruption,
resulting in complete loss of the data on the used block device.

Thanks for the report.

That's really bad, my mistake. I am digging to know how it happened. Our on-disk root bytenr are little-endian compatible. So using the cpu_to_le for write on a big-endian arch is a correct thing to do. If it fails, certainly there is something which I have overlooked. I am digging to know. Thanks for the report again.

Fsck won't be able to figure out the correct on-disk btyenr either.

If there isn't any backup we could try to find out the correct pointers manually. However, restore from the backup approach is much better.

-Anand

Steps to reproduce (tested on ppc/powerpc and parisc/hppa):

# mkfs.btrfs $DEV
# mount $DEV /mnt/tmp/
# umount /mnt/tmp/

This simple umount corrupts the file system:

# mount $DEV /mnt/tmp/
mount: /mnt/tmp: wrong fs type, bad option, bad superblock on $DEV, missing codepage or helper program, or other error.

# dmesg:
BTRFS critical (device <dev>): unable to find logical 4294967296 length 4096
BTRFS critical (device <dev>): unable to find logical 4294967296 length 4096
BTRFS critical (device <dev>): unable to find logical 18102363734671360 length 16384
BTRFS error (device <dev>): failed to read chunk root
BTRFS error (device <dev>): open_ctree failed

Also fsck is of no help:

# btrfsck $DEV
Couldn't map the block 18102363734671360
No mapping for 18102363734671360-18102363734687744
Couldn't map the block 18102363734671360
bytenr mismatch, want=18102363734671360, have=0
ERROR: cannot read chunk root
ERROR: cannot open file system

Trying mount or fsck on a little-endian system does not help either. So
I consider the data on that device lost - luckily I use btrfs only for
files where a backup exists all the time.

Reverting that change restored the previous error-free behaviour. I
didn't check HEAD, i.e. v4.16-rc5, since the upstream commt was the last
that affected these files. Still I could give this a try if anybody
wishes so.

Cheers,

Christoph

Next message: Mike Kravetz: "Re: [PATCH v3] hugetlbfs: check for pgoff value overflow"
Previous message: Sinan Kaya: "[PATCH v3 09/18] fm10k: Eliminate duplicate barriers on weakly-ordered archs"
In reply to: Anand Jain: "Re: [PATCH 4.14 024/110] btrfs: use proper endianness accessors for super_copy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]