Re: linux-next regression: IO errors in with ext4 and xen-blkfront

From: Jens Axboe
Date: Fri Oct 22 2010 - 04:18:37 EST


On 2010-10-21 02:09, Jeremy Fitzhardinge wrote:
> On 10/20/2010 05:04 PM, Jeremy Fitzhardinge wrote:
>> Hi,
>>
>> When doing some regression testing with Xen on linux-next, I'm finding
>> that my domains are failing to get through the boot sequence due to IO
>> errors:
>>
>> Remounting root filesystem in read-write mode: EXT4-fs (dm-0): re-mounted. Opts: (null)
>> [ OK ]
>> Mounting local filesystems: EXT3-fs: barriers not enabled
>> kjournald starting. Commit interval 5 seconds
>> EXT3-fs (xvda1): using internal journal
>> EXT3-fs (xvda1): mounted filesystem with writeback data mode
>> SELinux: initialized (dev xvda1, type ext3), uses xattr
>> SELinux: initialized (dev xenfs, type xenfs), uses genfs_contexts
>> [ OK ]
>> Enabling local filesystem quotas: [ OK ]
>> Enabling /etc/fstab swaps: Adding 917500k swap on /dev/mapper/vg_f1364-lv_swap. Priority:-1 extents:1 across:917500k
>> [ OK ]
>> SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses genfs_contexts
>> Entering non-interactive startup
>> Starting monitoring for VG vg_f1364: 2 logical volume(s) in volume group "vg_f1364" monitored
>> [ OK ]
>> ip6tables: Applying firewall rules: [ OK ]
>> iptables: Applying firewall rules: [ OK ]
>> Bringing up loopback interface: [ OK ]
>> Bringing up interface eth0:
>> Determining IP information for eth0... done.
>> [ OK ]
>> Starting auditd: [ OK ]
>> end_request: I/O error, dev xvda, sector 0
>> end_request: I/O error, dev xvda, sector 0
>> end_request: I/O error, dev xvda, sector 9675936
>> Aborting journal on device dm-0-8.
>> Starting portreserve: EXT4-fs error (device dm-0): ext4_journal_start_sb:259: Detected aborted journal
>> EXT4-fs (dm-0): Remounting filesystem read-only
>> [ OK ]
>> Starting system logger: EXT4-fs (dm-0): error count: 4
>> EXT4-fs (dm-0): initial error at 1286479997: ext4_journal_start_sb:251
>> EXT4-fs (dm-0): last error at 1287618175: ext4_journal_start_sb:259
>>
>>
>> I haven't tried to bisect this yet (which will be awkward because
>> linux-next had also introduced various Xen bootcrashing bugs), but I
>> wonder if you have any thoughts about what may be happening here. I
>> guess an obvious candidate is the barrier changes in the storage
>> subsystem, but I still get the same errors if I mount root with barrier=0.
>
> Hm. I get the same errors, but the system boots to login prompt rather
> than hanging at that point above, and seems generally happy. So perhaps
> barriers are the key.

To test that theory, can you try and pull the two other main bits of the
pending block patches and see if it works?

git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core
git://git.kernel.dk/linux-2.6-block.git for-2.6.37/drivers

and if that works, then pull

git://git.kernel.dk/linux-2.6-block.git for-2.6.37/barrier

and see how that fares.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/