Re: attempting to format brd device results in OOM kills
From: Jeff Layton
Date: Sun Jun 18 2017 - 18:43:58 EST
On Sun, 2017-06-18 at 16:27 -0600, Jens Axboe wrote:
> On 06/18/2017 04:21 PM, Jens Axboe wrote:
> > On 06/18/2017 10:30 AM, Jeff Layton wrote:
> > > I've run across a regression from v4.11. If I boot a v4.12-rc1 or later
> > > kernel, make a large brd device and try to format it, it quickly slows
> > > down to a crawl and then the OOM killer kicks in.
> > >
> > > I ran a bisect and it landed here:
> > >
> > > commit f09a06a193d942a12c1a33c153388b3962222006 (HEAD, refs/bisect/bad)
> > > Author: Christoph Hellwig <hch@xxxxxx>
> > > Date: Wed Apr 5 19:21:16 2017 +0200
> > >
> > > brd: remove discard support
> > >
> > > It's just a in-driver reimplementation of writing zeroes to the pages,
> > > which fails if the discards aren't page aligned.
> > >
> > > Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> > > Reviewed-by: Hannes Reinecke <hare@xxxxxxxx>
> > > Signed-off-by: Jens Axboe <axboe@xxxxxx>
> > >
> > >
> > > I've been reproducing it in a VM with ~8G allocated to it:
> > >
> > > I have a modprobe.d file with this in it:
> > >
> > > options brd rd_nr=1 rd_size=1073741824
> > >
> > > I then just:
> > >
> > > # modprobe brd
> > > # mkfs -t ext2 /dev/ram0
> > >
> > > It keels over pretty quickly after that.
> >
> > Just checked, and creating a 1TB ram disk and then running mkfs.ext2 on it
> > writes 16851MiB of data. I can't say I'm surprised you OOM, if you run that
> > in a 8G VM, as you're about 8G short.
> >
> > I'm puzzled as to why the discard change would make any difference, however.
>
> Reverted the patch, and I see identical behavior. The only difference is that
> the whole device is trimmed first, as expected. But it still writes ~16G
> afterwards.
>
> Are you sure this commit is what broke things for you? Honestly, I don't see
> how it could ever work with 1TB ram disk, 8G of RAM, and 16G of data written.
>
My mistake! My brd rd_size parameter was too large by a factor of 1024
(I missed that it was in kbytes and not bytes). With it sanely sized to
1G (as I had actually intended), it works fine.
It's interesting that the older kernel survives this and the newer one
doesn't, but since it's such a pathological setup I'm not too worried
about it.
As far as that commit...no, I'm not sure that's what "broke" it for me.
That's where the bisect landed (and I think I did it right), but I
didn't independently verify whether reverting it helps or not.
Anyway here's the bisect log if you're interested:
$ git bisect log
# bad: [2ea659a9ef488125eb46da6eb571de5eae5c43f6] Linux 4.12-rc1
# good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11
git bisect start 'v4.12-rc1' 'v4.11'
# bad: [221656e7c4ce342b99c31eca96c1cbb6d1dce45f] Merge tag 'sound-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 221656e7c4ce342b99c31eca96c1cbb6d1dce45f
# bad: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect bad 8d65b08debc7e62b2c6032d7fe7389d895b92cbc
# good: [cec381919818a9a0cb85600b3c82404bdd38cf36] Merge tag 'mac80211-next-for-davem-2017-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
git bisect good cec381919818a9a0cb85600b3c82404bdd38cf36
# bad: [6dc2cce9321198172cd96f955a5fc798a4cc35a6] Merge branch 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 6dc2cce9321198172cd96f955a5fc798a4cc35a6
# bad: [477d7caeede0e3a933368440fc877b12c25dbb6d] Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration
git bisect bad 477d7caeede0e3a933368440fc877b12c25dbb6d
# bad: [a5695a79088653c73c92ae8d48658cbc49f31884] coda: Convert to separately allocated bdi
git bisect bad a5695a79088653c73c92ae8d48658cbc49f31884
# good: [ee056f98126170ca8b16b9a4a6e20aae7c5c184e] blk-mq-sched: provide hooks for initializing hardware queue data
git bisect good ee056f98126170ca8b16b9a4a6e20aae7c5c184e
# bad: [2a79efd833dd51c4362af655b9b011393c423f18] lightnvm: fix some WARN() messages
git bisect bad 2a79efd833dd51c4362af655b9b011393c423f18
# bad: [48920ff2a5a940cd07d12cc79e4a2c75f1185aee] block: remove the discard_zeroes_data flag
git bisect bad 48920ff2a5a940cd07d12cc79e4a2c75f1185aee
# good: [ee472d835c264a4cb77f8cf878603e1e40f3559e] block: add a flags argument to (__)blkdev_issue_zeroout
git bisect good ee472d835c264a4cb77f8cf878603e1e40f3559e
# good: [19372e2769179ddd154a0d6fbbdb719eb5d0af12] loop: implement REQ_OP_WRITE_ZEROES
git bisect good 19372e2769179ddd154a0d6fbbdb719eb5d0af12
# bad: [5d1429fead5beacce6df052c31b28a97a11e250b] mmc: remove the discard_zeroes_data flag
git bisect bad 5d1429fead5beacce6df052c31b28a97a11e250b
# bad: [93c1defedcae701512957c279b850659d1dae78f] rbd: remove the discard_zeroes_data flag
git bisect bad 93c1defedcae701512957c279b850659d1dae78f
# bad: [f09a06a193d942a12c1a33c153388b3962222006] brd: remove discard support
git bisect bad f09a06a193d942a12c1a33c153388b3962222006
# first bad commit: [f09a06a193d942a12c1a33c153388b3962222006] brd: remove discard support
Anyway, sorry for the noise!
--
Jeff Layton <jlayton@xxxxxxxxxx>