Re: [PATCH] ceph: fix corruption when using page_count 0 page in rbd

From: Ilya Dryomov
Date: Tue May 06 2014 - 12:22:38 EST


On Wed, Apr 23, 2014 at 8:35 AM, Chunwei Chen <tuxoko@xxxxxxxxx> wrote:
> It has been reported that using ZFSonLinux on rbd will result in memory
> corruption. The bug report can be found here:
>
> https://github.com/zfsonlinux/spl/issues/241
> http://tracker.ceph.com/issues/7790
>
> The reason is that ZFS will send pages with page_count 0 into rbd, which in
> turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with
> page_count 0, as it will do get_page and put_page, and erroneously free the
> page.
>
> This type of issue has been noted before, and handled in iscsi, drbd,
> etc. So, rbd should also handle this. This fix address this issue by fall back
> to slower sendmsg when page_count 0 detected.
>
> Cc: Sage Weil <sage@xxxxxxxxxxx>
> Cc: Yehuda Sadeh <yehuda@xxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Chunwei Chen <tuxoko@xxxxxxxxx>
> ---
> net/ceph/messenger.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 4f55f9c..9a964e7 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -557,7 +557,7 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov,
> return r;
> }
>
> -static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
> +static int __ceph_tcp_sendpage(struct socket *sock, struct page *page,
> int offset, size_t size, bool more)
> {
> int flags = MSG_DONTWAIT | MSG_NOSIGNAL | (more ? MSG_MORE : MSG_EOR);
> @@ -570,6 +570,24 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
> return ret;
> }
>
> +static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
> + int offset, size_t size, bool more)
> +{
> + int ret;
> + struct kvec iov;
> +
> + /* sendpage cannot properly handle pages with page_count == 0,
> + * we need to fallback to sendmsg if that's the case */
> + if (page_count(page) >= 1)
> + return __ceph_tcp_sendpage(sock, page, offset, size, more);
> +
> + iov.iov_base = kmap(page) + offset;
> + iov.iov_len = size;
> + ret = ceph_tcp_sendmsg(sock, &iov, 1, size, more);
> + kunmap(page);
> +
> + return ret;
> +}

Looks good to me. Have you tested it with pre "Fix crash when using
ZFS on Ceph rbd" ZFS?

Thanks,

Ilya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/