Re: [PATCH v2] ipc: Modify message queue accounting to reflect both total user data and auxiliary kernel data

From: Davidlohr Bueso
Date: Thu Jun 25 2015 - 14:21:47 EST


[CC'ing akpm as he handles such changes]

On Thu, 2015-06-25 at 09:23 +0200, Michael Kerrisk (man-pages) wrote:
> On 25 June 2015 at 07:47, Davidlohr Bueso <dave@xxxxxxxxxxxx> wrote:
> > On Tue, 2015-06-23 at 00:25 +0200, Marcus Gelderie wrote:
> >> A while back, the message queue implementation in the kernel was
> >> improved to use btrees to speed up retrieval of messages (commit
> >> d6629859b36). The patch introducing the improved kernel handling of
> >> message queues (using btrees) has, as a by-product, changed the
> >> meaning of the QSIZE field in the pseudo-file created for the queue.
> >> Before, this field reflected the size of the user-data in the queue.
> >> Since, it also takes kernel data structures into account. For
> >> example, if 13 bytes of user data are in the queue, on my machine the
> >> file reports a size of 61 bytes.
> >
> > Good catch, and a nice opportunity to make the mq manpage more specific
> > wrt to queue sizes.
> >
> > [...]
> >
> >> Reporting the size of the message queue in kernel has its merits, but
> >> doing so in the QSIZE field of the pseudo file corresponding to the
> >> queue is a breaking change, as mentioned above. This patch therefore
> >> returns the QSIZE field to its original meaning. At the same time,
> >> it introduces a new field QKERSIZE that reflects the size of the queue
> >> in kernel (user data + kernel data).
> >
> > Hmmm I'm not sure about this. What are the specific benefits of having
> > QKERSIZE? We don't export in-kernel data like this in any other ipc
> > (posix or sysv) mechanism, afaik. Plus, we do not compromise kernel data
> > structures like this, as we would break userspace if later we change
> > posix_msg_tree_node. So NAK to this.
> >
> > I would just remove the extra
> > + info->qsize += sizeof(struct posix_msg_tree_node);
> >
> > bits from d6629859b36 (along with -stable v3.5), plus a patch updating
> > the manpage that this field only reflects user data.
>
> I've been hoping that Doug would jump into this discussion...
>
> If I recall/understand Doug correctly (see
> http://thread.gmane.org/gmane.linux.man/7050/focus=1797645 ),

Ah so we _had_ this conversation in the past.

> his
> rationale for the QSIZE change was that it then revealed a value that
> was closer to what was being used to account against the
> RLIMIT_MSGQUEUE resource limit. (Even with these changes, the QSIZE
> value was not 100% accurate for accounting against RLIMIT_MSGQUEUE,
> since some pieces of kernel overhead were still not being accounted
> for. Nevertheless, it's much closer than the old (pre 3.5) QSIZE for
> some corner cases.) Thus, Marcus's rationale for preserving this info
> as QKERSIZE.
>
> Now whether QKERSIZE is actually useful or used by anyone is another
> question. As far as I know, there was no user request that drove the
> change. But Doug can perhaps say something to this. QSIZE should I
> think definitely be fixed (reverted to pre-3.5 behavior). I'm agnostic
> about QKERSIZE.

Right, and we seemed to have agreed in the previous thread that the
QSIZE changes should be reverted back to its original values. We also
agree on the main reason why: it exposes unnecessarily kernel
implementation details to userland -- and as such I at least still don't
buy much into the idea of QKERSIZE either.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/