Re: [PATCH] tcp: fix potential corner case issue in segmentation (Was: Re: [PATCH] Do not use TSO/GSO when there is urgent data)

From: Petr Tesarik
Date: Fri Nov 21 2008 - 10:51:12 EST


Dne Friday 21 of November 2008 14:07:32 Ilpo JÃrvinen napsal(a):
> On Fri, 21 Nov 2008, Petr Tesarik wrote:
> > This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=12014
> >
> > Since most (if not all) implementations of TSO and even the in-kernel
> > software GSO do not update the urgent pointer when splitting a large
> > segment, it is necessary to turn off TSO/GSO for all outgoing traffic
> > with the URG pointer set.
>
> Good observation, I totally missed this possibility of T/GSO while
> looking.
>
> > Looking at tcp_current_mss (and the preceding comment) I even think
> > this was the original intention. However, this approach is insufficient,
> > because TSO/GSO is turned off only for newly created frames, not for
> > frames which were already pending at the arrival of a message with
> > MSG_OOB set. These frames were created when TSO/GSO was enabled,
> > so they may be large, and they will have the urgent pointer set
> > in tcp_transmit_skb().
> >
> > With this patch, such large packets will be fragmented again before
> > going to the transmit routine.
>
> I wonder if there's some corner case which still fails to fragment
> in tcp_retransmit_xmit's in skb->len <= cur_mss case if cur_mss
> grew very recently (and therefore skb-len now fits to a single segment).

This shouldn't be a problem, because TSO only applies to packets which are
larger than MSS, so the problematic case is when cur_mss gets smaller, not
when it grows. In other words, the original implementation of
tcp_retransmit_xmit() could never make use of TSO/GSO, anyway...

>[...]
> > Signed-off-by: Petr Tesarik <ptesarik@xxxxxxx>
> > CC: Jan Sembera <jsembera@xxxxxxx>
> > CC: Ilpo Jarvinen <ilpo.jarvinen@xxxxxxxxxxx>
> >
> > --
> > tcp_output.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -722,7 +722,8 @@ static void tcp_queue_skb(struct sock *sk, struct
> > sk_buff *skb)
> > static void tcp_set_skb_tso_segs(struct sock *sk, struct sk_buff *skb,
> > unsigned int mss_now)
> > {
> > - if (skb->len <= mss_now || !sk_can_gso(sk)) {
> > + if (skb->len <= mss_now || !sk_can_gso(sk) ||
> > + tcp_urg_mode(tcp_sk(sk))) {
> > /* Avoid the costly divide in the normal
> > * non-TSO case.
> > */
> > @@ -1163,7 +1164,9 @@ static int tcp_init_tso_segs(struct sock *sk,
> > struct sk_buff *skb,
> > {
> > int tso_segs = tcp_skb_pcount(skb);
> >
> > - if (!tso_segs || (tso_segs > 1 && tcp_skb_mss(skb) != mss_now)) {
> > + if (!tso_segs ||
> > + (tso_segs > 1 && (tcp_skb_mss(skb) != mss_now ||
> > + tcp_urg_mode(tcp_sk(sk))))) {
> > tcp_set_skb_tso_segs(sk, skb, mss_now);
> > tso_segs = tcp_skb_pcount(skb);
> > }
>
> It's a bit intrusive but I couldn't immediately come up with alternative
> that would have worked (came up with some not working ones :-)).

Yes, I also noticed that. We could add some more code to tcp_mark_urg(), e.g.
walk sk_write_queue and adjust the pending SKBs there...

Is it OK to simply set all skb->gso_segs to zero, and let the next call to
tcp_init_tso_segs redo them? I mean, will tcp_init_tso_segs() be always
called on all SKBs which are in the write queue at the time tcp_mark_urg() is
called?

Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/