Re: [PATCH/RFC] Re: recvmmsg() timeout behavior strangeness [RESEND]

From: Michael Kerrisk (man-pages)
Date: Mon Jun 16 2014 - 05:59:18 EST


Hi Arnaldo,

Things have gone quiet ;-). What's the current state of this patch?

Thanks,

Michael


On Thu, May 29, 2014 at 4:17 PM, Arnaldo Carvalho de Melo
<acme@xxxxxxxxxxxxxxxxxx> wrote:
> Em Thu, May 29, 2014 at 02:06:04PM +0000, David Laight escreveu:
>> From: 'Arnaldo Carvalho de Melo'
>> ...
>> > > I remember some discussions from an XNET standards meeting (I've forgotten
>> > > exactly which errors on which calls were being discussed).
>> > > My recollection is that you return success with a partial transfer
>> > > count for ANY error that happens after some data has been transferred.
>> > > The actual error will be returned when it happens again on the next
>> > > system call - Note the AGAIN, not a saved error.
>
>> > A saved error, for the right entity, in the recvmmsg case, that
>> > basically is batching multiple recvmsg syscalls, doesn't sound like a
>> > problem, i.e. the idea is to, as much as possible, mimic what multiple
>> > recvmsg calls would do, but reduce its in/out kernel (and inside kernel
>> > subsystems) overhead.
>
>> > Perhaps we can have something in between, i.e. for things like EFAULT,
>> > we should report straight away, effectively dropping whatever datagrams
>> > successfully received in the current batch, do you agree?
>
>> Not unreasonable - EFAULT shouldn't happen unless the application
>> is buggy.
>
> Ok.
>
>> > For transient errors the existing mechanism, fixed so that only per
>> > socket errors are saved for later, as today, could be kept?
>
>> I don't think it is ever necessary to save an errno value for the
>> next system call at all.
>> Just process the next system call and see what happens.
>
>> If the call returns with less than the maximum number of datagrams
>> and with a non-zero timeout left - then the application can infer
>> that it was terminated by an abnormal event of some kind.
>> This might be a signal.
>
> Then it could use getsockopt(SO_ERROR) perhaps? I.e. we don't return the
> error on the next call, but we provide a way for the app to retrieve the
> reason for the smaller than expected batch?
>
>> I'm not sure if an icmp error on a connected datagram socket could
>> generate a 'disconnect'. It might happen if the interface is being
>> used for something like SCTP.
>> In either case the next call will detect the error.
>
> - Arnaldo



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/