netdev
[Top] [All Lists]

Re: MSG_EOR flag

To: Steve Whitehouse <Steve@xxxxxxxxxxx>
Subject: Re: MSG_EOR flag
From: Henner Eisen <eis@xxxxxxxxxxxxx>
Date: 07 Mar 2000 19:36:11 +0100
Cc: netdev@xxxxxxxxxxx
In-reply-to: Steve Whitehouse's message of "Thu, 2 Mar 2000 21:21:41 +0000 (GMT)"
References: <200003022121.VAA28963@xxxxxxxxxxxxxx>
Sender: owner-netdev@xxxxxxxxxxx
>>>>> "Steve" == Steve Whitehouse <steve@xxxxxxxxxxxxxx> writes:

    Steve> You can return partial packets, but each call to recvmsg()
    Steve> must only return part (or whole) of one record. It must
    Steve> never return parts of more than one record in a single
    Steve> call. MSG_EOR is only set by recvmsg() on the final part of
    Steve> a record.

    Steve> You can just set MSG_EOR (as you suggest below) on each
    Steve> recvmsg() call if each call always results in a whole
    Steve> record being copied to the user.  This is very unlikely to
    Steve> result in correct behaviour though... you've no idea (from
    Steve> the kernel side) how big a buffer a user is going to give
    Steve> you to put the data in. Unlike SOCK_DGRAM you must not
    Steve> discard records with don't fit in the buffer, but must keep
    Steve> the part not yet sent to the user so the user can request
    Steve> it later.

Yes, I know, but the current method of re-assembling X.25 packets by means
of the M-bit can fail anyway (currently, this will will trigger an
X.25 reset for the virtual connection). I agree that this is broken,
but that's how it is impelented today. It is certainly worthy to fix. But
this needs some deeper changes in the X.25 code which cannot be applied
as a last minute patch for 2.4. 

    Steve> If you are looking for the "quick fix" for 2.2.xx though
    Steve> I'd certainly support your suggestion of always returning
    Steve> MSG_EOR from recvmsg() over the current behaviour. So long

o.k.

    Steve> as the applications always have a large enough buffer size,
    Steve> which from your comments I gather they do, then everything
    Steve> should work fine.

A related question is how to handle message boundaries in read()
and write() constistently. If write() in 2.3.x implicitly sets MSG_EOR,
which is interpreted as `each write() should generate a single, complete
message in terms of the underlaying protocol' by many protocol families,
I think read() from a SEQ_PACKET socket should behave consistently. That
means it should only return if the last fragment was received (unless
the read buffer space is to small in which case read() should return
an error). But as linux maps all read() to recvmsg() internally, the
socket layer only sees a recvmsg() call and cannot determine whether
it originated from a read(). Thus, it will be necessary to add a flag
to recvmsg, which is always set when recvmsg is called on behalf of read().
This flag would request that recvmsg should return only if either the
final part of the messages arrived or the receive buffer size is exceeded.

Is this what MSG_WAITALL is intended for?

Henner




<Prev in Thread] Current Thread [Next in Thread>