>>>>> "Steve" == Steve Whitehouse <steve@xxxxxxxxxxxxxx> writes:
Steve> You can return partial packets, but each call to recvmsg()
Steve> must only return part (or whole) of one record. It must
Steve> never return parts of more than one record in a single
Steve> call. MSG_EOR is only set by recvmsg() on the final part of
Steve> a record.
Steve> You can just set MSG_EOR (as you suggest below) on each
Steve> recvmsg() call if each call always results in a whole
Steve> record being copied to the user. This is very unlikely to
Steve> result in correct behaviour though... you've no idea (from
Steve> the kernel side) how big a buffer a user is going to give
Steve> you to put the data in. Unlike SOCK_DGRAM you must not
Steve> discard records with don't fit in the buffer, but must keep
Steve> the part not yet sent to the user so the user can request
Steve> it later.
Yes, I know, but the current method of re-assembling X.25 packets by means
of the M-bit can fail anyway (currently, this will will trigger an
X.25 reset for the virtual connection). I agree that this is broken,
but that's how it is impelented today. It is certainly worthy to fix. But
this needs some deeper changes in the X.25 code which cannot be applied
as a last minute patch for 2.4.
Steve> If you are looking for the "quick fix" for 2.2.xx though
Steve> I'd certainly support your suggestion of always returning
Steve> MSG_EOR from recvmsg() over the current behaviour. So long
Steve> as the applications always have a large enough buffer size,
Steve> which from your comments I gather they do, then everything
Steve> should work fine.
A related question is how to handle message boundaries in read()
and write() constistently. If write() in 2.3.x implicitly sets MSG_EOR,
which is interpreted as `each write() should generate a single, complete
message in terms of the underlaying protocol' by many protocol families,
I think read() from a SEQ_PACKET socket should behave consistently. That
means it should only return if the last fragment was received (unless
the read buffer space is to small in which case read() should return
an error). But as linux maps all read() to recvmsg() internally, the
socket layer only sees a recvmsg() call and cannot determine whether
it originated from a read(). Thus, it will be necessary to add a flag
to recvmsg, which is always set when recvmsg is called on behalf of read().
This flag would request that recvmsg should return only if either the
final part of the messages arrived or the receive buffer size is exceeded.
Is this what MSG_WAITALL is intended for?