netdev
[Top] [All Lists]

Re: select says I can read, but recvfrom hangs

To: "D. Hugh Redelmeier" <hugh@xxxxxxxxxx>
Subject: Re: select says I can read, but recvfrom hangs
From: Andi Kleen <ak@xxxxxx>
Date: Fri, 22 Jun 2001 13:09:29 +0200
Cc: Andi Kleen <ak@xxxxxx>, netdev@xxxxxxxxxxx
In-reply-to: <Pine.LNX.4.33.0106211521040.8383-100000@redshift.mimosa.com>; from hugh@mimosa.com on Thu, Jun 21, 2001 at 09:42:24PM +0200
References: <20010621182650.50332@colin.muc.de> <Pine.LNX.4.33.0106211521040.8383-100000@redshift.mimosa.com>
Sender: owner-netdev@xxxxxxxxxxx
On Thu, Jun 21, 2001 at 09:42:24PM +0200, D. Hugh Redelmeier wrote:
> This code assumes that for every error return from the recvfrom either
> there is a MSG_ERRQUEUE message, or at least it is safe to try to read
> one -- it won't block.  As far as I can tell, this has never blocked.
> But the number of errors isn't high, so testing hasn't been intense.

It should not block.

> Is it the case that a queued MSG_ERRQUEUE message will cause select to
> say that there is something to read?  I'd expect so.

Yes. select meshes error into read/write, while poll will also give you more
accurate events; separated for error and read.


> 
> This code assumes that if there is a queued MSG_ERRQUEUE message, an
> attempt to recvfrom with flags = 0 will not hang, but instead produce
> an error return.  Is this wrong?  If it is wrong, it contradicts my
> understanding of an answer that Andi gave me last fall.  This could
> explain the hang that we are observing.

It is right. As long as there is a errqueue message the pending error
of the socket is regenerated; and should be returned in recvmsg or reported
by select.

In this case it looks indeed like a subtle kernel bug; although I cannot
see it in 2.2.19 on quick source review.

Is the affected machine a SMP box?

-Andi

-- 
Life would be so much easier if we could just look at the source code.

<Prev in Thread] Current Thread [Next in Thread>