state-threads
[Top] [All Lists]

Re: st_netfd_poll returns within the requested timeout with timed out e

To: state-threads@xxxxxxxxxxx
Subject: Re: st_netfd_poll returns within the requested timeout with timed out error
From: Gene Shekhtman <gsh@xxxxxxxxxx>
Date: Mon, 15 Oct 2001 16:05:45 -0700
Organization: Abeona Networks, Inc.
References: <01092411323403.07213@bergsee.bio.uva.nl>
Sender: owner-state-threads@xxxxxxxxxxx
John Val wrote:

> During a computation I send the numerical results line after line to the
> client. Typically 1000 lines are send in one computation. On the client I
> noticed that some lines were not send correctly.

I think that John's problem is caused by the fact that ST time
resolution is actually a time interval between scheduling points.
That time interval may be as large as 1 second in some situations
(e.g., when a single thread does a lot of CPU-intensive work
continuously without a context switch).

Take a look at the _st_vp_check_clock() function in sched.c.
The elapsed time is calculated as (now - _st_this_vp.last_clock) and
then if (elapsed >= thread->sleep), thread is popped from the sleep
queue.  So, for example, if elapsed is 20ms and timeout is 10ms, the
thread will wake up as soon as _st_vp_check_clock() is called.

In John's case the specified I/O timeout of 10ms is less than the
time interval between scheduling points.  John's application issues
about 1000 st_write()s before filling output socket buffer and going
to select() with 10ms timeout.  Linux select(2) man page says:

   timeout is an *upper* bound on the amount of time elapsed
   before select returns.

If, for example, ~1000 st_writes took 8ms and select() timed
out after 4ms, the elapsed time is 12ms (> 10ms) and st_write()
returns with timeout error despite that it waited for only 4ms.

I think that there is a misunderstanding around the whole timeout
issue.  In 99.9% of all cases I/O timeouts are used either for
connection failure detection or to prevent a peer from holding
idle connection for too long.  So for most applications realistic
I/O timeouts should usually be order of seconds.  Also, I can't see
any point in retrying after timeout happened.  If someone wants to
retry after timeout, why wouldn't he increase the timeout value to
begin with?  If application for some reason wants to know the number
of bytes successfully transferred, it should use st_*_resid() functions
Mike added in the 1.3a version.

--Gene




<Prev in Thread] Current Thread [Next in Thread>
  • Re: st_netfd_poll returns within the requested timeout with timed out error, Gene Shekhtman <=