xfs
[Top] [All Lists]

Re: hangs running dbench with 16 clients

To: Steven Pratt <slpratt@xxxxxxxxxxxxxx>
Subject: Re: hangs running dbench with 16 clients
From: Steve Lord <lord@xxxxxxx>
Date: 19 Jun 2003 10:40:06 -0500
Cc: linux-xfs@xxxxxxxxxxx
In-reply-to: <3EF1D6AA.4050200@austin.ibm.com>
Organization:
References: <3EF1C932.4080706@austin.ibm.com> <1056034152.1772.92.camel@jen.americas.sgi.com> <3EF1D6AA.4050200@austin.ibm.com>
Sender: linux-xfs-bounce@xxxxxxxxxxx
On Thu, 2003-06-19 at 10:28, Steven Pratt wrote:
> Steve Lord wrote:
> 
> >On Thu, 2003-06-19 at 09:31, Steven Pratt wrote:
> >  
> >
> >>While trying to do performance regression testing on 2.5.xx kernels, we 
> >>are seeing hangs when runnuing xfs with 16 clients quite often.  No 
> >>error messages, just that dbench does not complete.  XFS is hte only 
> >>filesystem the causes this behavior with dbench,  It is very 
> >>reproducibly on the 2.5.7x kernels.  Has anyone else seen this?  What 
> >>kind of debug information would you need.
> >>
> >>Steve
> >>    
> >>
> >
> >Hi Steve,
> >
> >Is this the performance regression tests which get run on 2.5 kernels?
> >I have seen the results, never a comment about hangs before.
> >
> Yes, this is the nightly regression runs that we have been doing on the 
> 2.5 kernel series.  We have been more worried about changes in 
> performance than functional issues to this point (that is our mission), 
> Since I am running multiple runs of dbench and averaging results, having 
> one run die still allows us to get numbers.  I just looked over the 
> results again and I see problem both on single client and on 16 client 
> runs.  I seem to remember that we have problem more often on 64 client 
> runs but dropped that version for lack of runtime.  Should also not that 
> this may not be as reproducible as I first mentioned, seems more like 1 
> out of 8 runs or so dies.

OK, I have dbench 16 running in a loop here, if it keeps going for
a couple of hours I will send you a diff from 2.5.72 and we can see
if it fixes things for you. I will try a second box with a vanilla
Linus kernel and see if I can hit it there. So far no hangs, but there
do seem to be temporary stalls occasionally which is odd.

> 
> >Unfortunately, we have been somewhat tied up with 2.4 things and
> >2.5 is not getting as much air time - and kdb does not work which
> >makes debugging much more interesting. Let me try it out and see
> >what happens, and details about the setup you see this in would
> >be appreciated.
> >
> Understand about 2.5 work. Here is a link to the test machine setup 
> information.
> http://ltcperf.ncsa.uiuc.edu/data/2.5.71/2.4.20-vs-2.5.71/HardwareSoftware.html
> 

Hmm, I don't have 8 cpus in all the boxes I can use for this right now,
never mind 8 cpus in each!

Steve

-- 

Steve Lord                                      voice: +1-651-683-3511
Principal Engineer, Filesystem Software         email: lord@xxxxxxx


<Prev in Thread] Current Thread [Next in Thread>