Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.12.3/8.12.3) with ESMTP id g4H0HgnC019115 for ; Thu, 16 May 2002 17:17:42 -0700 Received: (from majordomo@localhost) by oss.sgi.com (8.12.3/8.12.3/Submit) id g4H0Hglw019114 for linux-xfs-outgoing; Thu, 16 May 2002 17:17:42 -0700 X-Authentication-Warning: oss.sgi.com: majordomo set sender to owner-linux-xfs@oss.sgi.com using -f Received: from deliverator.sgi.com (deliverator.SGI.COM [204.94.214.10]) by oss.sgi.com (8.12.3/8.12.3) with SMTP id g4H0HVnC019088 for ; Thu, 16 May 2002 17:17:31 -0700 Received: from nodin.corp.sgi.com (fddi-nodin.corp.sgi.com [198.29.75.193]) by deliverator.sgi.com (980327.SGI.8.8.8-aspam/980304.SGI-aspam: SGI does not authorize the use of its proprietary systems or networks for unsolicited or bulk email from the Internet.) via ESMTP id RAA04951 for ; Thu, 16 May 2002 17:17:44 -0700 (PDT) mail_from (kaos@sgi.com) Received: from kao2.melbourne.sgi.com (kao2.melbourne.sgi.com [134.14.55.180]) by nodin.corp.sgi.com (8.12.3/8.11.4/nodin-1.0) with ESMTP id g4H0GaPF6662737; Thu, 16 May 2002 17:16:37 -0700 (PDT) Received: by kao2.melbourne.sgi.com (Postfix, from userid 16331) id B8B633000B8; Fri, 17 May 2002 10:16:35 +1000 (EST) Received: from kao2.melbourne.sgi.com (localhost [127.0.0.1]) by kao2.melbourne.sgi.com (Postfix) with ESMTP id A259098; Fri, 17 May 2002 10:16:35 +1000 (EST) X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4 From: Keith Owens To: Michael Sinz Cc: Linux XFS List Subject: Re: Strange behavior on the 2.4.18 XFS tree? In-reply-to: Your message of "Thu, 16 May 2002 13:57:44 -0400." <3CE3F318.1080500@wgate.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 17 May 2002 10:16:30 +1000 Message-ID: <5939.1021594590@kao2.melbourne.sgi.com> Sender: owner-linux-xfs@oss.sgi.com Precedence: bulk On Thu, 16 May 2002 13:57:44 -0400, Michael Sinz wrote: >Well, given the "transient" nature of the problem, I can not be sure it >really "fixed" it but doing the sync did look like it helped - alot. > >I could not even do an "ls -l" of a directory during the "stall" period. >My already running system monitors (one of which is xosview) showed no >CPU usage (well, almost none) and almost no disk I/O (mostly 0) > >All of memory was used - something on the order of: (but not exactly as >this was run after the system unblocked) > > total used free shared buffers cached >Mem: 577396 567464 9932 0 0 233460 >-/+ buffers/cache: 334004 243392 >Swap: 2048276 1964 2046312 > >(Yes, mozilla was running and so was the find in the background) > >The find (or other major filesystem traversal) is what helps trigger this >condition. Where is the system hung if there is no disk I/O and no CPU >used (unless the CPU usage is not being accounted for?) > >BTW - once I did sync things worked as expected (ls ran quickly, etc) That is the same symptom I have seen. Something has a lock and is not moving, probably waiting on an event. Forcing some disk flush activity causes enough events to get everything going again.