Received: with ECARTIS (v1.0.0; list linux-xfs); Mon, 21 Oct 2002 23:22:00 -0700 (PDT) Received: from mx-01-bsl.sauter-bc.com (mx-01-bsl.sauter-bc.com [213.173.165.132]) by oss.sgi.com (8.12.5/8.12.5) with SMTP id g9M6LuuR028881 for ; Mon, 21 Oct 2002 23:21:57 -0700 Received: from tempmail.sauter-bc.com (tempmail [10.1.6.25]) by mx-01-bsl.sauter-bc.com (Postfix) with ESMTP id 3BEE5AC3A; Tue, 22 Oct 2002 08:16:56 +0200 (CEST) Received: from ssba-bsl.cad.sba (ssba-bsl.cad.sba [10.1.6.20]) by tempmail.sauter-bc.com (Postfix) with ESMTP id DC5DB190A9; Tue, 22 Oct 2002 08:16:49 +0200 (CEST) Received: from ch.sauter-bc.com (sup.cad.sba [10.1.200.117]) by ssba-bsl.cad.sba (Postfix) with ESMTP id 5D30330881D; Tue, 22 Oct 2002 08:21:47 +0200 (CEST) Message-ID: <3DB4EE7B.998FCADA@ch.sauter-bc.com> Date: Tue, 22 Oct 2002 08:21:47 +0200 From: Simon Matter Organization: Sauter AG, Basel X-Mailer: Mozilla 4.77 [de] (X11; U; Linux 2.2.19-6.2.16 i686) X-Accept-Language: de-CH MIME-Version: 1.0 To: Luben Tuikov Cc: linux-xfs , Eric Sandeen Subject: Re: PATCH: sleeping while holding a lock in _pagebuf_free_bh()::page_buf.c References: <3DB49424.9E4CAC0F@splentec.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 1151 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: simon.matter@ch.sauter-bc.com Precedence: bulk X-list: linux-xfs Luben Tuikov schrieb: > > Problem: on an SMP system, BANG#@!, the unthinkable happens. > Solution: never sleep when holding a lock. > > This patch applies to CVS code as of about 18:30 EDT > on Mon Oct 21 (today), and is self-explanatory. > > This patch fixes the problem of the mount going into D state > indefinitely when the RAID is syncing and mount is run > right after mkfs.xfs (from shell script, no sleep between, > low system load, SMP). I've had some troubles with one of my servers after a failing disk from the SoftRAID. IIRC it was like this: I replaced the broken disk and added it to the RAID volume using a boot CD. Before the volume was synced, I rebootet. The box came up but hung when it tried to mount some XFS volumes on the syncing RAID. My solution was to boot with init=/bin/sh, wait for the sync to complete, and then do a normal boot. Could this be the bug you found here? Simon > > If you know of similar incidents in other parts of the code > those should be fixed, probably ASAP. > > Please apply, > -- > Luben > > --- linux/fs/xfs/pagebuf/page_buf.c.orig Mon Oct 21 19:13:01 2002 > +++ linux/fs/xfs/pagebuf/page_buf.c Mon Oct 21 19:13:05 2002 > @@ -710,7 +710,7 @@ > pb_resv_bh_cnt++; > > if (waitqueue_active(&pb_resv_bh_wait)) { > - wake_up(&pb_resv_bh_wait); > + wake_up_sync(&pb_resv_bh_wait); > } > }