xfs
[Top] [All Lists]

Fwd: Re: xfsdump hanging in uninterruptible sleep

To: linux-xfs@xxxxxxxxxxx
Subject: Fwd: Re: xfsdump hanging in uninterruptible sleep
From: Bernd Strieder <strieder@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 30 Oct 2003 17:11:10 +0100
Organization: Universitaet Kaiserlautern
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: KMail/1.5.3
Hello,

after getting the mail bounced, a second try forwarding my own 
message. I had some conversation with Steve Lord on the subject, 
but I was not able to reach him directly. The hanging xfsdump 
process is still there, in the case you need more info. The 
machine will have to be rebooted as soon as possible to get 
backup working again. If it gets unstable it will be rebooted 
immediately, since quite some people are relying on it. 

Bernd Strieder
--- Begin Message ---
To: Steve Lord <lord@xxxxxxx>
Subject: Re: xfsdump hanging in uninterruptible sleep
From: Bernd Strieder <strieder@xxxxxxxxxxxxxxxxxxxx>
Date: Wed, 29 Oct 2003 15:48:00 +0100
In-reply-to: <1064535652.1529.25.camel@xxxxxxxxxxxxxxxxxxxxxxx>
Organization: Universitaet Kaiserlautern
References: <200309252314.39433.strieder@xxxxxxxxxxxxxxxxxxxx> <1064535652.1529.25.camel@xxxxxxxxxxxxxxxxxxxxxxx>
User-agent: KMail/1.5.3
Hello,

Am Freitag, 26. September 2003 02:20 schrieben Sie:
> On Thu, 2003-09-25 at 16:14, Bernd Strieder wrote:
> > Hello
> >
> > The filesystem in question is on a 36GB SCSI drive, the only
> > partition on this drive. The fs contains 22 GB of data,
> > quota is used with few exception. The filesystem is exported
> > to about 15 Linux clients, 1 Debian, the others SuSE8.2 and
> > 8.1 , and 1 OpenBSD client. The server is P3 2-way SMP with
> > 4GB of RAM.
> >
> > Occasionally xfsdump is hanging in uninterruptible sleep.
> > Occasionally means, that it happens sometimes, but I have
> > not been able to trigger the problem.
> >
> > Usually xfsdump is run at night writing about 20 GB via rmt
> > to a Sun box with a DLT streamer attached to it. If the
> > problem happens, it must be at the beginning of the dump,
> > from the backup logs I have.
> >
> > I have not found a way to kill xfsdump in this state, the
> > machine has to be booted, or the other night the next
> > xfsdump started will get into the same state. There are no
> > diagnostics somewhere in the system, syslog, console, dmesg.
> > ps says xfsdump is in lock_p.
> >
> > The problem happens with the SuSE-kernels delivered with
> > SuSE-8.1 and 8.2, and with all patched Linus with XFS
> > patched, and with -ac kernels. All kernels I have tried show
> > the problem. Before the update to SuSE 8.2 it took once 3
> > months between two cases of hanging xfsdump using a vanilla
> > 2.4.21 kernel with xfs 1.2 patched.
> >
> > I have tried xfsdump to /dev/null and putting the system
> > under load (more disk load, more network load), but I could
> > not trigger the problem. The tape drive swallows about
> > 5MB/sec, by dumping to /dev/zero the rate is about 15MB/sec,
> > which should be more stress to the system.
> >
> > Twice, the hanging xfsdump was not noticed for some days and
> > the system got instable, kernel NFSd hanging, which made a
> > reboot mandatory.
> >
> > Any ideas?
> >
> > Bernd Strieder
>
> Using sysrq to get a stack trace of the xfsdump thread in the
> kernel will give some pointers to where it is hanging. Since
> it happens when you use the real tape rather than the dummy
> one, there is a fair chance that the tape end of the dump is
> where the problem lies.
>
> You need a kernel with sysrq enabled, and you need to turn it
> on, then there is an option to dump stacks of all kernel
> threads.

After 1 month it happened again... I used sysrq 't' with level 9 
and piped the output of dmesg into ksymoops. The hanging xfsdump 
is included, but possibly not all processes. The file is 
attached. Since any other xfsdump will fail, eventually, the 
system will have to be rebooted to get our backup running again. 

If there is any chance you need and can get more information from 
the running system, please let me know, otherwise the machine 
should be rebooted as soon as possible. I will try to leave it 
in its state as long as possible.

# cat /proc/version
Linux version 2.4.22-xfs (root@doyle) (gcc version 3.3.1 (SuSE 
Linux)) #3 SMP Fri Sep 19 13:50:09 CEST 2003

# strings xfs.o

SGI XFS snapshot 2.4.22-2003-09-03_04:09_UTC with ACLs, no debug 
enabled

# xfsdump --help
xfsdump: version 2.2.6 (dump format 3.0)



Bernd Strieder

Attachment: xfsdump.syms
Description: Text document


--- End Message ---
<Prev in Thread] Current Thread [Next in Thread>
  • Fwd: Re: xfsdump hanging in uninterruptible sleep, Bernd Strieder <=