xfs
[Top] [All Lists]

Mysterious hangs -- what to do?

To: Linux xfs mailing list <linux-xfs@xxxxxxxxxxx>
Subject: Mysterious hangs -- what to do?
From: Joshua Baker-LePain <jlb17@xxxxxxxx>
Date: Tue, 18 Dec 2001 13:30:12 -0500 (EST)
Sender: owner-linux-xfs@xxxxxxxxxxx
The setup: I'm running Redhat 7.1 with kernel-smp-2.4.5-SGI_XFS_1.0.1 on a 
Dell Precision 610 (dual PIII Xeon 550s, 1GB Registered ECC SDRAM).  There 
is a 9GB system disk on the internal aic7xxx controller and an external 
560GB hardware RAID on an Initio a100u2w.  The RAID is the only XFS 
partition, and is NFS served to about 15 clients.

The prelude: Last Monday (the 10th), literally minutes after I noted that 
the system had a 110+ day uptime, the system spontaneously rebooted (I 
know, I know -- I shouldn't have checked the uptime).  No messages in the 
logs, nothing.  It wasn't a shutdown though, as the system partitions had 
to be fscked and the RAID went through an XFS recovery.  I thought to 
myself "maybe somebody screwed up and hit the big red button," and let it 
go.  It wasn't a power blip -- the system is on a UPS.

The issue: This morning, I came in to find the system hung.  It responded 
to pings, but that's it.  There were no messages on the console, and I 
certainly couldn't log in.  Alt-SysRq-m showed that it wasn't out of 
memory.  Alt-SysRq-t showed too much stuff to capture or look at 
intelligently (no serial console).  I wrote down the Alt-SysRq-p output, 
and tried to Sync-Sync-Unmount-Boot the thing via SysRq, but nothing 
doing.  So, I hit the big red button and was (again) very thankful for the 
5 second XFS recovery.

The question(s): What can I do next time this happens (as I'm assuming it 
will)?  I'll get a serial console hooked up ASAP (once I figure out 
how), so that will help.  Also, is the Alt-SysRq-p info good for anything?  
There are /var/log/ksyms.? files at the time of both "crashes", if that 
will help decode the registers.

Thanks.

-- 
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University



<Prev in Thread] Current Thread [Next in Thread>