| To: | linux-xfs@xxxxxxxxxxx |
|---|---|
| Subject: | Random Kernel Panic on Boot |
| From: | Mike Baptiste <mike.baptiste@xxxxxxxx> |
| Date: | Thu, 27 Dec 2001 20:52:22 -0500 |
| Organization: | Duke University, Pratt School of Engineering |
| Sender: | owner-linux-xfs@xxxxxxxxxxx |
| User-agent: | Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.7) Gecko/20011221 |
|
I've got a new system I'm working to put into production and I have one
nagging problem that I'm trying to resolve and perhaps one of you has
seen this or can point me in the right direction. I know this may not
be an XFS problem but since I'm using the XFS installer I figured I'd
start here. I've got a Dual Athlon server (APPRO 1124 - Tyan Thunder K7 firmware 2.08) with a Mylex AccelRaid 170 (32MB) with dual 18GB and dual 36GB Seagate Cheetahs in a RAID1 config. DAC960: ***** DAC960 RAID Driver Version 2.4.10 of 23 July 2001 ***** DAC960: Copyright 1998-2001 by Leonard N. Zubkoff <lnz@xxxxxxxxxxxxx> DAC960#0: Configuring Mylex AcceleRAID 170 PCI RAID Controller DAC960#0: Firmware Version: 6.00-13, Channels: 1, Memory Size: 32MB DAC960#0: PCI Bus: 0, Device: 8, Function: 1, I/O Address: Unassigned DAC960#0: PCI Address: 0xF4004000 mapped at 0xF8832000, IRQ Channel: 10 DAC960#0: Controller Queue Depth: 512, Maximum Blocks per Command: 2048 DAC960#0: Driver Queue Depth: 511, Scatter/Gather Limit: 128 of 257 Segments DAC960#0: Physical Devices: DAC960#0: 0:0 Vendor: SEAGATE Model: ST318406LC Revision: 0108 DAC960#0: Wide Synchronous at 160 MB/sec DAC960#0: Serial Number: 3FE00G5L00002216GSGY DAC960#0: Disk Status: Online, 35807232 blocks DAC960#0: 0:1 Vendor: SEAGATE Model: ST318406LC Revision: 0108 DAC960#0: Wide Synchronous at 160 MB/sec DAC960#0: Serial Number: 3FE00KDJ00007216A142 DAC960#0: Disk Status: Online, 35807232 blocks DAC960#0: 0:2 Vendor: SEAGATE Model: ST336706LC Revision: 0108 DAC960#0: Wide Synchronous at 160 MB/sec DAC960#0: Serial Number: 3FD08JH200007216E1CD DAC960#0: Disk Status: Online, 71651328 blocks DAC960#0: 0:3 Vendor: SEAGATE Model: ST336706LC Revision: 0108 DAC960#0: Wide Synchronous at 160 MB/sec DAC960#0: Serial Number: 3FD08H4000002217FADN DAC960#0: Disk Status: Online, 71651328 blocks DAC960#0: 0:7 Vendor: MYLEX Model: AcceleRAID 170 Revision: 0600 DAC960#0: Wide Synchronous at 160 MB/sec DAC960#0: Serial Number: DAC960#0: 0:9 Vendor: QLogic Model: GEM359 Revision: 1.07 DAC960#0: Asynchronous DAC960#0: Serial Number: 1 DAC960#0: Logical Drives: DAC960#0: /dev/rd/c0d0: RAID-1, Online, 35807232 blocks DAC960#0: Logical Device Initialized, BIOS Geometry: 255/63 DAC960#0: Stripe Size: 64KB, Segment Size: 8KB DAC960#0: Read Cache Disabled, Write Cache Disabled DAC960#0: /dev/rd/c0d1: RAID-1, Online, 71651328 blocks DAC960#0: Logical Device Initialized, BIOS Geometry: 255/63 DAC960#0: Stripe Size: 64KB, Segment Size: 8KB DAC960#0: Read Cache Disabled, Write Cache Disabled I installed RH 7.2 using the SGI XFS Installer with no problems. I configured a fairly normal setup for partitions (all XFS): Filesystem 1k-blocks Used Available Use% Mounted on /dev/rd/c0d0p2 4188164 1296152 2892012 31% / /dev/rd/c0d0p1 59428 8476 50952 15% /boot none 513540 0 513540 0% /dev/shm /dev/vg00/homelv 9432384 180 9432204 1% /home /dev/vg00/varlv 9432384 42548 9389836 1% /var /dev/rd/c0d0p5 260240 53040 207200 21% /tmp Note the LVM partitions were added later (/var & /home used to be contained in c0d0p2) and this problem showed up prior to that so disregard LVM in this case. The problem is this: When booting either the smp or enterprise kernels (stock RH 2.4.9-XFS from the XFS installer CD), I get random kernel panics. Maybe 3 out of 4 boots, at varying points in Interactive boot (from LVM activiation forward), I'll get a Kernel Panic from the DAC960 driver: DAC960#0: SegmentNumber != SegmentCount These panics never happen in the same place, but its always after the root filesystem has been mounted while the init.d scripts are running (though a couple times I've seen it happen during LVM activiation in rc.sysinit) I've never seen it happen booting the uni-proc kernel. I've got APIC disabled due to the AMD Interrupt Errata (#22) with the 760MP chipset. The RAID Segment size is 8KB. If the system boots without a panic - its been rock solid - no super heavy load, but lots of compiling and pkg installation without a single hiccup. But rebooting is always a shot in teh dark since the majority of the time it'll panic, but if I hard reset once or twice, it'll boot normally and run fine. Any ideas? Could this be XFS related? I searched all over with Google and Marc - Couldn't find any mention of a problem like this. I know Mylex 170's are fairly common in Linux configs so if this was a common non-XFS problem I figure I'd have come across something, but then again, maybe not :) Mike -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Mike Baptiste 202 Hudson Hall, Box 90271, Durham, NC 27708 Director of Information Technology mike.baptiste@xxxxxxxx Pratt School of Engineering @ Duke University Phone:919-660-5404 |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||