| To: | linux-xfs@xxxxxxxxxxx |
|---|---|
| Subject: | XFS corruption on md raid-0 arrays |
| From: | Mark Watts <m.watts@xxxxxxxxxxxxxxxxx> |
| Date: | Sun, 14 Nov 2004 16:02:49 +0000 |
| Sender: | linux-xfs-bounce@xxxxxxxxxxx |
| User-agent: | Mozilla Thunderbird 0.9 (X11/20041103) |
I'm running XFS on Mandrake 10.0 (2.6.3 kernel). System is Athlon XP, 1GM ram, 2 x 40GB Maxtor ATA/66 drives connected to a SiI680 IDE controller. I have several linux software raid partitions, each being a raid-0 array with an XFS filesystem. /boot is on a small ext3 partition. Today, while using the system (which had been running just fine for a week), the XFS driver decided to shutdown /dev/md1 citing 'corruptions in memory data' or somesuch error. dmesg showed a stack trace. Thinking it may be fixable with a reboot, I rebooted, only to have /dev/md0 get shutdown too. When the system comes back up, I get the following on a serial console. Is this an XFS issue or is something wrong with the md arrays? And the big question: is it fixable? Cheers, Mark. Linux version 2.6.3-16mdk-i686-up-4GB (qateam@xxxxxxxxxxxxxxxxxxxxxxxx) (gcc version 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk4 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000d4000 - 00000000000da000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 000000003fff8000 (ACPI data) BIOS-e820: 000000003fff8000 - 0000000040000000 (ACPI NVS) BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved) BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved) 127MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000fb940 hm, page 000fb000 reserved twice. hm, page 000fc000 reserved twice. hm, page 000f6000 reserved twice. hm, page 000f7000 reserved twice. On node 0 totalpages: 262128 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 225280 pages, LIFO batch:16 HighMem zone: 32752 pages, LIFO batch:7 DMI 2.3 present. ACPI: RSDP (v000 AMI ) @ 0x000fa9d0 ACPI: RSDT (v001 AMIINT VIA_K7 0x00000010 MSFT 0x00000097) @ 0x3fff0000 ACPI: FADT (v001 AMIINT VIA_K7 0x00000011 MSFT 0x00000097) @ 0x3fff0030 ACPI: MADT (v001 AMIINT VIA_K7 0x00000009 MSFT 0x00000097) @ 0x3fff00c0 ACPI: DSDT (v001 VIA VIA_K7 0x00001000 MSFT 0x0100000d) @ 0x00000000 ACPI: PM-Timer IO Port: 0x808 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) Processor #0 6:8 APIC version 16 ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0]) IOAPIC[0]: Assigned apic_id 2 IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, IRQ 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI BALANCE SET Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Built 1 zonelists Kernel command line: BOOT_IMAGE=263i686up4G-16 ro root=900 devfs=mount splash=silent ide=reverse console=ttyS0 bootsplash: silent mode. ide_setup: ide=reverse : Enabled support for IDE inverse scan order. Initializing CPU#0 PID hash table entries: 4096 (order 12: 32768 bytes) Detected 1760.202 MHz processor. Using pmtmr for high-res timesource Console: colour VGA+ 80x25 Memory: 1033256k/1048512k available (1955k kernel code, 14348k reserved, 850k data, 292k init, 131008k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 3481.60 BogoMIPS Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) checking if image is initramfs...it isn't (no cpio magic); looks like an initrd Freeing initrd memory: 478k freed CPU: CLK_CTL MSR was 6003d22f. Reprogramming to 2003d22f CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 256K (64 bytes/line) Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: AMD Athlon(tm) XP 2100+ stepping 01 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX enabled ExtINT on CPU#0 ESR value before enabling vector: 00000080 ESR value after enabling vector: 00000000 ENABLING IO-APIC IRQs ..TIMER: vector=0x31 pin1=2 pin2=-1 Using local APIC timer interrupts. calibrating APIC timer ... ..... CPU clock speed is 1759.0019 MHz. ..... host bus clock speed is 270.0618 MHz. NET: Registered protocol family 16 EISA bus registered PCI: PCI BIOS revision 2.10 entry at 0xfdaf1, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20040211 Looking for DSDT in initrd ... not found! ACPI: Interpreter enabled ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: Power Resource [URP1] (off) ACPI: Power Resource [URP2] (off) ACPI: Power Resource [FDDP] (off) ACPI: Power Resource [LPTP] (off) ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 *6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay PnPBIOS: Disabled testing the IO APIC....................... .................................... done. ACPI: No IRQ known for interrupt pin A of device 0000:00:11.1 - using IRQ 255 PCI: Using ACPI for IRQ routing PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off' apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac) apm: overridden by ACPI. ikconfig 0.7 with /proc/config* highmem bounce pool size: 64 pages VFS: Disk quotas dquot_6.5.1 devfs: 2004-01-31 Richard Gooch (rgooch@xxxxxxxxxxxxx) devfs: boot_options: 0x1 Initializing Cryptographic API PCI: Via IRQ fixup for 0000:00:10.2, from 6 to 5 PCI: Via IRQ fixup for 0000:00:10.0, from 11 to 5 isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found pty: 1024 Unix98 ptys configured Serial: 8250/16550 driver $Revision: 1.90 $ 20 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A RAMDISK driver initialized: 16 RAM disks of 32000K size 1024 blocksize Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:11.1 ACPI: No IRQ known for interrupt pin A of device 0000:00:11.1 - using IRQ 255 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt8235 (rev 00) IDE UDMA133 controller on pci0000:00:11.1 ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio hda: MAXTOR 6L020J1, ATA DISK drive Using anticipatory io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 hdc: SONY CD-RW CRX300E, ATAPI CD/DVD-ROM drive ide1 at 0x170-0x177,0x376 on irq 15 SiI680: IDE controller at PCI slot 0000:00:07.0 SiI680: chipset revision 1 SiI680: BASE CLOCK == 133 SiI680: 100% native mode on irq 19 ide2: MMIO-DMA , BIOS settings: hde:pio, hdf:pio ide3: MMIO-DMA , BIOS settings: hdg:pio, hdh:pio hde: Maxtor 94098U8, ATA DISK drive ide2 at 0xf8807f80-0xf8807f87,0xf8807f8a on irq 19 hdg: Maxtor 94098U8, ATA DISK drive ide3 at 0xf8807fc0-0xf8807fc7,0xf8807fca on irq 19 hda: max request size: 128KiB hda: 40132503 sectors (20547 MB) w/1819KiB Cache, CHS=39813/16/63, UDMA(133) /dev/ide/host0/bus0/target0/lun0: p1 hde: max request size: 64KiB hde: 80041248 sectors (40981 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(66) /dev/ide/host2/bus0/target0/lun0: p1 p2 < p5 p6 p7 > hdg: max request size: 64KiB hdg: 80041248 sectors (40981 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(66) /dev/ide/host2/bus1/target0/lun0: p1 p2 < p5 p6 p7 > mice: PS/2 mouse device common for all mice serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 input: AT Translated Set 2 keyboard on isa0060/serio0 md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27 EISA: Probing bus 0 at eisa0 NET: Registered protocol family 2 IP: routing cache hash table of 8192 buckets, 64Kbytes TCP: Hash tables configured (established 262144 bind 65536) NET: Registered protocol family 1 BIOS EDD facility v0.13 2004-Mar-09, 3 devices found Please report your BIOS at http://linux.dell.com/edd/results.html ACPI: (supports S0 S1 S4 S5) md: Autodetecting RAID arrays. md: autorun ... md: considering hdg7 ... md: adding hdg7 ... md: hdg6 has different UUID to hdg7 md: adding hde7 ... md: hde6 has different UUID to hdg7 md: created mdX md: bind<hde7> md: bind<hdg7> md: running: <hdg7><hde7> md: personality 2 is not loaded! md :do_md_run() returned -22 md: md1 stopped. md: unbind<hdg7> md: export_rdev(hdg7) md: unbind<hde7> md: export_rdev(hde7) md: considering hdg6 ... md: adding hdg6 ... md: adding hde6 ... md: created md0 md: bind<hde6> md: bind<hdg6> md: running: <hdg6><hde6> md: personality 2 is not loaded! md :do_md_run() returned -22 md: md0 stopped. md: unbind<hdg6> md: export_rdev(hdg6) md: unbind<hde6> md: export_rdev(hde6) md: ... autorun DONE. RAMDISK: Compressed image found at block 0 VFS: Mounted root (ext2 filesystem). Mounted devfs on /dev Red Hat nash verSCSI subsystem initialized sion 3.5.18-mdk starting Loading scsi_mod.ko module Loading aic7xxx.ko module scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36 <Adaptec 3960D Ultra160 SCSI adapter> aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
<Adaptec 3960D Ultra160 SCSI adapter>
aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBsLoading sd_mod.kmd: raid0 personality registered as nr 2 o module LoadinSGI XFS with ACLs, large block numbers, no debug enabled g raid0.ko modulSGI XFS Quota Management subsystem e Loading xfs.kmd: Autodetecting RAID arrays. o module Mountimd: autorun ... md: considering hde6 ... ng /proc filesysmd: adding hde6 ... tem Creating demd: adding hdg6 ... vice files Mounmd: hde7 has different UUID to hde6 md: hdg7 has different UUID to hde6 md: created md0 md: bind<hdg6> md: bind<hde6> md: running: <hde6><hdg6> ting sysfs Actimd0: setting max_sectors to 128, segment boundary to 32767 raid0: looking at hde6 raid0: comparing hde6(5119488) with hde6(5119488) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at hdg6 raid0: comparing hdg6(5119488) with hde6(5119488) raid0: EQUAL raid0: FINAL 1 zones raid0: done. raid0 : md_size is 10238976 blocks. raid0 : conf->hash_spacing is 10238976 blocks. raid0 : nb_zone is 1. raid0 : Allocating 4 bytes for hash. vating md devicemd: considering hde7 ... s mknod: failedmd: adding hde7 ... to create /dev/md: adding hdg7 ... md: created md1 md: bind<hdg7> md: bind<hde7> md: running: <hde7><hdg7> md0: 17 mknod: md1: setting max_sectors to 128, segment boundary to 32767 raid0: looking at hde7 raid0: comparing hde7(33979584) with hde7(33979584) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at hdg7 raid0: comparing hdg7(33979584) with hde7(33979584) raid0: EQUAL raid0: FINAL 1 zones raid0: done. raid0 : md_size is 67959168 blocks. raid0 : conf->hash_spacing is 67959168 blocks. raid0 : nb_zone is 1. raid0 : Allocating 4 bytes for hash. md: ... autorun DONE. failed to createXFS mounting filesystem md0 /dev/md/0: 17 Creating root device Mounting root filesystem Starting XFS recovery on filesystem: md0 (dev: md0) Filesystem "md0": XFS internal error xlog_valid_rec_header(1) at line 3485 of file fs/xfs/xfs_log_recover.c. Caller 0xf89c Call Trace: [<f89bd40b>] 0xf89bd40b [<f89bd5ec>] 0xf89bd5ec [<f89bd5ec>] 0xf89bd5ec [<c02686dd>] i8042_interrupt+0xad/0x140 [<f89bdf8b>] 0xf89bdf8b [<f89be0b6>] 0xf89be0b6 [<f89be2d5>] 0xf89be2d5 [<f89b5064>] 0xf89b5064 [<f89bfcad>] 0xf89bfcad [<f89d571c>] 0xf89d571c [<f89db00d>] 0xf89db00d [<f89bf080>] 0xf89bf080 [<f89b0ebe>] 0xf89b0ebe [<f89c7afd>] 0xf89c7afd [<f89dbce1>] 0xf89dbce1 [<f89dba9f>] 0xf89dba9f [<c01d38f6>] snprintf+0x26/0x30 [<c018bc37>] disk_name+0xa7/0xc0 [<c015fb10>] sb_set_blocksize+0x20/0x60 [<c015f511>] get_sb_bdev+0x131/0x160 [<c0165663>] real_lookup+0xd3/0x100 [<f89dbc8f>] 0xf89dbc8f [<f89dba00>] 0xf89dba00 [<c015f789>] do_kern_mount+0x89/0x110 [<c0173a84>] do_add_mount+0x84/0x190 [<c014149b>] __alloc_pages+0x9b/0x360 [<c0173e12>] do_mount+0x182/0x1d0 [<c0141793>] __get_free_pages+0x33/0x40 [<c0173c11>] copy_mount_options+0x81/0x100 [<c01741be>] sys_mount+0x8e/0xd0 [<c010b11d>] sysenter_past_esp+0x52/0x71 XFS: log mount/recovery failed XFS: log mount failed mount: error 22 XFS mounting filesystem md0 mounting xfs flags defaults well, retrying without the option flags Starting XFS recovery on filesystem: md0 (dev: md0) Filesystem "md0": XFS internal error xlog_valid_rec_header(1) at line 3485 of file fs/xfs/xfs_log_recover.c. Caller 0xf89c Call Trace: [<f89bd40b>] 0xf89bd40b [<f89bd5ec>] 0xf89bd5ec [<f89bd5ec>] 0xf89bd5ec [<c02686dd>] i8042_interrupt+0xad/0x140 [<f89bdf8b>] 0xf89bdf8b [<f89be0b6>] 0xf89be0b6 [<f89be2d5>] 0xf89be2d5 [<f89b5064>] 0xf89b5064 [<f89bfcad>] 0xf89bfcad [<f89d571c>] 0xf89d571c [<f89db00d>] 0xf89db00d [<f89bf080>] 0xf89bf080 [<f89b0ebe>] 0xf89b0ebe [<f89c7afd>] 0xf89c7afd [<f89dbce1>] 0xf89dbce1 [<f89dba9f>] 0xf89dba9f [<c01d38f6>] snprintf+0x26/0x30 [<c018bc37>] disk_name+0xa7/0xc0 [<c015fb10>] sb_set_blocksize+0x20/0x60 [<c015f511>] get_sb_bdev+0x131/0x160 [<f89dbc8f>] 0xf89dbc8f [<f89dba00>] 0xf89dba00 [<c015f789>] do_kern_mount+0x89/0x110 [<c0173a84>] do_add_mount+0x84/0x190 [<c014149b>] __alloc_pages+0x9b/0x360 [<c0173e12>] do_mount+0x182/0x1d0 [<c0141793>] __get_free_pages+0x33/0x40 [<c0173c11>] copy_mount_options+0x81/0x100 [<c01741be>] sys_mount+0x8e/0xd0 [<c010b11d>] sysenter_past_esp+0x52/0x71 XFS: log mount/recovery failed XFS: log mount failed mount: error 22 XFS mounting filesystem md0 mounting xfs well, retrying read-only without any flag Starting XFS recovery on filesystem: md0 (dev: md0) Filesystem "md0": XFS internal error xlog_valid_rec_header(1) at line 3485 of file fs/xfs/xfs_log_recover.c. Caller 0xf89c Call Trace: [<f89bd40b>] 0xf89bd40b [<f89bd5ec>] 0xf89bd5ec [<f89bd5ec>] 0xf89bd5ec [<c02686dd>] i8042_interrupt+0xad/0x140 [<f89bdf8b>] 0xf89bdf8b [<f89be0b6>] 0xf89be0b6 [<f89be2d5>] 0xf89be2d5 [<f89b5064>] 0xf89b5064 [<f89bfcad>] 0xf89bfcad [<f89d571c>] 0xf89d571c [<f89db00d>] 0xf89db00d [<f89bf080>] 0xf89bf080 [<f89b0ebe>] 0xf89b0ebe [<f89c7afd>] 0xf89c7afd [<f89dbce1>] 0xf89dbce1 [<f89dba9f>] 0xf89dba9f [<c01d38f6>] snprintf+0x26/0x30 [<c018bc37>] disk_name+0xa7/0xc0 [<c015fb10>] sb_set_blocksize+0x20/0x60 [<c015f511>] get_sb_bdev+0x131/0x160 [<f89dbc8f>] 0xf89dbc8f [<f89dba00>] 0xf89dba00 [<c015f789>] do_kern_mount+0x89/0x110 [<c0173a84>] do_add_mount+0x84/0x190 [<c014149b>] __alloc_pages+0x9b/0x360 [<c0173e12>] do_mount+0x182/0x1d0 [<c0141793>] __get_free_pages+0x33/0x40 [<c0173c11>] copy_mount_options+0x81/0x100 [<c01741be>] sys_mount+0x8e/0xd0 [<c010b11d>] sysenter_past_esp+0x52/0x71 XFS: log mount/recovery failed XFS: log mount failed mount: error 22 mounting xfs pivotroot: pivot_root(/sysroot,/sysroot/initrd) failed: 2 Remounting devfs at correct place if necessary Mounted devfs on /dev Freeing unused kernel memory: 292k freed Kernel panic: No init found. Try passing init= option to kernel. |
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | [Bug 384] New: Missing compat ioctls, bugzilla-daemon |
|---|---|
| Next by Date: | Re: XFS corruption on md raid-0 arrays, Christoph Hellwig |
| Previous by Thread: | [Bug 384] New: Missing compat ioctls, bugzilla-daemon |
| Next by Thread: | Re: XFS corruption on md raid-0 arrays, Christoph Hellwig |
| Indexes: | [Date] [Thread] [Top] [All Lists] |