xfs
[Top] [All Lists]

Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5

To: Harry Mangalam <harry.mangalam@xxxxxxx>
Subject: Re: XFS on ARM-based Linux on USR8700 NAS appliance w/ mdadm/RAID5
From: Eric Sandeen <sandeen@xxxxxxxxxxx>
Date: Mon, 23 Feb 2009 14:53:07 -0600
Cc: xfs@xxxxxxxxxxx
In-reply-to: <200902231243.33897.harry.mangalam@xxxxxxx>
References: <200902231243.33897.harry.mangalam@xxxxxxx>
User-agent: Thunderbird 2.0.0.19 (Macintosh/20081209)
Harry Mangalam wrote:
> Here's an unusual (long) tale of woe.
> 
> We had a USRobotics 8700 NAS appliance with 4 SATA disks in RAID5:
>  <http://www.usr.com/support/product-template.asp?prod=8700>
> which was a fine (if crude) ARM-based Linux NAS until it stroked out 
> at some point, leaving us with a degraded RAID5 and comatose NAS 
> device.
> 
> We'd like to get the files back of course and I've moved the disks to 
> a Linux PC, hooked them up to a cheap Silicon Image 4x SATA 
> controller and brought up the whole frankenmess with mdadm.  It 
> reported a clean but degraded array.
> 
> (much mdadm stuff deleted)
> 
> Shortening this up considerably, I was able to get the RAID5 
> reconstituted with a new disk, but was not so fortunate with the 
> filesystem.
> 
> The docs and files on the USR web site imply that the native 
> filesystem was originally XFS, but when I try to mount it as such, I 
> can't:

...snip...

> and when I check dmesg:
> [  245.008000] SGI XFS with ACLs, security attributes, realtime, large 
> block numbers, no debug enabled
> [  245.020000] SGI XFS Quota Management subsystem
> [  245.020000] XFS: SB read failed
> [  327.696000] md: md0 stopped.
> [  327.696000] md: unbind<sdc1>
> [  327.696000] md: export_rdev(sdc1)
> [  327.696000] md: unbind<sde1>
> [  327.696000] md: export_rdev(sde1)
> [  327.696000] md: unbind<sdd1>
> [  327.696000] md: export_rdev(sdd1)
> [  439.660000] XFS: bad magic number
> [  439.660000] XFS: SB validate failed
> 
> repeated attempts repeat the last 2 lines above.  This implies that 
> the superblock is bad and xfs_repair also reports that:
> xfs_repair /dev/md1
>         - creating 2 worker thread(s)
> Phase 1 - find and verify superblock...
> bad primary superblock - bad magic number !!!

The main badness that I know of on some of the ARM NAS implementations
is a change which was made due to differing alignment on the old arm
ABI; the change that went into some vendor trees actually modified the
on-disk format rather than fixing it up properly.  (This should be fixed
upstream now).  There are other odd problems w/ cache aliasing too.

However this wouldn't cause a superblock mis-read like this.  If you get
it mounted, you *may* run into what looks like directory corruption on
the PC, though, due to the alignment issue.

Anyway, first, I'd look around for "XFSB" in the early few blocks of
your raid and see if the raid might possibly have been rebuilt out of order.

# dd if=/dev/md0 bs=4k count=32 | hexdump -C | grep XFSB

or so...

-Eric

<Prev in Thread] Current Thread [Next in Thread>