<div dir="ltr">Hi Brain,<div>Below is the root inode data. I'm currently running xfs_metadump and will send you a link to the file. </div><div>Cheers!</div><div>David</div><div><br></div><div><br></div><div><br></div><div><br></div><div><div><div>xfs_db> sb</div><div>xfs_db> p rootino</div><div>rootino = 1024</div><div>xfs_db> inode 1024</div><div>xfs_db> p</div><div>core.magic = 0</div><div>core.mode = 0</div><div>core.version = 0</div><div>core.format = 0 (dev)</div><div>core.uid = 0</div><div>core.gid = 0</div><div>core.flushiter = 0</div><div>core.atime.sec = Thu Jan 1 10:00:00 1970</div><div>core.atime.nsec = 000000000</div><div>core.mtime.sec = Thu Jan 1 10:00:00 1970</div><div>core.mtime.nsec = 000000000</div><div>core.ctime.sec = Thu Jan 1 10:00:00 1970</div><div>core.ctime.nsec = 000000000</div><div>core.size = 0</div><div>core.nblocks = 0</div><div>core.extsize = 0</div><div>core.nextents = 0</div><div>core.naextents = 0</div><div>core.forkoff = 0</div><div>core.aformat = 0 (dev)</div><div>core.dmevmask = 0</div><div>core.dmstate = 0</div><div>core.newrtbm = 0</div><div>core.prealloc = 0</div><div>core.realtime = 0</div><div>core.immutable = 0</div><div>core.append = 0</div><div>core.sync = 0</div><div>core.noatime = 0</div><div>core.nodump = 0</div><div>core.rtinherit = 0</div><div>core.projinherit = 0</div><div>core.nosymlinks = 0</div><div>core.extsz = 0</div><div>core.extszinherit = 0</div><div>core.nodefrag = 0</div><div>core.filestream = 0</div><div>core.gen = 0</div><div>next_unlinked = 0</div><div>u.dev = 0</div></div></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 7 January 2015 at 10:16, Brian Foster <span dir="ltr"><<a href="mailto:bfoster@redhat.com" target="_blank">bfoster@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Wed, Jan 07, 2015 at 07:34:37AM +1100, David Raffelt wrote:<br>
> Hi Brian and Stefan,<br>
> Thanks for your reply. I checked the status of the array after the rebuild<br>
> (and before the reset).<br>
><br>
> md0 : active raid6 sdd1[8] sdc1[4] sda1[3] sdb1[7] sdi1[5] sde1[1]<br>
> 14650667520 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6]<br>
> [UUUUUU_]<br>
><br>
> However given that I've never had any problems before with mdadm rebuilds I<br>
> did not think to check the data before rebooting. Note that the array is<br>
> still in this state. Before the reboot I tried to run a smartctl check on<br>
> the failed drives and it could not read them. When I rebooted I did not<br>
> actually replace any drives, I just power cycled to see if I could<br>
> re-access the drives that were thrown out of the array. According to<br>
> smartctl they are completely fine.<br>
><br>
> I guess there is no way I can re-add the old drives and remove the newly<br>
> synced drive? Even though I immediately kicked all users off the system<br>
> when I got the mdadm alert, it's possible a small amount of data was<br>
> written to the array during the resync.<br>
><br>
> It looks like the filesystem was not unmounted properly before reboot:<br>
> Jan 06 09:11:54 server systemd[1]: Failed unmounting /export/data.<br>
> Jan 06 09:11:54 server systemd[1]: Shutting down.<br>
><br>
> Here is the mount errors in the log after rebooting:<br>
> Jan 06 09:15:17 server kernel: XFS (md0): Mounting Filesystem<br>
> Jan 06 09:15:17 server kernel: XFS (md0): Corruption detected. Unmount and<br>
> run xfs_repair<br>
> Jan 06 09:15:17 server kernel: XFS (md0): Corruption detected. Unmount and<br>
> run xfs_repair<br>
> Jan 06 09:15:17 server kernel: XFS (md0): Corruption detected. Unmount and<br>
> run xfs_repair<br>
> Jan 06 09:15:17 server kernel: XFS (md0): metadata I/O error: block 0x400<br>
> ("xfs_trans_read_buf_map") error 117 numblks 16<br>
> Jan 06 09:15:17 server kernel: XFS (md0): xfs_imap_to_bp:<br>
> xfs_trans_read_buf() returned error 117.<br>
> Jan 06 09:15:17 server kernel: XFS (md0): failed to read root inode<br>
><br>
<br>
</div></div>So it fails to read the root inode. You could also try to read said<br>
inode via xfs_db (e.g., 'sb,' 'p rootino,' 'inode <ino#>,' 'p') and see<br>
what it shows.<br>
<br>
Are you able to run xfs_metadump against the fs? If so and you're<br>
willing/able to make the dump available somewhere (compressed), I'd be<br>
interested to take a look to see what might be causing the difference in<br>
behavior between repair and xfs_db.<br>
<br>
Brian<br>
<div><div class="h5"><br>
> xfs_repair -n -L also complains about a bad magic number.<br>
><br>
> Unfortunately this 15TB RAID was part of a 45TB GlusterFS distributed<br>
> volume. It was only ever meant to be a scratch drive for intermediate<br>
> scientific results, however inevitably most users used it to store lots of<br>
> data. Oh well.<br>
><br>
> Thanks again,<br>
> Dave<br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
><br>
> On 6 January 2015 at 23:47, Brian Foster <<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>> wrote:<br>
><br>
> > On Tue, Jan 06, 2015 at 05:12:14PM +1100, David Raffelt wrote:<br>
> > > Hi again,<br>
> > > Some more information.... the kernel log show the following errors were<br>
> > > occurring after the RAID recovery, but before I reset the server.<br>
> > ><br>
> ><br>
> > By after the raid recovery, you mean after the two drives had failed out<br>
> > and 1 hot spare was activated and resync completed? It certainly seems<br>
> > like something went wrong in this process. The output below looks like<br>
> > it's failing to read in some inodes. Is there any stack trace output<br>
> > that accompanies these error messages to confirm?<br>
> ><br>
> > I suppose I would try to verify that the array configuration looks sane,<br>
> > but after the hot spare resync and then one or two other drive<br>
> > replacements (was the hot spare ultimately replaced?), it's hard to say<br>
> > whether it might be recoverable.<br>
> ><br>
> > Brian<br>
> ><br>
> > > Jan 06 00:00:27 server kernel: XFS (md0): Corruption detected. Unmount<br>
> > and<br>
> > > run xfs_repair<br>
> > > Jan 06 00:00:27 server kernel: XFS (md0): Corruption detected. Unmount<br>
> > and<br>
> > > run xfs_repair<br>
> > > Jan 06 00:00:27 server kernel: XFS (md0): Corruption detected. Unmount<br>
> > and<br>
> > > run xfs_repair<br>
> > > Jan 06 00:00:27 server kernel: XFS (md0): metadata I/O error: block<br>
> > > 0x36b106c00 ("xfs_trans_read_buf_map") error 117 numblks 16<br>
> > > Jan 06 00:00:27 server kernel: XFS (md0): xfs_imap_to_bp:<br>
> > > xfs_trans_read_buf() returned error 117.<br>
> > ><br>
> > ><br>
> > > Thanks,<br>
> > > Dave<br>
> ><br>
> > > _______________________________________________<br>
> > > xfs mailing list<br>
> > > <a href="mailto:xfs@oss.sgi.com">xfs@oss.sgi.com</a><br>
> > > <a href="http://oss.sgi.com/mailman/listinfo/xfs" target="_blank">http://oss.sgi.com/mailman/listinfo/xfs</a><br>
> ><br>
> ><br>
><br>
><br>
> --<br>
</div></div>> *David Raffelt (PhD)*<br>
<span class="im HOEnZb">> Postdoctoral Fellow<br>
><br>
> The Florey Institute of Neuroscience and Mental Health<br>
> Melbourne Brain Centre - Austin Campus<br>
> 245 Burgundy Street<br>
> Heidelberg Vic 3084<br>
> Ph: <a href="tel:%2B61%203%209035%207024" value="+61390357024">+61 3 9035 7024</a><br>
> <a href="http://www.florey.edu.au" target="_blank">www.florey.edu.au</a><br>
<br>
</span><div class="HOEnZb"><div class="h5">> _______________________________________________<br>
> xfs mailing list<br>
> <a href="mailto:xfs@oss.sgi.com">xfs@oss.sgi.com</a><br>
> <a href="http://oss.sgi.com/mailman/listinfo/xfs" target="_blank">http://oss.sgi.com/mailman/listinfo/xfs</a><br>
<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><div><b><font color="#ff6600">David Raffelt (PhD)</font></b></div><div><font color="#ff6600">Postdoctoral Fellow</font></div><div><br></div><div>The Florey Institute of Neuroscience and Mental Health</div><div>Melbourne Brain Centre - Austin Campus</div><div>245 Burgundy Street</div><div>Heidelberg Vic 3084<div>Ph: <a value="+61390357024">+61 3 9035 7024</a></div></div><div><a value="+61390357024">www.florey.edu.au</a></div></div></div>
</div>