On 2012-01-25, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> the xfs_info output would be really handy for determining what path
> through the directory code XFS was taking whenteh crash occurred.
No problem, here it is. The device is an LVM volume. Unfortunately
I've mounted and umounted the drive a few times since the reboot, so I
don't know how helpful this will actually be. I can attempt to repeat
the symptoms then try an xfs_info before attempting anything else. (I
ended up killing the xfs_repair -n to get this sooner, so I still do
not have any information from that. So far it's on phase 4, which is
taking a very long time; I think the reshape is stealing IO cycles,
which it's not really supposed to. It hasn't reported any errors so
meta-data=/dev/XXXXXXXX isize=256 agcount=57, agsize=61034784 blks
= sectsz=512 attr=0
data = bsize=4096 blocks=3417949184, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=1
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=32768, version=1
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
> I'd be worried about those IO errors - i don't think that they were
> the cause of the oops, but it implies that the underlying device is
> bad in some way. That may have something to do with the reshape in
> progress which make me worry that the reshape is actually keeping
> your data safe....
Yes, that was my worry as well. Fortunately this is a backup that can
be recreated, but I'd hate to lose my primary store then find out the
backup is hosed.
> As it is, the kernel crashed reading a directory buffer. It's hard
> to say what went wrong - can you take the kernel image and run:
> $ gdb <path/to/kernel>
> (gdb) l *(xfs_da_do_buf+0x43e)
> And post the output so we can see what line number in the code the
> crash occurred at? That might provide a bit more of a clue to what
> the problem is.
Does my kernel need debugging symbols compiled in? Because my kernel
doesn't seem to want to cooperate with gdb:
# gdb /boot/vmlinuz-2.6.39-4.el5.elrepo
GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5_7.1)
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
"/boot/vmlinuz-2.6.39-4.el5.elrepo": not in executable format: File format not
(gdb) l *(xfs_da_do_buf+0x43e)
No symbol table is loaded. Use the "file" command.
My compiling skills are generally confined to ./configure;make;make
install, so I'm not sure where to go next. If debugging is needed to be
compiled into the kernel, that may be problematic--it looks like ELrepo
doesn't provide the same kernel with debug options, so I'd have to build
one myself to get that. (Wow, I haven't built a kernel in over five