On Thu, Mar 04, 2010 at 03:20:39PM +0100, Michael Weissenbacher wrote:
> Hi Christoph/Dave!
>> Also when you next rebuilt the kernel please make sure to include
>> CONFIG_KALLSYMS in the configuration, possibly CONFIG_KALLSYMS_ALL too.
>> This will help greatly with decoding any kind of warning / oops.
> Thanks for this information. Unfortunately my current kernel was built
> without CONFIG_KALLSYMS. I'm now recompiling with CONFIG_KALLSYMS and
> CONFIG_KALLSYMS_ALL set. I reckon that my old traces can't be
> ksymoops'ed even if i enable that kernel option now? I will see if i can
> get a fresh trace then (even though i hope it won't happen again).
Yeah, that seems to be the case.
>> Was there anything else in the logs prior to the oops messages
>> that might indicate errors were occurring?
> Unfortunately everything in the logs is dandy until the error happens.
> It seems that xfs_fsr randomly stops at some files and then locks up the
> whole /var partition. I searched for the inode numbers where xfs_fsr
> stopped and one time it was "/var/log/xfs_fsr.log" and the other time it
> was "/var/spool/imap/x/user/xxxx/cyrus.cache" (username obfuscated).
If you've got the inode numbers, then your running with the verbose
flag set? Do you still have the logs for those inodes that it hung
> Whats's interesting is that i have the no-defrag flag set on the whole
> /var/log directory and still it seemed to hang on that log file.
xfs_fsr doesn't do directory traversals to find files for defrag -
it uses more efficient bulkstat+open-by-handle method to visit every
inode in the filesystem once. As a result, it will still open inodes
that have the nodefrag flag set on them, but will then ignore them once
it finds the flag is set.
If xfs_fsr hung before it checked the nodefrag flag, then there's
only a few things it could get stuck on:
1. fsync() of the file
2. file lock checks
A trace would tell us which one it was....