> Hi,
>
> Doing any kind of work on an xfs mounted partition, results in processes
> stuck in sv_wait... This occurs on both IDE and SCSI, with and without
> kio (cluster or plain) enabled.
>
> Although rm or cp will be uninterruptibly stuck in sv_wait, they do come
> to life ocassionally and perform some I/O. But most of the time is spent
> sleeping, without any I/O getting done. Here's a backtrace of rm:
>
OK, so this one was tracked down to the issues XFS has with the gcc 2.95
compiler (really must deal with that sometime). However, a thread stuck
at this point and not moving will generally be because we are waiting for
log space. This is a legitimate place to block, but not for long. If there
is a long term sleep in the xlog_grant_log_space then something has usually
gone wrong in another thread and we are failing to flush old metadata which
would free logspace for reuse. Using the xfsidbg module from kdb we can
find a few things out:
Feeding the first parameter of xlog_grant_log_space into the xlog command
gives us the log structure:
[1]kdb> xlog 0xc231f3e0
xlog at 0xc231f3e0
&flushsm: 0xc231f3e0 tic_cnt: 101 tic_tcnt: 102
freelist: 0xc3ddfc58 tail: 0xc3ddfc08 ICLOG: 0xc52f0000
&icloglock: 0xc231f414 tail_lsn: 0x4e00003d98 last_sync_lsn: 0x4e00004018
mp: 0xc42dc800 xbuf: 0xc6ef6280 roundoff: 376 l_covered_state: need
flags: log 0x0 <> dev: 0x804 logBBstart: 4771360 logsize: 65536000
logBBsize: 128000
curr_cycle: 78 prev_cycle: 78 curr_block: 16455 prev_block: 16408
iclog_bak: 0xc231f464 iclog_size: 0x8000 (32768) num iclogs: 4
&grant_lock: 0xc231f474 resHeadQ: 0x00000000 wrHeadQ: 0x00000000
GResCycle: 78 GResBytes: 8424584 GWrCycle: 78 GWrBytes: 8424584
GResBlocks: 16455 GResRemain: 0 GWrBlocks: 16455 GWrRemain: 0
Taking the mp field from this structure we can look at the active item
list - which is the dirty metadata in memory which has been written out
into the log (writing this would free logspace)
[1]kdb> xail 0xc42dc800
AIL for mp 0xc42dc800, oldest first
[0] type inode flags: 0x1 <in ail > lsn [4e:3d98]
inode 0xc68fca9c logged 1 flags: 0x0 <> format: 0x5 <core dexts >
lastfield: 0x5 <core dexts >
[1] type inode flags: 0x1 <in ail > lsn [4e:3d98]
inode 0xc68fc684 logged 1 flags: 0x0 <> format: 0x0 <> lastfield: 0x5 <core
dexts >
[2] type inode flags: 0x1 <in ail > lsn [4e:3d98]
inode 0xc3edd28c logged 1 flags: 0x0 <> format: 0x0 <> lastfield: 0x3 <core
ddata >
[3] type inode flags: 0x1 <in ail > lsn [4e:3ed8]
inode 0xc4cc20a0 logged 1 flags: 0x0 <> format: 0x0 <> lastfield: 0x3 <core
ddata >
[4] type inode flags: 0x1 <in ail > lsn [4e:4018]
inode 0xc68fc478 logged 1 flags: 0x0 <> format: 0x0 <> lastfield: 0x1 <core
>
[5] type inode flags: 0x1 <in ail > lsn [4e:4018]
inode 0xc68fc890 logged 1 flags: 0x0 <> format: 0x0 <> lastfield: 0x5 <core
dexts >
[6] type inode flags: 0x1 <in ail > lsn [4e:4018]
inode 0xc3ffed28 logged 1 flags: 0x0 <> format: 0x0 <> lastfield: 0x1 <core
>
[7] type inode flags: 0x1 <in ail > lsn [4e:4018]
inode 0xc68fcca8 logged 1 flags: 0x0 <> format: 0x3 <core ddata >
lastfield: 0x3 <core ddata >
So if this had been a hang, one possibility would be that a thead was blocked
holding a lock on one of these inodes.
This concludes XFS debugging 101.
Steve
|