hey,
We're seeing a condition where processes accessing an xfs filesystem
are hanging - xfs 1.2.0/linux-ia64 2.4.20.
The first wedged process is in an unlink call. it has downed
&nd.dentry->d_inode->i_sem in sys_unlink(), and has gone to sleep -
specifically, xlog_regrant_write_log_space() called _sv_wait().
The other processes are blocked waiting for this semaphore.
Backtraces of the first process and one of the other processes are below.
crash> bt 22241
PID: 22241 TASK: e0000040c8a68000 CPU: 1 COMMAND: "nwchem"
#0 [BSP:e0000040c8a69618] schedule at e000000004477270
#1 [BSP:e0000040c8a695e8] schedule_timeout at e000000004475e60
#2 [BSP:e0000040c8a69578] xlog_regrant_write_log_space at e00000000
#3 [BSP:e0000040c8a69530] xfs_log_reserve at e0000000046ab690
#4 [BSP:e0000040c8a694c8] xfs_trans_reserve at e0000000046c44f0
#5 [BSP:e0000040c8a69430] xfs_itruncate_finish at e0000000046a14f0
n #6 [BSP:e0000040c8a693d8] xfs_inactive at e0000000046d34d0
#7 [BSP:e0000040c8a693b0] vn_rele at e0000000046fa560
#8 [BSP:e0000040c8a69398] linvfs_clear_inode at e0000000046f8570
#9 [BSP:e0000040c8a69370] clear_inode at e000000004517d50
#10 [BSP:e0000040c8a69338] iput at e000000004519850
#11 [BSP:e0000040c8a69318] d_delete at e000000004513e20
#12 [BSP:e0000040c8a692e8] vfs_unlink at e000000004500000
#13 [BSP:e0000040c8a69270] sys_unlink at e000000004500320
#14 [BSP:e0000040c8a69270] ia64_ret_from_syscall at e00000000440e360
EFRAME: e0000040c8a6fe70
B0: 400000000234eca0 CR_IIP: 2000000002cdac20
CR_IPSR: 0000141308526010 CR_IFS: 0000000000000000
AR_PFS: c00000000000030a AR_RSC: 000000000000000f
AR_UNAT: 0000000000000000 AR_RNAT: 0000000000000000
AR_CCV: 0000000000000001 AR_FPSR: 0009804c8a74437f
LOADRS: 0000000002f00000 AR_BSPSTORE: 600000ff80000dd0
B6: 2000000002cdac20 B7: 400000000254a900
PR: 002000000000227d R1: 2000000002d3e1c8
R2: 0000000000000000 R3: 600000ffffff1498
R8: 8000000000000000 R9: 600000ffffff14a8
R10: 0000000000000000 R11: 60000000009900e0
R12: 600000ffffff1040 R13: 20000000027ea080
R14: 600000000094c248 R15: 0000000000000408
R16: 2000000002cdac20 R17: 2000000002d99f90
R18: 2000000002d99f98 R19: 6000000053727020
R20: 2000000002d99e68 R21: 2000000002d9a698
R22: 2000000002d99f80 R23: 0000000000000012
R24: 0000000000000090 R25: 2000000002d99e70
R26: 60000000537270a8 R27: 6000000003e5dad8
R28: 000000000000003b R29: 600000000094dbf8
R30: 0000000000000000 R31: 600000ffffff11c8
F6: 1003e60000000531c82f0 F7: 1003e0000000000000008
F8: 1003e60000000530e1380 F9: 1003e0000000000000033
crash> bt 22242
PID: 22242 TASK: e000000159608000 CPU: 0 COMMAND: "nwchem"
#0 [BSP:e000000159609338] schedule at e000000004477270
#1 [BSP:e0000001596092e8] __down at e000000004429df0
#2 [BSP:e000000159609270] sys_unlink at e000000004500260
#3 [BSP:e000000159609270] ia64_ret_from_syscall at e00000000440e360
EFRAME: e00000015960fe50
B0: 0000000000000000 CR_IIP: ed96deac00000026
CR_IPSR: e00000011c85a009 CR_IFS: 0000000000000010
AR_PFS: 0000141308526010 AR_RSC: 2000000002cdac20
AR_UNAT: 0000000000000000 AR_RNAT: 0000000000000000
AR_CCV: 000000000000003b AR_FPSR: 600000000094dbf8
LOADRS: 0000000000000000 AR_BSPSTORE: 0000000000000000
B6: 000000000000000f B7: 600000ffffff11c8
PR: c00000000000030a R1: 600000ff80000dd0
R2: 002000000000227d R3: 2000000002cdac20
R8: 600000ffffff1040 R9: 20000000027ea080
R10: 600000000094c248 R11: 0000000000000408
R12: 0000000002f00000 R13: 2000000002d3e1c8
R14: 0000000000000000 R15: 600000ffffff1498
R16: 8000000000000000 R17: 600000ffffff14a8
R18: 0000000000000000 R19: 60000000009900e0
R20: 2000000002cdac20 R21: 2000000002d99ed0
R22: 2000000002d99ed8 R23: 600000005370c380
R24: 2000000002d99e68 R25: 2000000002d9a698
R26: 2000000002d99ec0 R27: 0000000000000006
R28: 0000000000000030 R29: 2000000002d99e70
R30: 600000005370c3a8 R31: 6000000003e5dad8
F6: 9804c8a74437f0000000000000001 F7:
400000000254a900400000000234eca0 F8: 1003e60000000531c4270 F9:
1003e0000000000000008
here's the mount options used:
/dev/md9 on /scratch type xfs (rw,biosize=16,logbufs=8,logbsize=32768)
--
---------------------------
dann frazier
Hewlett-Packard
Linux and Open Source Lab
dannf@xxxxxx
(970) 898-0800
|