[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: high load lockup




Hmm, I didn't know linux could survive a stack this deep!

Going back into XFS from the memory free case like this is the real
problem. You probably have a multi-thread deadlock here, the
xfs_trans_read_buf call is basically blocked on a locked buffer. The
upper layer thread which did the original memory allocation is not
holding any buffer locks at this point, but it will be holding an
inode lock, possibly you have another thread somewhere which is
holding the buffer and is attempting to lock an inode. The Irix memory
freeing code is quite different from the Linux code, and XFS is a little
sensitive to getting this exactly right.

Try changing this line in xfs_iread (fs/xfs/xfs_inode.c)

        int             alloc_mode = tp ? KM_SLEEP : KM_SLEEP_IO;

to this

	int		alloc_mode = KM_SLEEP;

That will prevent this allocate from ever going into prune_dcache,
which of course raises the chance that it will fail and we will
die a horrible death.

I hate to request it, since it sounds like it will be huge, but can
you send me the output of the kdb command bta for one of these hangs.

Steve


> I've got a lockup in XFS under very high nfs load (SPECsfs with 5000
> OPS). The lockup is reproducible. With 2.4.7-xfs (the oss.sgi.com CVS
> tree July 24th) I get the following backtrace in kdb:
> 
> [0]kdb> btp 484
>     EBP       EIP         Function(args)
> 0xf05a9704 0x80114952 schedule+0x406 (0xd3a3e0e0, 0x1)
>                                kernel .text 0x80100000 0x8011454c 0x80114b60
> 0xf05a9730 0x80105d34 __down+0x78
>                                kernel .text 0x80100000 0x80105cbc 0x80105d94
> 0xf05a9744 0x80105ef3 __down_failed+0xb (0xf05a97f4, 0xf888a92a, 0xd3a3e0e0, 
> 0x60c000, 0xedaaef60)
>                                kernel .text 0x80100000 0x80105ee8 0x80105efc
>            0xf888ae0a [pagebuf].text.lock+0x10c
>                                pagebuf .text.lock 0xf888acfe 0xf888acfe 0xf88
> 8aea0
> 0xf05a974c 0xf888a81f [pagebuf]_pagebuf_grab_lock+0x13 (0xd3a3e0e0, 0x60c000)
>                                pagebuf .text 0xf8886060 0xf888a80c 0xf888a824
> 0xf05a97f4 0xf888a92a [pagebuf]_pagebuf_find_lockable_buffer+0x106 (0xedaaef6
> 0, 0x60c000, 0x0, 0x2000, 0x2201)
>                                pagebuf .text 0xf8886060 0xf888a824 0xf888aa18
> 0xf05a9824 0xf888aa49 [pagebuf]_pagebuf_get_lockable_buffer+0x31 (0xedaaef60,
>  0x60c000, 0x0, 0x2000, 0x2201)
>                                pagebuf .text 0xf8886060 0xf888aa18 0xf888aaf8
> 0xf05a9860 0xf8886ef2 [pagebuf]pagebuf_get+0x8e (0xedaaef60, 0x60c000, 0x0, 0
> x2000, 0x2201)
>                                pagebuf .text 0xf8886060 0xf8886e64 0xf8886fb4
> 0xf05a9890 0xf88ddf74 [xfs]xfs_trans_read_buf+0x48 (0xf0325400, 0x0, 0xf03255
> 64, 0x3060, 0x0)
>                                xfs .text 0xf8894060 0xf88ddf2c 0xf88de230
> 0xf05a98ec 0xf88c9697 [xfs]xfs_itobp+0xfb (0xf0325400, 0x0, 0xe36fd7d0, 0xf05
> a9930, 0xf05a9934)
>                                xfs .text 0xf8894060 0xf88c959c 0xf88c975c
> 0xf05a9938 0xf88cc500 [xfs]xfs_iflush+0xa8 (0xe36fd7d0, 0x5)
>                                xfs .text 0xf8894060 0xf88cc458 0xf88cc86c
> 0xf05a994c 0xf88cd74e [xfs]xfs_inode_item_push+0x12 (0xe50a38f8)
>                                xfs .text 0xf8894060 0xf88cd73c 0xf88cd760
> 0xf05a998c 0xf88dd928 [xfs]xfs_trans_push_ail+0x120 (0xf0325400, 0x784a, 0xbf
> )
>                                xfs .text 0xf8894060 0xf88dd808 0xf88dda0c
> 0xf05a99d4 0xf88d0703 [xfs]xlog_grant_push_ail+0x14b (0xf0325400, 0x2c5b8)
>                                xfs .text 0xf8894060 0xf88d05b8 0xf88d0710
> 0xf05a99f0 0xf88cfadb [xfs]xfs_log_reserve+0x3f (0xf0325400, 0x2abb8, 0x2, 0x
> e6b46e54, 0x69)
>                                xfs .text 0xf8894060 0xf88cfa9c 0xf88cfb24
> 0xf05a9a1c 0xf88dc88a [xfs]xfs_trans_reserve+0x76 (0xe6b46e20, 0x0, 0x2abb8, 
> 0x0, 0x4)
>                                xfs .text 0xf8894060 0xf88dc814 0xf88dc934
> 0xf05a9a58 0xf88a6300 [xfs]xfs_bmap_finish+0xac (0xf05a9b28, 0xf05a9ac0, 0xff
> ffffff, 0xffffffff, 0xf05a9ab0)
>                                xfs .text 0xf8894060 0xf88a6254 0xf88a63ac
> 0xf05a9acc 0xf88cb0a6 [xfs]xfs_itruncate_finish+0x22e (0xf05a9b28, 0xb08742b4
> , 0x400, 0x0, 0x0)
>                                xfs .text 0xf8894060 0xf88cae78 0xf88cb19c
> 0xf05a9b54 0xf88e2ba1 [xfs]xfs_inactive_free_eofblocks+0x209 (0xf0325400, 0xb
> 08742b4)
>                                xfs .text 0xf8894060 0xf88e2998 0xf88e2be8
> 0xf05a9b84 0xf88e32be [xfs]xfs_inactive+0x10e (0xb08742cc, 0x0)
>                                xfs .text 0xf8894060 0xf88e31b0 0xf88e360c
> 0xf05a9ba0 0xf88f2325 [xfs]vn_put+0x49 (0xd7f3394c)
>                                xfs .text 0xf8894060 0xf88f22dc 0xf88f2398
> 0xf05a9bac 0xf88f1499 [xfs]linvfs_put_inode+0x19 (0xd7f33840)
>                                xfs .text 0xf8894060 0xf88f1480 0xf88f14a0
> 0xf05a9bc0 0x8014b875 iput+0x2d (0xd7f33840)
>                                kernel .text 0x80100000 0x8014b848 0x8014b9d8
> 0xf05a9bdc 0x801491c1 prune_dcache+0xe9 (0x1c209)
>                                kernel .text 0x80100000 0x801490d8 0x80149250
> 0xf05a9be8 0x80149526 shrink_dcache_memory+0x22 (0x6, 0xf0, 0xf0, 0x1)
>                                kernel .text 0x80100000 0x80149504 0x80149538
> 0xf05a9c0c 0x8012f758 do_try_to_free_pages+0x28 (0xf0, 0x1, 0x0)
>                                kernel .text 0x80100000 0x8012f730 0x8012f78c
> 0xf05a9c20 0x8012f8bc try_to_free_pages+0x24 (0xf0)
>                                kernel .text 0x80100000 0x8012f898 0x8012f8c8
> 0xf05a9c50 0x801306e7 __alloc_pages+0x1d3
>                                kernel .text 0x80100000 0x80130514 0x80130790
> 0xf05a9c5c 0x8013050b _alloc_pages+0x1b
>                                kernel .text 0x80100000 0x801304f0 0x80130514
> 0xf05a9c64 0x8013079d __get_free_pages+0xd
>                                kernel .text 0x80100000 0x80130790 0x801307ac
> 0xf05a9c88 0x8012caf3 kmem_cache_grow+0xe3 (0xf020a970, 0xf0, 0x0, 0x6)
>                                kernel .text 0x80100000 0x8012ca10 0x8012cc80
> 0xf05a9cb0 0x8012d0e2 kmem_cache_zalloc+0xa6 (0xf020a970, 0xf0, 0xeda92358)
>                                kernel .text 0x80100000 0x8012d03c 0x8012d11c
> 0xf05a9ccc 0xf888f7ef [xfs_support]kmem_zone_zalloc+0x43 (0xf020a970, 0x4)
>                                xfs_support .text 0xf888f060 0xf888f7ac 0xf888
> f840
> 0xf05a9cf0 0xf88ca721 [xfs]xfs_iread+0x2d (0xf0325400, 0x0, 0x340102b, 0x0, 0
> xf05a9d38)
>                                xfs .text 0xf8894060 0xf88ca6f4 0xf88ca890
> 0xf05a9d3c 0xf88c87cc [xfs]xfs_iget_core+0x214 (0xbd3c13cc, 0xf0325400, 0x0, 
> 0x340102b, 0x0)
>                                xfs .text 0xf8894060 0xf88c85b8 0xf88c8ae0
> 0xf05a9d80 0xf88c8b4e [xfs]xfs_iget+0x6e (0xf0325400, 0x0, 0x340102b, 0x0, 0x
> 0)
>                                xfs .text 0xf8894060 0xf88c8ae0 0xf88c8c14
> 0xf05a9df0 0xf88def23 [xfs]xfs_dir_lookup_int+0x137 (0x0, 0xc74296b4, 0x5, 0x
> 80f8e43c, 0xf05a9e7c)
>                                xfs .text 0xf8894060 0xf88dedec 0xf88df0b0
> 0xf05a9e38 0xf88e36a2 [xfs]xfs_lookup+0x96 (0xc74296b4, 0x80f8e43c, 0xf05a9e7
> 8, 0xf05a9e7c, 0x0)
>                                xfs .text 0xf8894060 0xf88e360c 0xf88e370c
> 0xf05a9e88 0xf88ec6a4 [xfs]linvfs_lookup+0x64 (0xe6ac70c0, 0x80f8e3e0)
>                                xfs .text 0xf8894060 0xf88ec640 0xf88ec6f4
> 0xf05a9ea8 0x8014168a lookup_hash+0xaa (0xf05a9ec0, 0x9b2a8ba0)
>                                kernel .text 0x80100000 0x801415e0 0x801416e4
> 0xf05a9ecc 0x80141737 lookup_one_len+0x53 (0x8c8428b4, 0x9b2a8ba0, 0xd)
>                                kernel .text 0x80100000 0x801416e4 0x80141750
> 0xf05a9f04 0x8017f48f nfsd_lookup+0x34f (0xf7932600, 0xf7932400, 0x8c8428b4, 
> 0xd, 0xf7932200)
>                                kernel .text 0x80100000 0x8017f140 0x8017f5bc
> 0xf05a9f2c 0x8017ce7e nfsd_proc_lookup+0x86 (0xf7932600, 0xf7932400, 0xf79322
> 00)
>                                kernel .text 0x80100000 0x8017cdf8 0x8017ce94
> 0xf05a9f4c 0x8017c759 nfsd_dispatch+0xc5 (0xf7932600, 0xf0584014)
>                                kernel .text 0x80100000 0x8017c694 0x8017c7f0
> 0xf05a9fa8 0x802593ca svc_process+0x2ca (0xf0e89f60, 0xf7932600)
>                                kernel .text 0x80100000 0x80259100 0x80259650
> 
> 
> We get a similar lockup with a 2.4.8pre7 kernel:
> 
> 
> [1]kdb> btp 506
>     EBP       EIP         Function(args)
> 0xf6899868 0x80113342 schedule+0x476 (0xd7e37380, 0x1)
>                                kernel .text 0x80100000 0x80112ecc
> 0x80113550
> 0xf6899894 0x80105b54 __down+0x78
>                                kernel .text 0x80100000 0x80105adc
> 0x80105bb4
> 0xf68998a8 0x80105d13 __down_failed+0xb (0xf6899958, 0xf889a94a,
> 0xd7e37380, 0x4026e000, 
> 0xf3e3ad20)
>                                kernel .text 0x80100000 0x80105d08
> 0x80105d1c
>            0xf889ae2a [pagebuf].text.lock+0x10c
>                                pagebuf .text.lock 0xf889ad1e 0xf889ad1e
> 0xf889aec0
> 0xf68998b0 0xf889a83f [pagebuf]_pagebuf_grab_lock+0x13 (0xd7e37380,
> 0x4026e000)
>                                pagebuf .text 0xf8896060 0xf889a82c
> 0xf889a844
> 0xf6899958 0xf889a94a [pagebuf]_pagebuf_find_lockable_buffer+0x106
> (0xf3e3ad20, 0x4026e00
> 0, 0x6, 0x2000, 0x22200)
>                                pagebuf .text 0xf8896060 0xf889a844
> 0xf889aa38
> 0xf6899988 0xf889aa69 [pagebuf]_pagebuf_get_lockable_buffer+0x31
> (0xf3e3ad20, 0x4026e000,
>  0x6, 0x2000, 0x22200)
>                                pagebuf .text 0xf8896060 0xf889aa38
> 0xf889ab18
> 0xf68999c4 0xf8896ef2 [pagebuf]pagebuf_get+0x8e (0xf3e3ad20, 0x4026e000,
> 0x6, 0x2000, 0x2
> 2200)
>                                pagebuf .text 0xf8896060 0xf8896e64
> 0xf8896fb4
> 0xf68999f0 0xf88ede6c [xfs]xfs_trans_get_buf+0xfc (0xe5ba5a60, 0xf1e6a164,
> 0x3201370, 0x0
> , 0x2000)
>                                xfs .text 0xf88a4060 0xf88edd70 0xf88edeb4
> 0xf6899b3c 0xf88d4e76 [xfs]xfs_ialloc_ag_alloc+0x546 (0xe5ba5a60,
> 0xecd48b40, 0xf6899bc4)
>                                xfs .text 0xf88a4060 0xf88d4930 0xf88d51f4
> 0xf6899be8 0xf88d5599 [xfs]xfs_dialloc+0x149 (0xe5ba5a60, 0x6400095, 0x0,
> 0x81b6, 0x1)
> [1]more> 
>                                xfs .text 0xf88a4060 0xf88d5450 0xf88d5cb8
> 0xf6899c4c 0xf88da9cf [xfs]xfs_ialloc+0x6f (0xe5ba5a60, 0x8559c890,
> 0x81b6, 0x1, 0x0)
>                                xfs .text 0xf88a4060 0xf88da960 0xf88dacf4
> 0xf6899cc4 0xf88ef117 [xfs]xfs_dir_ialloc+0x67 (0xf6899d70, 0x8559c890,
> 0x81b6, 0x1, 0x0)
>                                xfs .text 0xf88a4060 0xf88ef0b0 0xf88ef2c8
> 0xf6899d9c 0xf88f3b45 [xfs]xfs_create+0x439 (0x8559c8a8, 0x8b89679c,
> 0xf6899de8, 0x0, 0x0)
>                                xfs .text 0xf88a4060 0xf88f370c 0xf88f4150
> 0xf6899e58 0xf88fc4b5 [xfs]linvfs_common_cr+0x99 (0x982620a0, 0x8b896740,
> 0x81b6, 0x1, 0x0)
>                                xfs .text 0xf88a4060 0xf88fc41c 0xf88fc624
> 0xf6899e74 0xf88fc63c [xfs]linvfs_create+0x18 (0x982620a0, 0x8b896740,
> 0x81b6)
>                                xfs .text 0xf88a4060 0xf88fc624 0xf88fc640
> 0xf6899e9c 0x80142164 vfs_create+0xd8 (0x982620a0, 0x8b896740, 0x81b6)
>                                kernel .text 0x80100000 0x8014208c
> 0x801421d4
> 0xf6899ec8 0x80182eb3 nfsd_create+0x27b (0xf68a7600, 0xf68a7400,
> 0x970730b4, 0xd, 0xf68a7498)
>                                kernel .text 0x80100000 0x80182c38
> 0x80182f7c
> 0xf6899f2c 0x8017fd37 nfsd_proc_create+0x3f3 (0xf68a7600, 0xf68a7400,
> 0xf68a7200)
>                                kernel .text 0x80100000 0x8017f944
> 0x8017fe74
> 0xf6899f4c 0x8017ef79 nfsd_dispatch+0xc5 (0xf68a7600, 0xf6894014)
>                                kernel .text 0x80100000 0x8017eeb4
> 0x8017f010
> 0xf6899fa8 0x80241aaa svc_process+0x2ca (0xf731ed20, 0xf68a7600)
>                                kernel .text 0x80100000 0x802417e0
> 0x80241d30
> 0xf6899fec 0x8017ed37 nfsd+0x1af
>                                kernel .text 0x80100000 0x8017eb88
> 0x8017eeb4
> [1]more> 
>            0x801055c7 kernel_thread+0x23
>                                kernel .text 0x80100000 0x801055a4
> 0x801055dc
> 
> 
> 
> The system has 8 XFS filesystems of size 32G each, each fs is mounted
> with a 16M logdev on a trd ramdisk and noatime.
> 
> Any ideas? pagebuf bug perhaps?
> 
> Cheers, Tridge