On Thu, Dec 28, 2000 at 11:00:08AM -0800, Rajagopal Ananthanarayanan wrote:
> Marcelo Tosatti wrote:
> >
> > On Tue, 26 Dec 2000, Marcelo Tosatti wrote:
> >
> > > The correct solution to your problem is to not pass __GFP_IO in the
> > > allocation flag passed to __alloc_pages.
> > >
> > > This way the allocation routines will not try to do any kind of IO and
> > > will not wait for kswapd.
> > >
>
> The problem still happens with the patch, now under a different scenario.
> The issue is that _any_ memory allocation under a FS lock can wait for
> kswapd ... and kswapd can, in turn, wait for a FS lock while pruning the
> dcache.
> Following is a typical backtrace of (1) kswapd (2) a process waiting for
> memory while trying to copy-in a part of user memory. Obviously, it will
> be impossible to fix all these allocations to not have GFP_IO, so an alternate
> strategy in __alloc_pages that does not wait for kswapd is one likely
> solution.
> Another possibility is to have a seperate daemon for doing the pruning work.
The separate daemon (kiod) has just been removed from 2.2 because it was
broken -- it prevents write throttling for memory hooks, causing spurious
oom conditions.
Andrea Arcangelli (cc'ed) did that work for 2.2, perhaps he can suggest a good
solution for 2.4/XFS too.
-Andi
> A third possibility is to let FS's iput return failure when it can't get
> locks;
> thus avoiding kswapd to wait indefinitely.
>
> What do you think?
>
> -----------------
> [1]kdb> btp 3
> EBP EIP Function(args)
> 0xc1147e34 0xc0112832 schedule+0x416
> kernel .text 0xc0100000 0xc011241c 0xc0112a40
> 0xc486db0e [xfs_support]lock_wait+0xa6 (0xc1d535f0, 0xc1d53608,
> 0x0)
> xfs_support .text 0xc486d060 0xc486da68
> 0xc486db3c
> 0xc486dbac [xfs_support]mraccessf_Rsmp_c4d68361+0x48 (0xc1d535e4,
> 0x288)
> xfs_support .text 0xc486d060 0xc486db64
> 0xc486dbc4
> 0xc48ac176 [xfs]xfs_ilock_ra+0x8a (0xc1d53558, 0x8)
> xfs .text 0xc4873060 0xc48ac0ec 0xc48ac180
> 0xc48ac193 [xfs]xfs_ilock+0x13 (0xc48c60d4, 0xc1d53558, 0x8)
> xfs .text 0xc4873060 0xc48ac180 0xc48ac198
> 0xc48c60d4 [xfs]xfs_inactive_free_eofblocks+0xb8 (0xc2f41000,
> 0xc1d53558)
> xfs .text 0xc4873060 0xc48c601c 0xc48c62ec
> 0xc48c6ad8 [xfs]xfs_inactive+0x110 (0xc1d53570, 0x0)
> xfs .text 0xc4873060 0xc48c69c8 0xc48c6e48
> 0xc48d5558 [xfs]vn_put+0x44 (0xc3547bb4)
> xfs .text 0xc4873060 0xc48d5514 0xc48d557c
> 0xc48d4693 [xfs]linvfs_put_inode+0x17 (0xc3547ac0)
> xfs .text 0xc4873060 0xc48d467c 0xc48d4698
> 0xc014a10b iput+0x2b (0xc3547ac0)
> kernel .text 0xc0100000 0xc014a0e0 0xc014a250
> 0xc014829d prune_dcache+0xb9 (0xd)
> [1]more>
> kernel .text 0xc0100000 0xc01481e4 0xc0148324
> 0xc01485e9 shrink_dcache_memory+0x21 (0x5, 0x4)
> kernel .text 0xc0100000 0xc01485c8 0xc01485f8
> 0xc012d8ca refill_inactive+0xe2 (0x4, 0x0, 0x6, 0x4, 0x6)
> kernel .text 0xc0100000 0xc012d7e8 0xc012d940
> 0xc012d9a2 do_try_to_free_pages+0x62 (0x4, 0x0, 0xc1161fb4)
> kernel .text 0xc0100000 0xc012d940 0xc012d9c8
> 0xc012da56 kswapd+0x8e
> kernel .text 0xc0100000 0xc012d9c8 0xc012db00
> 0xc01074cb kernel_thread+0x23
> kernel .text 0xc0100000 0xc01074a8 0xc01074d8
> ------------------
>
> A random process waiting for memory:
>
> ------------------
> [1]kdb> btp 9759
> EBP EIP Function(args)
> 0xc18c9c74 0xc0112832 schedule+0x416
> kernel .text 0xc0100000 0xc011241c 0xc0112a40
> 0xc012dbbb wakeup_kswapd+0xbb (0x1)
> kernel .text 0xc0100000 0xc012db00 0xc012dbd8
> 0xc012e8ee __alloc_pages+0x246
> kernel .text 0xc0100000 0xc012e6a8 0xc012e9a0
> 0xc012e9b4 __get_free_pages+0x14
> kernel .text 0xc0100000 0xc012e9a0 0xc012e9c4
> 0xc012f0b5 read_swap_cache_async+0x31 (0x1f3f00, 0x1, 0x1f3f00)
> kernel .text 0xc0100000 0xc012f084 0xc012f120
> 0xc0122966 do_swap_page+0x4a (0xc21a4360, 0xc3141960, 0x804b3a0,
> 0xc23a012c, 0x1f3f00)
> kernel .text 0xc0100000 0xc012291c 0xc0122a88
> 0xc0122d9b handle_mm_fault+0x143 (0xc21a4360, 0xc3141960,
> 0x804b3a0, 0x0, 0xc18c8000)
> kernel .text 0xc0100000 0xc0122c58 0xc0122e00
> 0xc0111937 do_page_fault+0x14f (0xc18c9df4, 0x0, 0x0, 0x380,
> 0x804c1a0)
> kernel .text 0xc0100000 0xc01117e8 0xc0111c00
> 0xc01090d8 error_code+0x34
> kernel .text 0xc0100000 0xc01090a4 0xc01090e0
> Interrupt registers:
> eax = 0x00000000 ebx = 0x00000000 ecx = 0x00000380 edx = 0x0804c1a0
> esi = 0x0804b3a0 edi = 0xc04dc200 esp = 0xc18c9e28 eip = 0xc01e97dc
> [1]more>
> ebp = 0x00000000 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010202
> xds = 0x08040018 xes = 0x00000018 origeax = 0xffffffff ®s = 0xc18c9df4
> 0xc01e97dc __generic_copy_from_user+0x30 (0xc04dc200, 0x804b3a0,
> 0xe00)
> kernel .text 0xc0100000 0xc01e97ac 0xc01e97e8
> 0xc4865464 [pagebuf]pagebuf_generic_file_write_Rsmp_1a301b89+0x344
> (0xc1f03d80, 0x804b3a0, 0x1000, 0xc18c9f88,
> 0xc18c9f3c)
> pagebuf .text 0xc4860060 0xc4865120 0xc486560c
> 0xc48d026e [xfs]xfs_rdwr+0x6e (0xc3bd9bd4, 0xc1f03d80, 0x804b3a0,
> 0x1000, 0xc18c9f88)
> xfs .text 0xc4873060 0xc48d0200 0xc48d027c
> 0xc48d111f [xfs]xfs_write+0x19b (0xc3bd9bd4, 0xc18c9f7c, 0x0, 0x0,
> 0x0)
> xfs .text 0xc4873060 0xc48d0f84 0xc48d11f8
> 0xc48cd727 [xfs]linvfs_write+0xf3 (0xc1f03d80, 0x804b3a0, 0x1000,
> 0xc1f03da0)
> xfs .text 0xc4873060 0xc48cd634 0xc48cd750
> 0xc013401a sys_write+0x8e (0xb, 0x804b3a0, 0x1000, 0x38, 0x10c3)
> kernel .text 0xc0100000 0xc0133f8c 0xc0134050
> 0xc0108fa3 system_call+0x33
> kernel .text 0xc0100000 0xc0108f70 0xc0108fa8
> ---------------------
>
> ananth.
|