xfs
[Top] [All Lists]

Re: grab_cache_page deadlock | was Re: set_buffer_dirty_uptodate

To: Rajagopal Ananthanarayanan <ananth@xxxxxxx>
Subject: Re: grab_cache_page deadlock | was Re: set_buffer_dirty_uptodate
From: Andi Kleen <ak@xxxxxxx>
Date: Thu, 28 Dec 2000 22:03:53 +0100
Cc: Marcelo Tosatti <marcelo@xxxxxxxxxxxxxxxx>, linux-xfs@xxxxxxxxxxx, andrea@xxxxxxx
In-reply-to: <3A4B8DB8.35A004CD@sgi.com>; from ananth@sgi.com on Thu, Dec 28, 2000 at 11:00:08AM -0800
References: <Pine.LNX.4.21.0012271445500.11471-100000@freak.distro.conectiva> <3A4B8DB8.35A004CD@sgi.com>
Sender: owner-linux-xfs@xxxxxxxxxxx
User-agent: Mutt/1.2.5i
On Thu, Dec 28, 2000 at 11:00:08AM -0800, Rajagopal Ananthanarayanan wrote:
> Marcelo Tosatti wrote:
> > 
> > On Tue, 26 Dec 2000, Marcelo Tosatti wrote:
> > 
> > > The correct solution to your problem is to not pass __GFP_IO in the
> > > allocation flag passed to __alloc_pages.
> > >
> > > This way the allocation routines will not try to do any kind of IO and
> > > will not wait for kswapd.
> > >
> 
> The problem still happens with the patch, now under a different scenario.
> The issue is that _any_ memory allocation under a FS lock can wait for
> kswapd ... and kswapd can, in turn, wait for a FS lock while pruning the 
> dcache.
> Following is a typical backtrace of  (1) kswapd (2) a process waiting for
> memory while trying to copy-in a part of user memory. Obviously, it will
> be impossible to fix all these allocations to not have GFP_IO, so an alternate
> strategy in __alloc_pages that does not wait for kswapd is one likely 
> solution.
> Another possibility is to have a seperate daemon for doing the pruning work.

The separate daemon (kiod) has just been removed from 2.2 because it was
broken -- it prevents write throttling for memory hooks, causing spurious
oom conditions. 

Andrea Arcangelli (cc'ed) did that work for 2.2, perhaps he can suggest a good
solution for 2.4/XFS too.


-Andi


> A third possibility is to let FS's iput return failure when it can't get 
> locks;
> thus avoiding kswapd to wait indefinitely.
> 
> What do you think?
> 
> -----------------
> [1]kdb> btp 3
>     EBP       EIP         Function(args)
> 0xc1147e34 0xc0112832 schedule+0x416
>                                kernel .text 0xc0100000 0xc011241c 0xc0112a40
>            0xc486db0e [xfs_support]lock_wait+0xa6 (0xc1d535f0, 0xc1d53608, 
> 0x0)
>                                xfs_support .text 0xc486d060 0xc486da68 
> 0xc486db3c
>            0xc486dbac [xfs_support]mraccessf_Rsmp_c4d68361+0x48 (0xc1d535e4, 
> 0x288)
>                                xfs_support .text 0xc486d060 0xc486db64 
> 0xc486dbc4
>            0xc48ac176 [xfs]xfs_ilock_ra+0x8a (0xc1d53558, 0x8)
>                                xfs .text 0xc4873060 0xc48ac0ec 0xc48ac180
>            0xc48ac193 [xfs]xfs_ilock+0x13 (0xc48c60d4, 0xc1d53558, 0x8)
>                                xfs .text 0xc4873060 0xc48ac180 0xc48ac198
>            0xc48c60d4 [xfs]xfs_inactive_free_eofblocks+0xb8 (0xc2f41000, 
> 0xc1d53558)
>                                xfs .text 0xc4873060 0xc48c601c 0xc48c62ec
>            0xc48c6ad8 [xfs]xfs_inactive+0x110 (0xc1d53570, 0x0)
>                                xfs .text 0xc4873060 0xc48c69c8 0xc48c6e48
>            0xc48d5558 [xfs]vn_put+0x44 (0xc3547bb4)
>                                xfs .text 0xc4873060 0xc48d5514 0xc48d557c
>            0xc48d4693 [xfs]linvfs_put_inode+0x17 (0xc3547ac0)
>                                xfs .text 0xc4873060 0xc48d467c 0xc48d4698
>            0xc014a10b iput+0x2b (0xc3547ac0)
>                                kernel .text 0xc0100000 0xc014a0e0 0xc014a250
>            0xc014829d prune_dcache+0xb9 (0xd)
> [1]more> 
>                                kernel .text 0xc0100000 0xc01481e4 0xc0148324
>            0xc01485e9 shrink_dcache_memory+0x21 (0x5, 0x4)
>                                kernel .text 0xc0100000 0xc01485c8 0xc01485f8
>            0xc012d8ca refill_inactive+0xe2 (0x4, 0x0, 0x6, 0x4, 0x6)
>                                kernel .text 0xc0100000 0xc012d7e8 0xc012d940
>            0xc012d9a2 do_try_to_free_pages+0x62 (0x4, 0x0, 0xc1161fb4)
>                                kernel .text 0xc0100000 0xc012d940 0xc012d9c8
>            0xc012da56 kswapd+0x8e
>                                kernel .text 0xc0100000 0xc012d9c8 0xc012db00
>            0xc01074cb kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc01074a8 0xc01074d8
> ------------------
> 
> A random process waiting for memory:
> 
> ------------------
> [1]kdb> btp 9759
>     EBP       EIP         Function(args)
> 0xc18c9c74 0xc0112832 schedule+0x416
>                                kernel .text 0xc0100000 0xc011241c 0xc0112a40
>            0xc012dbbb wakeup_kswapd+0xbb (0x1)
>                                kernel .text 0xc0100000 0xc012db00 0xc012dbd8
>            0xc012e8ee __alloc_pages+0x246
>                                kernel .text 0xc0100000 0xc012e6a8 0xc012e9a0
>            0xc012e9b4 __get_free_pages+0x14
>                                kernel .text 0xc0100000 0xc012e9a0 0xc012e9c4
>            0xc012f0b5 read_swap_cache_async+0x31 (0x1f3f00, 0x1, 0x1f3f00)
>                                kernel .text 0xc0100000 0xc012f084 0xc012f120
>            0xc0122966 do_swap_page+0x4a (0xc21a4360, 0xc3141960, 0x804b3a0, 
> 0xc23a012c, 0x1f3f00)
>                                kernel .text 0xc0100000 0xc012291c 0xc0122a88
>            0xc0122d9b handle_mm_fault+0x143 (0xc21a4360, 0xc3141960, 
> 0x804b3a0, 0x0, 0xc18c8000)
>                                kernel .text 0xc0100000 0xc0122c58 0xc0122e00
>            0xc0111937 do_page_fault+0x14f (0xc18c9df4, 0x0, 0x0, 0x380, 
> 0x804c1a0)
>                                kernel .text 0xc0100000 0xc01117e8 0xc0111c00
>            0xc01090d8 error_code+0x34
>                                kernel .text 0xc0100000 0xc01090a4 0xc01090e0
> Interrupt registers:
> eax = 0x00000000 ebx = 0x00000000 ecx = 0x00000380 edx = 0x0804c1a0 
> esi = 0x0804b3a0 edi = 0xc04dc200 esp = 0xc18c9e28 eip = 0xc01e97dc 
> [1]more> 
> ebp = 0x00000000 xss = 0x00000018 xcs = 0x00000010 eflags = 0x00010202 
> xds = 0x08040018 xes = 0x00000018 origeax = 0xffffffff &regs = 0xc18c9df4
>            0xc01e97dc __generic_copy_from_user+0x30 (0xc04dc200, 0x804b3a0, 
> 0xe00)
>                                kernel .text 0xc0100000 0xc01e97ac 0xc01e97e8
>            0xc4865464 [pagebuf]pagebuf_generic_file_write_Rsmp_1a301b89+0x344 
> (0xc1f03d80, 0x804b3a0, 0x1000, 0xc18c9f88,
> 0xc18c9f3c)
>                                pagebuf .text 0xc4860060 0xc4865120 0xc486560c
>            0xc48d026e [xfs]xfs_rdwr+0x6e (0xc3bd9bd4, 0xc1f03d80, 0x804b3a0, 
> 0x1000, 0xc18c9f88)
>                                xfs .text 0xc4873060 0xc48d0200 0xc48d027c
>            0xc48d111f [xfs]xfs_write+0x19b (0xc3bd9bd4, 0xc18c9f7c, 0x0, 0x0, 
> 0x0)
>                                xfs .text 0xc4873060 0xc48d0f84 0xc48d11f8
>            0xc48cd727 [xfs]linvfs_write+0xf3 (0xc1f03d80, 0x804b3a0, 0x1000, 
> 0xc1f03da0)
>                                xfs .text 0xc4873060 0xc48cd634 0xc48cd750
>            0xc013401a sys_write+0x8e (0xb, 0x804b3a0, 0x1000, 0x38, 0x10c3)
>                                kernel .text 0xc0100000 0xc0133f8c 0xc0134050
>            0xc0108fa3 system_call+0x33
>                                kernel .text 0xc0100000 0xc0108f70 0xc0108fa8
> ---------------------
> 
> ananth.

<Prev in Thread] Current Thread [Next in Thread>