xfs
[Top] [All Lists]

Deadlock on xfs_do_force_shutdown

To: linux-xfs@xxxxxxxxxxx
Subject: Deadlock on xfs_do_force_shutdown
From: jim@xxxxxxxxxxxxxxxxxx
Date: Mon, 18 Jul 2005 15:45:52 +0100
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Demon-WebMail/2.0
Hi,

I got a deadlock on 2.6.10 where (due to some fs corruption somewhere -- not 
the point of this e-mail) xfs_trans_delete_ail called xfs_do_force_shutdown 
holding the AIL_LOCK.  Later on, xfs_trans_tail_ail was called, which went for 
AIL_LOCK again...

The code path in question (though perhaps there are other possible ones) looks 
like:
xfs_trans_delete_ail holds AIL_LOCK
-> calls xfs_do_force_shutdown
-> calls xfs_log_force_umount
-> calls xlog_state_sync_all
-> calls xlog_state_release_iclog
-> calls xlog_assign_tail_lsn
-> calls xfs_trans_tail_ail
-> tries to take AIL_LOCK

A sample backtrace I got (seen due to memory shortages as it happens, but this 
too is a separate problem) was:
Call Trace:<IRQ> <ffffffff80159260>{__alloc_pages+816} 
<ffffffff801592fe>{__get_free_pages+14} 
       <ffffffff8015cbc1>{cache_grow+273} 
<ffffffff8015d0d8>{cache_alloc_refill+440} 
       <ffffffff8015caa6>{kmem_cache_alloc+54} <ffffffff802f87ec>{alloc_skb+44} 
       <ffffffffa001029e>{:e1000:e1000_alloc_rx_buffers+110} 
       <ffffffffa0012b8d>{:e1000:e1000_clean+1869} 
<ffffffff802fecd4>{net_rx_action+132} 
       <ffffffff8013a931>{__do_softirq+113} <ffffffff8013a9e5>{do_softirq+53} 
       <ffffffff8011124f>{do_IRQ+63} <ffffffff8010e9cd>{ret_from_intr+0} 
        <EOI> <ffffffff80135d9d>{printk+141} 
<ffffffff8011ce10>{flat_send_IPI_mask+0} 
       <ffffffff8035ed37>{.text.lock.spinlock+0} 
<ffffffff802150a1>{xfs_trans_tail_ail+33} 
       <ffffffff8020909e>{xlog_assign_tail_lsn+30} 
<ffffffff80209d69>{xlog_state_release_iclog+57} 
       <ffffffff8020b0a1>{xlog_state_sync_all+209} 
<ffffffff801fa0a6>{xfs_cmn_err+214} 
       <ffffffff8020c422>{xfs_log_force_umount+322} 
<ffffffff80222ea0>{pagebuf_iodone_work+0} 
       <ffffffff8021fc14>{xfs_do_force_shutdown+132} 
<ffffffff8021538b>{xfs_trans_delete_ail+219} 
       <ffffffff8021538b>{xfs_trans_delete_ail+219} 
<ffffffff8035e9d7>{__up_wakeup+53} 
       <ffffffff801e885c>{xfs_buf_iodone+44} 
<ffffffff801e806a>{xfs_buf_do_callbacks+42} 
       <ffffffff801e8742>{xfs_buf_iodone_callbacks+322} 
<ffffffff801313c3>{__wake_up+67} 
       <ffffffff80222ea0>{pagebuf_iodone_work+0} 
<ffffffff80146450>{worker_thread+496} 
       <ffffffff80131300>{default_wake_function+0} 
<ffffffff80131300>{default_wake_function+0} 
       <ffffffff8014a840>{keventd_create_kthread+0} 
<ffffffff80146260>{worker_thread+0} 
       <ffffffff8014a840>{keventd_create_kthread+0} 
<ffffffff8014a7f9>{kthread+217} 
       <ffffffff8010ef77>{child_rip+8} 
<ffffffff8014a840>{keventd_create_kthread+0} 
       <ffffffff8014a720>{kthread+0} <ffffffff8010ef6f>{child_rip+0} 

The dmesg said:
Filesystem "sdf1": xfs_trans_delete_ail: attempting to delete a log item 
that is not in the AIL
xfs_force_shutdown(sdf1,0x8) called from line 382 of file
fs/xfs/xfs_trans_ail.c.  Return address = 0xffffffff8021538b

Soon after the first CPU deadlocked, each other CPU on my system locked up 
going for the same AIL_LOCK.  It'd be great this particular deadlock case could 
be fixed so that fs problems like this don't bring entire systems down.

Cheers,

Jim Minter <jim@xxxxxxxxxxxxxxxxxx>


<Prev in Thread] Current Thread [Next in Thread>