[Top] [All Lists]

XFS: Observed Crash followed by deadlock of khubd/sync/XFS

To: xfs@xxxxxxxxxxx
Subject: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
From: Amit Sahrawat <amit.sahrawat83@xxxxxxxxx>
Date: Thu, 8 Sep 2011 16:35:28 +0530
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=XB2pqAySMoa43m3cEeEaxv7rRBmao2XdduSRa/NiDJo=; b=u3JsuPnJyjPbc+/EwsZLVH9/OHTHvXh401SYpTlh4neRCgnHTWnB8A/xd/I3ur9k13 i7+OKzEVVBC8bxPQpHlAwWFv0YXnJEAFCtubfl0PRA7KwyeEpkVkgvEDGd8RyoD1UGZO eFk3b95LY2md5kHUKluE/c4ZmbaFK3YLbmmS0=
Kernel Version:
Target: ARM

Observed while doing:
Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
After Copy do‘sync’
Now immediately, unplug the device.

usb 2-1.4: USB disconnect, address 4
end_request: I/O error, dev sda, sector 5696908
I/O error in filesystem ("sda3") meta-data dev sda3 block 0x56ed8c
  ("xlog_iodone") error 5 buf count 1024
xfs_force_shutdown(sda3,0x2) called from line 945 of file
fs/xfs/xfs_log.c.  Return address = 0xc018ac20
Filesystem "sda3": Log I/O Error Detected.  Shutting down filesystem: sda3
Please umount the filesystem, and rectify the problem(s)
XFS: Unable to update superblock counters. Freespace may not be
correct on next mount.
Unable to handle kernel NULL pointer dereference at virtual address 00000014
pgd = e42d4000
[00000014] *pgd=8b8d8031, *pte=00000000, *ppte=00000000

Main Backtrace:
[<c0189d88>] (xfs_log_move_tail+0x0/0x1b4)
[<c0198b78>] (xfs_trans_ail_delete+0x0/0x17c)
[<c016eaf8>] (xfs_buf_iodone+0x0/0x48)
[<c016ea98>] (xfs_buf_do_callbacks+0x0/0x3c)
[<c016eb7c>] (xfs_buf_iodone_callbacks+0x0/0x18c)
[<c01a2f98>] (xfs_buf_iodone_work+0x0/0x7c)
[<c01a3014>] (xfs_buf_ioend+0x0/0x9c)
[<c01a36f8>] (xfs_bioerror+0x0/0x54)
[<c01a374c>] (xfs_bdstrat_cb+0x0/0x6c)
[<c01a3158>] (xfs_flush_buftarg+0x0/0x18c)
[<c01a32e4>] (xfs_free_buftarg+0x0/0x78)
[<c01aa8d0>] (xfs_close_devices+0x0/0x68)
[<c01aa938>] (xfs_fs_put_super+0x0/0x88)
[<c00ab2b4>] (generic_shutdown_super+0x0/0x120)
[<c00ab3d4>] (kill_block_super+0x0/0x4c)
[<c00aa3ac>] (deactivate_locked_super+0x0/0x5c)
[<c00aa598>] (deactivate_super+0x0/0x60)
[<c00c1fec>] (mntput_no_expire+0x0/0xe8)
[<c00c2424>] (sys_umount+0x0/0x334) from [<c001ef80>]
---[ end trace 6bf95bedb3092162 ]---
Segmentation fault

Again plugging the USB does not work because ‘umount’ process which
resulted in the crash has not returned properly and the lock is kept
When I check the state of ‘khubd’ and ‘sync’ they both lie in ‘D –
TASK_UNINTERRUPTIBLE’ state and if their back-trace is checked at that

For Khubd:
[<c02f6524>] (schedule+0x0/0x50c)
[<c02f8988>] (__down_read+0x0/0x130)
[<c02f7ee4>] (down_read+0x0/0x14)
[<c00c30a4>] (get_super+0x0/0x104)
[<c00eed70>] (fsync_bdev+0x0/0x44)
[<c01df914>] (invalidate_partition+0x0/0x3c)
[<c010a384>] (del_gendisk+0x0/0xec)
[<c0228bb8>] (sd_remove+0x0/0xc8)

[<c02147f8>] (__device_release_driver+0x0/0xac)
[<c0214994>] (device_release_driver+0x0/0x30)
[<c0213de4>] (bus_remove_device+0x0/0x8c)
[<c0212308>] (device_del+0x0/0x160)
[<c0225fbc>] (__scsi_remove_device+0x0/0x90)
[<c0223328>] (scsi_forget_host+0x0/0xbc)
[<c021cccc>] (scsi_remove_host+0x0/0x18c)
[<bf15fe14>] (quiesce_and_remove_host+0x0/0xe4
[<bf15ff7c>] (usb_stor_disconnect+0x0/0x28
[<bf11e594>] (usb_unbind_interface+0x0/0xdc
[<c02147f8>] (__device_release_driver+0x0/0xac)
[<c0214994>] (device_release_driver+0x0/0x30)
[<c0213de4>] (bus_remove_device+0x0/0x8c)
[<c0212308>] (device_del+0x0/0x160)
[<bf11bc48>] (usb_disable_device+0x0/0x17c
[<bf116488>] (usb_disconnect+0x0/0x158
[<bf1167b8>] (hub_thread+0x0/0x1094
[<c005a7d8>] (kthread+0x0/0x8c)

For Sync:
[<c02f6524>] (schedule+0x0/0x50c)
[<c02f8988>] (__down_read+0x0/0x130)
[<c02f7ee4>] (down_read+0x0/0x14)
[<c00c31a8>] (iterate_supers+0x0/0xfc)
[<c00e4690>] (sync_filesystems+0x0/0x2c)
[<c00e47c4>] (sys_sync+0x0/0x44)

Both are stuck, waiting to acquire a semaphore ‘sb->s_umount’
During umount – which gets called when a device is unplugged flow is:
This semaphore is taken in deactivate super and released in
generic_shutdown_super() – ‘up_write(&sb->s_umount)’, but due to “NULL
pointer dereference” crash it is not called.

While for “NULL pointer deference” crash it shows the PC at:
Xfs_log_move_tail() while accessing ‘log’      

Changing the condition takes crash to other places.

Has anyone observed this scenario? Please advice something on this.

Thanks & Regards,
Amit Sahrawat

<Prev in Thread] Current Thread [Next in Thread>