xfs
[Top] [All Lists]

Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS

To: xfs@xxxxxxxxxxx
Subject: Re: XFS: Observed Crash followed by deadlock of khubd/sync/XFS
From: Amit Sahrawat <amit.sahrawat83@xxxxxxxxx>
Date: Thu, 8 Sep 2011 22:58:35 +0530
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=GvDSgGF7PYdpJIZ11kK0/e/rBrVl7n0vpjU4Xd+Uc0U=; b=saUvrikJPr3y4XxIKNWxvHccvA0b4DBWA026EWIL/JgyV/p/n8tjcThUCi0szIlA/J 4gmUsobCTsWU8xzWSCueo8L4jdII1Y0AJYa1OAUOCxPPY5HfjK4Z/pfSt7He1M+jzJyt M0+27Gwe2bBADFdDegGhxtMsWetVIUebQ+MU8=
In-reply-to: <CADDb1s2QDk7y+JgMikoje35LviYQwzpgFihndUPuZx2VXYV4Ew@xxxxxxxxxxxxxx>
References: <CADDb1s2QDk7y+JgMikoje35LviYQwzpgFihndUPuZx2VXYV4Ew@xxxxxxxxxxxxxx>
Since this is very hard to reproduce, to make it easy to debug. This
can be reproduce by introducing msleep in the kernel xfs_umountfs()
before xfs_log_sbcount(), just add a print before this function and
sleep and the moment the print appear unplug the USB device, same
scenario will be reproduced.
CRASH will show the backtrace and return to normal shell, but when
process state is checked, khubd will be shown in TASK-UNINTERRUPTIBLE
state 'D'.
Further if sync is issued that will also get converted to 'D' state,
the back-trace for each of the task is same as mentioned in the
previous mail.

Thanks & Regards,
Amit Sahrawat

On Thu, Sep 8, 2011 at 4:35 PM, Amit Sahrawat <amit.sahrawat83@xxxxxxxxx> wrote:
> Kernel Version: 2.6.39.4
> Target: ARM
>
> Observed while doing:
> Copy some file (any size, I tried with 10MB, 100MB) to XFS partition
> After Copy do‘sync’
> Now immediately, unplug the device.
>
> usb 2-1.4: USB disconnect, address 4
> end_request: I/O error, dev sda, sector 5696908
> I/O error in filesystem ("sda3") meta-data dev sda3 block 0x56ed8c
>  ("xlog_iodone") error 5 buf count 1024
> xfs_force_shutdown(sda3,0x2) called from line 945 of file
> fs/xfs/xfs_log.c.  Return address = 0xc018ac20
> Filesystem "sda3": Log I/O Error Detected.  Shutting down filesystem: sda3
> Please umount the filesystem, and rectify the problem(s)
> XFS: Unable to update superblock counters. Freespace may not be
> correct on next mount.
> Unable to handle kernel NULL pointer dereference at virtual address 00000014
> pgd = e42d4000
> [00000014] *pgd=8b8d8031, *pte=00000000, *ppte=00000000
>
> Main Backtrace:
> [<c0189d88>] (xfs_log_move_tail+0x0/0x1b4)
> [<c0198b78>] (xfs_trans_ail_delete+0x0/0x17c)
> [<c016eaf8>] (xfs_buf_iodone+0x0/0x48)
> [<c016ea98>] (xfs_buf_do_callbacks+0x0/0x3c)
> [<c016eb7c>] (xfs_buf_iodone_callbacks+0x0/0x18c)
> [<c01a2f98>] (xfs_buf_iodone_work+0x0/0x7c)
> [<c01a3014>] (xfs_buf_ioend+0x0/0x9c)
> [<c01a36f8>] (xfs_bioerror+0x0/0x54)
> [<c01a374c>] (xfs_bdstrat_cb+0x0/0x6c)
> [<c01a3158>] (xfs_flush_buftarg+0x0/0x18c)
> [<c01a32e4>] (xfs_free_buftarg+0x0/0x78)
> [<c01aa8d0>] (xfs_close_devices+0x0/0x68)
> [<c01aa938>] (xfs_fs_put_super+0x0/0x88)
> [<c00ab2b4>] (generic_shutdown_super+0x0/0x120)
> [<c00ab3d4>] (kill_block_super+0x0/0x4c)
> [<c00aa3ac>] (deactivate_locked_super+0x0/0x5c)
> [<c00aa598>] (deactivate_super+0x0/0x60)
> [<c00c1fec>] (mntput_no_expire+0x0/0xe8)
> [<c00c2424>] (sys_umount+0x0/0x334) from [<c001ef80>]
> (ret_fast_syscall+0x0/0x30)
> ---[ end trace 6bf95bedb3092162 ]---
> Segmentation fault
> #>
>
> Again plugging the USB does not work because ‘umount’ process which
> resulted in the crash has not returned properly and the lock is kept
> held.
> When I check the state of ‘khubd’ and ‘sync’ they both lie in ‘D –
> TASK_UNINTERRUPTIBLE’ state and if their back-trace is checked at that
> point.
>
> For Khubd:
> Backtrace:
> [<c02f6524>] (schedule+0x0/0x50c)
> [<c02f8988>] (__down_read+0x0/0x130)
> [<c02f7ee4>] (down_read+0x0/0x14)
> [<c00c30a4>] (get_super+0x0/0x104)
> [<c00eed70>] (fsync_bdev+0x0/0x44)
> [<c01df914>] (invalidate_partition+0x0/0x3c)
> [<c010a384>] (del_gendisk+0x0/0xec)
> [<c0228bb8>] (sd_remove+0x0/0xc8)
>
> [<c02147f8>] (__device_release_driver+0x0/0xac)
> [<c0214994>] (device_release_driver+0x0/0x30)
> [<c0213de4>] (bus_remove_device+0x0/0x8c)
> [<c0212308>] (device_del+0x0/0x160)
> [<c0225fbc>] (__scsi_remove_device+0x0/0x90)
> [<c0223328>] (scsi_forget_host+0x0/0xbc)
> [<c021cccc>] (scsi_remove_host+0x0/0x18c)
> [<bf15fe14>] (quiesce_and_remove_host+0x0/0xe4
> [<bf15ff7c>] (usb_stor_disconnect+0x0/0x28
> [<bf11e594>] (usb_unbind_interface+0x0/0xdc
> [<c02147f8>] (__device_release_driver+0x0/0xac)
> [<c0214994>] (device_release_driver+0x0/0x30)
> [<c0213de4>] (bus_remove_device+0x0/0x8c)
> [<c0212308>] (device_del+0x0/0x160)
> [<bf11bc48>] (usb_disable_device+0x0/0x17c
> [<bf116488>] (usb_disconnect+0x0/0x158
> [<bf1167b8>] (hub_thread+0x0/0x1094
> [<c005a7d8>] (kthread+0x0/0x8c)
>
>
>
> For Sync:
> Backtrace:
> [<c02f6524>] (schedule+0x0/0x50c)
> [<c02f8988>] (__down_read+0x0/0x130)
> [<c02f7ee4>] (down_read+0x0/0x14)
> [<c00c31a8>] (iterate_supers+0x0/0xfc)
> [<c00e4690>] (sync_filesystems+0x0/0x2c)
> [<c00e47c4>] (sys_sync+0x0/0x44)
>
> Both are stuck, waiting to acquire a semaphore ‘sb->s_umount’
> During umount – which gets called when a device is unplugged flow is:
> Sys_umount()…deactivate_super()deactivate_locked_super()kill_block_super()generic_shutdown_super()
> This semaphore is taken in deactivate super and released in
> generic_shutdown_super() – ‘up_write(&sb->s_umount)’, but due to “NULL
> pointer dereference” crash it is not called.
>
> While for “NULL pointer deference” crash it shows the PC at:
> Xfs_log_move_tail() while accessing ‘log’
> if (XLOG_FORCED_SHUTDOWN(log))
>                return;
>
> Changing the condition takes crash to other places.
>
> Has anyone observed this scenario? Please advice something on this.
>
> Thanks & Regards,
> Amit Sahrawat
>

<Prev in Thread] Current Thread [Next in Thread>