xfs
[Top] [All Lists]

Re: shutdown umount hangs

To: Utz Lehmann <leh@xxxxxxxxxx>
Subject: Re: shutdown umount hangs
From: Steve Lord <lord@xxxxxxx>
Date: Thu, 05 Apr 2001 09:12:22 -0500
Cc: cattelan@xxxxxxxxxxx, linux-xfs@xxxxxxxxxxx
In-reply-to: Message from Utz Lehmann <leh@tecosim.de> of "Thu, 05 Apr 2001 14:46:27 +0200." <20010405144627.A1152@tecosim.de>
Sender: owner-linux-xfs@xxxxxxxxxxx
OK, you made it a lot further through unmount than before, there are
two disk I/O's left until you are unmounted from here, and you are
waiting for one of them to complete - not sure why it is not completing
yet - previously you were stuck at the start of unmount. Russell changed
the code in a different direction than we had discussed, I need to go look 
at what he did. If you have the kdbm_pg module in the kernel when this
happens can you take the first argument of pagebuf_iowait and run the
pb command on it? From the stack below that would be:

kdb> pb 0xcf724180

Thanks for trying this stuff out for us.

Steve


> Hi
> 
> cattelan@xxxxxxxxxxx [cattelan@xxxxxxxxxxx] wrote:
> > Updated patch.... 
> > Realized something while driving home, small bug in the list
> > walking code.
> 
> 
> The patch doesn't work .-(
> But the backtraces are different:
> 
> 
> Entering kdb (current=0xc0358000, pid 0) due to Keyboard Entry
> kdb> ps
> Task Addr  Pid      Parent   [*] cpu  State Thread     Command
> 0xc15fe000 00000001 00000000  0  000  stop  0xc15fe260 init
> 0xc15f0000 00000002 00000001  0  000  stop  0xc15f0260 keventd
> 0xc15ec000 00000003 00000001  0  000  stop  0xc15ec260 kswapd
> 0xc15ea000 00000004 00000001  0  000  stop  0xc15ea260 kreclaimd
> 0xc15e8000 00000005 00000001  0  000  stop  0xc15e8260 bdflush
> 0xc15e6000 00000006 00000001  0  000  stop  0xc15e6260 kupdated
> 0xc15c0000 00000007 00000001  0  000  stop  0xc15c0260 scsi_eh_0
> 0xc157e000 00000008 00000001  0  000  stop  0xc157e260 mdrecoveryd
> 0xc1572000 00000009 00000001  0  000  stop  0xc1572260 pagebuf_daemon
> 0xcdc2c000 00000999 00000001  0  000  stop  0xcdc2c260 rc
> 0xcdc86000 00001381 00000999  0  000  stop  0xcdc86260 S20reboot
> 0xcfc4c000 00001402 00001381  0  000  stop  0xcfc4c260 umount
> kdb> btp 1402
>     EBP       EIP         Function(args)
> 0xcfc4de58 0xc01108ae schedule+0x2de (0xcf724180)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
> 0xcfc4de70 0xc0105a2f __down+0x5f
>                                kernel .text 0xc0100000 0xc01059d0 0xc0105a80
>            0xc0105b94 __down_failed+0x8 (0xcf724180, 0xc01aa992, 0xcf724180, 
> 0xcfc95800, 0xcf724180)
>                                kernel .text 0xc0100000 0xc0105b8c 0xc0105b98
>            0xc0270fe5 stext_lock+0x9cd
>                                kernel .text.lock 0xc0270618 0xc0270618 0xc027
> 17c0
>            0xc015cfea pagebuf_iowait+0x2a (0xcf724180, 0xcfc95800, 0xcf724180
> , 0xcf724180)
>                                kernel .text 0xc0100000 0xc015cfc0 0xc015cff0
>            0xc01aa992 xfs_unmountfs_writesb+0x92 (0xcfc95800)
>                                kernel .text 0xc0100000 0xc01aa900 0xc01aa9e0
>            0xc01aa85a xfs_unmountfs+0x5a (0xcfc95800, 0x3, 0xc03ac360)
>                                kernel .text 0xc0100000 0xc01aa800 0xc01aa8b0
>            0xc01b2f48 xfs_unmount+0x168 (0xcfc95800, 0x0, 0xc03ac360)
>                                kernel .text 0xc0100000 0xc01b2de0 0xc01b2f60
>            0xc01bdf1a fs_dounmount+0x5a (0xcfc95800, 0x0, 0x0, 0xc03ac360, 0x
> cf7b6248)
>                                kernel .text 0xc0100000 0xc01bdec0 0xc01bdf40
>            0xc01c5288 linvfs_put_super+0x58 (0xcf917e00)
>                                kernel .text 0xc0100000 0xc01c5230 0xc01c5300
>            0xc0134237 kill_super+0x87 (0xcf917e00, 0x0, 0xcf7c33c0, 0xfffffff
> f, 0xcfb339c0)
> more> 
>                                kernel .text 0xc0100000 0xc01341b0 0xc01342f0
>            0xc0134641 do_umount+0x1c1 (0xcf7c33c0, 0x0, 0x0)
>                                kernel .text 0xc0100000 0xc0134480 0xc0134650
>            0xc0134716 sys_umount+0xc6 (0x8052428, 0x0)
>                                kernel .text 0xc0100000 0xc0134650 0xc0134750
>            0xc013475c sys_oldumount+0xc (0x8052428, 0x804ee27, 0x8052468, 0x8
> 052429, 0x804ee20)
>                                kernel .text 0xc0100000 0xc0134750 0xc0134760
>            0xc0106f17 system_call+0x33
>                                kernel .text 0xc0100000 0xc0106ee4 0xc0106f1c
> kdb> bta
> Stack traceback for pid 1
>     EBP       EIP         Function(args)
> 0xc15ffefc 0xc01108ae schedule+0x2de (0xc15fff10)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
> 0xc15fff24 0xc011059a schedule_timeout+0x7a
>                                kernel .text 0xc0100000 0xc0110520 0xc01105c0
>            0xc013d613 do_select+0x93 (0xb, 0xc15fffa8, 0xc15fffa4)
>                                kernel .text 0xc0100000 0xc013d580 0xc013d790
>            0xc013dbf2 sys_select+0x432 (0xb, 0xbffff92c, 0x0, 0x0, 0xbffff874
> )
>                                kernel .text 0xc0100000 0xc013d7c0 0xc013dd70
>            0xc0106f17 system_call+0x33
>                                kernel .text 0xc0100000 0xc0106ee4 0xc0106f1c
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 2
>     EBP       EIP         Function(args)
> 0xc15f1fa8 0xc01108ae schedule+0x2de (0x700)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc011dd05 context_thread+0x115
>                                kernel .text 0xc0100000 0xc011dbf0 0xc011ddb0
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 3
>     EBP       EIP         Function(args)
> 0xc15edf90 0xc01108ae schedule+0x2de (0xc15edfa4)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
> 0xc15edfb8 0xc011059a schedule_timeout+0x7a (0x10f00, 0xc027bb71, 0xc15ec239)
>                                kernel .text 0xc0100000 0xc0110520 0xc01105c0
> 0xc15edfdc 0xc0110c16 interruptible_sleep_on_timeout+0x46 (0xc15fffbc)
>                                kernel .text 0xc0100000 0xc0110bd0 0xc0110c40
>            0xc01293b9 kswapd+0xe9
>                                kernel .text 0xc0100000 0xc01292d0 0xc01293e0
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 4
>     EBP       EIP         Function(args)
> 0xc15ebfb0 0xc01108ae schedule+0x2de (0x10f00)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
> 0xc15ebfcc 0xc0110bb0 interruptible_sleep_on+0x40 (0x10f00, 0xc15fffb0)
>                                kernel .text 0xc0100000 0xc0110b70 0xc0110bd0
>            0xc012948b kreclaimd+0x5b
>                                kernel .text 0xc0100000 0xc0129430 0xc0129510
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 5
>     EBP       EIP         Function(args)
> 0xc15e9fd8 0xc01108ae schedule+0x2de (0x10f00)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc0132c3e bdflush+0xce
>                                kernel .text 0xc0100000 0xc0132b70 0xc0132c50
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 6
>     EBP       EIP         Function(args)
> 0xc15e7f88 0xc01108ae schedule+0x2de
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc0133a83 __wait_on_super+0x73 (0xcf917e00)
>                                kernel .text 0xc0100000 0xc0133a10 0xc0133aa0
>            0xc0133ae5 sync_supers+0x45 (0x0)
>                                kernel .text 0xc0100000 0xc0133aa0 0xc0133b50
>            0xc0132a67 sync_old_buffers+0x7 (0x10f00)
>                                kernel .text 0xc0100000 0xc0132a60 0xc0132aa0
>            0xc0132d2c kupdate+0xdc
>                                kernel .text 0xc0100000 0xc0132c50 0xc0132d30
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 7
>     EBP       EIP         Function(args)
> 0xc15c1f78 0xc01108ae schedule+0x2de (0xc15c20c0, 0xc15c0000)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc0105b00 __down_interruptible+0x80
>                                kernel .text 0xc0100000 0xc0105a80 0xc0105b50
>            0xc0105b9f __down_failed_interruptible+0x7 (0x100, 0xc15fff68, 0xc
> 15fffc0, 0xc15c20c0, 0x0)
>                                kernel .text 0xc0100000 0xc0105b98 0xc0105ba4
>            0xc02713c3 stext_lock+0xdab
>                                kernel .text.lock 0xc0270618 0xc0270618 0xc027
> 17c0
>            0xc01f65de scsi_error_handler+0xbe
>                                kernel .text 0xc0100000 0xc01f6520 0xc01f6650
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 8
>     EBP       EIP         Function(args)
> 0xc157ffb4 0xc01108ae schedule+0x2de
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc022d654 md_thread+0x104
>                                kernel .text 0xc0100000 0xc022d550 0xc022d6c0
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 9
>     EBP       EIP         Function(args)
> 0xc1573f90 0xc01108ae schedule+0x2de (0xcea60980)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
> 0xc1573fac 0xc0110bb0 interruptible_sleep_on+0x40 (0xc1573fdc, 0xc1573fdc, 0x
> f00)
>                                kernel .text 0xc0100000 0xc0110b70 0xc0110bd0
>            0xc015d604 pagebuf_daemon+0xd4
>                                kernel .text 0xc0100000 0xc015d530 0xc015d760
>            0xc01054b3 kernel_thread+0x23
>                                kernel .text 0xc0100000 0xc0105490 0xc01054c0
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 999
>     EBP       EIP         Function(args)
> 0xcdc2df80 0xc01108ae schedule+0x2de (0xcdc2c000)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc0115caf sys_wait4+0x37f (0xffffffff, 0xbffff8a0, 0x0, 0x0, 0x0)
>                                kernel .text 0xc0100000 0xc0115930 0xc0115ce0
>            0xc0106f17 system_call+0x33
>                                kernel .text 0xc0100000 0xc0106ee4 0xc0106f1c
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 1381
>     EBP       EIP         Function(args)
> 0xcdc87f80 0xc01108ae schedule+0x2de (0xcdc86000)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
>            0xc0115caf sys_wait4+0x37f (0xffffffff, 0xbffffa00, 0x0, 0x0, 0x0)
>                                kernel .text 0xc0100000 0xc0115930 0xc0115ce0
>            0xc0106f17 system_call+0x33
>                                kernel .text 0xc0100000 0xc0106ee4 0xc0106f1c
> Enter <q> to end, <cr> to continue:
> Stack traceback for pid 1402
>     EBP       EIP         Function(args)
> 0xcfc4de58 0xc01108ae schedule+0x2de (0xcf724180)
>                                kernel .text 0xc0100000 0xc01105d0 0xc0110a10
> 0xcfc4de70 0xc0105a2f __down+0x5f
>                                kernel .text 0xc0100000 0xc01059d0 0xc0105a80
>            0xc0105b94 __down_failed+0x8 (0xcf724180, 0xc01aa992, 0xcf724180, 
> 0xcfc95800, 0xcf724180)
>                                kernel .text 0xc0100000 0xc0105b8c 0xc0105b98
>            0xc0270fe5 stext_lock+0x9cd
>                                kernel .text.lock 0xc0270618 0xc0270618 0xc027
> 17c0
>            0xc015cfea pagebuf_iowait+0x2a (0xcf724180, 0xcfc95800, 0xcf724180
> , 0xcf724180)
>                                kernel .text 0xc0100000 0xc015cfc0 0xc015cff0
>            0xc01aa992 xfs_unmountfs_writesb+0x92 (0xcfc95800)
>                                kernel .text 0xc0100000 0xc01aa900 0xc01aa9e0
>            0xc01aa85a xfs_unmountfs+0x5a (0xcfc95800, 0x3, 0xc03ac360)
>                                kernel .text 0xc0100000 0xc01aa800 0xc01aa8b0
>            0xc01b2f48 xfs_unmount+0x168 (0xcfc95800, 0x0, 0xc03ac360)
>                                kernel .text 0xc0100000 0xc01b2de0 0xc01b2f60
>            0xc01bdf1a fs_dounmount+0x5a (0xcfc95800, 0x0, 0x0, 0xc03ac360, 0x
> cf7b6248)
>                                kernel .text 0xc0100000 0xc01bdec0 0xc01bdf40
>            0xc01c5288 linvfs_put_super+0x58 (0xcf917e00)
>                                kernel .text 0xc0100000 0xc01c5230 0xc01c5300
> more> 
>            0xc0134237 kill_super+0x87 (0xcf917e00, 0x0, 0xcf7c33c0, 0xfffffff
> f, 0xcfb339c0)
>                                kernel .text 0xc0100000 0xc01341b0 0xc01342f0
>            0xc0134641 do_umount+0x1c1 (0xcf7c33c0, 0x0, 0x0)
>                                kernel .text 0xc0100000 0xc0134480 0xc0134650
>            0xc0134716 sys_umount+0xc6 (0x8052428, 0x0)
>                                kernel .text 0xc0100000 0xc0134650 0xc0134750
>            0xc013475c sys_oldumount+0xc (0x8052428, 0x804ee27, 0x8052468, 0x8
> 052429, 0x804ee20)
>                                kernel .text 0xc0100000 0xc0134750 0xc0134760
>            0xc0106f17 system_call+0x33
>                                kernel .text 0xc0100000 0xc0106ee4 0xc0106f1c
> Enter <q> to end, <cr> to continue:
> kdb> reboot
> 
> 
> 
> 
> 
> 
> btw: perhaps this is interesting:
> 
> This is a working shutdown (kernel from 2001-03-02):
> 
> Master Resource Control: previous runlevel: 5, switching to runlevel: 6
> Shutting down httpd done
> Shutting down inetd done
> Shutting down service automount done
> Shutting down CRON daemon done
> Shutting down kernel based NFS server done
> Shutting down nis_cachemgr done
> Shutting down Name Service Cache Daemon done
> Shutting down service gdm done
> Shutting down process accounting: done
> Shutting down service at daemon: done
> Shutting down identd service done
> Shutting down xntpd: done
> Shutting down ypbind done
> Shutting down lpd done
> umount: /raid/users: device is busy
> Shutting down service usbmgr done
> Shutting down RPC portmap daemon done
> Shutting down SSH daemon: done
> Shutting down syslog services done
> Shutting down routing done
> Shutting down network device eth0 done
> Saving random seed done
> Running /etc/init.d/halt.local
>  done
> Sending all procRPC: sendmsg returned error 101
> esses the TERM snfs: RPC call returned error 101
> ignal...
> RPC: sendmsg returned error 101
> nfs: RPC call returned error 101
> RPC: sendmsg returned error 101
> nfs: RPC call returned error 101
> RPC: sendmsg returned error 101
> nfs: RPC call returned error 101
>  done
> Sending all procmd: recovery thread got woken up ...
> esses the KILL smd: recovery thread finished ...
> ignal...
> mdrecoveryd(7) flushing signals.
>  done
> Turning off swap
>  done
> Unmounting file systems
> umount: /raid: device is busy
> shmfs umounted
> /dev/sda2 umounted
> /dev/vg00/tmp umounted
> /dev/vg00/opt umounted
> /dev/vg00/var umounted
> /dev/vg00/usr umounted
> devpts umounted
> /dev/sda1 umounted
>  done
> Oops: umount failed :-(  --  trying to remount readonly...
> /dev/sda1 on / type xfs (ro)
> automount(pid767) on /raid type autofs (ro,fd=5,pgrp=767,minproto=2,maxproto=
> 4)
> extra sync...
> ... hope now it's ok to reboot.
> proc umounted
> vgchange -- volume group "vg00" successfully deactivated
> 
> Please stand by while rebooting the system...
> stopping all md devices.
> Restarting system.
> 
> 
> 
> This from a not working (2001-04-05 with pb_patch):
> 
> Master Resource Control: previous runlevel: 3, switching to runlevel: 6
> Shutting down httpd done
> Shutting down inetd done
> Shutting down service automount done
> Shutting down CRON daemon done
> Shutting down kernel based NFS server done
> Shutting down nis_cachemgr done
> Shutting down Name Service Cache Daemon done
> Shutting down process accounting: done
> Shutting down service at daemon: done
> Shutting down identd service done
> Shutting down xntpd: done
> Shutting down ypbind done
> Shutting down lpd done
> Shutting down service usbmgr done
> Shutting down RPC portmap daemon done
> Shutting down SSH daemon: done
> Shutting down syslog services done
> Shutting down routing done
> Shutting down network device eth0 done
> Saving random seed done
> Running /etc/init.d/halt.local
>  done
> Sending all processes the TERM signal...
> Sending all procmd: recovery thread got woken up ...
> esses the KILL smd: recovery thread finished ...
> ignal...
> mdrecoveryd(8) flushing signals.
>  done
> Turning off swap
>  done
> Unmounting file systems
> shmfs umounted
> /dev/sda2 umounted
> /dev/vg00/tmp umounted
> /dev/vg00/opt umounted
> 
> (hangs)
> 
> 
> The "/raid/users: device is busy" and RPC errors were normal, but on all not
> working kernel i didn't see this. mayme this is another story. i testet it
> without nfs/automounter and it hangs too.
> 
> 
> utz



<Prev in Thread] Current Thread [Next in Thread>