xfs-masters
[Top] [All Lists]

[xfs-masters] Re: 2.6.18.3 also 2.6.19 XFS xfs_force_shutdown (was: XFS

To: David Chinner <dgc@xxxxxxx>
Subject: [xfs-masters] Re: 2.6.18.3 also 2.6.19 XFS xfs_force_shutdown (was: XFS internal error [...])
From: Shinichiro HIDA <shinichiro@xxxxxxxxxxxxx>
Date: Thu, 14 Dec 2006 18:21:49 +0900
Cc: linux-kernel@xxxxxxxxxxxxxxx, xfs@xxxxxxxxxxx, xfs-masters@xxxxxxxxxxx, Keith Owens <kaos@xxxxxxx>
In-reply-to: <20061213062502.GT44411608@xxxxxxxxxxxxxxxxx>
Organization: petite auberge Stained Glass
References: <9a8748490611280749k5c97d21bx2e499d2209d27dfe@xxxxxxxxxxxxxx> <20061129013214.GH44411608@xxxxxxxxxxxxxxxxx> <9a8748490611290117oc0ba880v1a6407bc4f41088f@xxxxxxxxxxxxxx> <20061130020734.GB37654165@xxxxxxxxxxxxxxxxx> <87bqm89y6g.wl%shinichiro@xxxxxxxxxxxxx> <20061213062502.GT44411608@xxxxxxxxxxxxxxxxx>
Reply-to: xfs-masters@xxxxxxxxxxx
Sender: xfs-masters-bounce@xxxxxxxxxxx
User-agent: Wanderlust/2.15.5 (Almost Unreal) EMIKO/1.14.1 (Choanoflagellata) FLIM/1.14.8 (Shijō) APEL/10.6 EasyPG/0.0.8 Emacs/22.0.91 (i686-pc-linux-gnu) MULE/5.0 (SAKAKI)
Hi,

;; Sorry for late, and Thanks for following up.

>>>>> In <20061213062502.GT44411608@xxxxxxxxxxxxxxxxx> 
>>>>>   David Chinner <dgc@xxxxxxx> wrote:
> On Wed, Dec 13, 2006 at 02:12:23PM +0900, Shinichiro HIDA wrote:
> > Hi,
> > 
> > I met same problem on my 2 machines, 2.6.19 (Debian unstable) also
> > 2.6.18.3 (Debian stable),

> The trace:

> > ;; [1] lune: debian unstable with 2.6.19
> > Dec 12 21:31:25 lune kernel:  [<c0297b70>] xfs_da_do_buf+0x340/0xa10

[...]

> Should have been preceeded with some other output explaining the
> reason for the shutdown.

I fond syslog a little before above, from Dec 12 21:15. I would like
to try to inform a little more..

;; These logs are on lune (Debian Sid (unstable) kernel 2.6.19)
;; xfsprogs 2.8.11-1 (debian package)

#  mount |grep hdf5
/dev/hdf5 on /usr type xfs (rw)

# hdparm -I /dev/hdf
/dev/hdf:

ATA device, with non-removable media
        Model Number:       ST3120022A                              
        Serial Number:      3LJ0P66Y            
        Firmware Revision:  3.54    
Standards:
        Used: ATA/ATAPI-6 T13 1410D revision 2 
        Supported: 6 5 4 
Configuration:
        Logical         max     current
        cylinders       16383   65535
        heads           16      1
        sectors/track   63      63
        --
        CHS current addressable sectors:    4128705
        LBA    user addressable sectors:  234441648
        LBA48  user addressable sectors:  234441648
        device size with M = 1024*1024:      114473 MBytes
        device size with M = 1000*1000:      120034 MBytes (120 GB)
Capabilities:
        LBA, IORDY(can be disabled)
        Standby timer values: spec'd by Standard
        R/W multiple sector transfer: Max = 16  Current = 16
        Recommended acoustic management value: 128, current value: 0
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5 
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4 
             Cycle time: no flow control=240ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
                Security Mode feature set
           *    Power Management feature set
           *    Write cache
           *    Look-ahead
           *    Host Protected Area feature set
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    DOWNLOAD_MICROCODE
                SET_MAX security extension
           *    48-bit Address feature set
           *    Device Configuration Overlay feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
Security: 
                supported
        not     enabled
        not     locked
        not     frozen
        not     expired: security count
        not     supported: enhanced erase
HW reset results:
        CBLID- above Vih
        Device num = 1 determined by CSEL
Checksum: correct


# xfs_info /usr
meta-data=/dev/hdf5              isize=256    agcount=29, agsize=262144 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=7500339, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=0
naming   =version 2              bsize=4096  
log      =internal               bsize=4096   blocks=1200, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0


;; from syslog
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hdf5
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hdf5
Dec 12 21:15:05 lune kernel: Filesystem "hde7": Disabling barriers, not 
supported by the underlying device
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hde7
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hde7
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hdf7
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hdf7
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hdf9
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hdf9
Dec 12 21:15:05 lune kernel: Filesystem "hde6": Disabling barriers, not 
supported by the underlying device
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hde6
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hde6
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hdf6
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hdf6
Dec 12 21:15:05 lune kernel: XFS mounting filesystem hdf8
Dec 12 21:15:05 lune kernel: Ending clean XFS mount for filesystem: hdf8
Dec 12 21:15:05 lune kernel: scsi 1:0:0:0: Direct-Access     EPSON    PM-860PT 
Storage 1.10 PQ: 0 ANSI: 2
Dec 12 21:15:05 lune kernel: scsi 1:0:0:0: Attached scsi generic sg0 type 0
Dec 12 21:15:05 lune kernel: usb-storage: device scan complete
Dec 12 21:15:05 lune kernel: sd 1:0:0:0: Attached scsi removable disk sda
Dec 12 21:15:05 lune kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 100 
Mbps Full Duplex
Dec 12 21:15:05 lune kernel: NET: Registered protocol family 10
Dec 12 21:15:05 lune kernel: lo: Disabled Privacy Extensions
Dec 12 21:15:11 lune kernel: eth0: no IPv6 routers present
Dec 12 21:15:38 lune nfsd[3248]: nfssvc: writting fds to kernel failed: errno 0 
(Success)
Dec 12 21:15:38 lune nfsd[3248]: nfssvc: writting fds to kernel failed: errno 0 
(Success)
Dec 12 21:15:38 lune kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 
state recovery directory
Dec 12 21:15:38 lune kernel: NFSD: starting 90-second grace period
Dec 12 21:15:43 lune kernel: device eth0 entered promiscuous mode
Dec 12 21:15:43 lune kernel: audit(1165925743.896:2): dev=eth0 prom=256 
old_prom=0 auid=4294967295
Dec 12 21:15:53 lune ntpd[3754]: kernel time sync status 0040
[...] ;; only logged by daemons, nothing from kernel.
Dec 12 21:31:01 lune cyrus/imap[6008]: starttls: TLSv1 with cipher AES128-SHA 
(128/128 bits new) no authentication
Dec 12 21:31:01 lune cyrus/imap[6008]: login: mars.stained-g [192.168.1.14] 
shinichiro CRAM-MD5+TLS User logged in
Dec 12 21:31:05 lune dhcpd: DHCPREQUEST for 192.168.1.12 from *:*:*:*:*:* via 
eth0
Dec 12 21:31:05 lune dhcpd: DHCPACK on 192.168.1.12 to *:*:*:*:*:* via eth0
Dec 12 21:31:25 lune dhcpd: DHCPREQUEST for 192.168.1.100 from *:*:*:*:*:* via 
eth0
Dec 12 21:31:25 lune dhcpd: DHCPACK on 192.168.1.100 to *:*:*:*:*:* via eth0
Dec 12 21:31:25 lune kernel: xfs_da_do_buf: bno 16777216
Dec 12 21:31:25 lune kernel: dir: inode 9078346
Dec 12 21:31:25 lune kernel: Filesystem "hdf5": XFS internal error 
xfs_da_do_buf(1) at line 1995 of file fs/xfs/xfs_da_btree.c.  Caller 0xc02982ec
Dec 12 21:31:25 lune kernel:  [<c0297b70>] xfs_da_do_buf+0x340/0xa10
Dec 12 21:31:25 lune kernel:  [<c02982ec>] xfs_da_read_buf+0x3c/0x40
Dec 12 21:31:25 lune kernel:  [<c02a3e28>] xfs_dir2_leafn_lookup_int+0x2e8/0x540
Dec 12 21:31:25 lune kernel:  [<c02a3e28>] xfs_dir2_leafn_lookup_int+0x2e8/0x540
Dec 12 21:31:25 lune kernel:  [<c029e3bd>] xfs_dir2_data_log_unused+0x6d/0x90
Dec 12 21:31:25 lune kernel:  [<c02982ec>] xfs_da_read_buf+0x3c/0x40
Dec 12 21:31:25 lune kernel:  [<c02a1f38>] xfs_dir2_node_removename+0x368/0x5b0
Dec 12 21:31:25 lune kernel:  [<c02a1f38>] xfs_dir2_node_removename+0x368/0x5b0
Dec 12 21:31:25 lune kernel:  [<c029bca9>] xfs_dir_removename+0x119/0x120
Dec 12 21:31:25 lune kernel:  [<c02bb21d>] xfs_log_reserve+0x33d/0x550
Dec 12 21:31:25 lune kernel:  [<c02c89fb>] xfs_trans_ijoin+0x3b/0x90
Dec 12 21:31:25 lune kernel:  [<c02d22de>] xfs_remove+0x34e/0x560
Dec 12 21:31:25 lune kernel:  [<c02dc98a>] xfs_vn_unlink+0x3a/0x70
Dec 12 21:31:25 lune kernel:  [<c02c7a6e>] xfs_trans_unlocked_item+0x3e/0x60
Dec 12 21:31:25 lune kernel:  [<c02ad078>] xfs_iunlock+0x98/0xc0
Dec 12 21:31:25 lune kernel:  [<c02cdfdf>] xfs_access+0x4f/0x60
Dec 12 21:31:25 lune kernel:  [<c02dccd6>] xfs_vn_permission+0x26/0x30
Dec 12 21:31:25 lune kernel:  [<c0168f03>] may_delete+0x63/0x170
Dec 12 21:31:25 lune kernel:  [<c0169594>] vfs_unlink+0x94/0x100
Dec 12 21:31:25 lune kernel:  [<c016b561>] do_unlinkat+0xd1/0x160
Dec 12 21:31:25 lune kernel:  [<c0102fe9>] sysenter_past_esp+0x56/0x79
Dec 12 21:31:25 lune kernel:  =======================
Dec 12 21:31:25 lune kernel: Filesystem "hdf5": XFS internal error 
xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c.  Caller 0xc02d234f
Dec 12 21:31:25 lune kernel:  [<c02c6ccb>] xfs_trans_cancel+0x10b/0x140
Dec 12 21:31:25 lune kernel:  [<c02d234f>] xfs_remove+0x3bf/0x560
Dec 12 21:31:25 lune kernel:  [<c02d234f>] xfs_remove+0x3bf/0x560
Dec 12 21:31:25 lune kernel:  [<c02dc98a>] xfs_vn_unlink+0x3a/0x70
Dec 12 21:31:25 lune kernel:  [<c02c7a6e>] xfs_trans_unlocked_item+0x3e/0x60
Dec 12 21:31:25 lune kernel:  [<c02ad078>] xfs_iunlock+0x98/0xc0
Dec 12 21:31:25 lune kernel:  [<c02cdfdf>] xfs_access+0x4f/0x60
Dec 12 21:31:25 lune kernel:  [<c02dccd6>] xfs_vn_permission+0x26/0x30
Dec 12 21:31:25 lune kernel:  [<c0168f03>] may_delete+0x63/0x170
Dec 12 21:31:25 lune kernel:  [<c0169594>] vfs_unlink+0x94/0x100
Dec 12 21:31:25 lune kernel:  [<c016b561>] do_unlinkat+0xd1/0x160
Dec 12 21:31:25 lune kernel:  [<c0102fe9>] sysenter_past_esp+0x56/0x79
Dec 12 21:31:25 lune kernel:  =======================
Dec 12 21:31:25 lune kernel: xfs_force_shutdown(hdf5,0x8) called from line 1139 
of file fs/xfs/xfs_trans.c.  Return address = 0xc02c6cf4
Dec 12 21:31:25 lune kernel: Filesystem "hdf5": Corruption of in-memory data 
detected.  Shutting down filesystem: hdf5
Dec 12 21:31:25 lune kernel: Please umount the filesystem, and rectify the 
problem(s)
Dec 12 21:32:20 lune kernel: xfs_force_shutdown(hdf5,0x1) called from line 424 
of file fs/xfs/xfs_rw.c.  Return address = 0xc02d47f9
Dec 12 21:32:26 lune kernel: xfs_force_shutdown(hdf5,0x1) called from line 424 
of file fs/xfs/xfs_rw.c.  Return address = 0xc02d47f9
Dec 12 21:32:31 lune kernel: Kernel logging (proc) stopped.
Dec 12 21:32:31 lune kernel: Kernel log daemon terminating.
Dec 12 21:34:38 lune kernel: klogd 1.4.1#20, log source = /proc/kmsg started.
Dec 12 21:34:38 lune kernel: Linux version 2.6.19 (shinichiro@xxxxxxxxxxxxxx) 
(gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-20)) #1 Tue Dec 12 
19:50:48 JST 2006

On another machine (named mars, debian sarge(stable), I write this on
mars) which fixed by xfs_repair with another kernel on another disk, I
found this (now).

# xfs_info /usr
meta-data=/usr                   isize=256    agcount=16, agsize=468896 blks
         =                       sectsz=512  
data     =                       bsize=4096   blocks=7502336, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=1
naming   =version 2              bsize=4096  
log      =internal               bsize=4096   blocks=3663, version=1
         =                       sectsz=512   sunit=0 blks
realtime =none                   extsz=65536  blocks=0, rtextents=0



> Did these machines run 2.6.17.x where x<= 6?
> i.e. is this problem:

> http://oss.sgi.com/projects/xfs/faq.html#dir2

Yes, I could boot this machine(lune) with 2.6.17.6. 

> The one you are tripping over?

I try to this later.

Thank you.

-- 
  Shinichiro HIDA  shinichiro@xxxxxxxxxxxxx
  GPG fingerprint = 5F2D 1656 FFF6 F691 A51C  5E61 E416 D398 470C 1CE9


<Prev in Thread] Current Thread [Next in Thread>