xfs
[Top] [All Lists]

XFS on large block device problem

To: XFS List <linux-xfs@xxxxxxxxxxx>
Subject: XFS on large block device problem
From: Frank Hellmann <frank@xxxxxxxxxxxxx>
Date: Thu, 26 Feb 2004 18:58:50 +0100
Organization: Optical Art Film- und Special-Effects GmbH
Sender: linux-xfs-bounce@xxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2.1) Gecko/20030225
Hi!

I just upgraded one of our linux boxes (Redhat 9 + usual 2.6 and glibc patches) to kernel 2.6.3 and wanted to try out the large block device support. First attemp, beware... :-)

I setup a md0 device consisting of four ~1.5TB large devices with mkraid /dev/md0 and a simple /etc/raidtab.

I had some issues makeing the filesystem on /dev/md0. mkfs.xfs complained about a non-clean md0 device and would not let me do anything with it.

Upgrading to xfs-progs 2.6.3 and mdadm 1.5 tools didn't change that.

First doing an mkfs.ext3, mounting and unmounting the device seems to cure that. At least after that I could mkfs.xfs it.

Unfortunatly restoring (tar not xfsrestore) some demo data back onto the drive I was getting a lot of xfs/kernel messages into the logfile. The tar process stopped with errors after about 200M of restore and the last files got corrupted. pdflush has high activity (~95%).

xfs_repair reports beside a lot of other problems a couple of:

primary/secondary superblock XX conflict - AG superblock geometry info
conflicts with filesystem geometry

which really makes me wonder, whats going on... See the attached files for further info. Unfortunatly I currently don't have any debugging tools on that machine...

Everything is peachy (with LBD), if I stay below the usual 2TB limits with the 2.6.3 kernel.

Am I missing anything for XFS LBD support? Any ideas?


                Cheers,
                        Frank...

--
--------------------------------------------------------------------------
Frank Hellmann          Optical Art GmbH           Waterloohain 7a
Digital Cinema          http://www.opticalart.de   22769 Hamburg
frank@xxxxxxxxxxxxx     Tel: ++49 40 5111051       Fax: ++49 40 43169199
...

Feb 26 17:19:17 machine kernel: XFS mounting filesystem md0
Feb 26 17:19:17 machine kernel: XFS mounting filesystem md1
Feb 26 17:23:38 machine kernel: st0: Block limits 1 - 16777215 bytes.
Feb 26 17:24:00 machine kernel: 0x0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00
Feb 26 17:24:00 machine kernel: Filesystem "md0": XFS internal error 
xfs_alloc_read_agf at line 2201 of file fs/xfs/xfs_alloc.c.  Caller 0xc01dfcdd
Feb 26 17:24:00 machine kernel: Call Trace:
Feb 26 17:24:00 machine kernel:  [<c01e0174>] xfs_alloc_read_agf+0x1b9/0x22e
Feb 26 17:24:00 machine kernel:  [<c01dfcdd>] xfs_alloc_fix_freelist+0x464/0x47a
Feb 26 17:24:00 machine last message repeated 2 times
Feb 26 17:24:00 machine kernel:  [<c023943f>] xfs_trans_log_buf+0x6e/0xa7
Feb 26 17:24:00 machine kernel:  [<c01dfe62>] xfs_alloc_log_agf+0x58/0x5c
Feb 26 17:24:00 machine kernel:  [<c01ddaa3>] xfs_alloc_ag_vextent+0xaf/0x143
Feb 26 17:24:00 machine kernel:  [<c01e04b8>] xfs_alloc_vextent+0x2cf/0x4fc
Feb 26 17:24:00 machine kernel:  [<c01f0d33>] xfs_bmap_alloc+0xc8b/0x1c05
Feb 26 17:24:00 machine kernel:  [<c01edb1e>] 
xfs_bmap_add_extent_delay_real+0x114b/0x1692
Feb 26 17:24:00 machine kernel:  [<c01fd06f>] xfs_bmbt_get_state+0x2f/0x3b
Feb 26 17:24:00 machine kernel:  [<c01f5f73>] xfs_bmapi+0xfd0/0x169c
Feb 26 17:24:00 machine kernel:  [<c011f62c>] recalc_task_prio+0xb2/0x1ea
Feb 26 17:24:00 machine kernel:  [<c011fb8a>] try_to_wake_up+0x1f1/0x294
Feb 26 17:24:00 machine kernel:  [<c01216a9>] __wake_up_common+0x38/0x57
Feb 26 17:24:00 machine kernel:  [<c01212d1>] schedule+0x387/0x6c5
Feb 26 17:24:00 machine kernel:  [<c0227638>] xfs_log_reserve+0xd7/0xdc
Feb 26 17:24:00 machine kernel:  [<c02245ef>] 
xfs_iomap_write_allocate+0x2c1/0x504
Feb 26 17:24:00 machine kernel:  [<c0166aad>] bio_alloc+0xcb/0x19c
Feb 26 17:24:00 machine kernel:  [<c0166173>] submit_bh+0xa1/0x1e5
Feb 26 17:24:00 machine kernel:  [<c021cbb3>] xfs_iunlock+0x3d/0x77
Feb 26 17:24:00 machine kernel:  [<c02238fd>] xfs_iomap+0x415/0x54a
Feb 26 17:24:00 machine kernel:  [<c0245fa4>] map_blocks+0x7a/0x15c
Feb 26 17:24:00 machine kernel:  [<c024719a>] page_state_convert+0x50c/0x6a0
Feb 26 17:24:00 machine kernel:  [<c024957b>] pagebuf_iorequest+0xb6/0x14e
Feb 26 17:24:00 machine kernel:  [<c024f739>] xfs_bdstrat_cb+0x42/0x48
Feb 26 17:24:00 machine kernel:  [<c0249087>] pagebuf_iostart+0x54/0xac
Feb 26 17:24:00 machine kernel:  [<c0221ae6>] xfs_iflush+0x272/0x535
Feb 26 17:24:01 machine kernel:  [<c0247a1a>] linvfs_writepage+0x60/0x10c
Feb 26 17:24:01 machine kernel:  [<c0186132>] mpage_writepages+0x2ca/0x3b4
Feb 26 17:24:01 machine kernel:  [<c0243305>] xfs_inode_flush+0x26b/0x2b6
Feb 26 17:24:01 machine kernel:  [<c02479ba>] linvfs_writepage+0x0/0x10c
Feb 26 17:24:01 machine kernel:  [<c01469a7>] do_writepages+0x36/0x38
Feb 26 17:24:01 machine kernel:  [<c01842a0>] __sync_single_inode+0xf7/0x283
Feb 26 17:24:01 machine kernel:  [<c01846b7>] sync_sb_inodes+0x1b1/0x295
Feb 26 17:24:01 machine kernel:  [<c0184820>] writeback_inodes+0x85/0x129
Feb 26 17:24:01 machine kernel:  [<c014664f>] background_writeout+0xbc/0xfa
Feb 26 17:24:01 machine kernel:  [<c0146e9e>] __pdflush+0x106/0x22c
Feb 26 17:24:01 machine kernel:  [<c0146fc4>] pdflush+0x0/0x13
Feb 26 17:24:01 machine kernel:  [<c0146fd3>] pdflush+0xf/0x13
Feb 26 17:24:01 machine kernel:  [<c0146593>] background_writeout+0x0/0xfa
Feb 26 17:24:01 machine kernel:  [<c010929c>] kernel_thread_helper+0x0/0xb
Feb 26 17:24:01 machine kernel:  [<c01092a1>] kernel_thread_helper+0x5/0xb

...
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
bad on-disk superblock 9 - bad magic number
primary/secondary superblock 9 conflict - AG superblock geometry info
conflicts with filesystem geometry
bad magic # 0x0 for agf 9
bad version # 0 for agf 9
bad sequence # 0 for agf 9
bad length 0 for agf 9, should be 45329664
bad magic # 0x0 for agi 9
bad version # 0 for agi 9
bad sequence # 0 for agi 9
bad length # 0 for agi 9, should be 45329664
reset bad sb for ag 9
reset bad agf for ag 9
reset bad agi for ag 9
bad agbno 0 for btbno root, agno 9
bad agbno 0 for btbcnt root, agno 9
bad agbno 0 for inobt root, agno 9
bad on-disk superblock 10 - bad magic number
primary/secondary superblock 10 conflict - AG superblock geometry info
conflicts with filesystem geometry
bad magic # 0x0 for agf 10
bad version # 0 for agf 10
bad sequence # 0 for agf 10
bad length 0 for agf 10, should be 45329664
bad magic # 0x0 for agi 10
bad version # 0 for agi 10
bad sequence # 0 for agi 10
bad length # 0 for agi 10, should be 45329664
reset bad sb for ag 10
reset bad agf for ag 10
reset bad agi for ag 10
bad agbno 0 for btbno root, agno 10
bad agbno 0 for btbcnt root, agno 10
bad agbno 0 for inobt root, agno 10
bad on-disk superblock 11 - bad magic number
primary/secondary superblock 11 conflict - AG superblock geometry info
conflicts with filesystem geometry
bad magic # 0x0 for agf 11
bad version # 0 for agf 11
bad sequence # 0 for agf 11
bad length 0 for agf 11, should be 45329664
bad magic # 0x0 for agi 11
bad version # 0 for agi 11
bad sequence # 0 for agi 11
bad length # 0 for agi 11, should be 45329664
reset bad sb for ag 11
reset bad agf for ag 11
reset bad agi for ag 11
bad agbno 0 for btbno root, agno 11
bad agbno 0 for btbcnt root, agno 11
bad agbno 0 for inobt root, agno 11
bad on-disk superblock 21 - bad magic number
primary/secondary superblock 21 conflict - AG superblock geometry info
conflicts with filesystem geometry
bad magic # 0x0 for agf 21
bad version # 0 for agf 21
bad sequence # 0 for agf 21
bad length 0 for agf 21, should be 45329664
bad magic # 0x0 for agi 21
bad version # 0 for agi 21
bad sequence # 0 for agi 21
bad length # 0 for agi 21, should be 45329664
reset bad sb for ag 21
reset bad agf for ag 21
reset bad agi for ag 21
bad agbno 0 for btbno root, agno 21
bad agbno 0 for btbcnt root, agno 21
bad agbno 0 for inobt root, agno 21
bad on-disk superblock 22 - bad magic number
primary/secondary superblock 22 conflict - AG superblock geometry info
conflicts with filesystem geometry
bad magic # 0x0 for agf 22
bad version # 0 for agf 22
bad sequence # 0 for agf 22
bad length 0 for agf 22, should be 45329664
bad magic # 0x0 for agi 22
bad version # 0 for agi 22
bad sequence # 0 for agi 22
bad length # 0 for agi 22, should be 45329664
reset bad sb for ag 22
reset bad agf for ag 22
reset bad agi for ag 22
bad agbno 0 for btbno root, agno 22
bad agbno 0 for btbcnt root, agno 22
bad agbno 0 for inobt root, agno 22
bad on-disk superblock 23 - bad magic number
primary/secondary superblock 23 conflict - AG superblock geometry info
conflicts with filesystem geometry
bad magic # 0x0 for agf 23
bad version # 0 for agf 23
bad sequence # 0 for agf 23
bad length 0 for agf 23, should be 45329664
bad magic # 0x0 for agi 23
bad version # 0 for agi 23
bad sequence # 0 for agi 23
bad length # 0 for agi 23, should be 45329664
reset bad sb for ag 23
reset bad agf for ag 23
reset bad agi for ag 23
bad agbno 0 for btbno root, agno 23
bad agbno 0 for btbcnt root, agno 23
bad agbno 0 for inobt root, agno 23
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
error following ag 9 unlinked list
error following ag 10 unlinked list
error following ag 11 unlinked list
error following ag 21 unlinked list
error following ag 22 unlinked list
error following ag 23 unlinked list
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - clear lost+found (if it exists) ...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - agno = 16
        - agno = 17
        - agno = 18
        - agno = 19
        - agno = 20
        - agno = 21
        - agno = 22
        - agno = 23
        - agno = 24
        - agno = 25
        - agno = 26
        - agno = 27
        - agno = 28
        - agno = 29
        - agno = 30
        - agno = 31
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - ensuring existence of lost+found directory
        - traversing filesystem starting at / ...
        - traversal finished ...
        - traversing all unattached subtrees ...
        - traversals finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
<Prev in Thread] Current Thread [Next in Thread>