http://bugme.osdl.org/show_bug.cgi?id=2936
Summary: XFS internal error xfs_alloc_read_agf
Kernel Version: 2.6.6
Status: NEW
Severity: normal
Owner: xfs-masters@xxxxxxxxxxx
Submitter: leonardo@xxxxxxxxxxxxxxxxxxxxx
Distribution: Debian Sarge
Hardware Environment:
- CPU: Intel(R) Pentium(R) 4 CPU 2.40GHz stepping 09
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
Detected 2398.381 MHz processor.
- hda: ST340014A, ATA DISK drive (Seagate 40G)
Using cfq io scheduler
hda: max request size: 1024KiB
hda: 78165360 sectors (40020 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(100)
- SGI XFS with ACLs, security attributes, realtime, no debug enabled
SGI XFS Quota Management subsystem
- VP_IDE: VIA vt8235 (rev 00) IDE UDMA133 controller on pci0000:00:11.1
Software Environment:
- xfs mount 2.12-7
- xfs_repair 2.6.11-1
- rsync 2.6.2-1
- gcc 4:3.3.4-1
- binutils 2.14.90.0.7-8
- glibc-2.3.2.ds1-13
Problem Description:
This system runs long time (2 month) without any problem as Terminal Server.
When i needed to update programs on this machine, im install programs, deb
packages, in another machine ,this have same hardware and kernel but with
1.5GRam, and, after, i rsync all files (/lib/,/usr/,/var/, etc..), this process
comsumpt a 15 min.
The problem occured in this process (rsyncing), and the kernel generate, in
terminal, a lot of messages like this:
.......
[<c01af5b5>] xfs_alloc_read_agf+0xd7/0x1d1
[<c01af1ef>] xfs_alloc_fix_freelist+0x37c/0x3fe
[<c01af1ef>] xfs_alloc_fix_freelist+0x37c/0x3fe
[<c01af1ef>] xfs_alloc_fix_freelist+0x37c/0x3fe
[<c01af7a5>] xfs_alloc_vextent+0xf6/0x37f
[<c01ddd5a>] xfs_ialloc_ag_alloc+0x147/0x5d5
[<c020743d>] pagebuf_get+0x159/0x181
[<c01faf58>] xfs_trans_read_buf+0x243/0x312
[<c01df92e>] xfs_ialloc_read_agi+0x7a/0x10d
[<c01de59a>] xfs_dialloc+0x125/0x9f2
[<c020715d>] _pagebuf_find+0x53/0x1af
[<c01ee6b0>] xlog_grant_log_space+0x113/0x33c
[<c01e519f>] xfs_ialloc+0x62/0x437
[<c01fbfe7>] xfs_dir_ialloc+0x82/0x26e
[<c01f95c0>] xfs_trans_reserve+0x7d/0x199
[<c0200e6e>] xfs_create+0x279/0x6a0
[<c01abcde>] xfs_acl_vhasacl_default+0x36/0x42
[<c020af73>] linvfs_mknod+0x304/0x399
[<c01cf04f>] xfs_dir2_lookup+0xfb/0xfd
[<c020b5bb>] linvfs_setattr+0xfa/0x146
[<c020b475>] linvfs_permission+0x0/0x13
[<c020b484>] linvfs_permission+0xf/0x13
[<c014d98e>] vfs_create+0x8d/0xf2
[<c014df37>] open_namei+0x355/0x3a4
[<c0141b8a>] filp_open+0x2d/0x4e
[<c0141f2d>] sys_open+0x4d/0x78
[<c0103b4d>] sysenter_past_esp+0x52/0x71
......... (the same debug occurs a lot times with same exit)
......... ( at some parts occurs another exit as bellow)
[<c01af5b5>] xfs_alloc_read_agf+0xd7/0x1d1
[<c01af1ef>] xfs_alloc_fix_freelist+0x37c/0x3fe
[<c01af1ef>] xfs_alloc_fix_freelist+0x37c/0x3fe
[<c01af1ef>] xfs_alloc_fix_freelist+0x37c/0x3fe
[<c01138ae>] recalc_task_prio+0x8f/0x183
[<c01139fe>] activate_task+0x5c/0x6f
[<c01af7a5>] xfs_alloc_vextent+0xf6/0x37f
[<c01ddd5a>] xfs_ialloc_ag_alloc+0x147/0x5d5
[<c020743d>] pagebuf_get+0x159/0x181
[<c01faf58>] xfs_trans_read_buf+0x243/0x312
[<c01df92e>] xfs_ialloc_read_agi+0x7a/0x10d
[<c01de59a>] xfs_dialloc+0x125/0x9f2
[<c029f29c>] ip_rcv_finish+0x0/0x230
[<c02921cc>] nf_hook_slow+0xbb/0x105
[<c029f29c>] ip_rcv_finish+0x0/0x230
[<c029f085>] ip_rcv+0x39d/0x43c
[<c01ee6b0>] xlog_grant_log_space+0x113/0x33c
[<c01e519f>] xfs_ialloc+0x62/0x437
[<c01fbfe7>] xfs_dir_ialloc+0x82/0x26e
[<c01f95c0>] xfs_trans_reserve+0x7d/0x199
[<c0200e6e>] xfs_create+0x279/0x6a0
[<c01abcde>] xfs_acl_vhasacl_default+0x36/0x42
[<c020af73>] linvfs_mknod+0x304/0x399
[<c012aa70>] file_read_actor+0x0/0xca
[<c01cf04f>] xfs_dir2_lookup+0xfb/0xfd
[<c020b475>] linvfs_permission+0x0/0x13
[<c020b484>] linvfs_permission+0xf/0x13
[<c014d98e>] vfs_create+0x8d/0xf2
[<c014df37>] open_namei+0x355/0x3a4
[<c0141b8a>] filp_open+0x2d/0x4e
[<c0141f2d>] sys_open+0x4d/0x78
[<c0103b4d>] sysenter_past_esp+0x52/0x71
................
After a reboot, kernel and lilo seem good, but when root filesystem needed to
mount this error showed:
XFS mounting filesystem hda1
Starting XFS recovery on filesystem: hda1 (dev: hda1)
[<c01b2d19>] xfs_alloc_read_agf+0xd7/0x1d1
[<c01b29be>] xfs_alloc_fix_freelist+0x3e7/0x3fe
[<c01b29be>] xfs_alloc_fix_freelist+0x3e7/0x3fe
[<c01b29be>] xfs_alloc_fix_freelist+0x3e7/0x3fe
[<c012d940>] buffered_rmqueue+0xc6/0x151
[<c012dc84>] __alloc_pages+0x2b9/0x2f5
[<c01f1e14>] xlog_grant_log_space+0x113/0x33c
[<c01b321b>] xfs_free_extent+0x89/0xd4
[<c0131345>] cache_alloc_refill+0x130/0x1c8
[<c01f6b26>] xlog_recover_process_efi+0x167/0x1b6
[<c01f6bc6>] xlog_recover_process_efis+0x51/0x53
[<c01f7ff0>] xlog_recover_finish+0x1d/0xad
[<c01f003d>] xfs_log_mount_finish+0x17/0x18
[<c01f9700>] xfs_mountfs+0x818/0xea4
[<c01f893a>] xfs_xlatesb+0x43/0x1d7
[<c020b968>] xfs_setsize_buftarg+0x33/0x6b
[<c020052f>] xfs_mount+0x2ce/0x53d
[<c0210f6e>] vfs_mount+0x22/0x2a
[<c0210ddc>] linvfs_fill_super+0x7e/0x1c9
[<c021d58f>] snprintf+0x1f/0x27
[<c016cbec>] disk_name+0x5c/0xa5
[<c0147aeb>] get_sb_bdev+0xf9/0x124
[<c0210f42>] linvfs_get_sb+0x1b/0x25
[<c0210d5e>] linvfs_fill_super+0x0/0x1c9
[<c0147ce4>] do_kern_mount+0x7a/0xeb
[<c0158693>] do_add_mount+0x68/0x14a
[<c0158975>] do_mount+0x14f/0x194
[<c021e0ca>] __copy_from_user_ll+0x54/0x58
[<c021e147>] copy_from_user+0x34/0x61
[<c01587ce>] copy_mount_options+0x59/0xb1
[<c0158ca3>] sys_mount+0x7a/0xb7
[<c03c0c4e>] do_mount_root+0x27/0x98
[<c03c0d08>] mount_block_root+0x49/0xf4
[<c0100399>] init+0x0/0xf3
[<c03c0ed3>] mount_devfs+0x2f/0x33
[<c03c0dfb>] prepare_namespace+0x22/0xcb
[<c0100399>] init+0x0/0xf3
[<c0100399>] init+0x0/0xf3
[<c0100399>] init+0x0/0xf3
[<c0100487>] init+0xee/0xf3
[<c0102244>] kernel_thread_helper+0x0/0xb
[<c0102249>] kernel_thread_helper+0x5/0xb
Ending XFS recovery on filesystem: hda1 (dev: hda1)
VFS: Mounted root (xfs filesystem) readonly.
Mounted devfs on /dev
Freeing unused kernel memory: 160k freed
[<c01b2d19>] xfs_alloc_read_agf+0xd7/0x1d1
[<c01b2b45>] xfs_alloc_pagf_init+0x1f/0x3e
[<c01b2b45>] xfs_alloc_pagf_init+0x1f/0x3e
[<c01b2b45>] xfs_alloc_pagf_init+0x1f/0x3e
[<c01e1a76>] xfs_ialloc_ag_select+0x12a/0x28d
[<c01e2596>] xfs_dialloc+0x9bd/0x9f2
[<c012a4fe>] find_or_create_page+0x1c/0x9f
[<c012a205>] wake_up_page+0xe/0x2e
[<c020a704>] _pagebuf_lookup_pages+0x1fe/0x2d9
[<c01c323d>] xfs_bmap_search_extents+0x5c/0x71
[<c020a92b>] _pagebuf_find+0xbd/0x1af
[<c01f1e14>] xlog_grant_log_space+0x113/0x33c
[<c01e8903>] xfs_ialloc+0x62/0x437
[<c01ff74b>] xfs_dir_ialloc+0x82/0x26e
[<c01fcd24>] xfs_trans_reserve+0x7d/0x199
[<c02045d2>] xfs_create+0x279/0x6a0
[<c01af442>] xfs_acl_vhasacl_default+0x36/0x42
[<c020e6d7>] linvfs_mknod+0x304/0x399
[<c01d74b3>] xfs_dir2_leaf_lookup+0x2b/0xbd
[<c01d30b0>] xfs_dir2_isleaf+0x20/0x60
[<c01d279d>] xfs_dir2_lookup+0xe5/0xfd
[<c0104510>] common_interrupt+0x18/0x20
[<c012b2c6>] filemap_nopage+0x1c8/0x2f4
[<c020ebd9>] linvfs_permission+0x0/0x13
[<c020ebe8>] linvfs_permission+0xf/0x13
[<c014d99a>] vfs_create+0x8d/0xf2
[<c014df43>] open_namei+0x355/0x3a4
[<c0141b22>] filp_open+0x2d/0x4e
[<c0141eb5>] sys_open+0x4d/0x78
[<c0103b51>] sysenter_past_esp+0x52/0x71
Well, im boot this machine using a "cdbootable distribution", to restore this
situation (this distribution have *2.4.26*).
Im try to restore using xfs_repair, but the tools stop in pass 2 and alert to
try to "mount and umount to restore log or use -L to zero log" (some think like
that)
After trying to mount (mount /dev/hda1 /mnt/restore) this _another version_ of
kernel panic with this message:
.............
SGI XFS with realtime, no debug enabled
SGI XFS Quota Management subsystem
XFS mounting filesystem ide0(3,1)
Starting XFS recovery on filesystem: ide0(3,1) (dev: ide0(3,1))
0x0: 58 41 47 46 00 00 00 01 00 00 00 0d 00 09 51 23
Filesystem "ide0(3,1)": XFS internal error xfs_alloc_read_agf at line 2201 of
file xfs_alloc.c. Caller 0xf8ba94c4
ef01fb98 f8bd3fc8 00000001 00000000 00000000 f8bd40bd f8c10584 00000001
ef753000 f8c1052e 00000899 f8ba94c4 ef753000 f8ba9c4f f8c10584 00000001
ef753000 eef31200 f8c1052e 00000899 f8ba94c4 ef753000 ef0dfc40 ef0dfc40
Call Trace: [<f8bd3fc8>] [<f8bd40bd>] [<f8c10584>] [<f8c1052e>] [<f8ba94c4>]
[<f8ba9c4f>] [<f8c10584>] [<f8c1052e>] [<f8ba94c4>] [<f8ba94c4>] [<c013630e>]
[<c013679a>] [<c0133bbc>] [<f8baa0b2>] [<f8be9b7b>] [<f8be9bf2>] [<f8beafa8>]
[<f8be3580>] [<f8becb7e>] [<f8c1b578>] [<f8bdfc2e>] [<f8bf3b46>] [<f8c035ad>]
[<f8c0329e>] [<c0142986>] [<c014336c>] [<f8c1be8c>] [<f8c1be8c>] [<c0155ba6>]
[<c014355c>] [<f8c1be8c>] [<c0156bd6>] [<c0156e5a>] [<c0156cd4>] [<c015722b>]
[<c0108997>]
0x0: 58 41 47 46 00 00 00 01 00 00 00 0d 00 09 51 23
Filesystem "ide0(3,1)": XFS internal error xfs_alloc_read_agf at line 2201 of
file xfs_alloc.c. Caller 0xf8ba94c4
ef01fa88 f8bd3fc8 00000001 00000000 00000000 f8bd40bd f8c10584 00000001
ef753000 f8c1052e 00000899 f8ba94c4 ef753000 f8ba9c4f f8c10584 00000001
ef753000 eef31200 f8c1052e 00000899 f8ba94c4 ef753000 ef0df798 ef0df798
Call Trace: [<f8bd3fc8>] [<f8bd40bd>] [<f8c10584>] [<f8c1052e>] [<f8ba94c4>]
[<f8ba9c4f>] [<f8c10584>] [<f8c1052e>] [<f8ba94c4>] [<f8ba94c4>] [<f8baa0b2>]
[<f8bb9aee>] [<f8bdcf07>] [<f8bf726b>] [<f8c1be00>] [<f8c03eda>] [<f8c02bf8>]
[<c0153712>] [<c01542c6>] [<f8be9f0d>] [<f8beafc7>] [<f8be3580>] [<f8becb7e>]
[<f8c1b578>] [<f8bdfc2e>] [<f8bf3b46>] [<f8c035ad>] [<f8c0329e>] [<c0142986>]
[<c014336c>] [<f8c1be8c>] [<f8c1be8c>] [<c0155ba6>] [<c014355c>] [<f8c1be8c>]
[<c0156bd6>] [<c0156e5a>] [<c0156cd4>] [<c015722b>] [<c0108997>]
xfs_force_shutdown(ide0(3,1),0x8) called from line 4049 of file xfs_bmap.c.
Return address = 0xf8c037f1
Filesystem "ide0(3,1)": Corruption of in-memory data detected. Shutting down
filesystem: ide0(3,1)
Please umount the filesystem, and rectify the problem(s)
Ending XFS recovery on filesystem: ide0(3,1) (dev: ide0(3,1))
................
after im umount , and rmmod xfs module but this message ocurs:
...........
kmem_cache_destroy: Can't free all objects eeff4a28
kmem_cache_destroy: Can't free all objects eeff4934
............
after, again, try to modprobe xfs this ocurs:
............
SGI XFS with realtime, no debug enabled
kernel BUG at slab.c:815!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01333db>] Not tainted
EFLAGS: 00010246
eax: 00000000 ebx: eeff4eec ecx: eeff4f58 edx: eeff4a94
esi: eeff4a8d edi: f8c14474 ebp: c0352e10 esp: ee55de84
ds: 0018 es: 0018 ss: 0018
Process modprobe (pid: 3052, stackpage=ee55d000)
Stack: 00000000 00000000 ef0b2ea4 ffffffea eeff4f0c ee55dea0 00000004 00000064
f8c07219 f8c14467 00000104 00000010 00000000 00000000 00000000 f8bf3420
00000104 f8c14467 00000094 f8c1445a 00000010 f8c14450 00000150 f8c14443
Call Trace: [<f8c07219>] [<f8c14467>] [<f8bf3420>] [<f8c14467>] [<f8c1445a>]
[<f8c14450>] [<f8c14443>] [<f8c033f8>] [<c01367e8>] [<c0136809>] [<c011c89d>]
[<f8ba5060>] [<c0108997>]
Code: 0f 0b 2f 03 a0 46 27 c0 8b 12 81 fa 4c ac 2b c0 75 d3 a1 4c
..................
I see this message:
"Corruption of in-memory data detected." warned,
so i changed machine to test (the another PentiumIV which same configuration),
and the same problems ocurs, after running 2 days whith memtest86+ nothing was
reported (no errors in memory).
This machine do backups too, which a lot of bz2 files, none of them appers
corrupted.
After try to repair i dumped 128MB off this bugged file system in a image.
which dd if=/dev/hda1 of=xfs_bug.img bs=1024k count=100
i dont now if i did right thing, nor if is useful, but the image can be
uploaded by me, just sant a email to request this image (64MBytes bzipped).
so, to repair i do xfs_repair -L /dev/hda1, this fix the problem.
nothing in filesystem show corrupted after repair.
(i do rsync again, which -b --backup_dir=/tmp/ to see diferences, and nothing
shows wrong)
Steps to reproduce:
Im very sorry, but i can't reproduce, but after a lot overwrite which rsync
the xfs filesystem make a "stable" bug, where i cant mount, or repair whichout
zero the log.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
|