Submitter : nathans *Status : closed
Assigned Engineer : nathans *Fixed By : nathans
*Fixed By Domain : engr *Closed Date : 08/31/00
Priority : 2 *Modified Date : 08/31/00
*Modified User : nathans *Modified User Domain : engr
*Fix Description :
From: nathan scott <nathans@xxxxxxxxxxxxxxxxxxxxxxx> (TAKE)
Date: Aug 31 2000 05:10:03PM
[pvnews version: 1.71]
----------------------------
Modid: 2.4.0-test1-xfs:slinx:73464a
Date: Thu Aug 31 17:07:32 PDT 2000
Workarea: snort:/build4/nathans/linux-xfs
Author: nathans
The following file(s) were checked into:
bonnie.engr.sgi.com:/isms/slinx/2.4.0-test1-xfs
cmd/xfs/repair/phase2.c - 1.28
- for internal logs, use data device fd for zeroing.
revert spaces back to tabs (cosmetic).
cmd/xfs/repair/phase6.c - 1.57
- fix bad memory access - see bug 800728.
Description :
I've been hunting this bug down for a couple of days now, and
have finally reproduced it reliably. The problem exists in
all versions of repair. libefence was critical in finding
this... an access past the end of some malloc'd memory.
first - populate a prototype file with 1000 root ino entries,
as in QA test 030, then...
stress$ sudo ../sim/mkfs/mkfs_xfs -f -p /tmp/proto1000 /dev/hda7
xfs: using dummy primary network address
meta-data=/dev/hda7 isize=256 agcount=8, agsize=31878 blks
data = bsize=4096 blocks=255023, imaxpct=25
= sunit=0 swidth=0 blks, unwritten=0
naming =version 2 bsize=4096
log =internal log bsize=4096 blocks=1200
realtime =none extsz=65536 blocks=0, rtextents=0
stress$ sudo gdb ../sim/repair/xfs_repair
GNU gdb 19991004
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) run /dev/hda7
Starting program:
/home/nathans/isms/linux-xfs/cmd/xfs/stress/../sim/repair/xfs_repair /dev/hda7
Electric Fence 2.0.5 Copyright (C) 1987-1998 Bruce Perens.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
Program received signal SIGSEGV, Segmentation fault.
0x807e368 in longform_dir2_entry_check_data (mp=0x4015acac, ip=0x4034fe5c,
num_illegal=0xbffff60c, need_dot=0xbffff618, stack=0xbffff688,
current_irec=0x40365fcc, current_ino_offset=0, bpp=0xbffff534,
hashtab=0x404b4ffc, freetabp=0xbffff528, da_bno=0, isblock=0)
at phase6.c:1879
1879 if (ptr + XFS_DIR2_DATA_ENTSIZE(dep->namelen) > endptr)
(gdb) where
#0 0x807e368 in longform_dir2_entry_check_data (mp=0x4015acac, ip=0x4034fe5c,
num_illegal=0xbffff60c, need_dot=0xbffff618, stack=0xbffff688,
current_irec=0x40365fcc, current_ino_offset=0, bpp=0xbffff534,
hashtab=0x404b4ffc, freetabp=0xbffff528, da_bno=0, isblock=0)
at phase6.c:1879
#1 0x80820c3 in longform_dir2_entry_check (mp=0x4015acac, ino=128,
ip=0x4034fe5c, num_illegal=0xbffff60c, need_dot=0xbffff618,
stack=0xbffff688, irec=0x40365fcc, ino_offset=0) at phase6.c:2712
#2 0x80841d3 in process_dirstack (mp=0x4015acac, stack=0xbffff688)
at phase6.c:3546
#3 0x808559d in phase6 (mp=0x4015acac) at phase6.c:3974
#4 0x808f90c in main (argc=2, argv=0xbffffad4) at xfs_repair.c:604
(gdb) l
1874 (char *)dup - (char *)d)
1875 break;
1876 ptr += INT_GET(dup->length, ARCH_CONVERT);
1877 }
1878 dep = (xfs_dir2_data_entry_t *)ptr;
1879 if (ptr + XFS_DIR2_DATA_ENTSIZE(dep->namelen) > endptr)
1880 break;
1881 if (INT_GET(*XFS_DIR2_DATA_ENTRY_TAG_P(dep),
ARCH_CONVERT) != (char *)dep - (char *)d)
1882 break;
1883 ptr += XFS_DIR2_DATA_ENTSIZE(dep->namelen);
(gdb) p ptr
$1 = 0x403cf000 <Address 0x403cf000 out of bounds>
(gdb) call __fswab16(dup->length)
$3 = 32
(gdb) p dup
$4 = (xfs_dir2_data_unused_t *) 0x403cefe0
(gdb) p *dup
$5 = {freetag = 65535, length = 8192, tag = 0}
(gdb)
The "+=" on 1876 puts us right at the start of the memory
_after_ the end of our malloc'd region (i.e. endptr), which
is OK as long as we don't dereference it (but on 1879 we do).
|