xfs
[Top] [All Lists]

TAKE 800728 - repair makes bad memory access in dir2 checks

To: nathans@xxxxxxxxxxxxxxxxxxxx
Subject: TAKE 800728 - repair makes bad memory access in dir2 checks
From: pv@xxxxxxxxxxxxxxxxxxxxxx (nathans@xxxxxxxxxxxx)
Date: Thu, 31 Aug 2000 17:10:04 -0700 (PDT)
Cc: linux-xfs@xxxxxxxxxxx
Reply-to: sgi.bugs.xfs@xxxxxxxxxxxxxxxxx
Sender: owner-linux-xfs@xxxxxxxxxxx
 Submitter : nathans                  *Status : closed                      
 Assigned Engineer : nathans          *Fixed By : nathans                   
*Fixed By Domain : engr               *Closed Date : 08/31/00               
 Priority : 2                         *Modified Date : 08/31/00             
*Modified User : nathans              *Modified User Domain : engr          
*Fix Description :
From: nathan scott <nathans@xxxxxxxxxxxxxxxxxxxxxxx> (TAKE)
Date: Aug 31 2000 05:10:03PM
[pvnews version: 1.71]
----------------------------

Modid:  2.4.0-test1-xfs:slinx:73464a
Date:  Thu Aug 31 17:07:32 PDT 2000
Workarea:  snort:/build4/nathans/linux-xfs
Author:  nathans

The following file(s) were checked into:
  bonnie.engr.sgi.com:/isms/slinx/2.4.0-test1-xfs

cmd/xfs/repair/phase2.c - 1.28
        - for internal logs, use data device fd for zeroing.
          revert spaces back to tabs (cosmetic).

cmd/xfs/repair/phase6.c - 1.57
        - fix bad memory access - see bug 800728.
Description :
I've been hunting this bug down for a couple of days now, and
have finally reproduced it reliably.  The problem exists in
all versions of repair.  libefence was critical in finding
this... an access past the end of some malloc'd memory.

first - populate a prototype file with 1000 root ino entries,
as in QA test 030, then...

stress$ sudo ../sim/mkfs/mkfs_xfs -f -p /tmp/proto1000 /dev/hda7
xfs: using dummy primary network address
meta-data=/dev/hda7              isize=256    agcount=8, agsize=31878 blks
data     =                       bsize=4096   blocks=255023, imaxpct=25
         =                       sunit=0      swidth=0 blks, unwritten=0
naming   =version 2              bsize=4096  
log      =internal log           bsize=4096   blocks=1200
realtime =none                   extsz=65536  blocks=0, rtextents=0
stress$ sudo gdb ../sim/repair/xfs_repair
GNU gdb 19991004
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) run /dev/hda7 
Starting program: 
/home/nathans/isms/linux-xfs/cmd/xfs/stress/../sim/repair/xfs_repair /dev/hda7

  Electric Fence 2.0.5 Copyright (C) 1987-1998 Bruce Perens.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - clear lost+found (if it exists) ...
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
# of bmap records in inode 128 greater than max (4096, max - 254)
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - ensuring existence of lost+found directory
        - traversing filesystem starting at / ... 

Program received signal SIGSEGV, Segmentation fault.
0x807e368 in longform_dir2_entry_check_data (mp=0x4015acac, ip=0x4034fe5c, 
    num_illegal=0xbffff60c, need_dot=0xbffff618, stack=0xbffff688, 
    current_irec=0x40365fcc, current_ino_offset=0, bpp=0xbffff534, 
    hashtab=0x404b4ffc, freetabp=0xbffff528, da_bno=0, isblock=0)
    at phase6.c:1879
1879                    if (ptr + XFS_DIR2_DATA_ENTSIZE(dep->namelen) > endptr)
(gdb) where
#0  0x807e368 in longform_dir2_entry_check_data (mp=0x4015acac, ip=0x4034fe5c, 
    num_illegal=0xbffff60c, need_dot=0xbffff618, stack=0xbffff688, 
    current_irec=0x40365fcc, current_ino_offset=0, bpp=0xbffff534, 
    hashtab=0x404b4ffc, freetabp=0xbffff528, da_bno=0, isblock=0)
    at phase6.c:1879
#1  0x80820c3 in longform_dir2_entry_check (mp=0x4015acac, ino=128, 
    ip=0x4034fe5c, num_illegal=0xbffff60c, need_dot=0xbffff618, 
    stack=0xbffff688, irec=0x40365fcc, ino_offset=0) at phase6.c:2712
#2  0x80841d3 in process_dirstack (mp=0x4015acac, stack=0xbffff688)
    at phase6.c:3546
#3  0x808559d in phase6 (mp=0x4015acac) at phase6.c:3974
#4  0x808f90c in main (argc=2, argv=0xbffffad4) at xfs_repair.c:604
(gdb) l
1874                                (char *)dup - (char *)d)
1875                                    break;
1876                            ptr += INT_GET(dup->length, ARCH_CONVERT);
1877                    }
1878                    dep = (xfs_dir2_data_entry_t *)ptr;
1879                    if (ptr + XFS_DIR2_DATA_ENTSIZE(dep->namelen) > endptr)
1880                            break;
1881                    if (INT_GET(*XFS_DIR2_DATA_ENTRY_TAG_P(dep), 
ARCH_CONVERT) != (char *)dep - (char *)d)
1882                            break;
1883                    ptr += XFS_DIR2_DATA_ENTSIZE(dep->namelen);
(gdb) p ptr
$1 = 0x403cf000 <Address 0x403cf000 out of bounds>
(gdb) call __fswab16(dup->length)
$3 = 32
(gdb) p dup
$4 = (xfs_dir2_data_unused_t *) 0x403cefe0
(gdb) p *dup
$5 = {freetag = 65535, length = 8192, tag = 0}
(gdb) 

The "+=" on 1876 puts us right at the start of the memory
_after_ the end of our malloc'd region (i.e. endptr), which
is OK as long as we don't dereference it (but on 1879 we do).

<Prev in Thread] Current Thread [Next in Thread>
  • TAKE 800728 - repair makes bad memory access in dir2 checks, nathans@xxxxxxxxxxxx <=