xfs
[Top] [All Lists]

[solved] xfsrestore segfault (was: Re: xfsdump segfault)

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: [solved] xfsrestore segfault (was: Re: xfsdump segfault)
From: Daniel Browning <db@xxxxxxxxx>
Date: Sat, 16 Feb 2013 19:47:13 -0800
Cc: xfs@xxxxxxxxxxx
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <201302161606.02693.db@xxxxxxxxx>
References: <201302152220.46756.db@xxxxxxxxx> <20130216065111.GV26694@dastard> <201302161606.02693.db@xxxxxxxxx>
User-agent: KMail/1.12.4 (Linux/2.6.32-279.22.1.el6.x86_64; KDE/4.3.4; x86_64; ; )
On Saturday 16 February 2013 4:06:02 pm Daniel Browning wrote:
> So it appears that a socket is causing the problem. What should I
> try next?

Turns out it was already fixed in git HEAD, so take this email as
my personal vote for releasing 3.1.3 sooner rather than later,
for whatever that's worth. :)

commit 59ad060e8f8c39406838c37360efb5f6c09a9041
Author: Alex Elder <elder@xxxxxxxxxxx>
Date:   Thu Dec 27 08:10:14 2012 -0600

    xfsdump: fix format string in restore_spec()
    
    Nigel Tamplin reported getting a seg fault in xfsrestore when a path
    name was too long.  He correctly diagnosed that the problem was due
    to an extra "%s" format specifier in the format value passed to a
    call to mlog().  This patch corrects that.
    
    Signed-off-by: Alex Elder <elder@xxxxxxxxxxx>
    Reported-by: Nigel Tamplin <ntamplin@xxxxxxxxxxxxxxx>
    Tested-by: Nigel Tamplin <ntamplin@xxxxxxxxxxxxxxx>
    Signed-off-by: Ben Myers <bpm@xxxxxxx>

diff --git a/restore/content.c b/restore/content.c
index edd00ed..54d933c 100644
--- a/restore/content.c
+++ b/restore/content.c
@@ -7796,7 +7796,7 @@ restore_spec( filehdr_t *fhdrp, rv_t *rvp, char *path )
            if ( strlen( path ) >= sizeof( addr.sun_path )) {
                mlog( MLOG_VERBOSE | MLOG_WARNING, _(
                      "pathname too long for bind of "
-                     "%s ino %llu %s: %s: discarding\n"),
+                     "%s ino %llu %s: discarding\n"),
                      printstr,
                      fhdrp->fh_stat.bs_ino,
                      path );


Dave Chinner helped me further on IRC and suggested gdb. When I
got the stacktrace, Nathan Scott on the channel used it to find
the problem area and the commit that fixed it. I applied just
that one commit to 3.1.2, recompiled, and confirmed that it is
fixed.

For the sake of completeness I'll post the stacktrace anyway
(with a few ellipsis for anonymization).

Thanks again everyone,
--
DB

/sbin/xfsrestore: truncating secondary/[...]/mod_interchange.html from 0 to 
22744
/sbin/xfsrestore: drive_simple read( want 32 )
/sbin/xfsrestore: drive_simple return_read_buf( returning 32 )
/sbin/xfsrestore: xlate_extenthdr
/sbin/xfsrestore: read extent hdr size 23040 offset 0 type 4 flags 00000001
/sbin/xfsrestore: read extent hdr type DATA offset 0 sz 23040 flags 1
/sbin/xfsrestore: drive_simple read( want 23040 )
/sbin/xfsrestore: drive_simple return_read_buf( returning 23040 )
/sbin/xfsrestore: drive_simple read( want 32 )
/sbin/xfsrestore: drive_simple return_read_buf( returning 32 )
/sbin/xfsrestore: xlate_extenthdr
/sbin/xfsrestore: read extent hdr size 0 offset 0 type 0 flags 00000001
/sbin/xfsrestore: read extent hdr type LAST offset 0 sz 0 flags 1
/sbin/xfsrestore: drive_simple get_mark( )
/sbin/xfsrestore: drive_simple read( want 256 )
/sbin/xfsrestore: drive_simple return_read_buf( returning 256 )
/sbin/xfsrestore: xlate_bstat
/sbin/xfsrestore: xlate_bstat: pre-xlate
        bs_ino 1705965962454368256
        bs_mode  20060200000
/sbin/xfsrestore: xlate_bstat: post-xlate
        bs_ino 98814635031
        bs_mode  140600
/sbin/xfsrestore: xlate_filehdr: pre-xlate
        fh_offset 0
        fh_flags 33554432
        fh_checksum 3550492224
/sbin/xfsrestore: xlate_filehdr: post-xlate
        fh_offset 0
        fh_flags 2
        fh_checksum 1077321939
/sbin/xfsrestore: read file hdr off 0 flags 0x2 ino 98814635031 mode 0x0000c180
/sbin/xfsrestore: restoring secondary/[...]/etc/socket (98814635031 2812586625)
/sbin/xfsrestore: restoring UNIX domain socket ino 98814635031 
secondary/[...]/etc/socket

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff211e700 (LWP 23393)]
0x00000033fd8478de in vfprintf () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install 
glibc-2.12-1.80.el6_3.6.x86_64 glibc-2.12-1.80.el6_3.7.x86_64 
libattr-2.4.44-7.el6.x86_64 
libuuid-2.17.2-12.7.el6_3.x86_64
(gdb) backtrace
#0  0x00000033fd8478de in vfprintf () from /lib64/libc.so.6
#1  0x000000000041b462 in mlog_va (levelarg=<value optimized out>, 
    fmt=0x448068 "pathname too long for bind of %s ino %llu %s: %s: 
discarding\n", args=0x7ffff211b530) at mlog.c:453
#2  0x000000000041b696 in mlog (levelarg=<value optimized out>, 
    fmt=<value optimized out>) at mlog.c:365
#3  0x000000000042a1dd in restore_spec (cp=<value optimized out>, 
    linkpr=<value optimized out>, 
    path1=0x7fffec0008c0 "secondary/[...]/etc/socket", 
    path2=<value optimized out>) at content.c:7797
#4  restore_file_cb (cp=<value optimized out>, linkpr=<value optimized out>, 
    path1=0x7fffec0008c0 "secondary/[...]/etc/socket", 
    path2=<value optimized out>) at content.c:7299
#5  0x0000000000439d10 in tree_cb_links (ino=98814635031, gen=2812586625, 
    ctime=1358666583, mtime=1322084463, funcp=0x429560 <restore_file_cb>, 
    contextp=0x7ffff211cca0, 
    path1=0x7fffec0008c0 "secondary/[...]/etc/socket", 
    path2=0x7fffec0028d0 "../custom") at tree.c:1870
#6  0x0000000000430f50 in restore_file (thrdix=0) at content.c:7219
#7  applynondirdump (thrdix=0) at content.c:3459
#8  content_stream_restore (thrdix=0) at content.c:2543
#9  0x0000000000418ee6 in childmain (arg1=<value optimized out>) at main.c:1443
#10 0x0000000000407680 in cldmgr_entry (arg1=0x657420) at cldmgr.c:235
#11 0x00000033fdc07851 in start_thread () from /lib64/libpthread.so.0
#12 0x00000033fd8e811d in clone () from /lib64/libc.so.6

<Prev in Thread] Current Thread [Next in Thread>