xfs
[Top] [All Lists]

Re: Ooops in Kernel 2.6.26.2

To: lachlan@xxxxxxx
Subject: Re: Ooops in Kernel 2.6.26.2
From: Lachlan McIlroy <lachlan@xxxxxxx>
Date: Mon, 11 Aug 2008 17:57:34 +1000
Cc: Sven Geggus <lists@xxxxxxxxxxxxxxxxxxxxx>, xfs@xxxxxxxxxxx
In-reply-to: <489FECCD.6050703@xxxxxxx>
References: <20080808180938.GA3760@xxxxxxxxxxxxxxxxx> <489FECCD.6050703@xxxxxxx>
Reply-to: lachlan@xxxxxxx
Sender: xfs-bounce@xxxxxxxxxxx
User-agent: Thunderbird 2.0.0.16 (X11/20080707)
The ticket allocation code got reworked in 2.6.26 and we now free
tickets whereas before we used to cache them so the use-after-free
went undetected.

This patch should do the trick.

--- a/fs/xfs/xfs_log.c  2008-08-11 17:47:18.000000000 +1000
+++ b/fs/xfs/xfs_log.c  2008-08-11 17:53:24.000000000 +1000
@@ -336,15 +364,12 @@ xfs_log_done(xfs_mount_t  *mp,
        } else {
                xlog_trace_loggrant(log, ticket, "xfs_log_done: (permanent)");
                xlog_regrant_reserve_log_space(log, ticket);
-       }
-
-       /* If this ticket was a permanent reservation and we aren't
-        * trying to release it, reset the inited flags; so next time
-        * we write, a start record will be written out.
-        */
-       if ((ticket->t_flags & XLOG_TIC_PERM_RESERV) &&
-           (flags & XFS_LOG_REL_PERM_RESERV) == 0)
+               /* If this ticket was a permanent reservation and we aren't
+                * trying to release it, reset the inited flags; so next time
+                * we write, a start record will be written out.
+                */
                ticket->t_flags |= XLOG_TIC_INITED;
+       }

        return lsn;
}       /* xfs_log_done */


Lachlan McIlroy wrote:
Sven,

I'm not exactly sure what's caused this panic but it could be a use
after free problem.  In xfs_log_done() (the function that we panicked
in) we call xlog_ticket_put() which frees the ticket and then we
access the ticket again a few lines later.

    if ((ticket->t_flags & XLOG_TIC_PERM_RESERV) == 0 ||
        (flags & XFS_LOG_REL_PERM_RESERV)) {
        /*
         * Release ticket if not permanent reservation or a specific
         * request has been made to release a permanent reservation.
         */
        xlog_trace_loggrant(log, ticket, "xfs_log_done: (non-permanent)");
        xlog_ungrant_log_space(log, ticket);
        xlog_ticket_put(log, ticket);          <=== freed the ticket here
    } else {
        xlog_trace_loggrant(log, ticket, "xfs_log_done: (permanent)");
        xlog_regrant_reserve_log_space(log, ticket);
    }

    /* If this ticket was a permanent reservation and we aren't
     * trying to release it, reset the inited flags; so next time
     * we write, a start record will be written out.
     */
if ((ticket->t_flags & XLOG_TIC_PERM_RESERV) && <=== accessed it again here
        (flags & XFS_LOG_REL_PERM_RESERV) == 0)
        ticket->t_flags |= XLOG_TIC_INITED;

Lachlan

Sven Geggus wrote:
Hi there,

I've got a (somewhat) reproducable bug with Kernel 2.6.26.2

* Do an xfs-repair on the device (a dm-crypt in this case) just in
  case
* mount the device
* try to remove a given file (rm foo.html)

XFS mounting filesystem dm-0
Ending clean XFS mount for filesystem: dm-0
BUG: unable to handle kernel paging request at f3040f5f
IP: [<c01fd723>] xfs_log_done+0x86/0xa7
*pde = 35dea163 *pte = 33040160 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC
Modules linked in: radeon drm rfcomm l2cap sym53c8xx scsi_transport_spi snd_via82xx 8139too snd_mpu401_uart snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer via_agp snd_page_alloc agpgart

Pid: 4043, comm: rm Not tainted (2.6.26.2 #1)
EIP: 0060:[<c01fd723>] EFLAGS: 00010282 CPU: 0
EIP is at xfs_log_done+0x86/0xa7
EAX: 00000282 EBX: f3040f30 ECX: 00000000 EDX: f31d5c58
ESI: f304def0 EDI: 00000001 EBP: f31a5dcc ESP: f31a5db4
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rm (pid: 4043, ti=f31a4000 task=f31d5c58 task.ti=f31a4000)
Stack: f47f8bf0 00000002 00000001 f31a5e04 0000000c f321ed88 f31a5ed4 c0206c8e 00000001 f3040f30 f321edb8 f321edb8 00000001 00000000 00000004 f31a5e0c f47f8bf0 c020e7c9 c020e7c9 00000000 f321eed0 00000010 00000013 f47e2ab0 Call Trace:
 [<c0206c8e>] ? _xfs_trans_commit+0x20c/0x36a
 [<c020e7c9>] ? kmem_zone_alloc+0x49/0x8f
 [<c020e7c9>] ? kmem_zone_alloc+0x49/0x8f
 [<c0207bbd>] ? xfs_trans_log_inode+0x14/0x2f
 [<c020d0ac>] ? xfs_remove+0x204/0x286
 [<c0214ba6>] ? xfs_vn_unlink+0x2e/0x4d
 [<c015e605>] ? vfs_unlink+0x5d/0xac
 [<c01601ab>] ? do_unlinkat+0x9b/0x133
 [<c0160253>] ? sys_unlink+0x10/0x12
 [<c0102cbe>] ? syscall_call+0x7/0xb
 =======================
Code: f6 43 2f 02 74 08 f7 c7 01 00 00 00 74 14 89 da 89 f0 e8 46 d7 ff ff 89 da 89 f0 e8 9e d9 ff ff eb 09 89 da 89 f0 e8 f6 fe ff ff <8a> 43 2f a8 02 74 0b 83 e7 01 75 06 83 c8 01 88 43 2f 8b 45 ec EIP: [<c01fd723>] xfs_log_done+0x86/0xa7 SS:ESP 0068:f31a5db4
---[ end trace fdfcf6b8ebb2164b ]---

Rebooting into my former kernel (2.6.25.10) I was now able to remove
the file without a kernel crash.

Unfortunately this may well mean that I can not reproduce this bug
anymore...

Sven






<Prev in Thread] Current Thread [Next in Thread>