From owner-xfs@oss.sgi.com Wed Nov 1 15:51:31 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 01 Nov 2006 15:51:38 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA1NpUaG032113 for ; Wed, 1 Nov 2006 15:51:31 -0800 X-ASG-Debug-ID: 1162420629-18514-10-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.mmtab.se (unknown [212.209.150.85]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8515C4EA816 for ; Wed, 1 Nov 2006 14:37:09 -0800 (PST) Received: from [10.0.0.145] ([212.209.150.84]) (authenticated bits=0) by mail.mmtab.se (8.13.7/8.13.7) with ESMTP id kA1Mb71W003236 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 1 Nov 2006 23:37:08 +0100 Message-ID: <4549218B.2020807@mmtab.se> Date: Wed, 01 Nov 2006 23:36:59 +0100 From: Per Mellander User-Agent: Thunderbird 1.5.0.7 (Windows/20060909) MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: xfs_growfs after lvextend don't increase mounted size. Subject: xfs_growfs after lvextend don't increase mounted size. Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 1.10 X-Barracuda-Spam-Status: No, SCORE=1.10 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_SC1_SA036d X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24778 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.10 BSF_SC1_SA036d Custom Rule SA036d X-archive-position: 9514 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: per.m@mmtab.se Precedence: bulk X-list: xfs Content-Length: 1909 Lines: 61 Hi! I've got a 6.5TB xfs filesystem on a LVM2 volume. I wanted to increase the size of the fs so I added another ~10TB to the volume. Every step taken was successfull, (ie no errors) but the filesystem size remained unchanged even after the xfs_growfs. What I did was the following: I extended the lvm using pvcreate, vgextend and finally lvextend. pvscan gives: PV /dev/sda1 VG vg01 lvm2 [1.59 TB / 0 free] PV /dev/sdb1 VG vg01 lvm2 [1.59 TB / 0 free] PV /dev/sdc1 VG vg01 lvm2 [1.59 TB / 0 free] PV /dev/sdd1 VG vg01 lvm2 [1.59 TB / 0 free] PV /dev/sde1 VG vg01 lvm2 [2.00 TB / 0 free] PV /dev/sdf1 VG vg01 lvm2 [2.00 TB / 0 free] PV /dev/sdg1 VG vg01 lvm2 [2.00 TB / 0 free] PV /dev/sdh1 VG vg01 lvm2 [2.00 TB / 0 free] PV /dev/sdi1 VG vg01 lvm2 [1.55 TB / 0 free] Total: 9 [15.91 TB] / in use: 9 [15.91 TB] / in no VG: 0 [0 ] and lvdisplay: --- Logical volume --- LV Name /dev/vg01/lv01 VG Name vg01 LV UUID rDSEQ3-DdhV-oLci-nNYT-dMdX-cppA-7v1r3e LV Write Access read/write LV Status available # open 1 LV Size 15.91 TB Current LE 4169996 Segments 9 Allocation inherit Read ahead sectors 0 Block device 253:0 After xfs_growfs /vol1 ( mounted fs ) df -h gives: /dev/mapper/vg01-lv01 6.4T 6.4T 36G 100% /vol1 The fs remains in previous size!! # cat /sys/block/dm-0/size 34160607232 which is equal to 15.91 TB Have I missed something when I extended the volume / growed the xfs fs? The system running xfs is a Fedora Core 4, 2.6.15-1.1831_FC4smp, lvm2-2.01.08-2.1, xfsprogs-2.6.13-4 Per _________________________________ This email has been ClamScanned ! www.clamav.net From owner-xfs@oss.sgi.com Wed Nov 1 19:43:56 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 01 Nov 2006 19:44:07 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA23hraG015861 for ; Wed, 1 Nov 2006 19:43:55 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA29220; Thu, 2 Nov 2006 14:42:59 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kA23gv7Y22495852; Thu, 2 Nov 2006 14:42:57 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kA23gsYm22416042; Thu, 2 Nov 2006 14:42:54 +1100 (AEDT) Date: Thu, 2 Nov 2006 14:42:54 +1100 From: David Chinner To: Per Mellander Cc: xfs@oss.sgi.com Subject: Re: xfs_growfs after lvextend don't increase mounted size. Message-ID: <20061102034254.GC11034@melbourne.sgi.com> References: <4549218B.2020807@mmtab.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4549218B.2020807@mmtab.se> User-Agent: Mutt/1.4.2.1i X-archive-position: 9518 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 840 Lines: 27 On Wed, Nov 01, 2006 at 11:36:59PM +0100, Per Mellander wrote: > Hi! > > I've got a 6.5TB xfs filesystem on a LVM2 volume. I wanted to increase > the size of the fs so I added another ~10TB to the volume. Every step > taken was successfull, (ie no errors) but the filesystem size remained > unchanged even after the xfs_growfs. There's a 32 bit overflow in the growfs code (and transaction code on 32 bit systems) so you can't grow by more than 2TB at a time. I've got a fix under test for this at the moment. Can you see if you can grow using: # xfs_growfs -D FSB = filesystem block size, and the current size is also in FSB. You can get both the current size and the FSB from xfs_growfs -n Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Nov 1 22:08:00 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 01 Nov 2006 22:08:09 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA267xaG011292 for ; Wed, 1 Nov 2006 22:08:00 -0800 X-ASG-Debug-ID: 1162443331-27234-621-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from wx-out-0506.google.com (wx-out-0506.google.com [66.249.82.233]) by cuda.sgi.com (Spam Firewall) with ESMTP id 57CDA4E771A for ; Wed, 1 Nov 2006 20:55:31 -0800 (PST) Received: by wx-out-0506.google.com with SMTP id h29so27101wxd for ; Wed, 01 Nov 2006 20:55:31 -0800 (PST) Received: by 10.70.90.12 with SMTP id n12mr10854229wxb; Wed, 01 Nov 2006 19:57:04 -0800 (PST) Received: from ?192.168.23.241? ( [208.195.10.2]) by mx.google.com with ESMTP id h14sm1810249wxd.2006.11.01.19.57.04; Wed, 01 Nov 2006 19:57:04 -0800 (PST) Message-ID: <45496C84.4090309@novacell.com> Date: Wed, 01 Nov 2006 19:56:52 -0800 From: John Novak User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: xfs over iscsi via initiator on CentOS 4.4 Subject: xfs over iscsi via initiator on CentOS 4.4 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24802 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9519 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jnovak@novacell.com Precedence: bulk X-list: xfs Content-Length: 470 Lines: 16 Does anyone have information about running an xfs file system over iscsi from the CentOS 4.4 iscsi initiator ? The only post I could find about this is one from Aug of 04 indicating a known issue. (http://oss.sgi.com/archives/xfs/2004-08/msg00155.html) I tries this on CentOS 4.4 with kernel 2.6.9-42.0.3.plus.c4smp running on a dual core AMD proc. I'm seeing the same symptom as reported in this earlier post. Any pointers are appreciated, Best, John Novak From owner-xfs@oss.sgi.com Thu Nov 2 09:26:58 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 09:27:06 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA2HQvaG013977 for ; Thu, 2 Nov 2006 09:26:58 -0800 X-ASG-Debug-ID: 1162488370-28353-201-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rrzmta2.rz.uni-regensburg.de (rrzmta2.rz.uni-regensburg.de [132.199.1.17]) by cuda.sgi.com (Spam Firewall) with ESMTP id EDF404ED0B6 for ; Thu, 2 Nov 2006 09:26:11 -0800 (PST) Received: from rrzmta2.rz.uni-regensburg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id B5BCF6C6AB for ; Thu, 2 Nov 2006 18:26:15 +0100 (CET) Received: from pc51072.physik.uni-regensburg.de (pc51072.physik.uni-regensburg.de [132.199.98.129]) by rrzmta2.rz.uni-regensburg.de (Postfix) with ESMTP id B04656C67E for ; Thu, 2 Nov 2006 18:26:15 +0100 (CET) Received: by pc51072.physik.uni-regensburg.de (Postfix, from userid 28561) id 883F4507058; Thu, 2 Nov 2006 18:26:08 +0100 (CET) Date: Thu, 2 Nov 2006 18:26:08 +0100 From: Christian Guggenberger To: xfs@oss.sgi.com X-ASG-Orig-Subj: mount failed after xfs_growfs beyond 16 TB Subject: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> Reply-To: christian.guggenberger@physik.uni-regensburg.de Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-Barracuda-Spam-Score: 0.50 X-Barracuda-Spam-Status: No, SCORE=0.50 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24854 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M BODY: Custom Rule 7568M X-archive-position: 9523 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.guggenberger@physik.uni-regensburg.de Precedence: bulk X-list: xfs Content-Length: 2978 Lines: 78 Hi, a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on top of lvm2 to 17TB. (I am not even sure if that's supposed work with linux-2.6, 32bit) used kernel seems to be debian sarge's 2.6.8 xfs_growfs seemed to succeed (AFAIK..) however, the fs shut down: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c. Caller 0xf89978a8 [__crc_pm_idle+550816/2056674] xfs_free_ag_extent+0x454/0x78a [xfs] [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs] [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs] [__crc_pm_idle+553757/2056674] xfs_alloc_read_agf+0xbe/0x1e4 [xfs] [__crc_pm_idle+764480/2056674] xfs_growfs_data_private+0xd80/0xec0 [xfs] [pty_write+305/307] pty_write+0x131/0x133 [opost+154/428] opost+0x9a/0x1ac [__crc_pm_idle+765024/2056674] xfs_growfs_data+0x3f/0x5e [xfs] [__crc_pm_idle+972873/2056674] xfs_ioctl+0x256/0x860 [xfs] [tty_write+436/788] tty_write+0x1b4/0x314 [write_chan+0/538] write_chan+0x0/0x21a [__crc_pm_idle+968754/2056674] linvfs_ioctl+0x78/0x101 [xfs] [sys_ioctl+315/675] sys_ioctl+0x13b/0x2a3 [syscall_call+7/11] syscall_call+0x7/0xb xfs_force_shutdown(dm-1,0x8) called from line 1088 of file fs/xfs/xfs_trans.c. Return address = 0xf8a01c3c Filesystem "dm-1": Corruption of in-memory data detected. Shutting down filesystem: dm-1 Please umount the filesystem, and rectify the problem(s) xfs_force_shutdown(dm-1,0x1) called from line 353 of file fs/xfs/xfs_rw.c. Return address = 0xf8a01c3c mounting fails with: XFS: SB sanity check 2 failed Filesystem "dm-1": XFS internal error xfs_mount_validate_sb(4) at line 277 of file fs/xfs/xfs_mount.c. Caller 0xf89e568c [__crc_pm_idle+872883/2056674] xfs_mount_validate_sb+0x21d/0x39a [xfs] [__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs] [__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs] [__crc_pm_idle+908971/2056674] xfs_mount+0x282/0x5d4 [xfs] [__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs] [__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs] [__crc_pm_idle+989534/2056674] linvfs_fill_super+0xa1/0x1ee [xfs] [snprintf+39/43] snprintf+0x27/0x2b [disk_name+169/171] disk_name+0xa9/0xab [sb_set_blocksize+46/93] sb_set_blocksize+0x2e/0x5d [get_sb_bdev+262/313] get_sb_bdev+0x106/0x139 [__crc_pm_idle+989914/2056674] linvfs_get_sb+0x2f/0x36 [xfs] [__crc_pm_idle+989373/2056674] linvfs_fill_super+0x0/0x1ee [xfs] [do_kern_mount+162/354] do_kern_mount+0xa2/0x162 [do_new_mount+115/181] do_new_mount+0x73/0xb5 [do_mount+370/446] do_mount+0x172/0x1be [copy_mount_options+99/188] copy_mount_options+0x63/0xbc [sys_mount+212/344] sys_mount+0xd4/0x158 [syscall_call+7/11] syscall_call+0x7/0xb XFS: SB validate failed XFS: SB sanity check 2 failed and finally, xfs_repair stops at bad primary superblock: inconsistent file geometrie information found candidate secondary superblock... superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29 thanks in advance, - Christian From owner-xfs@oss.sgi.com Thu Nov 2 10:39:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 10:40:05 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA2IdwaG023870 for ; Thu, 2 Nov 2006 10:39:59 -0800 X-ASG-Debug-ID: 1162492753-26243-917-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7F998D1BB033 for ; Thu, 2 Nov 2006 10:39:13 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kA2IcdjR009922; Thu, 2 Nov 2006 13:38:39 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kA2IcYL8032472; Thu, 2 Nov 2006 13:38:34 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kA2IcXmw030023; Thu, 2 Nov 2006 13:38:34 -0500 Message-ID: <454A3B28.7010405@sandeen.net> Date: Thu, 02 Nov 2006 12:38:32 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.7 (X11/20060913) MIME-Version: 1.0 To: christian.guggenberger@physik.uni-regensburg.de CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> In-Reply-To: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24856 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9525 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 4040 Lines: 121 Christian Guggenberger wrote: > Hi, > > a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on > top of lvm2 to 17TB. (I am not even sure if that's supposed work with > linux-2.6, 32bit) If you have CONFIG_LBD enabled (do you?), it should in theory, barring bugs :) > used kernel seems to be debian sarge's 2.6.8 hmm old.... > xfs_growfs seemed to succeed (AFAIK..) trace below looks like not... > however, the fs shut down: > > XFS internal error > XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c. Caller > 0xf89978a8 > [__crc_pm_idle+550816/2056674] xfs_free_ag_extent+0x454/0x78a [xfs] > [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs] > [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs] > [__crc_pm_idle+553757/2056674] xfs_alloc_read_agf+0xbe/0x1e4 [xfs] in the growfs thread here > [__crc_pm_idle+764480/2056674] xfs_growfs_data_private+0xd80/0xec0 [xfs] > [pty_write+305/307] pty_write+0x131/0x133 > [opost+154/428] opost+0x9a/0x1ac > [__crc_pm_idle+765024/2056674] xfs_growfs_data+0x3f/0x5e [xfs] > [__crc_pm_idle+972873/2056674] xfs_ioctl+0x256/0x860 [xfs] > [tty_write+436/788] tty_write+0x1b4/0x314 > [write_chan+0/538] write_chan+0x0/0x21a > [__crc_pm_idle+968754/2056674] linvfs_ioctl+0x78/0x101 [xfs] > [sys_ioctl+315/675] sys_ioctl+0x13b/0x2a3 > [syscall_call+7/11] syscall_call+0x7/0xb > xfs_force_shutdown(dm-1,0x8) called > from line 1088 of file fs/xfs/xfs_trans.c. Return address = 0xf8a01c3c > Filesystem "dm-1": Corruption of > in-memory data detected. Shutting down filesystem: dm-1 > Please umount the filesystem, and > rectify the problem(s) > xfs_force_shutdown(dm-1,0x1) called > from line 353 of file fs/xfs/xfs_rw.c. Return address = 0xf8a01c3c > > mounting fails with: > > XFS: SB sanity check 2 failed This is checking: if (unlikely( sbp->sb_dblocks == 0 || sbp->sb_dblocks > (xfs_drfsbno_t)sbp->sb_agcount * sbp->sb_agblocks || sbp->sb_dblocks < (xfs_drfsbno_t)(sbp->sb_agcount - 1) * sbp->sb_agblocks + XFS_MIN_AG_BLOCKS)) { xfs_fs_mount_cmn_err(flags, "SB sanity check 2 failed"); return XFS_ERROR(EFSCORRUPTED); } can you point xfs_db -r /dev/dm-1 and then: xfs_db> sb 0 xfs_db> p let's see what you've got. Also how big does /proc/partitions think your new device is? > Filesystem "dm-1": XFS internal error > xfs_mount_validate_sb(4) at line 277 of file fs/xfs/xfs_mount.c. Caller > 0xf89e568c > [__crc_pm_idle+872883/2056674] xfs_mount_validate_sb+0x21d/0x39a [xfs] > [__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs] > [__crc_pm_idle+874509/2056674] xfs_readsb+0xee/0x1f9 [xfs] > [__crc_pm_idle+908971/2056674] xfs_mount+0x282/0x5d4 [xfs] > [__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs] > [__crc_pm_idle+989973/2056674] vfs_mount+0x34/0x38 [xfs] > [__crc_pm_idle+989534/2056674] linvfs_fill_super+0xa1/0x1ee [xfs] > [snprintf+39/43] snprintf+0x27/0x2b > [disk_name+169/171] disk_name+0xa9/0xab > [sb_set_blocksize+46/93] sb_set_blocksize+0x2e/0x5d > [get_sb_bdev+262/313] get_sb_bdev+0x106/0x139 > [__crc_pm_idle+989914/2056674] linvfs_get_sb+0x2f/0x36 [xfs] > [__crc_pm_idle+989373/2056674] linvfs_fill_super+0x0/0x1ee [xfs] > [do_kern_mount+162/354] do_kern_mount+0xa2/0x162 > [do_new_mount+115/181] do_new_mount+0x73/0xb5 > [do_mount+370/446] do_mount+0x172/0x1be > [copy_mount_options+99/188] copy_mount_options+0x63/0xbc > [sys_mount+212/344] sys_mount+0xd4/0x158 > [syscall_call+7/11] syscall_call+0x7/0xb > XFS: SB validate failed > XFS: SB sanity check 2 failed > > and finally, xfs_repair stops at > > bad primary superblock: inconsistent file geometrie information > > found candidate secondary superblock... > superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29 hmm that offset is about 9.4 terabytes. any kernel messages when this happens? rval 29 is ESPIPE / illegal seek. -Eric > thanks in advance, > > - Christian > > > From owner-xfs@oss.sgi.com Thu Nov 2 13:24:27 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 13:24:34 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA2LOQaG014015 for ; Thu, 2 Nov 2006 13:24:27 -0800 X-ASG-Debug-ID: 1162499093-7718-584-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.172]) by cuda.sgi.com (Spam Firewall) with ESMTP id 37B28D1BB00B for ; Thu, 2 Nov 2006 12:24:53 -0800 (PST) Received: by ug-out-1314.google.com with SMTP id q2so203705uge for ; Thu, 02 Nov 2006 12:24:50 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:from:to:subject:date:message-id:mime-version:content-type:x-mailer:thread-index:x-mimeole:disposition-notification-to; b=S5Tr40n0VzXEbKOkkycMjw7BF8QbKC9892Tyr5kDCalgIATlIZkCRB22e9QYQ1EOrBr5Hz4oic6GmWlOfZFGxI5B//eoi94Jw/EFAIPsPfeHfR1SSgQ9vxqjTHQqxGAnCN45VwWWb3pKwXzbMFQl3aHZ0JJvPGbq7MbpmUUBXYg= Received: by 10.67.27.3 with SMTP id e3mr1329187ugj.1162498723188; Thu, 02 Nov 2006 12:18:43 -0800 (PST) Received: from home ( [84.94.68.186]) by mx.google.com with ESMTP id q1sm2665537uge.2006.11.02.12.18.40; Thu, 02 Nov 2006 12:18:42 -0800 (PST) From: "Uri Rotshtein" To: X-ASG-Orig-Subj: Adding xfs to kernel 2.4.20 Subject: Adding xfs to kernel 2.4.20 Date: Thu, 2 Nov 2006 22:18:15 +0300 Message-ID: MIME-Version: 1.0 X-Mailer: Microsoft Office Outlook 11 Thread-Index: Acb95DpcKk7+x66/RvCKU7nzmnSqUQ== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2962 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24864 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 9526 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rotshtein@gmail.com Precedence: bulk X-list: xfs Content-Length: 142 Lines: 15 Hi, I'm using kernel 2.4.20 with Redhat 9 and would like to try the XFS filesystem. 10x, Uri. [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Nov 2 14:28:04 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 14:28:08 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA2MS2aG019617 for ; Thu, 2 Nov 2006 14:28:04 -0800 X-ASG-Debug-ID: 1162506436-22227-286-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id 329204EC724 for ; Thu, 2 Nov 2006 14:27:16 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kA2MRErb012526; Thu, 2 Nov 2006 17:27:14 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kA2MREeT028105; Thu, 2 Nov 2006 17:27:14 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kA2MRBhi019988; Thu, 2 Nov 2006 17:27:12 -0500 Message-ID: <454A70BE.8010906@sandeen.net> Date: Thu, 02 Nov 2006 16:27:10 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.7 (X11/20060913) MIME-Version: 1.0 To: Uri Rotshtein CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Adding xfs to kernel 2.4.20 Subject: Re: Adding xfs to kernel 2.4.20 References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24874 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9528 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 271 Lines: 16 Uri Rotshtein wrote: > Hi, > > > > I'm using kernel 2.4.20 with Redhat 9 and would like to try the XFS > filesystem. > > you might look at ftp://oss.sgi.com/projects/xfs/testing/RHEL3/ although I'm not sure why you'd want to use such an old codebase :) -eric From owner-xfs@oss.sgi.com Thu Nov 2 16:42:50 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 16:42:54 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA30gjaG007494 for ; Thu, 2 Nov 2006 16:42:49 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA02613; Fri, 3 Nov 2006 11:41:50 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kA30fl7Y23303094; Fri, 3 Nov 2006 11:41:48 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kA30fgCF22779868; Fri, 3 Nov 2006 11:41:42 +1100 (AEDT) Date: Fri, 3 Nov 2006 11:41:42 +1100 From: David Chinner To: Christian Guggenberger Cc: xfs@oss.sgi.com Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061103004142.GI8394166@melbourne.sgi.com> References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 9529 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1586 Lines: 52 On Thu, Nov 02, 2006 at 06:26:08PM +0100, Christian Guggenberger wrote: > Hi, > > a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on > top of lvm2 to 17TB. (I am not even sure if that's supposed work with > linux-2.6, 32bit) Not supported - any metadata access past 16TB will wrap the 32 bit page cache index for the metadata address space and you'll corrupt the filesystem. > used kernel seems to be debian sarge's 2.6.8 > > xfs_growfs seemed to succeed (AFAIK..) > > however, the fs shut down: > > XFS internal error > XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/xfs/xfs_alloc.c. Caller > 0xf89978a8 > [__crc_pm_idle+550816/2056674] xfs_free_ag_extent+0x454/0x78a [xfs] > [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs] > [__crc_pm_idle+555561/2056674] xfs_free_extent+0xea/0x10f [xfs] > [__crc_pm_idle+553757/2056674] xfs_alloc_read_agf+0xbe/0x1e4 [xfs] > [__crc_pm_idle+764480/2056674] xfs_growfs_data_private+0xd80/0xec0 [xfs] No, growfs failed trying to extend the data partition and shut down the filesystem. > mounting fails with: > > XFS: SB sanity check 2 failed ..... > and finally, xfs_repair stops at > > bad primary superblock: inconsistent file geometrie information Probably because growfs failed part way through and left inconsistent state behind. > found candidate secondary superblock... > superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29 Does LVM2 even support volumes larger than 16TB on 32 bit machines? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 2 16:58:22 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 16:58:26 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA30wIaG009337 for ; Thu, 2 Nov 2006 16:58:20 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA03056; Fri, 3 Nov 2006 11:57:26 +1100 Date: Fri, 03 Nov 2006 10:58:31 +1000 From: Timothy Shimmin To: peyytmek@gmx.de cc: xfs@oss.sgi.com Subject: Re: Xfs-mailinglist question (xfs mounting problem, hdb1 just freezes) Message-ID: <3B2B6490C980DD2C8B4C9645@timothy-shimmins-power-mac-g5.local> In-Reply-To: <200611012255.46008.peyytmek@gmx.de> References: <200611012255.46008.peyytmek@gmx.de> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id kA30wMaG009349 X-archive-position: 9530 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 4619 Lines: 145 Hi there, Sorry about not getting back to you. The inode details you sent me look reasonable to me (AFAICS). However, the inode that we are processing which has problems is probably coming from the log. Basically, during log replay we reinstate metadata such as inodes, inode buffers, and later process an unlinked-list which this inode appears to be on. It's during the truncating (as part of inactivating the inode) that we have problems when looking at its extents. You should be able to see this inode in the log if you use: # xfs_logprint -ti device (or possibly I guess # xfs_logrpint -tibo device - if the inode is in a buffer of inodes) I'd be curious to see this. My thoughts for a plan of action were either: (1) forget the log and do an "xfs_repair -L" to zero it out and repair or (2) somehow try to do a mount which avoids the unlinked list processing (an older kernel (pre May 26) would do this because of a bug in recovery); then unmount, then do a normal "xfs_repair" or (3) we try to stop this particular inode from being processed in the unlinked-list inode processing during mount; unmount; repair and then mount again The simplest is to do option (1). However, it means that any other outstanding metadata changes would be lost. Option (2) might be a goer but you need such a kernel and in that case it won't do any unlinked processing for a group of such inodes (one's hashed to that bucket in the AGI's). I'm unsure on how to stop this unlinked processing otherwise. Option (3) I'm unsure as how to do. Also, there could be more inodes in the same boat which could cause recovery to crash. May be others have suggestions. Regards, Tim. --On 1 November 2006 10:55:43 PM +0000 peyytmek@gmx.de wrote: > Hello, > Thanks for your last answer. Since you didn't answer my email for 2 weeks i > thought you might have deleted it accidently > > Here's the email conversation: > >> > Hello. >> > Thanks for your answer. >> > >> > That's what i have: dmesg print with kernel-2.6.16-gentoo-r3 and an print >> > of xfs_bg. >> > >> >> You could print out the offending inode with xfs_db to show us >> >> what it looks like: $xfs_db -r /dev/hdb1 -c "inode 950759" -c "print". >> > >> > I don't know what you mean with it but i added it anyway. (done with >> > kernel-2.6.18-gentoo if it matters) >> > >> > xfs_db: >> > >> > CLX ~ # xfs_db -r /dev/hdb1 -c "inode 950759" -c "print" >> > core.magic = 0x494e >> > core.mode = 0100644 >> > core.version = 1 >> > core.format = 3 (btree) >> > core.nlinkv1 = 0 >> > core.uid = 1000 >> > core.gid = 100 >> > core.flushiter = 0 >> > core.atime.sec = Sun Aug 27 14:56:52 2006 >> > core.atime.nsec = 657389250 >> > core.mtime.sec = Sun Aug 27 16:29:40 2006 >> > core.mtime.nsec = 080196250 >> > core.ctime.sec = Thu Oct  5 01:17:40 2006 >> > core.ctime.nsec = 976565958 >> > core.size = 32071862 >> > core.nblocks = 7833 >> > core.extsize = 0 >> > core.nextents = 28 >> > core.naextents = 0 >> > core.forkoff = 0 >> > core.aformat = 2 (extents) >> > core.dmevmask = 0 >> > core.dmstate = 0 >> > core.newrtbm = 0 >> > core.prealloc = 0 >> > core.realtime = 0 >> > core.immutable = 0 >> > core.append = 0 >> > core.sync = 0 >> > core.noatime = 0 >> > core.nodump = 0 >> > core.rtinherit = 0 >> > core.projinherit = 0 >> > core.nosymlinks = 0 >> > core.extsz = 0 >> > core.extszinherit = 0 >> > core.gen = 0 >> > next_unlinked = null >> > u.bmbt.level = 1 >> > u.bmbt.numrecs = 1 >> > u.bmbt.keys[1] = [startoff] 1:[0] >> > u.bmbt.ptrs[1] = 1:185933 >> >> And now: >> >> xfs_db -r /dev/hadb1 -c "fsb 185933" -c "type bmapbtd" -c "p" >> >> to look at the 28 extent records. >> >> --Tim > > Here's the content of my last Email > > > Hello, thanks again for your fast answer > Sorry for the double post last time. > here it comes > > > CLX ~ # xfs_db -r /dev/hdb1 -c "fsb 185933" -c "type bmapbtd" -c "p" > magic = 0x424d4150 > level = 0 > numrecs = 27 > leftsib = null > rightsib = null > recs[1-27] = [startoff,startblock,blockcount,extentflag] 1:[0,185637,16,0] 2: > [16,185537,8,0] 3:[24,185718,8,0] 4:[32,185706,8,0] 5:[40,185836,8,0] 6: > [48,185848,16,0] 7:[64,185865,16,0] 8:[80,185882,8,0] 9:[96,185899,16,0] 10: > [112,185916,16,0] 11:[340,185934,2,0] 12:[342,4768704,1320,0] 13: > [1662,4770389,239,0] 14:[1901,4770919,264,0] 15:[2165,4771391,165,0] 16: > [2330,4771860,227,0] 17:[2557,4861204,351,0] 18:[2908,4861800,257,0] 19: > [3165,4862282,349,0] 20:[3514,4862934,230,0] 21:[3744,4863506,383,0] 22: > [4127,4864141,348,0] 23:[4475,4864871,228,0] 24:[4703,4865358,268,0] 25: > [4971,4865882,593,0] 26:[5564,4866818,339,0] 27:[5903,4867729,1928,0] From owner-xfs@oss.sgi.com Thu Nov 2 18:42:36 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 18:42:41 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA32gXaG021241 for ; Thu, 2 Nov 2006 18:42:34 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA06154; Fri, 3 Nov 2006 13:41:41 +1100 Message-ID: <454AAC6B.7010406@sgi.com> Date: Fri, 03 Nov 2006 13:41:47 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.7 (X11/20060909) MIME-Version: 1.0 To: jgl@johngroves.net CC: linux-xfs@oss.sgi.com, John Groves , Dean Roehrich Subject: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory References: <4547DA70.4040107@Groves.net> <4547EDFD.8020407@sgi.com> <454A94A6.6040907@johngroves.net> In-Reply-To: <454A94A6.6040907@johngroves.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9531 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 2779 Lines: 66 John Groves wrote: > Thanks for your replies, Vlad, although I don't find anything so far > that helps with my problem (my paths are not long, and my calls to > dm_path_to_handle have been running in production environments for a > couple of years). As far as I can see, dm_path_to_handle does not > work on a directory (?), although it works perfectly on a file. I > will try to dig deeper into this over the next few days, but here is a > somewhat clearer explanation of the behavior I am seeing. > > I have updated to the latest kernel from SGI's CVS server, but the > problem is still there. I am tracing through kernel code, and will be > happy to pull together some test code that demonstrates the problem, > or to post a patch if I figure it out, but this will take a few days. > > The sequence in which I find this problem is: > > 1. Receive a pre-rename event > 2. Use the first handle parameter to resolve the pre-rename parent > directory path (not via dm_handle_to_path -- I had to roll my own > mechanisms for turning handles into paths). > 3. Concatenate the first name parameter to the first parent > directory path, to get the relative path from mount point to actual > file being renamed. > 4. Call dm_path_to_handle on that path, hoping to get the handle of > the file-being-renamed. > > If the renamed-thing is a file, this works. If it's a directory, > dm_path_to_handle fails. > > With my dmapi event handler installed and running, I can reproduce it > by doing the following in the root directory of the filesystem: > > mkdir -p x/y/z > mv x/y x/w > > In the pre-rename event, prior to responding to the event, my handler > correctly determines that x/y is being renamed to x/w, but > dm_path_to_handle does not return the handle of x/y. My post-rename > event handler also correctly resolves the paths, but dm_path_to_handle > does not return the handle of x/w. > > If x/y is a file (rather than a directory) it all works properly. > > Let me know if you can think of anything specific I should look at, or > of a different way of getting the handle of the renamed thingy. Hi John, I did try this on my dmapi filesystem: emu:/mnt/scratch1/dmapi_test # mkdir -p x/y/z emu:/mnt/scratch1/dmapi_test # /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle x/y 5d1111a90e4800000e00000003000000d903400000000000 emu:/mnt/scratch1/dmapi_test # mv x/y x/w emu:/mnt/scratch1/dmapi_test # /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle x/w 5d1111a90e4800000e00000003000000d903400000000000 emu:/mnt/scratch1/dmapi_test # I also tried path_to_handle with relative path to a directory it worked fine too. When you say dm_path_to_handle fails, what is the error returned? Regards, Vlad From owner-xfs@oss.sgi.com Thu Nov 2 19:00:34 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 19:00:38 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA330SaG023868 for ; Thu, 2 Nov 2006 19:00:33 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA06623; Fri, 3 Nov 2006 13:59:38 +1100 Message-ID: <454AB0A0.7050309@sgi.com> Date: Fri, 03 Nov 2006 13:59:44 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.7 (X11/20060909) MIME-Version: 1.0 To: John Groves CC: jgl@johngroves.net, linux-xfs@oss.sgi.com, Dean Roehrich Subject: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory References: <4547DA70.4040107@Groves.net> <4547EDFD.8020407@sgi.com> <454A94A6.6040907@johngroves.net> <454AAC6B.7010406@sgi.com> <454AAF31.8050104@Groves.net> In-Reply-To: <454AAF31.8050104@Groves.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9532 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 1477 Lines: 52 John Groves wrote: > Vlad Apostolov wrote: > >> Hi John, >> >> I did try this on my dmapi filesystem: >> >> emu:/mnt/scratch1/dmapi_test # mkdir -p x/y/z >> emu:/mnt/scratch1/dmapi_test # >> /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle >> x/y >> 5d1111a90e4800000e00000003000000d903400000000000 >> emu:/mnt/scratch1/dmapi_test # mv x/y x/w >> emu:/mnt/scratch1/dmapi_test # >> /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle >> x/w >> 5d1111a90e4800000e00000003000000d903400000000000 >> emu:/mnt/scratch1/dmapi_test # >> >> I also tried path_to_handle with relative path to a directory it >> worked fine too. When you say >> dm_path_to_handle fails, what is the error returned? >> >> Regards, >> Vlad >> > > Vlad, > > This was my bad -- I need to go back to programming school ;-). > > The function in question was dealing in mount-point-relative paths, > not full paths, and I didn't notice the distinction. Passing a full > path to dm_path_to_handle fixed it. As for thinking it behaved > differently for a directory than for a file -- I've been smoking a > batch of bad crack ;-). Calling dm_path_to_handle also failed with > relative paths to files -- I just didn't notice because it wasn't > fatal on that code path. > > Thanks for responding and looking into it, though. > > Regards, > John > No problems John, Just a note that dm_path_to_handle works fine with relative paths on my machine. Regards, Vlad From owner-xfs@oss.sgi.com Thu Nov 2 19:01:50 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 19:01:54 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA331maG024120 for ; Thu, 2 Nov 2006 19:01:50 -0800 X-ASG-Debug-ID: 1162522861-30622-124-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from n034.sc1.cp.net (smtpout1453.sc1.he.tucows.com [64.97.157.153]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4C77FD1B2073 for ; Thu, 2 Nov 2006 19:01:01 -0800 (PST) Received: from [192.168.2.120] (70.112.81.243) by n034.sc1.cp.net (7.2.069.1) (authenticated as john@groves.net) id 454A86AE00007334; Fri, 3 Nov 2006 02:53:33 +0000 Message-ID: <454AAF31.8050104@Groves.net> Date: Thu, 02 Nov 2006 20:53:37 -0600 From: John Groves User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vlad Apostolov CC: jgl@johngroves.net, linux-xfs@oss.sgi.com, Dean Roehrich X-ASG-Orig-Subj: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory Subject: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory References: <4547DA70.4040107@Groves.net> <4547EDFD.8020407@sgi.com> <454A94A6.6040907@johngroves.net> <454AAC6B.7010406@sgi.com> In-Reply-To: <454AAC6B.7010406@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24892 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9533 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: John@Groves.net Precedence: bulk X-list: xfs Content-Length: 1279 Lines: 41 Vlad Apostolov wrote: > Hi John, > > I did try this on my dmapi filesystem: > > emu:/mnt/scratch1/dmapi_test # mkdir -p x/y/z > emu:/mnt/scratch1/dmapi_test # > /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle x/y > 5d1111a90e4800000e00000003000000d903400000000000 > emu:/mnt/scratch1/dmapi_test # mv x/y x/w > emu:/mnt/scratch1/dmapi_test # > /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle x/w > 5d1111a90e4800000e00000003000000d903400000000000 > emu:/mnt/scratch1/dmapi_test # > > I also tried path_to_handle with relative path to a directory it > worked fine too. When you say > dm_path_to_handle fails, what is the error returned? > > Regards, > Vlad > Vlad, This was my bad -- I need to go back to programming school ;-). The function in question was dealing in mount-point-relative paths, not full paths, and I didn't notice the distinction. Passing a full path to dm_path_to_handle fixed it. As for thinking it behaved differently for a directory than for a file -- I've been smoking a batch of bad crack ;-). Calling dm_path_to_handle also failed with relative paths to files -- I just didn't notice because it wasn't fatal on that code path. Thanks for responding and looking into it, though. Regards, John From owner-xfs@oss.sgi.com Thu Nov 2 20:41:05 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 02 Nov 2006 20:41:11 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA34f3aG007783 for ; Thu, 2 Nov 2006 20:41:05 -0800 X-ASG-Debug-ID: 1162525317-22063-155-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ruth.realtime.net (mercury.realtime.net [205.238.132.86]) by cuda.sgi.com (Spam Firewall) with ESMTP id EA791D1B7DD9 for ; Thu, 2 Nov 2006 19:41:57 -0800 (PST) Received: from [192.168.2.120] (cpe-70-112-81-243.austin.res.rr.com [70.112.81.243]) by realtime.net (Realtime Communications Advanced E-Mail Services V9.2) with ESMTP id 24509592-1817707 for multiple; Thu, 02 Nov 2006 19:00:20 -0600 Message-ID: <454A94A6.6040907@johngroves.net> Date: Thu, 02 Nov 2006 19:00:22 -0600 From: John Groves Reply-To: jgl@johngroves.net User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-xfs@oss.sgi.com CC: Vlad Apostolov , John Groves , Dean Roehrich X-ASG-Orig-Subj: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory Subject: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory References: <4547DA70.4040107@Groves.net> <4547EDFD.8020407@sgi.com> In-Reply-To: <4547EDFD.8020407@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-User: jg@bga.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24892 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9535 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jgl@johngroves.net Precedence: bulk X-list: xfs Content-Length: 3219 Lines: 83 Thanks for your replies, Vlad, although I don't find anything so far that helps with my problem (my paths are not long, and my calls to dm_path_to_handle have been running in production environments for a couple of years). As far as I can see, dm_path_to_handle does not work on a directory (?), although it works perfectly on a file. I will try to dig deeper into this over the next few days, but here is a somewhat clearer explanation of the behavior I am seeing. I have updated to the latest kernel from SGI's CVS server, but the problem is still there. I am tracing through kernel code, and will be happy to pull together some test code that demonstrates the problem, or to post a patch if I figure it out, but this will take a few days. The sequence in which I find this problem is: 1. Receive a pre-rename event 2. Use the first handle parameter to resolve the pre-rename parent directory path (not via dm_handle_to_path -- I had to roll my own mechanisms for turning handles into paths). 3. Concatenate the first name parameter to the first parent directory path, to get the relative path from mount point to actual file being renamed. 4. Call dm_path_to_handle on that path, hoping to get the handle of the file-being-renamed. If the renamed-thing is a file, this works. If it's a directory, dm_path_to_handle fails. With my dmapi event handler installed and running, I can reproduce it by doing the following in the root directory of the filesystem: mkdir -p x/y/z mv x/y x/w In the pre-rename event, prior to responding to the event, my handler correctly determines that x/y is being renamed to x/w, but dm_path_to_handle does not return the handle of x/y. My post-rename event handler also correctly resolves the paths, but dm_path_to_handle does not return the handle of x/w. If x/y is a file (rather than a directory) it all works properly. Let me know if you can think of anything specific I should look at, or of a different way of getting the handle of the renamed thingy. Thanks, John Groves Vlad Apostolov wrote: > John Groves wrote: > >> I'm running up against a difficult situation because dm_path_to_handle >> does not return a handle, if the path is to a directory. Is this a >> known issue, or perhaps fixed in a recent version? Or is there >> another way get the handle of a directory by path? When any file type >> is renamed, I (for various reasons) *must* know not just the old & new >> parent handles, but also the handle of the renamed thingy. If the >> thingy is a directory, I'm stuck at the moment. >> >> My test system has dmapi 2.2.1-5, which I don't think is current, but >> I can't seem to get access to the oss.sgi.com server to check. >> >> Any advice or info appreciated. I'm willing to try and submit a >> patch, but I'd appreciate first knowing whether there was a specific >> reason or problem that led to the current behavior. >> >> Thanks, >> John Groves >> > Hi John, > > If your path is longer than 2000 characters dm_path_to_handle used to fail. > This bug was fixed in August 2006. Please update your tree from here: > > http://oss.sgi.com/projects/xfs/download.html > > Regards, > Vlad > > > > From owner-xfs@oss.sgi.com Fri Nov 3 01:32:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 01:33:03 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA39WraG017853 for ; Fri, 3 Nov 2006 01:32:59 -0800 X-ASG-Debug-ID: 1162546326-5754-577-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rrzmta2.rz.uni-regensburg.de (rrzmta2.rz.uni-regensburg.de [132.199.1.17]) by cuda.sgi.com (Spam Firewall) with ESMTP id 15071D1B8CD7 for ; Fri, 3 Nov 2006 01:32:06 -0800 (PST) Received: from rrzmta2.rz.uni-regensburg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id 960C16CED8; Fri, 3 Nov 2006 10:32:07 +0100 (CET) Received: from pc51072.physik.uni-regensburg.de (pc51072.physik.uni-regensburg.de [132.199.98.129]) by rrzmta2.rz.uni-regensburg.de (Postfix) with ESMTP id 44D196CEAD; Fri, 3 Nov 2006 10:32:07 +0100 (CET) Received: by pc51072.physik.uni-regensburg.de (Postfix, from userid 28561) id EE04F507058; Fri, 3 Nov 2006 10:32:03 +0100 (CET) Date: Fri, 3 Nov 2006 10:32:03 +0100 From: Christian Guggenberger To: Eric Sandeen , dgc@sgi.com Cc: christian.guggenberger@physik.uni-regensburg.de, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> Reply-To: christian.guggenberger@physik.uni-regensburg.de References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <454A3B28.7010405@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <454A3B28.7010405@sandeen.net> User-Agent: Mutt/1.5.9i X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24916 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9536 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.guggenberger@physik.uni-regensburg.de Precedence: bulk X-list: xfs Content-Length: 2157 Lines: 107 Eric, Dave, > > xfs_db> sb 0 > xfs_db> p > > let's see what you've got. > xfs_db: read failed: Invalid argument xfs_db: data size check failed xfs_db> sb 0 xfs_db> p magicnum = 0x58465342 blocksize = 4096 dblocks = 18446744070056148512 rblocks = 0 rextents = 0 uuid = 27d35a50-724e-440b-ae1a-79f934f7915a logstart = 2147483652 rootino = 128 rbmino = 129 rsumino = 130 rextsize = 16 agblocks = 84976608 agcount = 570 rbmblocks = 0 logblocks = 32768 versionnum = 0x30c4 sectsize = 512 inodesize = 256 inopblock = 16 fname = "\000\000\000\000\000\000\000\000\000\000\000\000" blocklog = 12 sectlog = 9 inodelog = 8 inopblog = 4 agblklog = 27 rextslog = 0 inprogress = 0 imax_pct = 25 icount = 1298880 ifree = 376826 fdblocks = 18446744067363131928 frextents = 0 uquotino = 131 gquotino = null qflags = 0x7 flags = 0 shared_vn = 0 inoalignmt = 2 unit = 0 width = 0 dirblklog = 0 logsectlog = 0 logsectsize = 0 logsunit = 0 features2 = 0 xfs_db> > Also how big does /proc/partitions think your new device is? > it thinks it's 26983133184 blocks, which seems to be correct: --- Logical volume --- LV Name /dev/data/project VG Name data LV UUID 4RIXaW-QxWj-KOr5-CysS-TmLF-Jebu-lPyPOU LV Write Access read/write LV Status available # open 1 LV Size 25.13 TB Current LE 6587679 Segments 4 Allocation inherit Read ahead sectors 0 Block device 254:1 note, the fs was first grown with (originally mounted on /data/projects) xfs_growfs -D 4294966000 /data/projects which succeeded. a further xfs_growfs -D 4300000000 /data/projects shut the fs down. > > found candidate secondary superblock... > > superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29 > > hmm that offset is about 9.4 terabytes. > > any kernel messages when this happens? > > rval 29 is ESPIPE / illegal seek. not that I know of, unfortunately. As Dave already stated that > 16TB is not supported on 32bits - is there any way to step back ? cheers. - Christian From owner-xfs@oss.sgi.com Fri Nov 3 04:35:26 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 04:35:38 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA3CZJaG012067 for ; Fri, 3 Nov 2006 04:35:24 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id XAA19682; Fri, 3 Nov 2006 23:34:25 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kA3CYM7Y23496703; Fri, 3 Nov 2006 23:34:23 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kA3CYIOe23474843; Fri, 3 Nov 2006 23:34:18 +1100 (AEDT) Date: Fri, 3 Nov 2006 23:34:18 +1100 From: David Chinner To: Christian Guggenberger Cc: Eric Sandeen , dgc@sgi.com, xfs@oss.sgi.com Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061103123418.GP8394166@melbourne.sgi.com> References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <454A3B28.7010405@sandeen.net> <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 9537 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 3181 Lines: 130 On Fri, Nov 03, 2006 at 10:32:03AM +0100, Christian Guggenberger wrote: > Eric, Dave, > > > > > xfs_db> sb 0 > > xfs_db> p > > > > let's see what you've got. > > > > xfs_db: read failed: Invalid argument > xfs_db: data size check failed > xfs_db> sb 0 > xfs_db> p > magicnum = 0x58465342 > blocksize = 4096 > dblocks = 18446744070056148512 That looks like an overflow to me ;) > fdblocks = 18446744067363131928 Free space gone kaboom too... > frextents = 0 > uquotino = 131 > gquotino = null > qflags = 0x7 > flags = 0 > shared_vn = 0 > inoalignmt = 2 > unit = 0 > width = 0 > dirblklog = 0 > logsectlog = 0 > logsectsize = 0 > logsunit = 0 > features2 = 0 > xfs_db> > > > Also how big does /proc/partitions think your new device is? > > > it thinks it's 26983133184 blocks, which seems to be correct: > > --- Logical volume --- > LV Name /dev/data/project > VG Name data > LV UUID 4RIXaW-QxWj-KOr5-CysS-TmLF-Jebu-lPyPOU > LV Write Access read/write > LV Status available > # open 1 > LV Size 25.13 TB > Current LE 6587679 > Segments 4 > Allocation inherit > Read ahead sectors 0 > Block device 254:1 > > note, the fs was first grown with (originally mounted on /data/projects) > > xfs_growfs -D 4294966000 /data/projects > which succeeded. Which is just less than 16TB: 0x1ffeffaf0000 > a further > > xfs_growfs -D 4300000000 /data/projects Which is just more than 16TB: 0x2008ccb00000 > shut the fs down. Probably corrupted metadata in the first couple of AGs... > > > found candidate secondary superblock... > > > superblock read failed, offset 10093861404672, size 2048, ag 0, rval 29 > > > > hmm that offset is about 9.4 terabytes. With a size of 25.13TiB in the LVM, 9.4TB is ~(25.13 - 16)TiB That's a 32 bit overflow as well... > As Dave already stated that > 16TB is not supported on 32bits - is there > any way to step back ? xfs_db mojo.... ;) Note - no guarantee this will work - practise on an expendable sparse loopback filessytem image by making a filesystem of slightly less than 16TB then growing it to corrupt it the same way and then fixing it up successfully. Once it's corrupted, unmount and run xfs_db in expert mode. The superblock: blocksize = 4096 dblocks = 18446744070056148512 ... agblocks = 84976608 agcount = 570 An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and we will only shrink by whole AGs. Hence we have to correct agcount and dblocks. So, 404 AGs gives: dblocks = agblocks * agcount = 84976608 * 404 * 512 bytes = 0xFFC853B0000 bytes, which is under 16TiB = 4291318704 blocks Now you need to zero fdblocks, and now you should be able to run xfs_repair to fix it up. Don't be surprised if repair runs out of memory - you'll have to hope Barry gets finished with the memory reduction work he's doing soon or get a 64 bit machine to fix that problem. A 64bit machine wouldn't have the 16TB limit, either ;) Good luck.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Nov 3 06:55:34 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 06:55:41 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA3EtWaG028004 for ; Fri, 3 Nov 2006 06:55:34 -0800 X-ASG-Debug-ID: 1162565684-4894-99-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id D9ECCD1B33AE for ; Fri, 3 Nov 2006 06:54:44 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C20F818022E2F; Fri, 3 Nov 2006 08:54:43 -0600 (CST) Message-ID: <454B5833.9030008@sandeen.net> Date: Fri, 03 Nov 2006 08:54:43 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.7 (Macintosh/20060909) MIME-Version: 1.0 To: David Chinner CC: Christian Guggenberger , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <20061103004142.GI8394166@melbourne.sgi.com> In-Reply-To: <20061103004142.GI8394166@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24940 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9538 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1319 Lines: 36 David Chinner wrote: > On Thu, Nov 02, 2006 at 06:26:08PM +0100, Christian Guggenberger wrote: >> Hi, >> >> a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on >> top of lvm2 to 17TB. (I am not even sure if that's supposed work with >> linux-2.6, 32bit) > > Not supported - any metadata access past 16TB will wrap the 32 bit page cache > index for the metadata address space and you'll corrupt the filesystem. Ohhhh right. I've been in x86_64 land for too long, sorry for the earlier false assertion.... :( xfs guys, if it's not there already (and I don't see it from a quick look..) growfs -really- should refuse (in the kernel) to grow a filesystem past 16T on a 32-bit machine, just as we refuse to mount one. something like this in xfs_growfs_data_private: #if XFS_BIG_BLKNOS /* Limited by ULONG_MAX of page cache index */ if (unlikely( (nb >> (PAGE_SHIFT - sbp->sb_blocklog)) > ULONG_MAX) { #else /* Limited by UINT_MAX of sectors */ if (unlikely( (nb << (sbp->sb_blocklog - BBSHIFT)) > UINT_MAX) { #endif cmn_err(CE_WARN, "new filesystem size too large for this system."); return XFS_ERROR(E2BIG); } and something similar in xfs_growfs_rt ? -Eric From owner-xfs@oss.sgi.com Fri Nov 3 06:58:48 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 06:58:55 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA3EwiaG028538 for ; Fri, 3 Nov 2006 06:58:48 -0800 X-ASG-Debug-ID: 1162565878-10765-48-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ruth.realtime.net (mercury.realtime.net [205.238.132.86]) by cuda.sgi.com (Spam Firewall) with ESMTP id BE293D1AFBEA for ; Fri, 3 Nov 2006 06:57:58 -0800 (PST) Received: from [192.168.2.120] (cpe-70-112-81-243.austin.res.rr.com [70.112.81.243]) by realtime.net (Realtime Communications Advanced E-Mail Services V9.2) with ESMTP id 24628936-1817707 for multiple; Fri, 03 Nov 2006 08:57:36 -0600 Message-ID: <454B58E4.6000802@johngroves.net> Date: Fri, 03 Nov 2006 08:57:40 -0600 From: John Groves Reply-To: jgl@johngroves.net User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vlad Apostolov CC: John Groves , linux-xfs@oss.sgi.com, Dean Roehrich X-ASG-Orig-Subj: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory Subject: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory References: <4547DA70.4040107@Groves.net> <4547EDFD.8020407@sgi.com> <454A94A6.6040907@johngroves.net> <454AAC6B.7010406@sgi.com> <454AAF31.8050104@Groves.net> <454AB0A0.7050309@sgi.com> In-Reply-To: <454AB0A0.7050309@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Authenticated-User: jg@bga.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24940 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9539 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jgl@johngroves.net Precedence: bulk X-list: xfs Content-Length: 452 Lines: 18 Vlad Apostolov wrote: > Just a note that dm_path_to_handle works fine with relative paths on my > machine. In your case, could it be applying the "current working directory" from the process context to resolve a full path? Mine is a daemon, and the relative paths are not valid relative to the "cwd" in which the daemon was started. ...just a thought. Otherwise, for the moment I may have to just accept it as weird... Thanks again, John From owner-xfs@oss.sgi.com Fri Nov 3 07:45:40 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 07:45:46 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA3FjaaG001958 for ; Fri, 3 Nov 2006 07:45:40 -0800 X-ASG-Debug-ID: 1162568690-11424-544-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rrzmta2.rz.uni-regensburg.de (rrzmta2.rz.uni-regensburg.de [132.199.1.17]) by cuda.sgi.com (Spam Firewall) with ESMTP id 835F34ED2C1 for ; Fri, 3 Nov 2006 07:44:50 -0800 (PST) Received: from rrzmta2.rz.uni-regensburg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id EC4A56CEB1; Fri, 3 Nov 2006 16:44:54 +0100 (CET) Received: from pc51072.physik.uni-regensburg.de (pc51072.physik.uni-regensburg.de [132.199.98.129]) by rrzmta2.rz.uni-regensburg.de (Postfix) with ESMTP id 9868069AEC; Fri, 3 Nov 2006 16:44:54 +0100 (CET) Received: by pc51072.physik.uni-regensburg.de (Postfix, from userid 28561) id 4CDF4507058; Fri, 3 Nov 2006 16:44:48 +0100 (CET) Date: Fri, 3 Nov 2006 16:44:48 +0100 From: Christian Guggenberger To: David Chinner Cc: Christian Guggenberger , Eric Sandeen , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> Reply-To: christian.guggenberger@physik.uni-regensburg.de References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <454A3B28.7010405@sandeen.net> <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> <20061103123418.GP8394166@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061103123418.GP8394166@melbourne.sgi.com> User-Agent: Mutt/1.5.9i X-Barracuda-Spam-Score: 0.50 X-Barracuda-Spam-Status: No, SCORE=0.50 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24942 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M BODY: Custom Rule 7568M X-archive-position: 9540 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.guggenberger@physik.uni-regensburg.de Precedence: bulk X-list: xfs Content-Length: 1753 Lines: 84 > > xfs_db mojo.... ;) > > Note - no guarantee this will work - practise on an expendable > sparse loopback filessytem image by making a filesystem of slightly less > than 16TB then growing it to corrupt it the same way and then fixing it up > successfully. > > Once it's corrupted, unmount and run xfs_db in expert mode. > The superblock: > > blocksize = 4096 > dblocks = 18446744070056148512 > ... > agblocks = 84976608 > agcount = 570 > > An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and > we will only shrink by whole AGs. Hence we have to correct > agcount and dblocks. isn't the AG size 'agblocks * blocksize' == ~324 GB here ? got further input on a secondray superblock form the colleague: looks more reasonable, I'd say. Is there a way to manually recover sb0 from sb1 ? (btw, I still hope they get access to an 64bit system with recent xfsprogs and kernel, soon) xfs_db: read failed: Invalid argument xfs_db: data size check failed xfs_db> sb 1 xfs_db> p magicnum = 0x58465342 blocksize = 4096 dblocks = 4294966000 rblocks = 0 rextents = 0 uuid = 27d35a50-724e-440b-ae1a-79f934f7915a logstart = 2147483652 rootino = 128 rbmino = 129 rsumino = 130 rextsize = 16 agblocks = 84976608 agcount = 51 rbmblocks = 0 logblocks = 32768 versionnum = 0x30c4 sectsize = 512 inodesize = 256 inopblock = 16 fname = "\000\000\000\000\000\000\000\000\000\000\000\000" blocklog = 12 sectlog = 9 inodelog = 8 inopblog = 4 agblklog = 27 rextslog = 0 inprogress = 0 imax_pct = 25 icount = 1298880 ifree = 376828 fdblocks = 1601952378 frextents = 0 uquotino = 131 gquotino = null qflags = 0x7 flags = 0 shared_vn = 0 inoalignmt = 2 unit = 0 width = 0 dirblklog = 0 logsectlog = 0 logsectsize = 0 logsunit = 0 features2 = 0 cheers. - Christian From owner-xfs@oss.sgi.com Fri Nov 3 07:55:39 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 07:55:47 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA3FtdaG003056 for ; Fri, 3 Nov 2006 07:55:39 -0800 X-ASG-Debug-ID: 1162569293-21532-120-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 486F4D1BD465 for ; Fri, 3 Nov 2006 07:54:53 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3B4DA18BCD2B7; Fri, 3 Nov 2006 09:54:52 -0600 (CST) Message-ID: <454B664B.6030701@sandeen.net> Date: Fri, 03 Nov 2006 09:54:51 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.7 (Macintosh/20060909) MIME-Version: 1.0 To: christian.guggenberger@physik.uni-regensburg.de CC: David Chinner , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <454A3B28.7010405@sandeen.net> <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> <20061103123418.GP8394166@melbourne.sgi.com> <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> In-Reply-To: <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24944 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9541 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 933 Lines: 31 Christian Guggenberger wrote: >> xfs_db mojo.... ;) >> >> Note - no guarantee this will work - practise on an expendable >> sparse loopback filessytem image by making a filesystem of slightly less >> than 16TB then growing it to corrupt it the same way and then fixing it up >> successfully. >> >> Once it's corrupted, unmount and run xfs_db in expert mode. >> The superblock: >> >> blocksize = 4096 >> dblocks = 18446744070056148512 >> ... >> agblocks = 84976608 >> agcount = 570 >> >> An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and >> we will only shrink by whole AGs. Hence we have to correct >> agcount and dblocks. > > isn't the AG size 'agblocks * blocksize' == ~324 GB here ? > > got further input on a secondray superblock form the colleague: > looks more reasonable, I'd say. Is there a way to manually recover sb0 > from sb1 ? you can copy it over field-by-field.... not sure if there's an easier way. -Eric From owner-xfs@oss.sgi.com Fri Nov 3 16:18:15 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 03 Nov 2006 16:18:18 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA40IDaG020038 for ; Fri, 3 Nov 2006 16:18:15 -0800 X-ASG-Debug-ID: 1162599447-4960-471-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by cuda.sgi.com (Spam Firewall) with SMTP id 048A7D1B5B61 for ; Fri, 3 Nov 2006 16:17:27 -0800 (PST) Received: (qmail invoked by alias); 04 Nov 2006 00:17:25 -0000 Received: from port-212-202-77-183.dynamic.qsc.de (EHLO clx) [212.202.77.183] by mail.gmx.net (mp043) with SMTP; 04 Nov 2006 01:17:25 +0100 X-Authenticated: #20522298 From: peyytmek@gmx.de To: Timothy Shimmin X-ASG-Orig-Subj: Re: Xfs-mailinglist question (xfs mounting problem, hdb1 just freezes) Subject: Re: Xfs-mailinglist question (xfs mounting problem, hdb1 just freezes) Date: Sat, 4 Nov 2006 02:14:29 +0000 User-Agent: KMail/1.9.1 References: <200611012255.46008.peyytmek@gmx.de> <3B2B6490C980DD2C8B4C9645@timothy-shimmins-power-mac-g5.local> In-Reply-To: <3B2B6490C980DD2C8B4C9645@timothy-shimmins-power-mac-g5.local> Cc: xfs@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Message-Id: <200611040214.29997.peyytmek@gmx.de> X-Y-GMX-Trusted: 0 X-Barracuda-Spam-Score: 0.55 X-Barracuda-Spam-Status: No, SCORE=0.55 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=NO_REAL_NAME X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.24976 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id kA40IFaG020042 X-archive-position: 9545 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: peyytmek@gmx.de Precedence: bulk X-list: xfs Content-Length: 5424 Lines: 160 On Friday 03 November 2006 00:58, you wrote: Hello again Thanks for your email First of all, everytime I do xfs_logprint /dev/hdb1, with -ti or -tibo or mounting the partition with a kernel pre May 26 (kernel-2.16.3-gentoo-r3) or even xfs_repair -L /dev/hdb1 it's always the same. my terminal just freezes There is also nothing in /var/log/messages Well, I'm going to try my hdd in another computer with a different ide-controller. maybe that's the problem althought i doubt it would really work that way out. Bye, D.H. > Hi there, > > Sorry about not getting back to you. > > The inode details you sent me look reasonable to me (AFAICS). > However, the inode that we are processing which has problems > is probably coming from the log. > Basically, during log replay we reinstate metadata such as inodes, > inode buffers, > and later process an unlinked-list which this inode appears to be on. > It's during the truncating (as part of inactivating the inode) that we have > problems when looking at its extents. > > You should be able to see this inode in the log if you use: > # xfs_logprint -ti device > (or possibly I guess > # xfs_logrpint -tibo device - if the inode is in a buffer of inodes) > I'd be curious to see this. > > My thoughts for a plan of action were either: > (1) forget the log and do an "xfs_repair -L" to zero it out and repair > or > (2) somehow try to do a mount which avoids the unlinked list processing > (an older kernel (pre May 26) would do this because of a bug in recovery); > then unmount, then do a normal "xfs_repair" > or > (3) we try to stop this particular inode from being processed in the > unlinked-list inode processing during mount; unmount; repair and then mount > again > > The simplest is to do option (1). However, it means that any other > outstanding metadata changes would be lost. > Option (2) might be a goer but you need such a kernel and in that case it > won't do any unlinked processing for a group of such inodes (one's hashed > to that bucket in the AGI's). I'm unsure on how to stop this unlinked > processing otherwise. > Option (3) I'm unsure as how to do. Also, there could be more inodes in > the same boat which could cause recovery to crash. > > May be others have suggestions. > > Regards, > Tim. > > --On 1 November 2006 10:55:43 PM +0000 peyytmek@gmx.de wrote: > > Hello, > > Thanks for your last answer. Since you didn't answer my email for 2 weeks > > i thought you might have deleted it accidently > > > > Here's the email conversation: > >> > Hello. > >> > Thanks for your answer. > >> > > >> > That's what i have: dmesg print with kernel-2.6.16-gentoo-r3 and an > >> > print of xfs_bg. > >> > > >> >> You could print out the offending inode with xfs_db to show us > >> >> what it looks like: $xfs_db -r /dev/hdb1 -c "inode 950759" -c > >> >> "print". > >> > > >> > I don't know what you mean with it but i added it anyway. (done with > >> > kernel-2.6.18-gentoo if it matters) > >> > > >> > xfs_db: > >> > > >> > CLX ~ # xfs_db -r /dev/hdb1 -c "inode 950759" -c "print" > >> > core.magic = 0x494e > >> > core.mode = 0100644 > >> > core.version = 1 > >> > core.format = 3 (btree) > >> > core.nlinkv1 = 0 > >> > core.uid = 1000 > >> > core.gid = 100 > >> > core.flushiter = 0 > >> > core.atime.sec = Sun Aug 27 14:56:52 2006 > >> > core.atime.nsec = 657389250 > >> > core.mtime.sec = Sun Aug 27 16:29:40 2006 > >> > core.mtime.nsec = 080196250 > >> > core.ctime.sec = Thu Oct  5 01:17:40 2006 > >> > core.ctime.nsec = 976565958 > >> > core.size = 32071862 > >> > core.nblocks = 7833 > >> > core.extsize = 0 > >> > core.nextents = 28 > >> > core.naextents = 0 > >> > core.forkoff = 0 > >> > core.aformat = 2 (extents) > >> > core.dmevmask = 0 > >> > core.dmstate = 0 > >> > core.newrtbm = 0 > >> > core.prealloc = 0 > >> > core.realtime = 0 > >> > core.immutable = 0 > >> > core.append = 0 > >> > core.sync = 0 > >> > core.noatime = 0 > >> > core.nodump = 0 > >> > core.rtinherit = 0 > >> > core.projinherit = 0 > >> > core.nosymlinks = 0 > >> > core.extsz = 0 > >> > core.extszinherit = 0 > >> > core.gen = 0 > >> > next_unlinked = null > >> > u.bmbt.level = 1 > >> > u.bmbt.numrecs = 1 > >> > u.bmbt.keys[1] = [startoff] 1:[0] > >> > u.bmbt.ptrs[1] = 1:185933 > >> > >> And now: > >> > >> xfs_db -r /dev/hadb1 -c "fsb 185933" -c "type bmapbtd" -c "p" > >> > >> to look at the 28 extent records. > >> > >> --Tim > > > > Here's the content of my last Email > > > > > > Hello, thanks again for your fast answer > > Sorry for the double post last time. > > here it comes > > > > > > CLX ~ # xfs_db -r /dev/hdb1 -c "fsb 185933" -c "type bmapbtd" -c "p" > > magic = 0x424d4150 > > level = 0 > > numrecs = 27 > > leftsib = null > > rightsib = null > > recs[1-27] = [startoff,startblock,blockcount,extentflag] > > 1:[0,185637,16,0] 2: [16,185537,8,0] 3:[24,185718,8,0] 4:[32,185706,8,0] > > 5:[40,185836,8,0] 6: [48,185848,16,0] 7:[64,185865,16,0] > > 8:[80,185882,8,0] 9:[96,185899,16,0] 10: [112,185916,16,0] > > 11:[340,185934,2,0] 12:[342,4768704,1320,0] 13: [1662,4770389,239,0] > > 14:[1901,4770919,264,0] 15:[2165,4771391,165,0] 16: [2330,4771860,227,0] > > 17:[2557,4861204,351,0] 18:[2908,4861800,257,0] 19: [3165,4862282,349,0] > > 20:[3514,4862934,230,0] 21:[3744,4863506,383,0] 22: [4127,4864141,348,0] > > 23:[4475,4864871,228,0] 24:[4703,4865358,268,0] 25: [4971,4865882,593,0] > > 26:[5564,4866818,339,0] 27:[5903,4867729,1928,0] From owner-xfs@oss.sgi.com Sat Nov 4 16:49:42 2006 Received: with ECARTIS (v1.0.0; list xfs); Sat, 04 Nov 2006 16:49:44 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA50nfaG023799 for ; Sat, 4 Nov 2006 16:49:42 -0800 X-ASG-Debug-ID: 1162687733-23291-601-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.gmx.net (mail.gmx.net [213.165.64.20]) by cuda.sgi.com (Spam Firewall) with SMTP id 7655A5075F7 for ; Sat, 4 Nov 2006 16:48:53 -0800 (PST) Received: (qmail invoked by alias); 05 Nov 2006 00:48:51 -0000 Received: from port-212-202-77-183.dynamic.qsc.de (EHLO clx) [212.202.77.183] by mail.gmx.net (mp019) with SMTP; 05 Nov 2006 01:48:51 +0100 X-Authenticated: #20522298 From: peyytmek@gmx.de To: Timothy Shimmin X-ASG-Orig-Subj: Re: Xfs-mailinglist question (xfs mounting problem, hdb1 just freezes) Subject: Re: Xfs-mailinglist question (xfs mounting problem, hdb1 just freezes) Date: Sun, 5 Nov 2006 00:48:43 +0000 User-Agent: KMail/1.9.1 References: <200611012255.46008.peyytmek@gmx.de> <3B2B6490C980DD2C8B4C9645@timothy-shimmins-power-mac-g5.local> <200611040214.29997.peyytmek@gmx.de> In-Reply-To: <200611040214.29997.peyytmek@gmx.de> Cc: xfs@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Message-Id: <200611050048.43868.peyytmek@gmx.de> X-Y-GMX-Trusted: 0 X-Barracuda-Spam-Score: 0.55 X-Barracuda-Spam-Status: No, SCORE=0.55 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=NO_REAL_NAME X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25074 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id kA50ngaG023805 X-archive-position: 9547 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: peyytmek@gmx.de Precedence: bulk X-list: xfs Content-Length: 6059 Lines: 178 On Saturday 04 November 2006 02:14, you wrote: Hi Well. I kinda managed to recover my data I used another computer for it and just plugged and xfs_repair /dev/hdb1 now it works just fine. Althought I have no idea what caused all the freezing if i accessed /dev/hdb1 on my server Thanks for help. Bye D.H. > On Friday 03 November 2006 00:58, you wrote: > > Hello again > Thanks for your email > > First of all, everytime I do xfs_logprint /dev/hdb1, with -ti or -tibo > or mounting the partition with a kernel pre May 26 > (kernel-2.16.3-gentoo-r3) or even xfs_repair -L /dev/hdb1 it's always the > same. my terminal just freezes > > There is also nothing in /var/log/messages > > Well, I'm going to try my hdd in another computer with a different > ide-controller. maybe that's the problem althought i doubt it would really > work that way out. > > Bye, > D.H. > > > Hi there, > > > > Sorry about not getting back to you. > > > > The inode details you sent me look reasonable to me (AFAICS). > > However, the inode that we are processing which has problems > > is probably coming from the log. > > Basically, during log replay we reinstate metadata such as inodes, > > inode buffers, > > and later process an unlinked-list which this inode appears to be on. > > It's during the truncating (as part of inactivating the inode) that we > > have problems when looking at its extents. > > > > You should be able to see this inode in the log if you use: > > # xfs_logprint -ti device > > (or possibly I guess > > # xfs_logrpint -tibo device - if the inode is in a buffer of inodes) > > I'd be curious to see this. > > > > My thoughts for a plan of action were either: > > (1) forget the log and do an "xfs_repair -L" to zero it out and repair > > or > > (2) somehow try to do a mount which avoids the unlinked list processing > > (an older kernel (pre May 26) would do this because of a bug in > > recovery); then unmount, then do a normal "xfs_repair" > > or > > (3) we try to stop this particular inode from being processed in the > > unlinked-list inode processing during mount; unmount; repair and then > > mount again > > > > The simplest is to do option (1). However, it means that any other > > outstanding metadata changes would be lost. > > Option (2) might be a goer but you need such a kernel and in that case it > > won't do any unlinked processing for a group of such inodes (one's hashed > > to that bucket in the AGI's). I'm unsure on how to stop this unlinked > > processing otherwise. > > Option (3) I'm unsure as how to do. Also, there could be more inodes in > > the same boat which could cause recovery to crash. > > > > May be others have suggestions. > > > > Regards, > > Tim. > > > > --On 1 November 2006 10:55:43 PM +0000 peyytmek@gmx.de wrote: > > > Hello, > > > Thanks for your last answer. Since you didn't answer my email for 2 > > > weeks i thought you might have deleted it accidently > > > > > > Here's the email conversation: > > >> > Hello. > > >> > Thanks for your answer. > > >> > > > >> > That's what i have: dmesg print with kernel-2.6.16-gentoo-r3 and an > > >> > print of xfs_bg. > > >> > > > >> >> You could print out the offending inode with xfs_db to show us > > >> >> what it looks like: $xfs_db -r /dev/hdb1 -c "inode 950759" -c > > >> >> "print". > > >> > > > >> > I don't know what you mean with it but i added it anyway. (done with > > >> > kernel-2.6.18-gentoo if it matters) > > >> > > > >> > xfs_db: > > >> > > > >> > CLX ~ # xfs_db -r /dev/hdb1 -c "inode 950759" -c "print" > > >> > core.magic = 0x494e > > >> > core.mode = 0100644 > > >> > core.version = 1 > > >> > core.format = 3 (btree) > > >> > core.nlinkv1 = 0 > > >> > core.uid = 1000 > > >> > core.gid = 100 > > >> > core.flushiter = 0 > > >> > core.atime.sec = Sun Aug 27 14:56:52 2006 > > >> > core.atime.nsec = 657389250 > > >> > core.mtime.sec = Sun Aug 27 16:29:40 2006 > > >> > core.mtime.nsec = 080196250 > > >> > core.ctime.sec = Thu Oct  5 01:17:40 2006 > > >> > core.ctime.nsec = 976565958 > > >> > core.size = 32071862 > > >> > core.nblocks = 7833 > > >> > core.extsize = 0 > > >> > core.nextents = 28 > > >> > core.naextents = 0 > > >> > core.forkoff = 0 > > >> > core.aformat = 2 (extents) > > >> > core.dmevmask = 0 > > >> > core.dmstate = 0 > > >> > core.newrtbm = 0 > > >> > core.prealloc = 0 > > >> > core.realtime = 0 > > >> > core.immutable = 0 > > >> > core.append = 0 > > >> > core.sync = 0 > > >> > core.noatime = 0 > > >> > core.nodump = 0 > > >> > core.rtinherit = 0 > > >> > core.projinherit = 0 > > >> > core.nosymlinks = 0 > > >> > core.extsz = 0 > > >> > core.extszinherit = 0 > > >> > core.gen = 0 > > >> > next_unlinked = null > > >> > u.bmbt.level = 1 > > >> > u.bmbt.numrecs = 1 > > >> > u.bmbt.keys[1] = [startoff] 1:[0] > > >> > u.bmbt.ptrs[1] = 1:185933 > > >> > > >> And now: > > >> > > >> xfs_db -r /dev/hadb1 -c "fsb 185933" -c "type bmapbtd" -c "p" > > >> > > >> to look at the 28 extent records. > > >> > > >> --Tim > > > > > > Here's the content of my last Email > > > > > > > > > Hello, thanks again for your fast answer > > > Sorry for the double post last time. > > > here it comes > > > > > > > > > CLX ~ # xfs_db -r /dev/hdb1 -c "fsb 185933" -c "type bmapbtd" -c "p" > > > magic = 0x424d4150 > > > level = 0 > > > numrecs = 27 > > > leftsib = null > > > rightsib = null > > > recs[1-27] = [startoff,startblock,blockcount,extentflag] > > > 1:[0,185637,16,0] 2: [16,185537,8,0] 3:[24,185718,8,0] > > > 4:[32,185706,8,0] 5:[40,185836,8,0] 6: [48,185848,16,0] > > > 7:[64,185865,16,0] > > > 8:[80,185882,8,0] 9:[96,185899,16,0] 10: [112,185916,16,0] > > > 11:[340,185934,2,0] 12:[342,4768704,1320,0] 13: [1662,4770389,239,0] > > > 14:[1901,4770919,264,0] 15:[2165,4771391,165,0] 16: > > > [2330,4771860,227,0] 17:[2557,4861204,351,0] 18:[2908,4861800,257,0] > > > 19: [3165,4862282,349,0] 20:[3514,4862934,230,0] > > > 21:[3744,4863506,383,0] 22: [4127,4864141,348,0] > > > 23:[4475,4864871,228,0] 24:[4703,4865358,268,0] 25: > > > [4971,4865882,593,0] 26:[5564,4866818,339,0] 27:[5903,4867729,1928,0] From owner-xfs@oss.sgi.com Sun Nov 5 14:38:04 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 05 Nov 2006 14:38:08 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA5Mc0aG007297 for ; Sun, 5 Nov 2006 14:38:03 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA20766; Mon, 6 Nov 2006 09:37:05 +1100 Message-ID: <454E6795.1040900@sgi.com> Date: Mon, 06 Nov 2006 09:37:09 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.7 (X11/20060909) MIME-Version: 1.0 To: jgl@johngroves.net CC: John Groves , linux-xfs@oss.sgi.com, Dean Roehrich Subject: Re: XFS dmapi: dm_path_to_handle fails if the path is a directory References: <4547DA70.4040107@Groves.net> <4547EDFD.8020407@sgi.com> <454A94A6.6040907@johngroves.net> <454AAC6B.7010406@sgi.com> <454AAF31.8050104@Groves.net> <454AB0A0.7050309@sgi.com> <454B58E4.6000802@johngroves.net> In-Reply-To: <454B58E4.6000802@johngroves.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9549 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 1374 Lines: 44 John Groves wrote: > > > Vlad Apostolov wrote: > >> Just a note that dm_path_to_handle works fine with relative paths on >> my machine. > > In your case, could it be applying the "current working directory" > from the process context to resolve a full path? Mine is a daemon, > and the relative paths are not valid relative to the "cwd" in which > the daemon was started. > > ...just a thought. Otherwise, for the moment I may have to just > accept it as weird... > > Thanks again, > John I am using xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle and the relative path passed as an argument is directly given to dm_path_to_handle(). The current working directory I can't really explain why it doesn't work in your case. Here is an example of a directory path to handle I get: emu:/home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd # ./path_to_handle /mnt/scratch1/dmapi 5d1111a90e4800000e0000006e0000008300000000000000 emu:/home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd # ./path_to_handle ../../../../../../../../../mnt/scratch1/dmapi 5d1111a90e4800000e0000006e0000008300000000000000 emu:/home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd # cd / emu:/ # /home/vapo/isms/xfs-cmds/xfstests/dmapi/src/suite1/cmd/path_to_handle ../../../../../../../../../mnt/scratch1/dmapi 5d1111a90e4800000e0000006e0000008300000000000000 Regards, Vlad From owner-xfs@oss.sgi.com Sun Nov 5 17:15:09 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 05 Nov 2006 17:15:12 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA61F6aG024803 for ; Sun, 5 Nov 2006 17:15:08 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA24098; Mon, 6 Nov 2006 12:14:06 +1100 Date: Mon, 06 Nov 2006 11:15:22 +1000 From: Timothy Shimmin To: Eric Sandeen cc: David Chinner , Christian Guggenberger , xfs@oss.sgi.com Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: In-Reply-To: <454B5833.9030008@sandeen.net> References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <20061103004142.GI8394166@melbourne.sgi.com> <454B5833.9030008@sandeen.net> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9552 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1664 Lines: 48 Good idea, Eric. I've created a pv. I noticed this was taken from xfs_mount_validate_sb() for the dblocks test. I guess it would be nice to abstract this test in a macro for use in multiple places. Cheers, Tim. --On 3 November 2006 8:54:43 AM -0600 Eric Sandeen wrote: > David Chinner wrote: >> On Thu, Nov 02, 2006 at 06:26:08PM +0100, Christian Guggenberger wrote: >>> Hi, >>> >>> a colleague recently tried to grow a 16 TB filesystem (x86, 32bit) on >>> top of lvm2 to 17TB. (I am not even sure if that's supposed work with >>> linux-2.6, 32bit) >> >> Not supported - any metadata access past 16TB will wrap the 32 bit page cache >> index for the metadata address space and you'll corrupt the filesystem. > > > Ohhhh right. I've been in x86_64 land for too long, sorry for the earlier false assertion.... :( > > xfs guys, if it's not there already (and I don't see it from a quick look..) growfs -really- > should refuse (in the kernel) to grow a filesystem past 16T on a 32-bit machine, just as we > refuse to mount one. something like this in xfs_growfs_data_private: > ># if XFS_BIG_BLKNOS /* Limited by ULONG_MAX of page cache index */ > if (unlikely( > (nb >> (PAGE_SHIFT - sbp->sb_blocklog)) > ULONG_MAX) { ># else /* Limited by UINT_MAX of sectors */ > if (unlikely( > (nb << (sbp->sb_blocklog - BBSHIFT)) > UINT_MAX) { ># endif > cmn_err(CE_WARN, > "new filesystem size too large for this system."); > return XFS_ERROR(E2BIG); > } > > and something similar in xfs_growfs_rt ? > > -Eric > From owner-xfs@oss.sgi.com Sun Nov 5 19:26:01 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 05 Nov 2006 19:26:05 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA63Q0aG001490 for ; Sun, 5 Nov 2006 19:26:01 -0800 X-ASG-Debug-ID: 1162783514-12542-499-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id C6FA7D1BD719 for ; Sun, 5 Nov 2006 19:25:14 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id F037B18BCD2B7; Sun, 5 Nov 2006 21:25:13 -0600 (CST) Message-ID: <454EAB19.6060901@sandeen.net> Date: Sun, 05 Nov 2006 21:25:13 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.7 (Macintosh/20060909) MIME-Version: 1.0 To: Timothy Shimmin CC: David Chinner , Christian Guggenberger , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <20061103004142.GI8394166@melbourne.sgi.com> <454B5833.9030008@sandeen.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25180 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9553 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 408 Lines: 20 Timothy Shimmin wrote: > Good idea, Eric. > I've created a pv. > I noticed this was taken from xfs_mount_validate_sb() for the dblocks test. yep > I guess it would be nice to abstract this test in a macro for use in > multiple places. yep, it'd just need to be refactored a bit to support data only & rt only (for growfs), while mount wants to check both at the same time. -Eric > > Cheers, > Tim. From owner-xfs@oss.sgi.com Mon Nov 6 02:41:06 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 06 Nov 2006 02:41:11 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA6Af6aG018301 for ; Mon, 6 Nov 2006 02:41:06 -0800 X-ASG-Debug-ID: 1162805334-9227-533-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from server1.spsn.net (server1.spsn.net [195.234.231.102]) by cuda.sgi.com (Spam Firewall) with ESMTP id 6CB08D1BDEE0 for ; Mon, 6 Nov 2006 01:28:54 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by server1.spsn.net (Postfix) with ESMTP id E4F8EACC01A for ; Mon, 6 Nov 2006 10:28:47 +0100 (CET) Received: from server1.spsn.net ([127.0.0.1]) by localhost (server1.spsn.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 82ck1zPyNP9p for ; Mon, 6 Nov 2006 10:28:47 +0100 (CET) Received: by server1.spsn.net (Postfix, from userid 102) id CC7D3ACC01C; Mon, 6 Nov 2006 10:28:47 +0100 (CET) Received: from saschatest.adtech.de (unknown [213.200.64.124]) by server1.spsn.net (Postfix) with ESMTP id 506E5ACC01A for ; Mon, 6 Nov 2006 10:28:47 +0100 (CET) From: Sascha Nitsch To: xfs@oss.sgi.com X-ASG-Orig-Subj: Weird performance decrease Subject: Weird performance decrease Date: Mon, 6 Nov 2006 10:28:08 +0100 User-Agent: KMail/1.9.5 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200611061028.08963.sgi@linuxhowtos.org> X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.1.1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25204 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9554 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sgi@linuxhowtos.org Precedence: bulk X-list: xfs Content-Length: 2184 Lines: 62 Hi, I'm observing a rather strange behaviour of the filesystem cache algorithm. I have a server running the following app scenario: A filesystem tree with a depth of 7 directories and 4 character directory names. In the deepest directories are files. filesize from 100 bytes to 5kb. Filesystem is XFS. The app creates dirs in the tree and reads/writes files into the deepest dirs in the tree. CPU: Dual Xeon 3.0 Ghz w/HT 512KB cache each, 2GB RAM, SCSI-HDD 15k RPM The first while, all is fine and extremely fast. After a while the buffer size is about 3.5 MB and cache size about 618 MB. Until that moment ~445000 directories and ~106000 files have been created Thats where the weird behaviour starts. The buffer size drops to ~200 kb and cache size starts decreasing fast. This results in a drastic performace drop in my app. (avg. read/write times increase from 0.3ms to 4ms) not a constant increase, a jumping increase. During the next while it constantly gets slower (19ms and more). After running a while (with still reducing cache size) the buffer size stays at ~700kb and cache about 400 MB. Performane is terrible. Way slower than starting up with no cache. restarting the app makes no change, neither remounting the partition. cmd to create the fs: mkfs.xfs -b size=512 -i maxpct=0 -l version=2 -n size=16k /dev/sdc mounting with mount /dev/sdc /data I'm open for suggestion on mkfs calls, mount options and kernel tuning via procfs. I have a testcase to reproduce the problem. It happens after ~45 minutes. xfs_info /data/ meta-data=/data isize=256 agcount=16, agsize=8960921 blks = sectsz=512 data = bsize=512 blocks=143374736, imaxpct=0 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=16384 log =internal bsize=512 blocks=65536, version=2 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 kernel: a 2.6.9-34.0.2.ELsmp #1 SMP Mon Jul 17 21:41:41 CDT 2006 i686 i686 i386 GNU/Linux filesystem usage is < 1% From owner-xfs@oss.sgi.com Mon Nov 6 03:32:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 06 Nov 2006 03:33:04 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA6BWwaG027725 for ; Mon, 6 Nov 2006 03:32:59 -0800 X-ASG-Debug-ID: 1162812730-2823-298-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from server1.spsn.net (server1.spsn.net [195.234.231.102]) by cuda.sgi.com (Spam Firewall) with ESMTP id F0BDD50CAF0 for ; Mon, 6 Nov 2006 03:32:10 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by server1.spsn.net (Postfix) with ESMTP id B6015ACC01C; Mon, 6 Nov 2006 12:32:06 +0100 (CET) Received: from server1.spsn.net ([127.0.0.1]) by localhost (server1.spsn.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9nU5uYRtLM72; Mon, 6 Nov 2006 12:32:06 +0100 (CET) Received: by server1.spsn.net (Postfix, from userid 102) id 94B19ACC039; Mon, 6 Nov 2006 12:32:06 +0100 (CET) Received: from saschatest.adtech.de (unknown [213.200.64.124]) by server1.spsn.net (Postfix) with ESMTP id 8218DACC01C; Mon, 6 Nov 2006 12:32:05 +0100 (CET) From: Sascha Nitsch To: Ruben Rubio X-ASG-Orig-Subj: Re: Weird performance decrease Subject: Re: Weird performance decrease Date: Mon, 6 Nov 2006 12:31:26 +0100 User-Agent: KMail/1.9.5 References: <200611061028.08963.sgi@linuxhowtos.org> <454F16E4.6050907@rentalia.com> In-Reply-To: <454F16E4.6050907@rentalia.com> Cc: xfs@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200611061231.26623.sgi@linuxhowtos.org> X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.1.1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25214 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id kA6BWxaG027729 X-archive-position: 9555 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sgi@linuxhowtos.org Precedence: bulk X-list: xfs Content-Length: 4073 Lines: 104 On Monday 06 November 2006 12:05, you wrote: > I have seen that there is performance problems when there is some files > in a directory and data is being added in the files. > > Are the files franmented? > Go to a directory where the files are listed, and > xfs_bmap -v * | less > > Check out the results. The files themself are not fragmentented. They only get (re)written at once. No appends. But a couple of the directories have extends. Examples: 0354: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..7]: 8964012..8964019 1 (3091..3098) 8 1: [8..15]: 9013089..9013096 1 (52168..52175) 8 00f0: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..7]: 80648721..80648728 9 (432..439) 8 1: [8..15]: 80654561..80654568 9 (6272..6279) 8 2: [16..23]: 80662073..80662080 9 (13784..13791) 8 3: [24..31]: 80669473..80669480 9 (21184..21191) 8 4: [32..39]: 80677185..80677192 9 (28896..28903) 8 5: [40..47]: 80685105..80685112 9 (36816..36823) 8 6: [48..55]: 80692545..80692552 9 (44256..44263) 8 7: [56..63]: 80700001..80700008 9 (51712..51719) 8 8: [64..71]: 80708272..80708279 9 (59983..59990) 8 9: [72..79]: 80716819..80716826 9 (68530..68537) 8 some up to 123 (tested via some checks in random picked directories). Would increasing the directory size help to avoid those extends? I'm quite new when it comes to internal stuff of xfs, just used it "as is" and was happy. Sascha > Sascha Nitsch escribió: > > Hi, > > > > I'm observing a rather strange behaviour of the filesystem cache > > algorithm. > > > > I have a server running the following app scenario: > > > > A filesystem tree with a depth of 7 directories and 4 character directory > > names. > > In the deepest directories are files. > > filesize from 100 bytes to 5kb. > > Filesystem is XFS. > > > > The app creates dirs in the tree and reads/writes files into the deepest > > dirs in the tree. > > > > CPU: Dual Xeon 3.0 Ghz w/HT 512KB cache each, 2GB RAM, SCSI-HDD 15k RPM > > > > The first while, all is fine and extremely fast. After a while the buffer > > size is about 3.5 MB > > and cache size about 618 MB. > > Until that moment ~445000 directories and ~106000 files have been created > > > > Thats where the weird behaviour starts. > > > > The buffer size drops to ~200 kb and cache size starts decreasing fast. > > This results in a drastic performace drop in my app. > > (avg. read/write times increase from 0.3ms to 4ms) > > not a constant increase, a jumping increase. During the next while it > > constantly gets slower (19ms and more). > > > > After running a while (with still reducing cache size) the buffer size > > stays at > > ~700kb and cache about 400 MB. Performane is terrible. Way slower than > > starting up with no cache. > > > > restarting the app makes no change, neither remounting the partition. > > > > cmd to create the fs: > > mkfs.xfs -b size=512 -i maxpct=0 -l version=2 -n size=16k /dev/sdc > > mounting with > > mount /dev/sdc /data > > > > I'm open for suggestion on mkfs calls, mount options and kernel tuning > > via procfs. > > I have a testcase to reproduce the problem. It happens after ~45 minutes. > > > > xfs_info /data/ > > meta-data=/data isize=256 agcount=16, agsize=8960921 > > blks = sectsz=512 > > data = bsize=512 blocks=143374736, imaxpct=0 > > = sunit=0 swidth=0 blks, unwritten=1 > > naming =version 2 bsize=16384 > > log =internal bsize=512 blocks=65536, version=2 > > = sectsz=512 sunit=0 blks > > realtime =none extsz=65536 blocks=0, rtextents=0 > > > > kernel: > > a 2.6.9-34.0.2.ELsmp #1 SMP Mon Jul 17 21:41:41 CDT 2006 i686 i686 i386 > > GNU/Linux > > > > filesystem usage is < 1% From owner-xfs@oss.sgi.com Mon Nov 6 04:34:47 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 06 Nov 2006 04:34:57 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA6CYjaG006698 for ; Mon, 6 Nov 2006 04:34:46 -0800 X-ASG-Debug-ID: 1162811523-28682-557-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.rentalia.net (mail.rentalia.com [213.192.209.8]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2168650C79C for ; Mon, 6 Nov 2006 03:12:03 -0800 (PST) Received: (qmail 27720 invoked by uid 514); 6 Nov 2006 12:05:20 +0100 Received: from 62.37.216.226 by rigodon (envelope-from , uid 512) with qmail-scanner-1.25-st-qms (clamdscan: 0.88/1284. spamassassin: 3.0.2. perlscan: 1.25-st-qms. Clear:RC:0(62.37.216.226):SA:0(-2.4/5.0):. Processed in 6.089067 secs); 06 Nov 2006 11:05:20 -0000 X-Antivirus-MYDOMAIN-Mail-From: ruben@rentalia.com via rigodon X-Antivirus-MYDOMAIN: 1.25-st-qms (Clear:RC:0(62.37.216.226):SA:0(-2.4/5.0):. Processed in 6.089067 secs Process 27671) Received: from 226.pool62-37-216.dynamic.uni2.es (HELO ?192.168.2.28?) (ruben@rentalia.com@62.37.216.226) by mail.rentalia.net with SMTP; 6 Nov 2006 12:05:13 +0100 Message-ID: <454F16E4.6050907@rentalia.com> Date: Mon, 06 Nov 2006 12:05:08 +0100 From: Ruben Rubio User-Agent: Thunderbird 1.5.0.7 (X11/20060922) MIME-Version: 1.0 To: Sascha Nitsch CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Weird performance decrease Subject: Re: Weird performance decrease References: <200611061028.08963.sgi@linuxhowtos.org> In-Reply-To: <200611061028.08963.sgi@linuxhowtos.org> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Barracuda-Spam-Score: 1.00 X-Barracuda-Spam-Status: No, SCORE=1.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=URI_NOVOWEL X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25210 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.00 URI_NOVOWEL URI: URI hostname has long non-vowel sequence X-archive-position: 9556 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: ruben@rentalia.com Precedence: bulk X-list: xfs Content-Length: 2893 Lines: 89 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have seen that there is performance problems when there is some files in a directory and data is being added in the files. Are the files franmented? Go to a directory where the files are listed, and xfs_bmap -v * | less Check out the results. Sascha Nitsch escribió: > Hi, > > I'm observing a rather strange behaviour of the filesystem cache algorithm. > > I have a server running the following app scenario: > > A filesystem tree with a depth of 7 directories and 4 character directory > names. > In the deepest directories are files. > filesize from 100 bytes to 5kb. > Filesystem is XFS. > > The app creates dirs in the tree and reads/writes files into the deepest dirs > in the tree. > > CPU: Dual Xeon 3.0 Ghz w/HT 512KB cache each, 2GB RAM, SCSI-HDD 15k RPM > > The first while, all is fine and extremely fast. After a while the buffer size > is about 3.5 MB > and cache size about 618 MB. > Until that moment ~445000 directories and ~106000 files have been created > > Thats where the weird behaviour starts. > > The buffer size drops to ~200 kb and cache size starts decreasing fast. > This results in a drastic performace drop in my app. > (avg. read/write times increase from 0.3ms to 4ms) > not a constant increase, a jumping increase. During the next while it > constantly gets slower (19ms and more). > > After running a while (with still reducing cache size) the buffer size stays > at > ~700kb and cache about 400 MB. Performane is terrible. Way slower than > starting up with no cache. > > restarting the app makes no change, neither remounting the partition. > > cmd to create the fs: > mkfs.xfs -b size=512 -i maxpct=0 -l version=2 -n size=16k /dev/sdc > mounting with > mount /dev/sdc /data > > I'm open for suggestion on mkfs calls, mount options and kernel tuning via > procfs. > I have a testcase to reproduce the problem. It happens after ~45 minutes. > > xfs_info /data/ > meta-data=/data isize=256 agcount=16, agsize=8960921 blks > = sectsz=512 > data = bsize=512 blocks=143374736, imaxpct=0 > = sunit=0 swidth=0 blks, unwritten=1 > naming =version 2 bsize=16384 > log =internal bsize=512 blocks=65536, version=2 > = sectsz=512 sunit=0 blks > realtime =none extsz=65536 blocks=0, rtextents=0 > > kernel: > a 2.6.9-34.0.2.ELsmp #1 SMP Mon Jul 17 21:41:41 CDT 2006 i686 i686 i386 > GNU/Linux > > filesystem usage is < 1% > > > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFTxbjIo1XmbAXRboRAuFBAKCDFC+FKGmIPEC7m2qPwntgAQO2pgCeJvZ1 fC5bypzpHkU7KMOwtwxObQI= =mIEc -----END PGP SIGNATURE----- From owner-xfs@oss.sgi.com Mon Nov 6 05:43:00 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 06 Nov 2006 05:43:09 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA6DgwaG013782 for ; Mon, 6 Nov 2006 05:42:59 -0800 X-ASG-Debug-ID: 1162820529-23404-394-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from rrzmta2.rz.uni-regensburg.de (rrzmta2.rz.uni-regensburg.de [132.199.1.17]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2A494508757 for ; Mon, 6 Nov 2006 05:42:09 -0800 (PST) Received: from rrzmta2.rz.uni-regensburg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id 002436A533; Mon, 6 Nov 2006 14:41:53 +0100 (CET) Received: from pc51072.physik.uni-regensburg.de (pc51072.physik.uni-regensburg.de [132.199.98.129]) by rrzmta2.rz.uni-regensburg.de (Postfix) with ESMTP id C5E6E6B0C3; Mon, 6 Nov 2006 14:41:53 +0100 (CET) Received: by pc51072.physik.uni-regensburg.de (Postfix, from userid 28561) id 87861507058; Mon, 6 Nov 2006 14:41:48 +0100 (CET) Date: Mon, 6 Nov 2006 14:41:48 +0100 From: Christian Guggenberger To: Christian Guggenberger Cc: David Chinner , Eric Sandeen , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: mount failed after xfs_growfs beyond 16 TB Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061106134148.GA25180@pc51072.physik.uni-regensburg.de> Reply-To: christian.guggenberger@physik.uni-regensburg.de References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <454A3B28.7010405@sandeen.net> <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> <20061103123418.GP8394166@melbourne.sgi.com> <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> User-Agent: Mutt/1.5.9i X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25222 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9557 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.guggenberger@physik.uni-regensburg.de Precedence: bulk X-list: xfs Content-Length: 619 Lines: 23 On Fri, Nov 03, 2006 at 04:44:48PM +0100, Christian Guggenberger wrote: > > > > xfs_db mojo.... ;) > > > > Note - no guarantee this will work - practise on an expendable > > sparse loopback filessytem image by making a filesystem of slightly less > > than 16TB then growing it to corrupt it the same way and then fixing it up > > successfully. > > ... > > (btw, I still hope they get access to an 64bit system with recent > xfsprogs and kernel, soon) > for your info - with recent xfsprogs (2.8.11) repair (on a 32bit system) succeeded. No xfs_db magic needed. thanks again for your help, cheers. - Christian From owner-xfs@oss.sgi.com Mon Nov 6 15:40:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 06 Nov 2006 15:40:22 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA6NeBaG021842 for ; Mon, 6 Nov 2006 15:40:14 -0800 X-ASG-Debug-ID: 1162856365-7214-534-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7700B50E119 for ; Mon, 6 Nov 2006 15:39:25 -0800 (PST) Received: from agami.com ([192.168.168.115]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kA6NdIoV008664 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Mon, 6 Nov 2006 15:39:20 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kA6NdD7d002436; Mon, 6 Nov 2006 15:39:13 -0800 Message-ID: <454FC5C3.8080803@agami.com> Date: Mon, 06 Nov 2006 15:31:15 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.7 (X11/20060909) MIME-Version: 1.0 To: Sascha Nitsch CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Weird performance decrease Subject: Re: Weird performance decrease References: <200611061028.08963.sgi@linuxhowtos.org> In-Reply-To: <200611061028.08963.sgi@linuxhowtos.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25246 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9560 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 2961 Lines: 86 Hi Sascha, Did you notice the iostat -x on the device ? Please verify the turnaround time of the device when you are getting the slowdown. For example, if %b is closer towards 100, perhaps you are maxing out on the disk I/O ops per sec. Since you have only one disk, once I/O becomes random, the disk wouldn't be able to do more than 200-250 disk ops per sec. # iostat -x sda 1 extended device statistics device mgr/s mgw/s r/s w/s kr/s kw/s size queue wait svc_t %b sda 3 7 3.5 18.5 157.0 112.9 12.3 0.2 9.7 1.7 4 Regards, Shailendra Sascha Nitsch wrote: > Hi, > > I'm observing a rather strange behaviour of the filesystem cache algorithm. > > I have a server running the following app scenario: > > A filesystem tree with a depth of 7 directories and 4 character directory > names. > In the deepest directories are files. > filesize from 100 bytes to 5kb. > Filesystem is XFS. > > The app creates dirs in the tree and reads/writes files into the deepest dirs > in the tree. > > CPU: Dual Xeon 3.0 Ghz w/HT 512KB cache each, 2GB RAM, SCSI-HDD 15k RPM > > The first while, all is fine and extremely fast. After a while the buffer size > is about 3.5 MB > and cache size about 618 MB. > Until that moment ~445000 directories and ~106000 files have been created > > Thats where the weird behaviour starts. > > The buffer size drops to ~200 kb and cache size starts decreasing fast. > This results in a drastic performace drop in my app. > (avg. read/write times increase from 0.3ms to 4ms) > not a constant increase, a jumping increase. During the next while it > constantly gets slower (19ms and more). > > After running a while (with still reducing cache size) the buffer size stays > at > ~700kb and cache about 400 MB. Performane is terrible. Way slower than > starting up with no cache. > > restarting the app makes no change, neither remounting the partition. > > cmd to create the fs: > mkfs.xfs -b size=512 -i maxpct=0 -l version=2 -n size=16k /dev/sdc > mounting with > mount /dev/sdc /data > > I'm open for suggestion on mkfs calls, mount options and kernel tuning via > procfs. > I have a testcase to reproduce the problem. It happens after ~45 minutes. > > xfs_info /data/ > meta-data=/data isize=256 agcount=16, agsize=8960921 blks > = sectsz=512 > data = bsize=512 blocks=143374736, imaxpct=0 > = sunit=0 swidth=0 blks, unwritten=1 > naming =version 2 bsize=16384 > log =internal bsize=512 blocks=65536, version=2 > = sectsz=512 sunit=0 blks > realtime =none extsz=65536 blocks=0, rtextents=0 > > kernel: > a 2.6.9-34.0.2.ELsmp #1 SMP Mon Jul 17 21:41:41 CDT 2006 i686 i686 i386 > GNU/Linux > > filesystem usage is < 1% > > > From owner-xfs@oss.sgi.com Tue Nov 7 00:18:48 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 07 Nov 2006 00:18:56 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kA78IjaG012195 for ; Tue, 7 Nov 2006 00:18:47 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id TAA07474; Tue, 7 Nov 2006 19:17:54 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kA78Hp7Y26816335; Tue, 7 Nov 2006 19:17:52 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kA78HmgF26131385; Tue, 7 Nov 2006 19:17:48 +1100 (AEDT) Date: Tue, 7 Nov 2006 19:17:48 +1100 From: David Chinner To: Christian Guggenberger Cc: David Chinner , Eric Sandeen , xfs@oss.sgi.com Subject: Re: mount failed after xfs_growfs beyond 16 TB Message-ID: <20061107081748.GA8394166@melbourne.sgi.com> References: <20061102172608.GA27769@pc51072.physik.uni-regensburg.de> <454A3B28.7010405@sandeen.net> <20061103093203.GA18010@pc51072.physik.uni-regensburg.de> <20061103123418.GP8394166@melbourne.sgi.com> <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061103154448.GA26647@pc51072.physik.uni-regensburg.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 9566 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 776 Lines: 31 On Fri, Nov 03, 2006 at 04:44:48PM +0100, Christian Guggenberger wrote: > > The superblock: > > > > blocksize = 4096 > > dblocks = 18446744070056148512 > > ... > > agblocks = 84976608 > > agcount = 570 > > > > An AG is ~43.5GB, so 570 AGs is 24.8TB. It's to big, and > > we will only shrink by whole AGs. Hence we have to correct > > agcount and dblocks. > > isn't the AG size 'agblocks * blocksize' == ~324 GB here ? Yes, you are right - I was thinking 512 byte blocks which then gave the right size that you grew to. Otherwise 570*324GB gives 200TB, which is somewhat larger than you apparently tried to grow to... Sorry for the misdirection, but I'm glad to see that you got it fixed. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Nov 7 02:46:36 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 07 Nov 2006 02:46:41 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA7AkYaG030004 for ; Tue, 7 Nov 2006 02:46:36 -0800 X-ASG-Debug-ID: 1162896347-13353-584-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from server1.spsn.net (server1.spsn.net [195.234.231.102]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5125550F14E for ; Tue, 7 Nov 2006 02:45:47 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by server1.spsn.net (Postfix) with ESMTP id E3DB5A9C0F1; Tue, 7 Nov 2006 11:45:17 +0100 (CET) Received: from server1.spsn.net ([127.0.0.1]) by localhost (server1.spsn.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UzU+ANvZPH6k; Tue, 7 Nov 2006 11:45:17 +0100 (CET) Received: by server1.spsn.net (Postfix, from userid 102) id C0B8DA9C15F; Tue, 7 Nov 2006 11:45:17 +0100 (CET) Received: from saschatest.adtech.de (unknown [213.200.64.124]) by server1.spsn.net (Postfix) with ESMTP id 48E91A9C0F1; Tue, 7 Nov 2006 11:45:17 +0100 (CET) From: Sascha Nitsch To: Shailendra Tripathi X-ASG-Orig-Subj: Re: Weird performance decrease Subject: Re: Weird performance decrease Date: Tue, 7 Nov 2006 11:44:32 +0100 User-Agent: KMail/1.9.5 References: <200611061028.08963.sgi@linuxhowtos.org> <454FC5C3.8080803@agami.com> In-Reply-To: <454FC5C3.8080803@agami.com> Cc: xfs@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200611071144.32925.sgi@linuxhowtos.org> X-Bogosity: Spam, tests=bogofilter, spamicity=1.000000, version=1.1.1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25301 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9569 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sgi@linuxhowtos.org Precedence: bulk X-list: xfs Content-Length: 4477 Lines: 111 On Tuesday 07 November 2006 00:31, you wrote: > Hi Sascha, > Did you notice the iostat -x on the device ? Please > verify the turnaround time of the device when you are getting the slowdown. > > For example, if %b is closer towards 100, perhaps you are maxing out on > the disk I/O ops per sec. > Since you have only one disk, once I/O becomes random, the disk wouldn't > be able to do more than 200-250 disk ops per sec. > > # iostat -x sda 1 > extended device statistics > device mgr/s mgw/s r/s w/s kr/s kw/s size queue wait > svc_t %b > sda 3 7 3.5 18.5 157.0 112.9 12.3 0.2 9.7 > 1.7 4 > > > Regards, > Shailendra Hi Shailendra, here are some measurements: == startup (very high performance) == top: Cpu0 : 0.2% us, 0.0% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2074432k total, 133784k used, 1940648k free, 15652k buffers Swap: 2618488k total, 160k used, 2618328k free, 53508k cached iostat -x Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdc 0.08 1.77 0.47 2.48 9.37 58.29 4.68 29.14 22.94 0.32 107.23 3.07 0.91 == shortly before performance drops == top: Cpu0 : 0.0% us, 0.2% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si Cpu1 : 0.0% us, 0.2% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2074432k total, 1342464k used, 731968k free, 17300k buffers Swap: 2618488k total, 160k used, 2618328k free, 645180k cached iostat -x Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdc 0.08 1.96 0.47 2.55 9.35 63.53 4.68 31.77 24.11 0.35 115.27 3.05 0.92 == directly after drop == top: Cpu0 : 0.0% us, 0.6% sy, 0.0% ni, 98.6% id, 0.6% wa, 0.2% hi, 0.0% si Cpu1 : 0.0% us, 0.2% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2074432k total, 1355704k used, 718728k free, 532k buffers Swap: 2618488k total, 160k used, 2618328k free, 656548k cached iostat -x Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdc 0.08 1.96 0.47 2.56 9.36 63.87 4.68 31.93 24.16 0.35 115.32 3.05 0.93 notice the buffer size drop == after running for a while with slow performance == top: Cpu0 : 0.0% us, 1.0% sy, 0.0% ni, 85.0% id, 14.0% wa, 0.0% hi, 0.0% si Cpu1 : 0.0% us, 0.2% sy, 0.0% ni, 97.0% id, 2.8% wa, 0.0% hi, 0.0% si Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 2074432k total, 1065216k used, 1009216k free, 292k buffers Swap: 2618488k total, 160k used, 2618328k free, 296888k cached iostat -x Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdc 0.08 1.97 0.50 2.69 10.16 67.99 5.08 34.00 24.52 0.36 111.35 3.10 0.99 without buffers and low cache it's no wonder that the io wait increases. But why get the buffers and and cache disabled and not rebuild? A Note: the workload and types of io operations are the same from the first to the last second, nothing is changing. what iostat fails to detect ist that on average, there are ~60 reads/s and ~60 writes/s. Average read time is starting at 30ns/read attempt (on a non-existig file; put still pretty impressive) write time (including average creation of 4.3 directories/write) starts at .3ms and it stays at that speed until the drop. After that it start to increase to more than 19ms for read ops and 4ms for write ops. I'm absolutely running out of possible ideas and workarounds. Regards, Sascha From owner-xfs@oss.sgi.com Tue Nov 7 10:38:39 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 07 Nov 2006 10:38:48 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kA7IccaG015202 for ; Tue, 7 Nov 2006 10:38:39 -0800 X-ASG-Debug-ID: 1162924670-15961-747-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id B351CD1C6091 for ; Tue, 7 Nov 2006 10:37:50 -0800 (PST) Received: from agami.com ([192.168.168.115]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kA7IbhoV019778 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 7 Nov 2006 10:37:44 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kA7Ibcva018069; Tue, 7 Nov 2006 10:37:38 -0800 Message-ID: <4550D082.3030802@agami.com> Date: Tue, 07 Nov 2006 10:29:22 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.7 (X11/20060909) MIME-Version: 1.0 To: Sascha Nitsch CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Weird performance decrease Subject: Re: Weird performance decrease References: <200611061028.08963.sgi@linuxhowtos.org> <454FC5C3.8080803@agami.com> <200611071144.32925.sgi@linuxhowtos.org> In-Reply-To: <200611071144.32925.sgi@linuxhowtos.org> Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25331 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9570 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 5349 Lines: 124 Hi Sascha, Please run the iostat continuously to monitor the disk performance. Please monitor the await field and %util field on the device. Looking at this, it does look like there is some bottleneck in I/O path (or number of requests generated are too high). Please note that when you take iostat just once, it gives the average stats on the device (accumulative). So, you are not getting the real picture. However, I can see that the average I/O response time is way too high. (await=115.27). This means that an I/O has been spending average 115 ms (too bad). It includes the time it spent in the disk I/O queue (called elevator queue) and actual service time. Your disk is performing good as it is service time is 3.05 ms This time has less to do with caching/buffers availability. Again, it appears to me the number of requests generated are overwhelming the device. That is. the deivce has seen overwhelming I/O in recent past. For example, when I do this: dd if=/dev/zero of=/tmp/1 bs=32k count=100000 I see this (below). Note that I see await of 218 ms. $ iostat -x hda6 avg-cpu: %user %nice %sys %iowait %idle 1.75 0.00 0.41 0.21 97.63 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util hda6 0.00 0.77 0.00 0.06 0.00 6.59 0.00 3.30 114.62 0.01 218.10 2.22 0.01 > here are some measurements: > > == startup (very high performance) == > > top: > Cpu0 : 0.2% us, 0.0% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Mem: 2074432k total, 133784k used, 1940648k free, 15652k buffers > Swap: 2618488k total, 160k used, 2618328k free, 53508k cached > > iostat -x > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > avgrq-sz avgqu-sz await svctm %util > sdc 0.08 1.77 0.47 2.48 9.37 58.29 4.68 29.14 > 22.94 0.32 107.23 3.07 0.91 > > == shortly before performance drops == > > top: > Cpu0 : 0.0% us, 0.2% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu1 : 0.0% us, 0.2% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Mem: 2074432k total, 1342464k used, 731968k free, 17300k buffers > Swap: 2618488k total, 160k used, 2618328k free, 645180k cached > > iostat -x > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > avgrq-sz avgqu-sz await svctm %util > sdc 0.08 1.96 0.47 2.55 9.35 63.53 4.68 31.77 > 24.11 0.35 115.27 3.05 0.92 > > == directly after drop == > top: > Cpu0 : 0.0% us, 0.6% sy, 0.0% ni, 98.6% id, 0.6% wa, 0.2% hi, 0.0% si > Cpu1 : 0.0% us, 0.2% sy, 0.0% ni, 99.8% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Mem: 2074432k total, 1355704k used, 718728k free, 532k buffers > Swap: 2618488k total, 160k used, 2618328k free, 656548k cached > > iostat -x > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > avgrq-sz avgqu-sz await svctm %util > sdc 0.08 1.96 0.47 2.56 9.36 63.87 4.68 31.93 > 24.16 0.35 115.32 3.05 0.93 > > notice the buffer size drop > > == after running for a while with slow performance == > top: > Cpu0 : 0.0% us, 1.0% sy, 0.0% ni, 85.0% id, 14.0% wa, 0.0% hi, 0.0% si > Cpu1 : 0.0% us, 0.2% sy, 0.0% ni, 97.0% id, 2.8% wa, 0.0% hi, 0.0% si > Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Cpu3 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si > Mem: 2074432k total, 1065216k used, 1009216k free, 292k buffers > Swap: 2618488k total, 160k used, 2618328k free, 296888k cached > > iostat -x > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > avgrq-sz avgqu-sz await svctm %util > sdc 0.08 1.97 0.50 2.69 10.16 67.99 5.08 34.00 > 24.52 0.36 111.35 3.10 0.99 > > without buffers and low cache it's no wonder that the io wait increases. > But why get the buffers and and cache disabled and not rebuild? > > > A Note: the workload and types of io operations are the same from the first to > the last second, nothing is changing. > what iostat fails to detect ist that on average, there are ~60 reads/s and ~60 > writes/s. > > Average read time is starting at 30ns/read attempt (on a non-existig file; > put still pretty impressive) > write time (including average creation of 4.3 directories/write) starts > at .3ms and it stays at that speed until the drop. > > After that it start to increase to more than 19ms for read ops and 4ms for > write ops. > > I'm absolutely running out of possible ideas and workarounds. > > Regards, > Sascha > From owner-xfs@oss.sgi.com Thu Nov 9 17:11:19 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 09 Nov 2006 17:11:28 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAA1BFaG024354 for ; Thu, 9 Nov 2006 17:11:18 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA15958; Fri, 10 Nov 2006 12:10:24 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAA1AL7Y29331309; Fri, 10 Nov 2006 12:10:22 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAA1AIhq30071791; Fri, 10 Nov 2006 12:10:18 +1100 (AEDT) Date: Fri, 10 Nov 2006 12:10:18 +1100 From: David Chinner To: Russell Cattelan Cc: "Igor A. Valcov" , linux-kernel , xfs@oss.sgi.com Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ Message-ID: <20061110011018.GP8394166@melbourne.sgi.com> References: <1163095715.5632.102.camel@xenon.msp.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1163095715.5632.102.camel@xenon.msp.redhat.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9577 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1397 Lines: 37 On Thu, Nov 09, 2006 at 12:08:35PM -0600, Russell Cattelan wrote: > On Thu, 2006-11-09 at 20:30 +0300, Igor A. Valcov wrote: > > Hello, > > > > For one of our projects we have a test program that measures file > > system performance by writing up to 1000 files simultaneously. After > > installing kernel v2.6.16 we noticed that XFS performance dropped by a > > factor of 5 (tests that took around 4 minutes on kernel 2.6.15 now > > take around 20 minutes to complete). We then checked all kernels > > starting from 2.6.16 up to 2.6.19-rc5 with the same unpleasant result. > > The funny thing about all this is that we chose XFS for that > > particular project specifically because it was about 5 times faster > > with the tests than the other file systems. Now they all take about > > the same time. > > > > I also noticed that I/O barriers were introduced in v2.6.16 and > > thought they may be the cause, but mounting the file system with > > 'nobarrier' doesn't seem to affect the performance in any way. > > > > Any thoughts on the matter are appreciated. > I would try verifying the problem on a non ide disk just > to confirm the write barrier theory. > > Also file a bug. > http://oss/sgi.com/bugzilla > include test case and hard description if possible. and cc xfs@oss.sgi.com on XFS bug reports ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 9 19:37:31 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 09 Nov 2006 19:37:37 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAA3bVaG015452 for ; Thu, 9 Nov 2006 19:37:31 -0800 X-ASG-Debug-ID: 1163129799-5311-246-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id A0A214E24B1 for ; Thu, 9 Nov 2006 19:36:39 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 9C91218BCD2B7; Thu, 9 Nov 2006 21:36:38 -0600 (CST) Message-ID: <4553F3C6.2030807@sandeen.net> Date: Thu, 09 Nov 2006 21:36:38 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: "Igor A. Valcov" CC: linux-kernel , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS filesystem performance drop in kernels 2.6.16+ Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25558 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9579 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 297 Lines: 11 Igor A. Valcov wrote: > I also noticed that I/O barriers were introduced in v2.6.16 and > thought they may be the cause, but mounting the file system with > 'nobarrier' doesn't seem to affect the performance in any way. did this happen to be a remount with nobarrier, or a fresh mount? -Eric From owner-xfs@oss.sgi.com Fri Nov 10 04:02:44 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 04:02:52 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAAC2haG027197 for ; Fri, 10 Nov 2006 04:02:44 -0800 X-ASG-Debug-ID: 1163160116-30926-901-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from emailer.gwdg.de (emailer.gwdg.de [134.76.10.24]) by cuda.sgi.com (Spam Firewall) with ESMTP id E53DE5159E7 for ; Fri, 10 Nov 2006 04:01:56 -0800 (PST) Received: from linux01.gwdg.de ([134.76.13.21]) by mailer.gwdg.de with esmtps (TLSv1:DES-CBC3-SHA:168) (Exim 4.60) (envelope-from ) id 1GiV4g-0002vA-6u; Fri, 10 Nov 2006 13:01:34 +0100 Received: from linux01.gwdg.de (localhost [127.0.0.1]) by linux01.gwdg.de (8.13.3/8.13.3/SuSE Linux 0.7) with ESMTP id kAAC004q007001; Fri, 10 Nov 2006 13:00:02 +0100 Received: from localhost (jengelh@localhost) by linux01.gwdg.de (8.13.3/8.13.3/Submit) with ESMTP id kAABxx1h006995; Fri, 10 Nov 2006 12:59:59 +0100 Date: Fri, 10 Nov 2006 12:59:59 +0100 (MET) From: Jan Engelhardt To: Eric Sandeen cc: "Igor A. Valcov" , linux-kernel , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS filesystem performance drop in kernels 2.6.16+ Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ In-Reply-To: <4553F3C6.2030807@sandeen.net> Message-ID: References: <4553F3C6.2030807@sandeen.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25590 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9581 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jengelh@linux01.gwdg.de Precedence: bulk X-list: xfs Content-Length: 348 Lines: 14 >> I also noticed that I/O barriers were introduced in v2.6.16 and >> thought they may be the cause, but mounting the file system with >> 'nobarrier' doesn't seem to affect the performance in any way. > > > did this happen to be a remount with nobarrier, or a fresh mount? For the barrier stuff, see http://lkml.org/lkml/2006/5/19/33 -`J' -- From owner-xfs@oss.sgi.com Fri Nov 10 05:41:04 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 05:41:16 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAADf3aG009993 for ; Fri, 10 Nov 2006 05:41:04 -0800 X-ASG-Debug-ID: 1163164592-21642-97-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.185]) by cuda.sgi.com (Spam Firewall) with ESMTP id E6480D1CB302 for ; Fri, 10 Nov 2006 05:16:33 -0800 (PST) Received: by nf-out-0910.google.com with SMTP id x30so1076706nfb for ; Fri, 10 Nov 2006 05:16:32 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Hjbu5mV/CRP0+IPJY4CsjrF8luK1QBH8UjtUK3mkDuzaG0CXa+Ayp4Rf0YEYmx7BQOHHMvS1j6df7xmZdpwHbuU1i7Ki+kAMj3XotB32OGz1MOXYjt9UQWeLveLvpArfaCrSTvq1FvY3121ClQL62q7V4ov2exnSLI5ViT5smZ8= Received: by 10.82.106.14 with SMTP id e14mr337756buc.1163164592075; Fri, 10 Nov 2006 05:16:32 -0800 (PST) Received: by 10.82.153.10 with HTTP; Fri, 10 Nov 2006 05:16:26 -0800 (PST) Message-ID: Date: Fri, 10 Nov 2006 16:16:27 +0300 From: "Igor A. Valcov" To: linux-kernel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS filesystem performance drop in kernels 2.6.16+ Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <4553F3C6.2030807@sandeen.net> X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25596 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9582 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: viaprog@gmail.com Precedence: bulk X-list: xfs Content-Length: 1880 Lines: 90 Below is a simplified version of the test program, and results of testing different kernels/filesystems/mount options. The results are a little different from the ones described in the initial post (this time performance decreased "only" 2 times), but the general tendency is clearly the same. ============ 2.6.19-rc5-git2 ============ mount -t xfs -o noatime,barrier /dev/sdc1 /mnt/disc real 16m40.516s user 0m17.989s sys 9m36.320s mount -t xfs -o noatime,nobarrier /dev/sdc1 /mnt/disc real 15m40.212s user 0m17.549s sys 9m29.692s mount -t ext3 -o noatime /dev/sdc1 /mnt/disc real 49m44.728s user 0m27.678s sys 14m15.689s ============ 2.6.14.6 ============ mount -t xfs -o noatime /dev/sdc1 /mnt/disc real 9m58.974s user 0m17.373s sys 8m4.850s mount -t ext3 -o noatime /dev/sdc1 /mnt/disc real 49m7.526s user 0m26.278s sys 12m37.627s ======================================== #include #include #include #define __BYTES 8192 #define __FILES 1000 char buf [__BYTES]; int main () { char fname [1024]; int nFiles [__FILES]; int f, i; /* Fill buf */ for (i = 0; i < __BYTES; i++) buf [i] = i % 128; /* Create and open files */ for (f = 0; f < __FILES; f++) { sprintf (fname, "/mnt/disc/storage/file-%d", f); nFiles [f] = open (fname, O_WRONLY | O_CREAT | O_TRUNC, 0644); } for (i = 0; i < 262144; i++) { /* Write data to a big file */ write (nFiles [0], buf, __BYTES); /* Write data to small files */ for (f = 1; f < __FILES; f++) write (nFiles [f], &f, sizeof (f)); } for (f = 0; f < __FILES; f++) { fsync (nFiles [f]); close (nFiles [f]); } return 0; } ======================================== -- Igor A. Valcov From owner-xfs@oss.sgi.com Fri Nov 10 09:15:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 09:15:25 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAAHFEaG014172 for ; Fri, 10 Nov 2006 09:15:16 -0800 X-ASG-Debug-ID: 1163178866-14276-366-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id D9DC05163E4 for ; Fri, 10 Nov 2006 09:14:26 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAAHEOqr021637; Fri, 10 Nov 2006 12:14:24 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAAHENcf029969; Fri, 10 Nov 2006 12:14:23 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAAHENOp029481; Fri, 10 Nov 2006 12:14:23 -0500 Message-ID: <4554B36E.9030006@sandeen.net> Date: Fri, 10 Nov 2006 11:14:22 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: "Igor A. Valcov" CC: linux-kernel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS filesystem performance drop in kernels 2.6.16+ Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ References: <4553F3C6.2030807@sandeen.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25610 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9583 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 501 Lines: 13 Igor A. Valcov wrote: > Below is a simplified version of the test program, and results of > testing different kernels/filesystems/mount options. The results are a > little different from the ones described in the initial post (this > time performance decreased "only" 2 times), but the general tendency > is clearly the same. I imagine that I know the answer, but to be sure you might put some time checks into your test app to see -which- portion of the test is taking the bulk of the time. -Eric From owner-xfs@oss.sgi.com Fri Nov 10 17:21:37 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 17:21:45 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAB1LVaG004456 for ; Fri, 10 Nov 2006 17:21:34 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA23865; Sat, 11 Nov 2006 12:20:38 +1100 Date: Sat, 11 Nov 2006 11:22:13 +1000 From: Timothy Shimmin To: sandeen@sandeen.net, xfs@oss.sgi.com Subject: Re: [PATCH] remove old irix log replay cases Message-ID: <1F3EC1BE248F9904011855D2@timothy-shimmins-power-mac-g5.local> In-Reply-To: <20061019020125.25D9718D90452@sandeen.net> References: <20061019020125.25D9718D90452@sandeen.net> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9587 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 7901 Lines: 247 Hi Eric, --On 18 October 2006 9:01:25 PM -0500 sandeen@sandeen.net wrote: > I think the irix 5.3 and 6.1 log handling can go, since linux > refuses outright to do anything with irix-style logs, no? > Yep this stuff can go. Could go on IRIX too :) > The remaining single-case switch statements might be a little > odd, but I suppose that they still show there's flexibility > for log format... Fine. > > Maybe the #defines could stay, or at least as comments, for > hysterical raisins... Get rid of 'em I say. Thanks, --Tim > > -Eric > > xfs_buf_item.h | 18 ---------------- > xfs_log_recover.c | 58 +----------------------------------------------------- > xfs_trans.h | 4 --- > 3 files changed, 3 insertions(+), 77 deletions(-) > > Signed-off-by: Eric Sandeen > > Index: xfs-linux-allpatches/xfs_buf_item.h > =================================================================== > --- xfs-linux-allpatches.orig/xfs_buf_item.h > +++ xfs-linux-allpatches/xfs_buf_item.h > @@ -21,23 +21,7 @@ > /* > * This is the structure used to lay out a buf log item in the > * log. The data map describes which 128 byte chunks of the buffer > - * have been logged. This structure works only on buffers that > - * reside up to the first TB in the filesystem. These buffers are > - * generated only by pre-6.2 systems and are known as XFS_LI_6_1_BUF. > - */ > -typedef struct xfs_buf_log_format_v1 { > - unsigned short blf_type; /* buf log item type indicator */ > - unsigned short blf_size; /* size of this item */ > - __int32_t blf_blkno; /* starting blkno of this buf */ > - ushort blf_flags; /* misc state */ > - ushort blf_len; /* number of blocks in this buf */ > - unsigned int blf_map_size; /* size of data bitmap in words */ > - unsigned int blf_data_map[1];/* variable size bitmap of */ > - /* regions of buffer in this item */ > -} xfs_buf_log_format_v1_t; > - > -/* > - * This is a form of the above structure with a 64 bit blkno field. > + * have been logged. > * For 6.2 and beyond, this is XFS_LI_BUF. We use this to log everything. > */ > typedef struct xfs_buf_log_format_t { > Index: xfs-linux-allpatches/xfs_log_recover.c > =================================================================== > --- xfs-linux-allpatches.orig/xfs_log_recover.c > +++ xfs-linux-allpatches/xfs_log_recover.c > @@ -1514,7 +1514,6 @@ xlog_recover_reorder_trans( > { > xlog_recover_item_t *first_item, *itemq, *itemq_next; > xfs_buf_log_format_t *buf_f; > - xfs_buf_log_format_v1_t *obuf_f; > ushort flags = 0; > > first_item = itemq = trans->r_itemq; > @@ -1522,29 +1521,16 @@ xlog_recover_reorder_trans( > do { > itemq_next = itemq->ri_next; > buf_f = (xfs_buf_log_format_t *)itemq->ri_buf[0].i_addr; > - switch (ITEM_TYPE(itemq)) { > - case XFS_LI_BUF: > - flags = buf_f->blf_flags; > - break; > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > - obuf_f = (xfs_buf_log_format_v1_t*)buf_f; > - flags = obuf_f->blf_flags; > - break; > - } > > switch (ITEM_TYPE(itemq)) { > case XFS_LI_BUF: > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > + flags = buf_f->blf_flags; > if (!(flags & XFS_BLI_CANCEL)) { > xlog_recover_insert_item_frontq(&trans->r_itemq, > itemq); > break; > } > case XFS_LI_INODE: > - case XFS_LI_6_1_INODE: > - case XFS_LI_5_3_INODE: > case XFS_LI_DQUOT: > case XFS_LI_QUOTAOFF: > case XFS_LI_EFD: > @@ -1583,7 +1569,6 @@ xlog_recover_do_buffer_pass1( > xfs_buf_cancel_t *nextp; > xfs_buf_cancel_t *prevp; > xfs_buf_cancel_t **bucket; > - xfs_buf_log_format_v1_t *obuf_f; > xfs_daddr_t blkno = 0; > uint len = 0; > ushort flags = 0; > @@ -1594,13 +1579,6 @@ xlog_recover_do_buffer_pass1( > len = buf_f->blf_len; > flags = buf_f->blf_flags; > break; > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > - obuf_f = (xfs_buf_log_format_v1_t*)buf_f; > - blkno = (xfs_daddr_t) obuf_f->blf_blkno; > - len = obuf_f->blf_len; > - flags = obuf_f->blf_flags; > - break; > } > > /* > @@ -1746,7 +1724,6 @@ xlog_recover_do_buffer_pass2( > xlog_t *log, > xfs_buf_log_format_t *buf_f) > { > - xfs_buf_log_format_v1_t *obuf_f; > xfs_daddr_t blkno = 0; > ushort flags = 0; > uint len = 0; > @@ -1757,13 +1734,6 @@ xlog_recover_do_buffer_pass2( > flags = buf_f->blf_flags; > len = buf_f->blf_len; > break; > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > - obuf_f = (xfs_buf_log_format_v1_t*)buf_f; > - blkno = (xfs_daddr_t) obuf_f->blf_blkno; > - flags = obuf_f->blf_flags; > - len = (xfs_daddr_t) obuf_f->blf_len; > - break; > } > > return xlog_check_buffer_cancelled(log, blkno, len, flags); > @@ -1799,7 +1769,6 @@ xlog_recover_do_inode_buffer( > int inodes_per_buf; > xfs_agino_t *logged_nextp; > xfs_agino_t *buffer_nextp; > - xfs_buf_log_format_v1_t *obuf_f; > unsigned int *data_map = NULL; > unsigned int map_size = 0; > > @@ -1808,12 +1777,6 @@ xlog_recover_do_inode_buffer( > data_map = buf_f->blf_data_map; > map_size = buf_f->blf_map_size; > break; > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > - obuf_f = (xfs_buf_log_format_v1_t*)buf_f; > - data_map = obuf_f->blf_data_map; > - map_size = obuf_f->blf_map_size; > - break; > } > /* > * Set the variables corresponding to the current region to > @@ -1912,7 +1875,6 @@ xlog_recover_do_reg_buffer( > int i; > int bit; > int nbits; > - xfs_buf_log_format_v1_t *obuf_f; > unsigned int *data_map = NULL; > unsigned int map_size = 0; > int error; > @@ -1922,12 +1884,6 @@ xlog_recover_do_reg_buffer( > data_map = buf_f->blf_data_map; > map_size = buf_f->blf_map_size; > break; > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > - obuf_f = (xfs_buf_log_format_v1_t*)buf_f; > - data_map = obuf_f->blf_data_map; > - map_size = obuf_f->blf_map_size; > - break; > } > bit = 0; > i = 1; /* 0 is the buf format structure */ > @@ -2160,7 +2116,6 @@ xlog_recover_do_buffer_trans( > int pass) > { > xfs_buf_log_format_t *buf_f; > - xfs_buf_log_format_v1_t *obuf_f; > xfs_mount_t *mp; > xfs_buf_t *bp; > int error; > @@ -2197,13 +2152,6 @@ xlog_recover_do_buffer_trans( > len = buf_f->blf_len; > flags = buf_f->blf_flags; > break; > - case XFS_LI_6_1_BUF: > - case XFS_LI_5_3_BUF: > - obuf_f = (xfs_buf_log_format_v1_t*)buf_f; > - blkno = obuf_f->blf_blkno; > - len = obuf_f->blf_len; > - flags = obuf_f->blf_flags; > - break; > default: > xfs_fs_cmn_err(CE_ALERT, log->l_mp, > "xfs_log_recover: unknown buffer type 0x%x, logdev %s", > @@ -2830,9 +2778,7 @@ xlog_recover_do_trans( > * where xfs_daddr_t is 32-bits but mount will warn us > * off a > 1 TB filesystem before we get here. > */ > - if ((ITEM_TYPE(item) == XFS_LI_BUF) || > - (ITEM_TYPE(item) == XFS_LI_6_1_BUF) || > - (ITEM_TYPE(item) == XFS_LI_5_3_BUF)) { > + if ((ITEM_TYPE(item) == XFS_LI_BUF)) { > if ((error = xlog_recover_do_buffer_trans(log, item, > pass))) > break; > Index: xfs-linux-allpatches/xfs_trans.h > =================================================================== > --- xfs-linux-allpatches.orig/xfs_trans.h > +++ xfs-linux-allpatches/xfs_trans.h > @@ -39,13 +39,9 @@ typedef struct xfs_trans_header { > /* > * Log item types. > */ > -#define XFS_LI_5_3_BUF 0x1234 /* v1 bufs, 1-block inode buffers */ > -#define XFS_LI_5_3_INODE 0x1235 /* 1-block inode buffers */ > #define XFS_LI_EFI 0x1236 > #define XFS_LI_EFD 0x1237 > #define XFS_LI_IUNLINK 0x1238 > -#define XFS_LI_6_1_INODE 0x1239 /* 4K non-aligned inode bufs */ > -#define XFS_LI_6_1_BUF 0x123a /* v1, 4K inode buffers */ > #define XFS_LI_INODE 0x123b /* aligned ino chunks, var-size ibufs */ > #define XFS_LI_BUF 0x123c /* v2 bufs, variable sized inode bufs */ > #define XFS_LI_DQUOT 0x123d > From owner-xfs@oss.sgi.com Fri Nov 10 19:20:53 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 19:20:59 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAB3KqaG013566 for ; Fri, 10 Nov 2006 19:20:53 -0800 X-ASG-Debug-ID: 1163215204-11887-800-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id DD039D1CB308 for ; Fri, 10 Nov 2006 19:20:04 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 063E018D9035E for ; Fri, 10 Nov 2006 21:20:04 -0600 (CST) Message-ID: <45554163.9060608@sandeen.net> Date: Fri, 10 Nov 2006 21:20:03 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: xfs mailing list X-ASG-Orig-Subj: [PATCH] remove flag-less mraccessf/mrupdatef Subject: [PATCH] remove flag-less mraccessf/mrupdatef Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25651 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9588 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 2654 Lines: 83 mraccessf & mrupdatef are supposed to be the "flags" versions of the functions, but they a) ignore the flags parameter completely, and b) are never called directly, only via the flag-less defines anyway So, drop the #define indirection, and rename mraccessf to mraccess, etc. Signed-off-by: Eric Sandeen linux-2.4/mrlock.c | 4 ++-- linux-2.4/mrlock_rwsem.h | 6 ++---- linux-2.6/mrlock.h | 6 ++---- 3 files changed, 6 insertions(+), 10 deletions(-) Index: xfs-linux-allpatches/linux-2.6/mrlock.h =================================================================== --- xfs-linux-allpatches.orig/linux-2.6/mrlock.h +++ xfs-linux-allpatches/linux-2.6/mrlock.h @@ -31,15 +31,13 @@ typedef struct { do { (mrp)->mr_writer = 0; init_rwsem(&(mrp)->mr_lock); } while (0) #define mrlock_init(mrp, t,n,s) mrinit(mrp, n) #define mrfree(mrp) do { } while (0) -#define mraccess(mrp) mraccessf(mrp, 0) -#define mrupdate(mrp) mrupdatef(mrp, 0) -static inline void mraccessf(mrlock_t *mrp, int flags) +static inline void mraccess(mrlock_t *mrp) { down_read(&mrp->mr_lock); } -static inline void mrupdatef(mrlock_t *mrp, int flags) +static inline void mrupdate(mrlock_t *mrp) { down_write(&mrp->mr_lock); mrp->mr_writer = 1; Index: xfs-linux-allpatches/linux-2.4/mrlock.c =================================================================== --- xfs-linux-allpatches.orig/linux-2.4/mrlock.c +++ xfs-linux-allpatches/linux-2.4/mrlock.c @@ -116,7 +116,7 @@ mrlock(mrlock_t *mrp, int type, int flag /* ARGSUSED */ void -mraccessf(mrlock_t *mrp, int flags) +mraccess(mrlock_t *mrp) { MRLOCK(mrp); if(mrp->mr_writes_waiting > 0) { @@ -135,7 +135,7 @@ mraccessf(mrlock_t *mrp, int flags) /* ARGSUSED */ void -mrupdatef(mrlock_t *mrp, int flags) +mrupdate(mrlock_t *mrp) { MRLOCK(mrp); while(mrp->mr_count) { Index: xfs-linux-allpatches/linux-2.4/mrlock_rwsem.h =================================================================== --- xfs-linux-allpatches.orig/linux-2.4/mrlock_rwsem.h +++ xfs-linux-allpatches/linux-2.4/mrlock_rwsem.h @@ -31,15 +31,13 @@ typedef struct { ( (mrp)->mr_writer = 0, init_rwsem(&(mrp)->mr_lock) ) #define mrlock_init(mrp, t,n,s) mrinit(mrp, n) #define mrfree(mrp) do { } while (0) -#define mraccess(mrp) mraccessf(mrp, 0) -#define mrupdate(mrp) mrupdatef(mrp, 0) -static inline void mraccessf(mrlock_t *mrp, int flags) +static inline void mraccess(mrlock_t *mrp) { down_read(&mrp->mr_lock); } -static inline void mrupdatef(mrlock_t *mrp, int flags) +static inline void mrupdate(mrlock_t *mrp) { down_write(&mrp->mr_lock); mrp->mr_writer = 1; From owner-xfs@oss.sgi.com Fri Nov 10 19:29:49 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 19:29:57 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAB3TmaG014577 for ; Fri, 10 Nov 2006 19:29:49 -0800 X-ASG-Debug-ID: 1163215740-11435-905-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 1DA91D1CF284 for ; Fri, 10 Nov 2006 19:29:01 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 4B38018CF23E6 for ; Fri, 10 Nov 2006 21:29:00 -0600 (CST) Message-ID: <4555437B.4040904@sandeen.net> Date: Fri, 10 Nov 2006 21:28:59 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] remove unused filp from ioctl functions Subject: [PATCH] remove unused filp from ioctl functions Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25651 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9589 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 4768 Lines: 174 There's a stuct file * passed around the ioctl code that is never used... just takes up precious stack space near as I can tell :) Signed-off-by: Eric Sandeen linux-2.4/xfs_ioctl.c | 16 +++++----------- linux-2.6/xfs_ioctl.c | 16 +++++----------- 2 files changed, 10 insertions(+), 22 deletions(-) Index: xfs-linux-allpatches/linux-2.4/xfs_ioctl.c =================================================================== --- xfs-linux-allpatches.orig/linux-2.4/xfs_ioctl.c +++ xfs-linux-allpatches/linux-2.4/xfs_ioctl.c @@ -349,7 +349,6 @@ STATIC int xfs_readlink_by_handle( xfs_mount_t *mp, unsigned long arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -400,7 +399,6 @@ STATIC int xfs_fssetdm_by_handle( xfs_mount_t *mp, unsigned long arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -442,7 +440,6 @@ STATIC int xfs_attrlist_by_handle( xfs_mount_t *mp, unsigned long arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -563,7 +560,6 @@ STATIC int xfs_attrmulti_by_handle( xfs_mount_t *mp, unsigned long arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -683,7 +679,6 @@ xfs_ioc_xattr( STATIC int xfs_ioc_getbmap( bhv_desc_t *bdp, - struct file *filp, int flags, unsigned int cmd, unsigned long arg); @@ -779,7 +774,7 @@ xfs_ioctl( case XFS_IOC_GETBMAP: case XFS_IOC_GETBMAPA: - return xfs_ioc_getbmap(bdp, filp, ioflags, cmd, arg); + return xfs_ioc_getbmap(bdp, ioflags, cmd, arg); case XFS_IOC_GETBMAPX: return xfs_ioc_getbmapx(bdp, arg); @@ -793,16 +788,16 @@ xfs_ioctl( return xfs_open_by_handle(mp, arg, filp, inode); case XFS_IOC_FSSETDM_BY_HANDLE: - return xfs_fssetdm_by_handle(mp, arg, filp, inode); + return xfs_fssetdm_by_handle(mp, arg, inode); case XFS_IOC_READLINK_BY_HANDLE: - return xfs_readlink_by_handle(mp, arg, filp, inode); + return xfs_readlink_by_handle(mp, arg, inode); case XFS_IOC_ATTRLIST_BY_HANDLE: - return xfs_attrlist_by_handle(mp, arg, filp, inode); + return xfs_attrlist_by_handle(mp, arg, inode); case XFS_IOC_ATTRMULTI_BY_HANDLE: - return xfs_attrmulti_by_handle(mp, arg, filp, inode); + return xfs_attrmulti_by_handle(mp, arg, inode); case XFS_IOC_SWAPEXT: { error = xfs_swapext((struct xfs_swapext *)arg); @@ -1258,7 +1253,6 @@ xfs_ioc_xattr( STATIC int xfs_ioc_getbmap( bhv_desc_t *bdp, - struct file *filp, int ioflags, unsigned int cmd, unsigned long arg) Index: xfs-linux-allpatches/linux-2.6/xfs_ioctl.c =================================================================== --- xfs-linux-allpatches.orig/linux-2.6/xfs_ioctl.c +++ xfs-linux-allpatches/linux-2.6/xfs_ioctl.c @@ -352,7 +352,6 @@ STATIC int xfs_readlink_by_handle( xfs_mount_t *mp, void __user *arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -403,7 +402,6 @@ STATIC int xfs_fssetdm_by_handle( xfs_mount_t *mp, void __user *arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -445,7 +443,6 @@ STATIC int xfs_attrlist_by_handle( xfs_mount_t *mp, void __user *arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -566,7 +563,6 @@ STATIC int xfs_attrmulti_by_handle( xfs_mount_t *mp, void __user *arg, - struct file *parfilp, struct inode *parinode) { int error; @@ -686,7 +682,6 @@ xfs_ioc_xattr( STATIC int xfs_ioc_getbmap( bhv_desc_t *bdp, - struct file *filp, int flags, unsigned int cmd, void __user *arg); @@ -785,7 +780,7 @@ xfs_ioctl( case XFS_IOC_GETBMAP: case XFS_IOC_GETBMAPA: - return xfs_ioc_getbmap(bdp, filp, ioflags, cmd, arg); + return xfs_ioc_getbmap(bdp, ioflags, cmd, arg); case XFS_IOC_GETBMAPX: return xfs_ioc_getbmapx(bdp, arg); @@ -799,16 +794,16 @@ xfs_ioctl( return xfs_open_by_handle(mp, arg, filp, inode); case XFS_IOC_FSSETDM_BY_HANDLE: - return xfs_fssetdm_by_handle(mp, arg, filp, inode); + return xfs_fssetdm_by_handle(mp, arg, inode); case XFS_IOC_READLINK_BY_HANDLE: - return xfs_readlink_by_handle(mp, arg, filp, inode); + return xfs_readlink_by_handle(mp, arg, inode); case XFS_IOC_ATTRLIST_BY_HANDLE: - return xfs_attrlist_by_handle(mp, arg, filp, inode); + return xfs_attrlist_by_handle(mp, arg, inode); case XFS_IOC_ATTRMULTI_BY_HANDLE: - return xfs_attrmulti_by_handle(mp, arg, filp, inode); + return xfs_attrmulti_by_handle(mp, arg, inode); case XFS_IOC_SWAPEXT: { error = xfs_swapext((struct xfs_swapext __user *)arg); @@ -1278,7 +1273,6 @@ xfs_ioc_xattr( STATIC int xfs_ioc_getbmap( bhv_desc_t *bdp, - struct file *filp, int ioflags, unsigned int cmd, void __user *arg) From owner-xfs@oss.sgi.com Fri Nov 10 20:26:31 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 20:26:39 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAB4QUaG023465 for ; Fri, 10 Nov 2006 20:26:31 -0800 X-ASG-Debug-ID: 1163219142-22111-47-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3AA9E4DE9EF for ; Fri, 10 Nov 2006 20:25:43 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3F7B618CF23E6 for ; Fri, 10 Nov 2006 21:52:19 -0600 (CST) Message-ID: <455548F2.4060500@sandeen.net> Date: Fri, 10 Nov 2006 21:52:18 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] remove unused xflags parameter from sync routines Subject: [PATCH] remove unused xflags parameter from sync routines Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 1.05 X-Barracuda-Spam-Status: No, SCORE=1.05 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_RULE_7582B X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25657 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.05 BSF_RULE_7582B BODY: Custom Rule 7582B X-archive-position: 9590 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 3627 Lines: 110 xfs_syncsub & xfs_sync_inodes have an xflags parameter which is never used. -Eric Signed-off-by: Eric Sandeen quota/xfs_qm_syscalls.c | 2 +- xfs_mount.h | 4 ++-- xfs_vfsops.c | 18 +++++------------- 3 files changed, 8 insertions(+), 16 deletions(-) Index: xfs-linux-allpatches/quota/xfs_qm_syscalls.c =================================================================== --- xfs-linux-allpatches.orig/quota/xfs_qm_syscalls.c +++ xfs-linux-allpatches/quota/xfs_qm_syscalls.c @@ -134,7 +134,7 @@ xfs_qm_quotactl( break; case Q_XQUOTASYNC: - return (xfs_sync_inodes(mp, SYNC_DELWRI, 0, NULL)); + return (xfs_sync_inodes(mp, SYNC_DELWRI, NULL)); default: break; Index: xfs-linux-allpatches/xfs_mount.h =================================================================== --- xfs-linux-allpatches.orig/xfs_mount.h +++ xfs-linux-allpatches/xfs_mount.h @@ -587,8 +587,8 @@ extern struct xfs_buf *xfs_getsb(xfs_mou extern int xfs_readsb(xfs_mount_t *, int); extern void xfs_freesb(xfs_mount_t *); extern void xfs_do_force_shutdown(bhv_desc_t *, int, char *, int); -extern int xfs_syncsub(xfs_mount_t *, int, int, int *); -extern int xfs_sync_inodes(xfs_mount_t *, int, int, int *); +extern int xfs_syncsub(xfs_mount_t *, int, int *); +extern int xfs_sync_inodes(xfs_mount_t *, int, int *); extern xfs_agnumber_t xfs_initialize_perag(struct bhv_vfs *, xfs_mount_t *, xfs_agnumber_t); extern void xfs_xlatesb(void *, struct xfs_sb *, int, __int64_t); Index: xfs-linux-allpatches/xfs_vfsops.c =================================================================== --- xfs-linux-allpatches.orig/xfs_vfsops.c +++ xfs-linux-allpatches/xfs_vfsops.c @@ -640,7 +640,7 @@ xfs_quiesce_fs( * we can write the unmount record. */ do { - xfs_syncsub(mp, SYNC_REMOUNT|SYNC_ATTR|SYNC_WAIT, 0, NULL); + xfs_syncsub(mp, SYNC_REMOUNT|SYNC_ATTR|SYNC_WAIT, NULL); pincount = xfs_flush_buftarg(mp->m_ddev_targp, 1); if (!pincount) { delay(50); @@ -886,24 +886,20 @@ xfs_sync( if (unlikely(flags == SYNC_QUIESCE)) return xfs_quiesce_fs(mp); else - return xfs_syncsub(mp, flags, 0, NULL); + return xfs_syncsub(mp, flags, NULL); } /* * xfs sync routine for internal use * * This routine supports all of the flags defined for the generic vfs_sync - * interface as explained above under xfs_sync. In the interests of not - * changing interfaces within the 6.5 family, additional internally- - * required functions are specified within a separate xflags parameter, - * only available by calling this routine. + * interface as explained above under xfs_sync. * */ int xfs_sync_inodes( xfs_mount_t *mp, int flags, - int xflags, int *bypassed) { xfs_inode_t *ip = NULL; @@ -1412,17 +1408,13 @@ xfs_sync_inodes( * xfs sync routine for internal use * * This routine supports all of the flags defined for the generic vfs_sync - * interface as explained above under xfs_sync. In the interests of not - * changing interfaces within the 6.5 family, additional internally- - * required functions are specified within a separate xflags parameter, - * only available by calling this routine. + * interface as explained above under xfs_sync. * */ int xfs_syncsub( xfs_mount_t *mp, int flags, - int xflags, int *bypassed) { int error = 0; @@ -1444,7 +1436,7 @@ xfs_syncsub( if (flags & SYNC_BDFLUSH) xfs_finish_reclaim_all(mp, 1); else - error = xfs_sync_inodes(mp, flags, xflags, bypassed); + error = xfs_sync_inodes(mp, flags, bypassed); } /* From owner-xfs@oss.sgi.com Fri Nov 10 22:33:42 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 22:33:49 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAB6XdaG031745 for ; Fri, 10 Nov 2006 22:33:40 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA28876; Sat, 11 Nov 2006 17:32:45 +1100 Date: Sat, 11 Nov 2006 16:34:22 +1000 From: Timothy Shimmin To: torvalds@osdl.org cc: akpm@osdl.org, xfs@oss.sgi.com Subject: XFS update for 2.6.19 Message-ID: <76708453565F2C45356A563E@timothy-shimmins-power-mac-g5.local> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9591 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 4403 Lines: 139 Hi Linus, Please pull from: git://oss.sgi.com:8090/xfs/xfs-2.6 It contains some bug fixes (notably inode iunpin fix), a couple of patches Andrew Morton was carrying and a few other fixups/support. This will update the following files: fs/xfs/linux-2.6/xfs_ioctl.c | 5 +-- fs/xfs/linux-2.6/xfs_super.c | 4 ++- fs/xfs/support/move.c | 2 + fs/xfs/support/move.h | 2 + fs/xfs/xfs_dir2.c | 2 + fs/xfs/xfs_iget.c | 51 +++++++++------------------------ fs/xfs/xfs_inode.c | 64 ++++++++++++++++++++++++------------------ fs/xfs/xfs_inode.h | 41 --------------------------- fs/xfs/xfs_vnodeops.c | 33 +++++++++------------- 9 files changed, 72 insertions(+), 132 deletions(-) through these commits: commit de5ab811cd0fd738d78c24b970337121ea4c1e07 Author: David Chinner Date: Sat Nov 11 15:23:35 2006 +1100 [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h SGI-PV: 957005 SGI-Modid: xfs-linux-melb:xfs-kern:27398a Signed-off-by: David Chinner Signed-off-by: Michal Piotrowski Signed-off-by: Tim Shimmin commit d716594813bf897e6f0255b03d0b43cc00f46c0c Author: David Chinner Date: Sat Nov 11 15:18:35 2006 +1100 [XFS] Prevent a deadlock when xfslogd unpins inodes. The previous fixes for the use after free in xfs_iunpin left a nasty log deadlock when xfslogd unpinned the inode and dropped the last reference to the inode. the ->clear_inode() method can issue transactions, and if the log was full, the transaction could push on the log and get stuck trying to push the inode it was currently unpinning. To fix this, we provide xfs_iunpin a guarantee that it will always have a valid xfs_inode <-> linux inode link or a particular flag will be set on the inode. We then use log forces during lookup to ensure transactions are completed before we recycle the inode. This ensures that xfs_iunpin will never use the linux inode after it is being freed, and any lookup on an inode on the reclaim list will wait until it is safe to attach a new linux inode to the xfs inode. SGI-PV: 956832 SGI-Modid: xfs-linux-melb:xfs-kern:27359a Signed-off-by: David Chinner Signed-off-by: Shailendra Tripathi Signed-off-by: Takenori Nagano Signed-off-by: Tim Shimmin commit 283e9cb79215cc159386daba72363469640a4ddf Author: David Chinner Date: Sat Nov 11 15:07:41 2006 +1100 [XFS] Clean up i_flags and i_flags_lock handling. SGI-PV: 956832 SGI-Modid: xfs-linux-melb:xfs-kern:27358a Signed-off-by: David Chinner Signed-off-by: Nathan Scott Signed-off-by: Tim Shimmin commit 4debb1aff0c7a37a37a1fa200958d5b89668e474 Author: Vlad Apostolov Date: Sat Nov 11 15:07:34 2006 +1100 [XFS] 956664: dm_read_invis() changes i_atime SGI-PV: 956664 SGI-Modid: xfs-linux-melb:xfs-kern:27315a Signed-off-by: Vlad Apostolov Signed-off-by: Sam Vaughan Signed-off-by: Tim Shimmin commit 9945a9f7fa5b394ea46ffb7da12846418bc418a9 Author: Vlad Apostolov Date: Sat Nov 11 15:07:28 2006 +1100 [XFS] rename uio_read() to xfs_uio_read() SGI-PV: 957004 SGI-Modid: xfs-linux-melb:xfs-kern:27231a Signed-off-by: Vlad Apostolov Signed-off-by: Tim Shimmin commit f8bfdc2384a51054d50a1fe3564b39c0f13327a6 Author: Tim Shimmin Date: Sat Nov 11 15:07:19 2006 +1100 [XFS] Keep lockdep happy. SGI-PV: 956964 SGI-Modid: xfs-linux-melb:xfs-kern:27200a Signed-off-by: Tim Shimmin Signed-off-by: David Chinner Signed-off-by: Eric Sandeen commit f534f2a21d5b2979913bdbae05e74a1d015a62ef Author: Vlad Apostolov Date: Sat Nov 11 14:17:37 2006 +1100 [XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when CONFIG_XFS_TRACE is on SGI-PV: 956618 SGI-Modid: xfs-linux-melb:xfs-kern:27196a Signed-off-by: Vlad Apostolov Signed-off-by: Tim Shimmin Thanks, --Tim From owner-xfs@oss.sgi.com Fri Nov 10 22:55:25 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 22:55:33 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAB6tMaG000965 for ; Fri, 10 Nov 2006 22:55:24 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA29172; Sat, 11 Nov 2006 17:54:29 +1100 Date: Sat, 11 Nov 2006 16:56:05 +1000 From: Timothy Shimmin To: torvalds@osdl.org cc: akpm@osdl.org, xfs@oss.sgi.com Subject: Re: XFS update for 2.6.19 Message-ID: <7434A759AFB4695DB1CB3A53@timothy-shimmins-power-mac-g5.local> In-Reply-To: <76708453565F2C45356A563E@timothy-shimmins-power-mac-g5.local> References: <76708453565F2C45356A563E@timothy-shimmins-power-mac-g5.local> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9592 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 4865 Lines: 151 Just noticed a Makefile change from the 1st commit was missing. I'll git reset hard and redo the commits. Will send another email when ok. --Tim --On 11 November 2006 4:34:22 PM +1000 Timothy Shimmin wrote: > Hi Linus, > > Please pull from: > git://oss.sgi.com:8090/xfs/xfs-2.6 > > It contains some bug fixes (notably inode iunpin fix), > a couple of patches Andrew Morton was carrying and > a few other fixups/support. > > This will update the following files: > > fs/xfs/linux-2.6/xfs_ioctl.c | 5 +-- > fs/xfs/linux-2.6/xfs_super.c | 4 ++- > fs/xfs/support/move.c | 2 + > fs/xfs/support/move.h | 2 + > fs/xfs/xfs_dir2.c | 2 + > fs/xfs/xfs_iget.c | 51 +++++++++------------------------ > fs/xfs/xfs_inode.c | 64 ++++++++++++++++++++++++------------------ > fs/xfs/xfs_inode.h | 41 --------------------------- > fs/xfs/xfs_vnodeops.c | 33 +++++++++------------- > 9 files changed, 72 insertions(+), 132 deletions(-) > > through these commits: > > commit de5ab811cd0fd738d78c24b970337121ea4c1e07 > Author: David Chinner > Date: Sat Nov 11 15:23:35 2006 +1100 > > [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h > > SGI-PV: 957005 > SGI-Modid: xfs-linux-melb:xfs-kern:27398a > > Signed-off-by: David Chinner > Signed-off-by: Michal Piotrowski > Signed-off-by: Tim Shimmin > > > commit d716594813bf897e6f0255b03d0b43cc00f46c0c > Author: David Chinner > Date: Sat Nov 11 15:18:35 2006 +1100 > > [XFS] Prevent a deadlock when xfslogd unpins inodes. > > The previous fixes for the use after free in xfs_iunpin left a nasty log > deadlock when xfslogd unpinned the inode and dropped the last reference to > the inode. the ->clear_inode() method can issue transactions, and if the > log was full, the transaction could push on the log and get stuck trying > to push the inode it was currently unpinning. > > To fix this, we provide xfs_iunpin a guarantee that it will always have a > valid xfs_inode <-> linux inode link or a particular flag will be set on > the inode. We then use log forces during lookup to ensure transactions are > completed before we recycle the inode. This ensures that xfs_iunpin will > never use the linux inode after it is being freed, and any lookup on an > inode on the reclaim list will wait until it is safe to attach a new linux > inode to the xfs inode. > > SGI-PV: 956832 > SGI-Modid: xfs-linux-melb:xfs-kern:27359a > > Signed-off-by: David Chinner > Signed-off-by: Shailendra Tripathi > Signed-off-by: Takenori Nagano > Signed-off-by: Tim Shimmin > > > commit 283e9cb79215cc159386daba72363469640a4ddf > Author: David Chinner > Date: Sat Nov 11 15:07:41 2006 +1100 > > [XFS] Clean up i_flags and i_flags_lock handling. > > SGI-PV: 956832 > SGI-Modid: xfs-linux-melb:xfs-kern:27358a > > Signed-off-by: David Chinner > Signed-off-by: Nathan Scott > Signed-off-by: Tim Shimmin > > > commit 4debb1aff0c7a37a37a1fa200958d5b89668e474 > Author: Vlad Apostolov > Date: Sat Nov 11 15:07:34 2006 +1100 > > [XFS] 956664: dm_read_invis() changes i_atime > > SGI-PV: 956664 > SGI-Modid: xfs-linux-melb:xfs-kern:27315a > > Signed-off-by: Vlad Apostolov > Signed-off-by: Sam Vaughan > Signed-off-by: Tim Shimmin > > > commit 9945a9f7fa5b394ea46ffb7da12846418bc418a9 > Author: Vlad Apostolov > Date: Sat Nov 11 15:07:28 2006 +1100 > > [XFS] rename uio_read() to xfs_uio_read() > > SGI-PV: 957004 > SGI-Modid: xfs-linux-melb:xfs-kern:27231a > > Signed-off-by: Vlad Apostolov > Signed-off-by: Tim Shimmin > > > commit f8bfdc2384a51054d50a1fe3564b39c0f13327a6 > Author: Tim Shimmin > Date: Sat Nov 11 15:07:19 2006 +1100 > > [XFS] Keep lockdep happy. > > SGI-PV: 956964 > SGI-Modid: xfs-linux-melb:xfs-kern:27200a > > Signed-off-by: Tim Shimmin > Signed-off-by: David Chinner > Signed-off-by: Eric Sandeen > > > commit f534f2a21d5b2979913bdbae05e74a1d015a62ef > Author: Vlad Apostolov > Date: Sat Nov 11 14:17:37 2006 +1100 > > [XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when > CONFIG_XFS_TRACE is on > > SGI-PV: 956618 > SGI-Modid: xfs-linux-melb:xfs-kern:27196a > > Signed-off-by: Vlad Apostolov > Signed-off-by: Tim Shimmin > > Thanks, > --Tim > > From owner-xfs@oss.sgi.com Fri Nov 10 23:07:43 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 23:07:50 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAB77gaG002105 for ; Fri, 10 Nov 2006 23:07:43 -0800 X-ASG-Debug-ID: 1163227981-23681-90-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.4]) by cuda.sgi.com (Spam Firewall) with ESMTP id 91981D1CB31C for ; Fri, 10 Nov 2006 22:53:01 -0800 (PST) Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id kAB6qvoZ031275 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 10 Nov 2006 22:52:58 -0800 Received: from box (shell0.pdx.osdl.net [10.9.0.31]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with SMTP id kAB6qvtV020976; Fri, 10 Nov 2006 22:52:57 -0800 Date: Fri, 10 Nov 2006 22:52:57 -0800 From: Andrew Morton To: "Igor A. Valcov" Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS filesystem performance drop in kernels 2.6.16+ Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ Message-Id: <20061110225257.63f91851.akpm@osdl.org> In-Reply-To: References: <4553F3C6.2030807@sandeen.net> X-Mailer: Sylpheed version 2.2.7 (GTK+ 2.8.17; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.156 $ X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25667 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9593 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: xfs Content-Length: 700 Lines: 22 On Fri, 10 Nov 2006 16:16:27 +0300 "Igor A. Valcov" wrote: > Below is a simplified version of the test program, Boy, I hope not. The results of this test program are of very little interest. > for (i = 0; i < 262144; i++) { > /* Write data to a big file */ > write (nFiles [0], buf, __BYTES); > > /* Write data to small files */ > for (f = 1; f < __FILES; f++) > write (nFiles [f], &f, sizeof (f)); > } This sits in a loop doing write(fd, buf, 4). This is wildly inefficient - you'd get a 10x throughput benefit and maybe 100x reduction in CPU cost simply by switching to fwrite(). I suspect something went wrong here. From owner-xfs@oss.sgi.com Fri Nov 10 23:53:45 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 10 Nov 2006 23:53:52 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAB7rfaG005352 for ; Fri, 10 Nov 2006 23:53:43 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA00026; Sat, 11 Nov 2006 18:52:45 +1100 Date: Sat, 11 Nov 2006 17:54:21 +1000 From: Timothy Shimmin To: torvalds@osdl.org cc: akpm@osdl.org, xfs@oss.sgi.com Subject: XFS Update for 2.6.19 - take 2 Message-ID: X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9594 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 4838 Lines: 147 Hi Linus, I'll try again :) (Got the makefile changes this time and fixed up the diffstat below hopefully) Please pull from: git://oss.sgi.com:8090/xfs/xfs-2.6 It contains some bug fixes (notably inode iunpin fix), a couple of patches Andrew Morton was carrying and a few other fixups/support. This will update the following files: fs/xfs/Makefile-linux-2.6 | 17 +--------- fs/xfs/linux-2.6/xfs_buf.c | 4 +- fs/xfs/linux-2.6/xfs_dmapi_priv.h | 28 ++++++++++++++++ fs/xfs/linux-2.6/xfs_ioctl.c | 5 ++- fs/xfs/linux-2.6/xfs_super.c | 4 +- fs/xfs/support/debug.c | 4 +- fs/xfs/support/move.c | 2 + fs/xfs/support/move.h | 2 + fs/xfs/xfs.h | 23 +++++++++++++ fs/xfs/xfs_dir2.c | 2 + fs/xfs/xfs_dmapi.h | 22 +------------ fs/xfs/xfs_iget.c | 51 +++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 64 ++++++++++++++++--------------------- fs/xfs/xfs_inode.h | 41 ++++++++++++++++++++++++ fs/xfs/xfs_vnodeops.c | 33 +++++++++++-------- 15 files changed, 189 insertions(+), 113 deletions(-) through these commits: commit 93c189c1148a5e39bcc8f62568f42a77f93477c5 Author: Vlad Apostolov Date: Sat Nov 11 18:03:49 2006 +1100 [XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when CONFIG_XFS_TRACE is on SGI-PV: 956618 SGI-Modid: xfs-linux-melb:xfs-kern:27196a Signed-off-by: Vlad Apostolov Signed-off-by: Tim Shimmin commit 439b8434792d0b62e32ab1416f214a18a640cc03 Author: Tim Shimmin Date: Sat Nov 11 18:04:34 2006 +1100 [XFS] Keep lockdep happy. SGI-PV: 956964 SGI-Modid: xfs-linux-melb:xfs-kern:27200a Signed-off-by: Tim Shimmin Signed-off-by: David Chinner Signed-off-by: Eric Sandeen commit 70a505285f9859f77e07f7c12371b0d29ecf3d82 Author: Vlad Apostolov Date: Sat Nov 11 18:04:41 2006 +1100 [XFS] rename uio_read() to xfs_uio_read() SGI-PV: 957004 SGI-Modid: xfs-linux-melb:xfs-kern:27231a Signed-off-by: Vlad Apostolov Signed-off-by: Tim Shimmin commit 2e2e7bb1fd857b9fc83b0cd77b6b647ebb423301 Author: Vlad Apostolov Date: Sat Nov 11 18:04:47 2006 +1100 [XFS] 956664: dm_read_invis() changes i_atime SGI-PV: 956664 SGI-Modid: xfs-linux-melb:xfs-kern:27315a Signed-off-by: Vlad Apostolov Signed-off-by: Sam Vaughan Signed-off-by: Tim Shimmin commit 7a18c386078eaf17ae54595f66c0d64d9c1cb29c Author: David Chinner Date: Sat Nov 11 18:04:54 2006 +1100 [XFS] Clean up i_flags and i_flags_lock handling. SGI-PV: 956832 SGI-Modid: xfs-linux-melb:xfs-kern:27358a Signed-off-by: David Chinner Signed-off-by: Nathan Scott Signed-off-by: Tim Shimmin commit 4c60658e0f4e253cf275f12b7c76bf128515a774 Author: David Chinner Date: Sat Nov 11 18:05:00 2006 +1100 [XFS] Prevent a deadlock when xfslogd unpins inodes. The previous fixes for the use after free in xfs_iunpin left a nasty log deadlock when xfslogd unpinned the inode and dropped the last reference to the inode. the ->clear_inode() method can issue transactions, and if the log was full, the transaction could push on the log and get stuck trying to push the inode it was currently unpinning. To fix this, we provide xfs_iunpin a guarantee that it will always have a valid xfs_inode <-> linux inode link or a particular flag will be set on the inode. We then use log forces during lookup to ensure transactions are completed before we recycle the inode. This ensures that xfs_iunpin will never use the linux inode after it is being freed, and any lookup on an inode on the reclaim list will wait until it is safe to attach a new linux inode to the xfs inode. SGI-PV: 956832 SGI-Modid: xfs-linux-melb:xfs-kern:27359a Signed-off-by: David Chinner Signed-off-by: Shailendra Tripathi Signed-off-by: Takenori Nagano Signed-off-by: Tim Shimmin commit 050e714eb2bc662e9df6bf048ce86b4fbdd9bcd3 Author: David Chinner Date: Sat Nov 11 18:05:06 2006 +1100 [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h SGI-PV: 957005 SGI-Modid: xfs-linux-melb:xfs-kern:27398a Signed-off-by: David Chinner Signed-off-by: Michal Piotrowski Signed-off-by: Tim Shimmin --Tim From owner-xfs@oss.sgi.com Sat Nov 11 02:58:42 2006 Received: with ECARTIS (v1.0.0; list xfs); Sat, 11 Nov 2006 02:58:50 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kABAwfaG024501 for ; Sat, 11 Nov 2006 02:58:42 -0800 X-ASG-Debug-ID: 1163242663-11852-324-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from emailer.gwdg.de (emailer.gwdg.de [134.76.10.24]) by cuda.sgi.com (Spam Firewall) with ESMTP id 7AD5A51737E for ; Sat, 11 Nov 2006 02:57:44 -0800 (PST) Received: from linux01.gwdg.de ([134.76.13.21]) by mailer.gwdg.de with esmtps (TLSv1:DES-CBC3-SHA:168) (Exim 4.60) (envelope-from ) id 1GiqY9-00001w-V2; Sat, 11 Nov 2006 11:57:26 +0100 Received: from linux01.gwdg.de (localhost [127.0.0.1]) by linux01.gwdg.de (8.13.3/8.13.3/SuSE Linux 0.7) with ESMTP id kABAttnf006291; Sat, 11 Nov 2006 11:55:57 +0100 Received: from localhost (jengelh@localhost) by linux01.gwdg.de (8.13.3/8.13.3/Submit) with ESMTP id kABAtso8006284; Sat, 11 Nov 2006 11:55:54 +0100 Date: Sat, 11 Nov 2006 11:55:53 +0100 (MET) From: Jan Engelhardt To: Andrew Morton cc: "Igor A. Valcov" , linux-kernel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS filesystem performance drop in kernels 2.6.16+ Subject: Re: XFS filesystem performance drop in kernels 2.6.16+ In-Reply-To: <20061110225257.63f91851.akpm@osdl.org> Message-ID: References: <4553F3C6.2030807@sandeen.net> <20061110225257.63f91851.akpm@osdl.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25681 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9595 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jengelh@linux01.gwdg.de Precedence: bulk X-list: xfs Content-Length: 749 Lines: 27 >> for (i = 0; i < 262144; i++) { >> /* Write data to a big file */ >> write (nFiles [0], buf, __BYTES); >> >> /* Write data to small files */ >> for (f = 1; f < __FILES; f++) >> write (nFiles [f], &f, sizeof (f)); >> } > >This sits in a loop doing write(fd, buf, 4). This is wildly inefficient - >you'd get a 10x throughput benefit and maybe 100x reduction in CPU cost >simply by switching to fwrite(). Well yes and no. The problem here is the syscall overhead. fwrite buffers things, so needless syscalls are avoided. The same could be done by changing the program logic and increasing the size argument to read/write. >I suspect something went wrong here. Design error. :) -`J' -- From owner-xfs@oss.sgi.com Sat Nov 11 03:09:19 2006 Received: with ECARTIS (v1.0.0; list xfs); Sat, 11 Nov 2006 03:09:27 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kABB9IaG025713 for ; Sat, 11 Nov 2006 03:09:19 -0800 X-ASG-Debug-ID: 1163242590-11896-301-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by cuda.sgi.com (Spam Firewall) with ESMTP id D4443515DA2 for ; Sat, 11 Nov 2006 02:56:31 -0800 (PST) Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1GiqXD-0001Cp-2K; Sat, 11 Nov 2006 10:56:27 +0000 Date: Sat, 11 Nov 2006 10:56:27 +0000 From: Christoph Hellwig To: Eric Sandeen Cc: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] remove unused filp from ioctl functions Subject: Re: [PATCH] remove unused filp from ioctl functions Message-ID: <20061111105627.GB3356@infradead.org> References: <4555437B.4040904@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4555437B.4040904@sandeen.net> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25681 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9596 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 318 Lines: 7 On Fri, Nov 10, 2006 at 09:28:59PM -0600, Eric Sandeen wrote: > There's a stuct file * passed around the ioctl code that is never > used... just takes up precious stack space near as I can tell :) While you're at it kill the inode paramater anyway, it can be retrieved from the vnode by trivial address arithmetics. From owner-xfs@oss.sgi.com Sat Nov 11 09:13:25 2006 Received: with ECARTIS (v1.0.0; list xfs); Sat, 11 Nov 2006 09:13:32 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kABHDOaG031710 for ; Sat, 11 Nov 2006 09:13:25 -0800 X-ASG-Debug-ID: 1163265156-20206-472-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 38A8B517FDC for ; Sat, 11 Nov 2006 09:12:37 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id E98E018CF23E6; Sat, 11 Nov 2006 11:12:35 -0600 (CST) Message-ID: <45560483.5000303@sandeen.net> Date: Sat, 11 Nov 2006 11:12:35 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] remove unused filp from ioctl functions Subject: Re: [PATCH] remove unused filp from ioctl functions References: <4555437B.4040904@sandeen.net> <20061111105627.GB3356@infradead.org> In-Reply-To: <20061111105627.GB3356@infradead.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25705 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9599 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 400 Lines: 12 Christoph Hellwig wrote: > On Fri, Nov 10, 2006 at 09:28:59PM -0600, Eric Sandeen wrote: >> There's a stuct file * passed around the ioctl code that is never >> used... just takes up precious stack space near as I can tell :) > > While you're at it kill the inode paramater anyway, it can be retrieved > from the vnode by trivial address arithmetics. Hadn't noticed, let me look into that. -Eric From owner-xfs@oss.sgi.com Sun Nov 12 15:03:58 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 12 Nov 2006 15:04:07 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kACN3taG025962 for ; Sun, 12 Nov 2006 15:03:57 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA09444; Mon, 13 Nov 2006 10:02:57 +1100 Message-ID: <4557A823.6070609@sgi.com> Date: Mon, 13 Nov 2006 10:02:59 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.7 (X11/20060909) MIME-Version: 1.0 To: Timothy Shimmin CC: xfs-dev , xfs@oss.sgi.com Subject: Re: XFS Update for 2.6.19 - take 2 References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9604 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 3933 Lines: 131 I don't know what I have done wrong but some of the patches bellow signed by me a actually patches from the comunity that I just have verified and checked in our XFS source management system. I guess, I didn't clearly identified the real authors of the fixes (although I think I put their names when I checked in the patch). Vlad Timothy Shimmin wrote: > > > commit 93c189c1148a5e39bcc8f62568f42a77f93477c5 > Author: Vlad Apostolov > Date: Sat Nov 11 18:03:49 2006 +1100 > > [XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when > CONFIG_XFS_TRACE is on > > SGI-PV: 956618 > SGI-Modid: xfs-linux-melb:xfs-kern:27196a > > Signed-off-by: Vlad Apostolov > Signed-off-by: Tim Shimmin > > > commit 439b8434792d0b62e32ab1416f214a18a640cc03 > Author: Tim Shimmin > Date: Sat Nov 11 18:04:34 2006 +1100 > > [XFS] Keep lockdep happy. > > SGI-PV: 956964 > SGI-Modid: xfs-linux-melb:xfs-kern:27200a > > Signed-off-by: Tim Shimmin > Signed-off-by: David Chinner > Signed-off-by: Eric Sandeen > > > commit 70a505285f9859f77e07f7c12371b0d29ecf3d82 > Author: Vlad Apostolov > Date: Sat Nov 11 18:04:41 2006 +1100 > > [XFS] rename uio_read() to xfs_uio_read() > > SGI-PV: 957004 > SGI-Modid: xfs-linux-melb:xfs-kern:27231a > > Signed-off-by: Vlad Apostolov > Signed-off-by: Tim Shimmin > > > commit 2e2e7bb1fd857b9fc83b0cd77b6b647ebb423301 > Author: Vlad Apostolov > Date: Sat Nov 11 18:04:47 2006 +1100 > > [XFS] 956664: dm_read_invis() changes i_atime > > SGI-PV: 956664 > SGI-Modid: xfs-linux-melb:xfs-kern:27315a > > Signed-off-by: Vlad Apostolov > Signed-off-by: Sam Vaughan > Signed-off-by: Tim Shimmin > > > commit 7a18c386078eaf17ae54595f66c0d64d9c1cb29c > Author: David Chinner > Date: Sat Nov 11 18:04:54 2006 +1100 > > [XFS] Clean up i_flags and i_flags_lock handling. > > SGI-PV: 956832 > SGI-Modid: xfs-linux-melb:xfs-kern:27358a > > Signed-off-by: David Chinner > Signed-off-by: Nathan Scott > Signed-off-by: Tim Shimmin > > > commit 4c60658e0f4e253cf275f12b7c76bf128515a774 > Author: David Chinner > Date: Sat Nov 11 18:05:00 2006 +1100 > > [XFS] Prevent a deadlock when xfslogd unpins inodes. > > The previous fixes for the use after free in xfs_iunpin left a nasty log > deadlock when xfslogd unpinned the inode and dropped the last > reference to > the inode. the ->clear_inode() method can issue transactions, and if the > log was full, the transaction could push on the log and get stuck trying > to push the inode it was currently unpinning. > > To fix this, we provide xfs_iunpin a guarantee that it will always have a > valid xfs_inode <-> linux inode link or a particular flag will be set on > the inode. We then use log forces during lookup to ensure transactions > are > completed before we recycle the inode. This ensures that xfs_iunpin will > never use the linux inode after it is being freed, and any lookup on an > inode on the reclaim list will wait until it is safe to attach a new > linux > inode to the xfs inode. > > SGI-PV: 956832 > SGI-Modid: xfs-linux-melb:xfs-kern:27359a > > Signed-off-by: David Chinner > Signed-off-by: Shailendra Tripathi > Signed-off-by: Takenori Nagano > Signed-off-by: Tim Shimmin > > > commit 050e714eb2bc662e9df6bf048ce86b4fbdd9bcd3 > Author: David Chinner > Date: Sat Nov 11 18:05:06 2006 +1100 > > [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h > > SGI-PV: 957005 > SGI-Modid: xfs-linux-melb:xfs-kern:27398a > > Signed-off-by: David Chinner > Signed-off-by: Michal Piotrowski > Signed-off-by: Tim Shimmin > > --Tim > > From owner-xfs@oss.sgi.com Sun Nov 12 17:12:29 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 12 Nov 2006 17:12:36 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAD1CQaG009191 for ; Sun, 12 Nov 2006 17:12:27 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA12262; Mon, 13 Nov 2006 12:11:29 +1100 Date: Mon, 13 Nov 2006 11:13:12 +1000 From: Timothy Shimmin To: Vlad Apostolov cc: xfs-dev , xfs@oss.sgi.com Subject: Re: XFS Update for 2.6.19 - signed-off-by issues Message-ID: <8E9FE4A0A565B567E6AA0D02@timothy-shimmins-power-mac-g5.local> In-Reply-To: <4557A823.6070609@sgi.com> References: <4557A823.6070609@sgi.com> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9605 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 5615 Lines: 170 Hi Vlad, --On 13 November 2006 10:02:59 AM +1100 Vlad Apostolov wrote: > I don't know what I have done wrong but some of the patches bellow signed > by me a actually patches from the comunity that I just have verified and > checked in our XFS source management system. I guess, I didn't clearly identified > the real authors of the fixes (although I think I put their names when I checked in > the patch). > We'll address this going forward (since Vlad's patch commits were rather simple) and hopefully get it right next time :) Unless anyone has a strong objection or some other problems come up. >From the outside contributors point of view, a signed-off-by line in the contributed patch is appreciated but not necessary as we (SGI) will/should do this. >From SGI engineering point of view: There were 2 problems with the sgi ptools checkin Vlad was referring to AFAICT: 1. If the original author should be the git author then we need a "Signed-off-by:" clause in the body of the ptools checkin description. This was missing in the checkin description here. This will be the SGI engineer's responsibility to check if there is one already or add one in. This way our scripts know who the intended git author was and not who the ptools author who checked in the fix was. 2. If we want reviewers including outside reviewers as signed off then the reviewers should be placed on separate lines in the ptools "Inspected by" fields and need to be added to our developers.pl database (I can do the last bit as I did this time for a few mods/commits). However, I'll update the script to handle comma or space separated reviewer names for the future (the mods in question had comma separated ones). If anyone has a better plan then let me know. Thanks. I hope this clarifies things a bit. --Tim The sgi-ptools to git converstion scripts > Vlad > > Timothy Shimmin wrote: >> >> >> commit 93c189c1148a5e39bcc8f62568f42a77f93477c5 >> Author: Vlad Apostolov >> Date: Sat Nov 11 18:03:49 2006 +1100 >> >> [XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when >> CONFIG_XFS_TRACE is on >> >> SGI-PV: 956618 >> SGI-Modid: xfs-linux-melb:xfs-kern:27196a >> >> Signed-off-by: Vlad Apostolov >> Signed-off-by: Tim Shimmin >> >> >> commit 439b8434792d0b62e32ab1416f214a18a640cc03 >> Author: Tim Shimmin >> Date: Sat Nov 11 18:04:34 2006 +1100 >> >> [XFS] Keep lockdep happy. >> >> SGI-PV: 956964 >> SGI-Modid: xfs-linux-melb:xfs-kern:27200a >> >> Signed-off-by: Tim Shimmin >> Signed-off-by: David Chinner >> Signed-off-by: Eric Sandeen >> >> >> commit 70a505285f9859f77e07f7c12371b0d29ecf3d82 >> Author: Vlad Apostolov >> Date: Sat Nov 11 18:04:41 2006 +1100 >> >> [XFS] rename uio_read() to xfs_uio_read() >> >> SGI-PV: 957004 >> SGI-Modid: xfs-linux-melb:xfs-kern:27231a >> >> Signed-off-by: Vlad Apostolov >> Signed-off-by: Tim Shimmin >> >> >> commit 2e2e7bb1fd857b9fc83b0cd77b6b647ebb423301 >> Author: Vlad Apostolov >> Date: Sat Nov 11 18:04:47 2006 +1100 >> >> [XFS] 956664: dm_read_invis() changes i_atime >> >> SGI-PV: 956664 >> SGI-Modid: xfs-linux-melb:xfs-kern:27315a >> >> Signed-off-by: Vlad Apostolov >> Signed-off-by: Sam Vaughan >> Signed-off-by: Tim Shimmin >> >> >> commit 7a18c386078eaf17ae54595f66c0d64d9c1cb29c >> Author: David Chinner >> Date: Sat Nov 11 18:04:54 2006 +1100 >> >> [XFS] Clean up i_flags and i_flags_lock handling. >> >> SGI-PV: 956832 >> SGI-Modid: xfs-linux-melb:xfs-kern:27358a >> >> Signed-off-by: David Chinner >> Signed-off-by: Nathan Scott >> Signed-off-by: Tim Shimmin >> >> >> commit 4c60658e0f4e253cf275f12b7c76bf128515a774 >> Author: David Chinner >> Date: Sat Nov 11 18:05:00 2006 +1100 >> >> [XFS] Prevent a deadlock when xfslogd unpins inodes. >> >> The previous fixes for the use after free in xfs_iunpin left a nasty log >> deadlock when xfslogd unpinned the inode and dropped the last >> reference to >> the inode. the ->clear_inode() method can issue transactions, and if the >> log was full, the transaction could push on the log and get stuck trying >> to push the inode it was currently unpinning. >> >> To fix this, we provide xfs_iunpin a guarantee that it will always have a >> valid xfs_inode <-> linux inode link or a particular flag will be set on >> the inode. We then use log forces during lookup to ensure transactions >> are >> completed before we recycle the inode. This ensures that xfs_iunpin will >> never use the linux inode after it is being freed, and any lookup on an >> inode on the reclaim list will wait until it is safe to attach a new >> linux >> inode to the xfs inode. >> >> SGI-PV: 956832 >> SGI-Modid: xfs-linux-melb:xfs-kern:27359a >> >> Signed-off-by: David Chinner >> Signed-off-by: Shailendra Tripathi >> Signed-off-by: Takenori Nagano >> Signed-off-by: Tim Shimmin >> >> >> commit 050e714eb2bc662e9df6bf048ce86b4fbdd9bcd3 >> Author: David Chinner >> Date: Sat Nov 11 18:05:06 2006 +1100 >> >> [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h >> >> SGI-PV: 957005 >> SGI-Modid: xfs-linux-melb:xfs-kern:27398a >> >> Signed-off-by: David Chinner >> Signed-off-by: Michal Piotrowski >> Signed-off-by: Tim Shimmin >> >> --Tim >> >> From owner-xfs@oss.sgi.com Sun Nov 12 17:51:02 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 12 Nov 2006 17:51:10 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAD1p1aG012469 for ; Sun, 12 Nov 2006 17:51:02 -0800 X-ASG-Debug-ID: 1163381619-26315-778-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailgate.mysql.com (mailgate-out2.mysql.com [213.136.52.68]) by cuda.sgi.com (Spam Firewall) with ESMTP id 64256D1D00E0 for ; Sun, 12 Nov 2006 17:33:39 -0800 (PST) Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgate.mysql.com (8.13.4/8.13.4) with ESMTP id kAD1Xbtd004709 for ; Mon, 13 Nov 2006 02:33:37 +0100 Received: from mail.mysql.com ([10.222.1.99]) by localhost (mailgate.mysql.com [10.222.1.98]) (amavisd-new, port 10026) with LMTP id 22554-06 for ; Mon, 13 Nov 2006 02:33:37 +0100 (CET) Received: from [192.168.1.100] (ppp163-199.static.internode.on.net [150.101.163.199]) (authenticated bits=0) by mail.mysql.com (8.13.3/8.13.3) with ESMTP id kAD1XNiT032097 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO) for ; Mon, 13 Nov 2006 02:33:31 +0100 X-ASG-Orig-Subj: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Subject: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads From: Stewart Smith To: xfs@oss.sgi.com Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-yf2iUCshWZ67mA/P20R0" Organization: MySQL AB Date: Mon, 13 Nov 2006 12:33:22 +1100 Message-Id: <1163381602.11914.10.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25835 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9606 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stewart@mysql.com Precedence: bulk X-list: xfs Content-Length: 2249 Lines: 64 --=-yf2iUCshWZ67mA/P20R0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable I recently (finally) wrote my patch to use the xfsctl to get better allocation for NDB disk data files (datafiles and undofiles). patch at: http://lists.mysql.com/commits/15088 This actually ends up giving us a rather nice speed boost in some of the test suite runs. The problem is: - two cluster nodes on 1 host (in the case of the mysql-test-run script) - each node has a complete copy of the database - ALTER TABLESPACE ADD DATAFILE / ALTER LOGFILEGROUP ADD UNDOFILE creates files on *both* nodes. We want to zero these out. - files are opened with O_SYNC (IIRC) The patch I committed uses XFS_IOC_RESVSP64 to allocate (unwritten) extents and then posix_fallocate to zero out the file (the glibc implementation of this call just writes zeros out). Now, ideally it would be beneficial (and probably faster) to have XFS do this in kernel. Asynchronously would be pretty cool too.. but hey :) The reason we don't want unwritten extents is that NDB has some realtime properties, and futzing about with extents and the like in the FS during transactions isn't such a good idea. So, this would lead me to try XFS_IOC_ALLOCSP64 - which doesn't have the "unwritten extents" warning that RESVSP64 does. However, with the two processes writing the files out, I get heavy fragmentation. Even with a RESVSP followed by ALLOCSP I get the same result. So it seems that ALLOCSP re-allocates extents (even if it doesn't have to) and really doesn't give you much (didn't do too much timing to see if it was any quicker). Is this expected behaviour? (it wasn't for me) --=20 Stewart Smith, Software Engineer MySQL AB, www.mysql.com Office: +14082136540 Ext: 6616 VoIP: 6616@sip.us.mysql.com Mobile: +61 4 3 8844 332 Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html --=-yf2iUCshWZ67mA/P20R0 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) iD8DBQBFV8tiKglWCUL+FDoRAl49AJ9H5/9h+KJmLLMeY0HBmxH89cRb5gCgo2ej IK6ccyggyyTBs7vVBroRbFo= =UFYp -----END PGP SIGNATURE----- --=-yf2iUCshWZ67mA/P20R0-- From owner-xfs@oss.sgi.com Sun Nov 12 20:11:08 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 12 Nov 2006 20:11:16 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAD4B6aG024479 for ; Sun, 12 Nov 2006 20:11:07 -0800 X-ASG-Debug-ID: 1163391018-26327-428-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailgate.mysql.com (mailgate-out2.mysql.com [213.136.52.68]) by cuda.sgi.com (Spam Firewall) with ESMTP id EB01751A283 for ; Sun, 12 Nov 2006 20:10:18 -0800 (PST) Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgate.mysql.com (8.13.4/8.13.4) with ESMTP id kAD49Q77003056; Mon, 13 Nov 2006 05:09:26 +0100 Received: from mail.mysql.com ([10.222.1.99]) by localhost (mailgate.mysql.com [10.222.1.98]) (amavisd-new, port 10026) with LMTP id 00971-03; Mon, 13 Nov 2006 05:09:25 +0100 (CET) Received: from [192.168.1.100] (ppp163-199.static.internode.on.net [150.101.163.199]) (authenticated bits=0) by mail.mysql.com (8.13.3/8.13.3) with ESMTP id kAD497Kg022268 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Mon, 13 Nov 2006 05:09:17 +0100 X-ASG-Orig-Subj: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads From: Stewart Smith To: Sam Vaughan Cc: xfs@oss.sgi.com In-Reply-To: <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-rOZV1sXgWb3qfQz5RbKS" Organization: MySQL AB Date: Mon, 13 Nov 2006 15:09:02 +1100 Message-Id: <1163390942.14517.12.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 X-Barracuda-Spam-Score: 0.50 X-Barracuda-Spam-Status: No, SCORE=0.50 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25845 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M BODY: Custom Rule 7568M X-archive-position: 9607 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stewart@mysql.com Precedence: bulk X-list: xfs Content-Length: 2858 Lines: 78 --=-rOZV1sXgWb3qfQz5RbKS Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2006-11-13 at 13:58 +1100, Sam Vaughan wrote: > Are the two processes in your test writing files to the same=20=20 > directory as each other? If so then their allocations will go into=20=20 > the same AG as the directory by default, hence the fragmentation. If=20= =20 > you can limit yourself to an AG's worth of data per directory then=20=20 > you should be able to avoid fragmentation using the default=20=20 > allocator. If you need to reserve more than that per AG, then the=20=20 > files will most likely start interleaving again once they spill out=20=20 > of their original AGs. If that's the case then the upcoming=20=20 > filestreams allocator may be your best bet. I do predict that the filestreams allocator will be useful for us (and also on my MythTV box...). The two processes write to their own directories. The structure of the "filesystem" for the process (ndbd) is: ndb_1_fs/ (the 1 refers to node id, so there is a ndb_2_fs for a 2 node setup) D8/, D9/, D10/, D11/ all have a DBLQH subdirectory. In here there are several S0.FragLog files (the number changes). These are 16MB files used for logging. We (currently) don't do any xfsctl allocation on these. We should though. In fact, we're writing them in a way to get holes (which probably affects performance). These files are write only (except during a full cluster restart - a very rare event). LCP/0/T0F0.Data (there is at least 0,1,2 for that first number, T0 is table 0 - can be thousands of tables. f0 is fragment 0, can be a few of them too, typically 2-4 though) These are an on-disk copy of in-memory tables, variably sized files (as big or as small as tables in a DB) The above log files are for changes occuring during the writing of these files. datafile01.dat, undofile01.dat etc whatever files the user creates for disk based tables the datafiles and undofiles that i've done the special allocation for. Typical deployments will have anything from a few hundred MB per file to few GB to many many GB. "typical" installations are probably now evenly split between 1 process per physical machine and several (usually 2).=20 --=20 Stewart Smith, Software Engineer MySQL AB, www.mysql.com Office: +14082136540 Ext: 6616 VoIP: 6616@sip.us.mysql.com Mobile: +61 4 3 8844 332 Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html --=-rOZV1sXgWb3qfQz5RbKS Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) iD8DBQBFV+/eKglWCUL+FDoRAgeaAJ9VyoAYPbdCbkiqDla2XjAAFkAQOACdHuCG XvoepUZ5I/+6U2xy2FgCNRs= =VKoX -----END PGP SIGNATURE----- --=-rOZV1sXgWb3qfQz5RbKS-- From owner-xfs@oss.sgi.com Sun Nov 12 20:52:12 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 12 Nov 2006 20:52:19 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAD4q8aG032719 for ; Sun, 12 Nov 2006 20:52:10 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA17739; Mon, 13 Nov 2006 15:51:15 +1100 Received: from [134.14.55.100] (cxfsmac10.melbourne.sgi.com [134.14.55.100]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAD4pE7Y32517879; Mon, 13 Nov 2006 15:51:15 +1100 (AEDT) In-Reply-To: <1163390942.14517.12.camel@localhost.localdomain> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> Cc: xfs@oss.sgi.com Content-Transfer-Encoding: 7bit From: Sam Vaughan Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Date: Mon, 13 Nov 2006 15:53:54 +1100 To: Stewart Smith X-Mailer: Apple Mail (2.752.2) X-archive-position: 9608 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sjv@sgi.com Precedence: bulk X-list: xfs Content-Length: 7519 Lines: 190 On 13/11/2006, at 3:09 PM, Stewart Smith wrote: > On Mon, 2006-11-13 at 13:58 +1100, Sam Vaughan wrote: >> Are the two processes in your test writing files to the same >> directory as each other? If so then their allocations will go into >> the same AG as the directory by default, hence the fragmentation. If >> you can limit yourself to an AG's worth of data per directory then >> you should be able to avoid fragmentation using the default >> allocator. If you need to reserve more than that per AG, then the >> files will most likely start interleaving again once they spill out >> of their original AGs. If that's the case then the upcoming >> filestreams allocator may be your best bet. > > I do predict that the filestreams allocator will be useful for us (and > also on my MythTV box...). > > The two processes write to their own directories. > > The structure of the "filesystem" for the process (ndbd) is: > > ndb_1_fs/ (the 1 refers to node id, so there is a ndb_2_fs for a 2 > node > setup) > D8/, D9/, D10/, D11/ > all have a DBLQH subdirectory. In here there are several > S0.FragLog files (the number changes). These are 16MB > files used for logging. > We (currently) don't do any xfsctl allocation on these. > We should though. In fact, we're writing them in a way > to get holes (which probably affects performance). > These files are write only (except during a full cluster > restart - a very rare event). > > LCP/0/T0F0.Data > (there is at least 0,1,2 for that first number, > T0 is table 0 - can be thousands of tables. > f0 is fragment 0, can be a few of them too, typically > 2-4 though) > These are an on-disk copy of in-memory tables, variably > sized files (as big or as small as tables in a DB) > The above log files are for changes occuring during the > writing of these files. > > datafile01.dat, undofile01.dat etc > whatever files the user creates for disk based tables > the datafiles and undofiles that i've done the special > allocation for. > Typical deployments will have anything from a few > hundred MB per file to few GB to many many GB. > > "typical" installations are probably now evenly split between 1 > process > per physical machine and several (usually 2). Just to be clear, are we talking about intra-file fragmentation, i.e. file data laid out discontiguously on disk, or inter-file fragmentation where each file is continguous on disk but the files from different processes are getting interleaved? Also, are there just a couple of user data files, each of them potentially much larger than the size of an AG, or do you split the data up into many files, e.g. datafile01.dat ... datafile99.dat ...? If you have the flexibility to break the data up at arbitrary points into separate files, you could get optimal allocation behaviour by starting a new directory as soon as the files in the current one are large enough to fill an AG. The problem with the filestreams allocator is that it will only dedicate an AG to a directory for a fixed and short period of time after the last file was written to it. This works well to limit the resource drain on AGs when running file-per-frame video captures, but not so well with a database that writes its data in a far less regimented and timely way. The following two tests illustrate the standard allocation policy I'm referring to here. I've simplified it to take advantage of the fact that it's producing just one extent per file, but you can run `xfs_bmap -v` over all the files to verify that's the case. Standard SLES 10 kernel, standard mount options: $ uname -r 2.6.16.21-0.8-smp $ xfs_info . meta-data=/dev/sdb8 isize=256 agcount=16, agsize=3267720 blks = sectsz=512 attr=0 data = bsize=4096 blocks=52283520, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=25529, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 $ mount | grep sdb8 /dev/sdb8 on /spare200 type xfs (rw) $ Create two directories and start two processes off, one per directory. The processes preallocate ten 100MB files each. The result is that their data goes into separate AGs on disk, all nicely contiguous: $ mkdir a b $ for dir in a b; do > for file in `seq 0 9`; do > touch $dir/$file > xfs_io -c 'allocsp 100m 0' $dir/$file > done & > done; wait [1] 5649 [2] 5650 $ for file in `seq 0 9`; do > bmap_a=`xfs_bmap -v a/$file | tail -1` > bmap_b=`xfs_bmap -v b/$file | tail -1` > ag_a=`echo $bmap_a | awk '{print $4}'` > ag_b=`echo $bmap_b | awk '{print $4}'` > br_a=`echo $bmap_a | awk 'printf "%-18s", $3}'` > br_b=`echo $bmap_b | awk 'printf "%-18s", $3}'` > echo a/$file: $ag_a "$br_a" b/$file: $ag_b "$br_b" > done a/0: 8 209338416..209543215 b/0: 9 235275936..235480735 a/1: 8 209543216..209748015 b/1: 9 235480736..235685535 a/2: 8 209748016..209952815 b/2: 9 235685536..235890335 a/3: 8 209952816..210157615 b/3: 9 235890336..236095135 a/4: 8 210157616..210362415 b/4: 9 236095136..236299935 a/5: 8 210362416..210567215 b/5: 9 236299936..236504735 a/6: 8 210567216..210772015 b/6: 9 236504736..236709535 a/7: 8 210772016..210976815 b/7: 9 236709536..236914335 a/8: 8 210976816..211181615 b/8: 9 236914336..237119135 a/9: 8 211181616..211386415 b/9: 9 237119136..237323935 $ Now do the same thing, except have the processes write their files into the same directory using different file names. This time the files are allocated on top of each other. $ dir=c $ mkdir $dir $ for process in 1 2; do > for file in `seq 0 9`; do > touch $dir/$process.$file > xfs_io -c 'allocsp 100m 0' $dir/$process.$file > done & > done; wait [1] 5985 [2] 5986 $ for file in c/*; do > bmap=`xfs_bmap -v $file | tail -1` > ag=`echo $bmap | awk '{print $4}'` > br=`echo $bmap | awk '{printf "%-18s", $3}'` > echo $file: $ag "$br" > done c/1.0: 11 287559456..287764255 c/1.1: 11 287969056..288173855 c/1.2: 11 288378656..288583455 c/1.3: 11 288788256..288993055 c/1.4: 11 289197856..289402655 c/1.5: 11 289607456..289812255 c/1.6: 11 290017056..290221855 c/1.7: 11 290426656..290631455 c/1.8: 11 290836264..291041063 c/1.9: 11 291450664..291655463 c/2.0: 11 287764256..287969055 c/2.1: 11 288173856..288378655 c/2.2: 11 288583456..288788255 c/2.3: 11 288993056..289197855 c/2.4: 11 289402656..289607455 c/2.5: 11 289812256..290017055 c/2.6: 11 290221856..290426655 c/2.7: 11 290631464..290836263 c/2.8: 11 291041064..291245863 c/2.9: 11 291245864..291450663 $ Now in your case you're using different directories, so your files are probably OK at the start of day. Once the AGs they start in fill up though, the files for both processes will start getting allocated from the next available AG. At that point, allocations that started out looking like the first test above will end up looking like the second. The filestreams allocator will stop this from happening for applications that write data regularly like video ingest servers, but I wouldn't expect it to be a cure-all for your database app because your writes could have large delays between them. Instead, I'd look into ways to break up your data into AG-sized chunks, starting a new directory every time you go over that magic size. Sam From owner-xfs@oss.sgi.com Sun Nov 12 21:21:52 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 12 Nov 2006 21:22:05 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAD5LpaG002986 for ; Sun, 12 Nov 2006 21:21:52 -0800 X-ASG-Debug-ID: 1163395261-32229-140-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mailgate.mysql.com (mailgate-out2.mysql.com [213.136.52.68]) by cuda.sgi.com (Spam Firewall) with ESMTP id 542F33E9E8C; Sun, 12 Nov 2006 21:21:01 -0800 (PST) Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgate.mysql.com (8.13.4/8.13.4) with ESMTP id kAD5KwSj003308; Mon, 13 Nov 2006 06:20:58 +0100 Received: from mail.mysql.com ([10.222.1.99]) by localhost (mailgate.mysql.com [10.222.1.98]) (amavisd-new, port 10026) with LMTP id 26488-04; Mon, 13 Nov 2006 06:20:57 +0100 (CET) Received: from [192.168.1.100] (ppp163-199.static.internode.on.net [150.101.163.199]) (authenticated bits=0) by mail.mysql.com (8.13.3/8.13.3) with ESMTP id kAD5Kpci001890 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Mon, 13 Nov 2006 06:20:54 +0100 X-ASG-Orig-Subj: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads From: Stewart Smith To: Sam Vaughan Cc: xfs@oss.sgi.com In-Reply-To: <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-Sxd6HlS88lQbZ5tP45OJ" Organization: MySQL AB Date: Mon, 13 Nov 2006 16:20:50 +1100 Message-Id: <1163395250.14517.38.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25853 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9609 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stewart@mysql.com Precedence: bulk X-list: xfs Content-Length: 5396 Lines: 128 --=-Sxd6HlS88lQbZ5tP45OJ Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2006-11-13 at 15:53 +1100, Sam Vaughan wrote: > Just to be clear, are we talking about intra-file fragmentation, i.e.=20= =20 > file data laid out discontiguously on disk, or inter-file=20=20 > fragmentation where each file is continguous on disk but the files=20=20 > from different processes are getting interleaved? Also, are there=20=20 > just a couple of user data files, each of them potentially much=20=20 > larger than the size of an AG, or do you split the data up into many=20= =20 > files, e.g. datafile01.dat ... datafile99.dat ...? an example: /home/mysql/cluster/ndb_1_fs/datafile1.dat: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..63]: 32862376..32862439 8 (1405096..1405159) 64 1: [64..127]: 32875992..32876055 8 (1418712..1418775) 64 2: [128..191]: 33040112..33040175 8 (1582832..1582895) 64 3: [192..255]: 33080136..33080199 8 (1622856..1622919) 64 4: [256..319]: 33101416..33101479 8 (1644136..1644199) 64 5: [320..383]: 33112624..33112687 8 (1655344..1655407) 64 6: [384..447]: 32526608..32526671 8 (1069328..1069391) 64 7: [448..511]: 31678920..31678983 8 (221640..221703) 64 /home/mysql/cluster/ndb_2_fs/datafile1.dat: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..63]: 32864704..32864767 8 (1407424..1407487) 64 1: [64..127]: 32888544..32888607 8 (1431264..1431327) 64 2: [128..191]: 33068832..33068895 8 (1611552..1611615) 64 3: [192..255]: 33101168..33101231 8 (1643888..1643951) 64 4: [256..319]: 33101656..33101719 8 (1644376..1644439) 64 5: [320..383]: 33115784..33115847 8 (1658504..1658567) 64 6: [384..447]: 33897200..33897263 8 (2439920..2439983) 64 7: [448..511]: 33900896..33900959 8 (2443616..2443679) 64 on this fs: isize=3D256 agcount=3D32, agsize=3D491520 blks =3D sectsz=3D512 attr=3D0 data =3D bsize=3D4096 blocks=3D15728640, imaxpct=3D25 =3D sunit=3D0 swidth=3D0 blks, unwritte= n=3D1 naming =3Dversion 2 bsize=3D4096=20=20 log =3Dinternal bsize=3D4096 blocks=3D3840, version=3D1 =3D sectsz=3D512 sunit=3D0 blks realtime =3Dnone extsz=3D65536 blocks=3D0, rtextents=3D0 (somewhere between 5-15Gb free from this create IIRC) these datafiles are fixed size, allocated by user. a DBA would run from the SQL server something like: CREATE TABLESPACE ts1 ADD DATAFILE 'datafile.dat' USE LOGFILE GROUP lg1 INITIAL_SIZE 1G ENGINE NDB; to get a tablespace with 1GB data file (on each node). we currently don't do any automatic extending. > If you have the flexibility to break the data up at arbitrary points=20= =20 > into separate files, you could get optimal allocation behaviour by=20=20 > starting a new directory as soon as the files in the current one are=20= =20 > large enough to fill an AG. The problem with the filestreams=20=20 > allocator is that it will only dedicate an AG to a directory for a=20=20 > fixed and short period of time after the last file was written to=20=20 > it. This works well to limit the resource drain on AGs when running=20= =20 > file-per-frame video captures, but not so well with a database that=20=20 > writes its data in a far less regimented and timely way. for the data and undo files, we're just not changing their size except at creation time, so that's okay. > Now in your case you're using different directories, so your files=20=20 > are probably OK at the start of day. Once the AGs they start in fill=20= =20 > up though, the files for both processes will start getting allocated=20= =20 > from the next available AG. At that point, allocations that started=20= =20 > out looking like the first test above will end up looking like the=20=20 > second. >=20 > The filestreams allocator will stop this from happening for=20=20 > applications that write data regularly like video ingest servers, but=20= =20 > I wouldn't expect it to be a cure-all for your database app because=20=20 > your writes could have large delays between them. Instead, I'd look=20= =20 > into ways to break up your data into AG-sized chunks, starting a new=20= =20 > directory every time you go over that magic size. I'll have to check our writing behaviour the files that change sizes... but they're not too much of an issue (they're hardly ever read back, so as long as writing them out is okay and reading isn't totally abismal, we don't have to worry). --=20 Stewart Smith, Software Engineer MySQL AB, www.mysql.com Office: +14082136540 Ext: 6616 VoIP: 6616@sip.us.mysql.com Mobile: +61 4 3 8844 332 Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html --=-Sxd6HlS88lQbZ5tP45OJ Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) iD8DBQBFWACxKglWCUL+FDoRAvMvAJ9xrLPWxGzuAk02gt2TwJu11pDUYwCbBWl8 in+PlEfZYHPHBODVw5yL1S0= =qt5j -----END PGP SIGNATURE----- --=-Sxd6HlS88lQbZ5tP45OJ-- From owner-xfs@oss.sgi.com Mon Nov 13 08:09:09 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 08:09:18 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADG98aG021477 for ; Mon, 13 Nov 2006 08:09:09 -0800 X-ASG-Debug-ID: 1163432913-17390-369-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from moving-picture.com (mpc-26.sohonet.co.uk [193.203.82.251]) by cuda.sgi.com (Spam Firewall) with ESMTP id C81694A5B5D for ; Mon, 13 Nov 2006 07:48:33 -0800 (PST) Received: from minion.mpc.local ([172.16.11.112] helo=moving-picture.com) by moving-picture.com with esmtp (Exim 4.43) id 1Gje2h-0004bg-0C for xfs@oss.sgi.com; Mon, 13 Nov 2006 15:48:15 +0000 Message-ID: <455893BE.4010208@moving-picture.com> Date: Mon, 13 Nov 2006 15:48:14 +0000 From: James Pearson User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040524 X-Accept-Language: en-us, en MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: Update for the RHEL4 XFS module? Subject: Update for the RHEL4 XFS module? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Disclaimer: This email and any attachments are confidential, may be legally X-Disclaimer: privileged and intended solely for the use of addressee. If you X-Disclaimer: are not the intended recipient of this message, any disclosure, X-Disclaimer: copying, distribution or any action taken in reliance on it is X-Disclaimer: strictly prohibited and may be unlawful. If you have received X-Disclaimer: this message in error, please notify the sender and delete all X-Disclaimer: copies from your system. X-Disclaimer: X-Disclaimer: Email may be susceptible to data corruption, interception and X-Disclaimer: unauthorised amendment, and we do not accept liability for any X-Disclaimer: such corruption, interception or amendment or the consequences X-Disclaimer: thereof. X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25893 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9618 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: james-p@moving-picture.com Precedence: bulk X-list: xfs Content-Length: 216 Lines: 10 The RHEL4 XFS module code at is now about a year old. Is there any chance that it could be updated with something based on more recent code? Thanks James Pearson From owner-xfs@oss.sgi.com Mon Nov 13 08:15:54 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 08:16:03 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADGFraG022699 for ; Mon, 13 Nov 2006 08:15:54 -0800 X-ASG-Debug-ID: 1163434504-20618-587-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 954FF51C2B6 for ; Mon, 13 Nov 2006 08:15:04 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 7E987187B8C1D; Mon, 13 Nov 2006 10:15:03 -0600 (CST) Message-ID: <45589A06.7040503@sandeen.net> Date: Mon, 13 Nov 2006 10:15:02 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: James Pearson CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Update for the RHEL4 XFS module? Subject: Re: Update for the RHEL4 XFS module? References: <455893BE.4010208@moving-picture.com> In-Reply-To: <455893BE.4010208@moving-picture.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25893 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9619 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 726 Lines: 26 James Pearson wrote: > The RHEL4 XFS module code at > is now about a year old. > > Is there any chance that it could be updated with something based on > more recent code? > > Thanks > > James Pearson > > Funny you should ask, I just started looking at this last night :) It was originally based on the SLES9 xfs code; my plan is to simply update it to the latest sles9 xfs codebase, as that has been tended to with a goal of stability by the fine folks at sgi.... so, no bleeding-edge xfs for now (I don't particularly want to backport 2.6.18 xfs code to 2.6.9....) BTW I have test RHEL5 xfs rpms too, in very test-y state: http://sandeen.net/rhel5_xfs/ -Eric From owner-xfs@oss.sgi.com Mon Nov 13 08:35:33 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 08:35:42 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADGZWaG029179 for ; Mon, 13 Nov 2006 08:35:33 -0800 X-ASG-Debug-ID: 1163434214-20847-547-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from houinet7.hou.moc.com (houinet7.hou.moc.com [192.70.218.25]) by cuda.sgi.com (Spam Firewall) with ESMTP id 9648A51B210 for ; Mon, 13 Nov 2006 08:10:15 -0800 (PST) Received: from pnors230.mgroupnet.com ([89.32.18.248]) by houinet7.hou.moc.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 13 Nov 2006 10:10:11 -0600 Received: from pnors220.mgroupnet.com ([89.32.18.245]) by pnors230.mgroupnet.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 13 Nov 2006 10:10:11 -0600 Received: from 89.60.92.102 ([89.60.92.102]) by pnors220.mgroupnet.com ([89.32.18.245]) with Microsoft Exchange Server HTTP-DAV ; Mon, 13 Nov 2006 16:10:11 +0000 Received: from houuc8 by pnors220.mgroupnet.com; 13 Nov 2006 10:10:10 -0600 X-ASG-Orig-Subj: RHEL 4 Compatible Kernel Module Code Subject: RHEL 4 Compatible Kernel Module Code From: "Stephen C. Rigler" To: xfs@oss.sgi.com Content-Type: multipart/mixed; boundary="=-8t7RM/+IJRZy4HtfBske" Date: Mon, 13 Nov 2006 10:10:10 -0600 Message-Id: <1163434210.25484.14.camel@houuc8> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-27.rhel4.6) X-OriginalArrivalTime: 13 Nov 2006 16:10:11.0180 (UTC) FILETIME=[32036AC0:01C7073E] X-Barracuda-Spam-Score: 1.53 X-Barracuda-Spam-Status: No, SCORE=1.53 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=RCVD_NUMERIC_HELO X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25893 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 1.53 RCVD_NUMERIC_HELO Received: contains an IP address used for HELO X-archive-position: 9620 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: srigler@marathonoil.com Precedence: bulk X-list: xfs Content-Length: 6128 Lines: 141 --=-8t7RM/+IJRZy4HtfBske Content-Type: text/plain Content-Transfer-Encoding: 7bit Greetings, We are using CentOS 4.4 along with the RHEL 4 compatible kernel modules (downloadable here: http://mirror.centos.org/centos/4.4/centosplus/x86_64/RPMS/). According to the CentOS mailing list, the person at SGI who had been backporting the xfs code for RHEL4/CentOS4 has left the company. Are there any plans to continue this work? It seems like we are getting bit by this bug: http://oss.sgi.com/bugzilla/show_bug.cgi?id=410 but it doesn't look like the fix has been backported to the RHEL4 kernel module. Thanks, Steve --=-8t7RM/+IJRZy4HtfBske Content-Disposition: inline Content-Description: Forwarded message - Re: [CentOS] Re: XFS Issues Content-Type: message/rfc822 Received: from pnors230.mgroupnet.com ([89.32.18.248]) by pnors220.mgroupnet.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 8 Nov 2006 16:47:32 -0600 Received: from fdyinet2.fin.moc.com ([89.2.42.65]) by pnors230.mgroupnet.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 8 Nov 2006 16:47:32 -0600 Received: from fdyinet2 ([65.219.124.100]) by fdyinet2.fin.moc.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 8 Nov 2006 17:47:29 -0500 Received: from plmler1.mail.eds.com (localhost [127.0.0.1]) by plmler1.mail.eds.com (8.13.8/8.12.10) with ESMTP id kA8Mkar7019397 for ; Wed, 8 Nov 2006 16:46:36 -0600 Received: from plmler1.mail.eds.com (localhost [127.0.0.1]) by plmler1.mail.eds.com (8.13.8/8.12.10) with ESMTP id kA8Mj85J014819 for ; Wed, 8 Nov 2006 16:45:08 -0600 Received: from mail.centos.org (mail.centos.org [72.21.40.12]) by plmler1.mail.eds.com (8.13.8/8.12.10) with ESMTP id kA8Mj88A014803 for ; Wed, 8 Nov 2006 16:45:08 -0600 X-EDS-Source-Ip: 72.21.40.12 X-EDS-Source-Name: mail.centos.org X-EDS-Reported-Name: mail.centos.org Received: from lists.centos.org (localhost.localdomain [127.0.0.1]) by mail.centos.org (Postfix) with ESMTP id E317BF3C443; Wed, 8 Nov 2006 22:44:54 +0000 (UTC) X-Original-To: centos@centos.org Delivered-To: centos@centos.org Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.184]) by mail.centos.org (Postfix) with ESMTP id B5D88F3C244 for ; Wed, 8 Nov 2006 22:44:50 +0000 (UTC) Received: by nf-out-0910.google.com with SMTP id n15so478150nfc for ; Wed, 08 Nov 2006 14:44:50 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=Buu9Y3hRHhSe95zJw4sSTb3NpnVxBW/TY5VvHIluGSfodvFXuFCNV35rFHfgZV26jj9uBIAt2YnJifxqM7wSSHOwKCA5lUR3v5+TCThxPM5nA9pHB4V851BKIWMxBfYCzCSpzenaeRDuktyORajmEAzKGRvHbvUazbFZaJz3SPs= Received: by 10.49.3.10 with SMTP id f10mr2964749nfi.1163025889883; Wed, 08 Nov 2006 14:44:49 -0800 (PST) Received: by 10.48.212.2 with HTTP; Wed, 8 Nov 2006 14:44:49 -0800 (PST) Message-ID: Date: Wed, 8 Nov 2006 22:44:49 +0000 From: "James Pearson" To: "CentOS mailing list" Subject: Re: [CentOS] Re: XFS Issues In-Reply-To: <1163013603.25393.23.camel@houuc8> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <1163006260.25393.10.camel@houuc8> <1163013603.25393.23.camel@houuc8> X-Google-Sender-Auth: f30592bec6a19f7f X-BeenThere: centos@centos.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: CentOS mailing list List-Id: CentOS mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: centos-bounces@centos.org Errors-To: centos-bounces@centos.org X-Brightmail-Flag: NO X-Spam-Flag: NO X-Spam-Checker-Version: SpamAssassin 2.64 (2004-01-11) on plmler1.mail.eds.com X-Spam-Report: This mail has been scanned in order to control spam. The results are given below in the "Content Analysis" line. If the number of points is greater than required, then this mail is very probably spam. ---------------- Content Analysis: (0.0 points, 5.0 required) ---------------- X-Spam-Status: No, hits=0.0 required=5.0 tests=none X-Spam-Level: X-Envelope-Trace: plmler1.kA8Mj85J014819 X-Envelope-From: Return-Path: centos-bounces@centos.org X-OriginalArrivalTime: 08 Nov 2006 22:47:29.0293 (UTC) FILETIME=[DE920BD0:01C70387] On 08/11/06, Stephen C. Rigler wrote: > On Wed, 2006-11-08 at 11:06 -0800, Scott Silva wrote: > > > Is there any chance that the fix will make it into the centosplus > > > kernel-module-xfs? > > Why not try installing 2.6.9-42.0.3 and see if maybe it is fixed? > > > > I installed it on a different system and the modinfo for xfs still gives > the same information: > > description: SGI-XFS CVS-2004-10-17_05:00_UTC > > Has the module been updated since 2.6.9-34.0.1? The kernel-module-xfs code hasn't changed in a while - that date stamp is incorrect, but the code is more like a year old. Unfortunately, the person at SGI that packaged up the code in the XFS module for RHEL4/CentOS4 no longer works for SGI (I believe he now works for RedHat). So, I guess, unless someone else at SGI (or elsewhere) back ports the more recent XFS code to the RHEL4 kernel, you are out of luck ... Or you could use an up to date kernel.org kernel instead ... James Pearson _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos --=-8t7RM/+IJRZy4HtfBske-- From owner-xfs@oss.sgi.com Mon Nov 13 09:00:28 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 09:00:36 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADH0QaG032654 for ; Mon, 13 Nov 2006 09:00:28 -0800 X-ASG-Debug-ID: 1163437169-30747-475-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from moving-picture.com (mpc-26.sohonet.co.uk [193.203.82.251]) by cuda.sgi.com (Spam Firewall) with ESMTP id E03D251BCCD for ; Mon, 13 Nov 2006 08:59:30 -0800 (PST) Received: from minion.mpc.local ([172.16.11.112] helo=moving-picture.com) by moving-picture.com with esmtp (Exim 4.43) id 1Gjf9D-00028a-1y; Mon, 13 Nov 2006 16:59:03 +0000 Message-ID: <4558A456.8070806@moving-picture.com> Date: Mon, 13 Nov 2006 16:59:02 +0000 From: James Pearson User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040524 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Eric Sandeen CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Update for the RHEL4 XFS module? Subject: Re: Update for the RHEL4 XFS module? References: <455893BE.4010208@moving-picture.com> <45589A06.7040503@sandeen.net> In-Reply-To: <45589A06.7040503@sandeen.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Disclaimer: This email and any attachments are confidential, may be legally X-Disclaimer: privileged and intended solely for the use of addressee. If you X-Disclaimer: are not the intended recipient of this message, any disclosure, X-Disclaimer: copying, distribution or any action taken in reliance on it is X-Disclaimer: strictly prohibited and may be unlawful. If you have received X-Disclaimer: this message in error, please notify the sender and delete all X-Disclaimer: copies from your system. X-Disclaimer: X-Disclaimer: Email may be susceptible to data corruption, interception and X-Disclaimer: unauthorised amendment, and we do not accept liability for any X-Disclaimer: such corruption, interception or amendment or the consequences X-Disclaimer: thereof. X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25897 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9622 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: james-p@moving-picture.com Precedence: bulk X-list: xfs Content-Length: 723 Lines: 27 Eric Sandeen wrote: > James Pearson wrote: > >> The RHEL4 XFS module code at >> is now about a year old. >> >> Is there any chance that it could be updated with something based on >> more recent code? >> >> Thanks >> >> James Pearson >> >> > > Funny you should ask, I just started looking at this last night :) > > It was originally based on the SLES9 xfs code; my plan is to simply > update it to the latest sles9 xfs codebase, as that has been tended to > with a goal of stability by the fine folks at sgi.... so, no > bleeding-edge xfs for now (I don't particularly want to backport 2.6.18 > xfs code to 2.6.9....) Thanks - that's good to know. James Pearson From owner-xfs@oss.sgi.com Mon Nov 13 09:45:31 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 09:45:38 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADHjUaG005239 for ; Mon, 13 Nov 2006 09:45:31 -0800 X-ASG-Debug-ID: 1163439881-16586-239-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id 38DCA51C4BB for ; Mon, 13 Nov 2006 09:44:41 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kADHiWOJ013477; Mon, 13 Nov 2006 12:44:32 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kADHiW4O032294; Mon, 13 Nov 2006 12:44:32 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kADHiUx3001379; Mon, 13 Nov 2006 12:44:31 -0500 Message-ID: <4558AEFE.10108@sandeen.net> Date: Mon, 13 Nov 2006 11:44:30 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: geir.myrestrand@falconstor.com CC: James Pearson , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Update for the RHEL4 XFS module? Subject: Re: Update for the RHEL4 XFS module? References: <455893BE.4010208@moving-picture.com> <45589A06.7040503@sandeen.net> <4558AE11.2030108@falconstor.com> In-Reply-To: <4558AE11.2030108@falconstor.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25901 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9623 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 905 Lines: 26 Geir A. Myrestrand wrote: > Eric Sandeen wrote: >> James Pearson wrote: >>> The RHEL4 XFS module code at >>> is now about a year old. >>> >>> Is there any chance that it could be updated with something based on >>> more recent code? >>> >> Funny you should ask, I just started looking at this last night :) >> >> It was originally based on the SLES9 xfs code; my plan is to simply >> update it to the latest sles9 xfs codebase, as that has been tended to >> with a goal of stability by the fine folks at sgi.... so, no >> bleeding-edge xfs for now (I don't particularly want to backport 2.6.18 >> xfs code to 2.6.9....) >> > > Eric, please notify us via the list when it has been updated. > I would like this module updated too... Will do. Are there particular issues you're having, or is your code just feeling a bit old in general. ;-) -Eric From owner-xfs@oss.sgi.com Mon Nov 13 10:07:48 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 10:07:56 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADI7jaG007585 for ; Mon, 13 Nov 2006 10:07:47 -0800 X-ASG-Debug-ID: 1163439639-9625-790-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from evaldomino.Falconstor.com (mail1.falconstor.com [216.223.47.230]) by cuda.sgi.com (Spam Firewall) with ESMTP id CDD52D1D1CBA for ; Mon, 13 Nov 2006 09:40:39 -0800 (PST) Received: from [10.3.4.127] ([10.3.4.127]) by falconstormail.falconstor.net (Lotus Domino Release 5.0.11) with ESMTP id 2006111312335064:1173 ; Mon, 13 Nov 2006 12:33:50 -0500 Message-ID: <4558AE11.2030108@falconstor.com> Date: Mon, 13 Nov 2006 12:40:33 -0500 From: "Geir A. Myrestrand" Reply-To: geir.myrestrand@falconstor.com Organization: FalconStor Software, Inc. User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Eric Sandeen CC: James Pearson , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Update for the RHEL4 XFS module? Subject: Re: Update for the RHEL4 XFS module? References: <455893BE.4010208@moving-picture.com> <45589A06.7040503@sandeen.net> In-Reply-To: <45589A06.7040503@sandeen.net> X-MIMETrack: Itemize by SMTP Server on FalconstorMail/FalconStor(Release 5.0.11 |July 24, 2002) at 11/13/2006 12:33:50 PM, Serialize by Router on evaldomino/FalconStor(Release 5.0.11 |July 24, 2002) at 11/13/2006 12:41:28 PM, Serialize complete at 11/13/2006 12:41:28 PM Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25899 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9625 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: geir.myrestrand@falconstor.com Precedence: bulk X-list: xfs Content-Length: 778 Lines: 27 Eric Sandeen wrote: > James Pearson wrote: >> The RHEL4 XFS module code at >> is now about a year old. >> >> Is there any chance that it could be updated with something based on >> more recent code? >> > > Funny you should ask, I just started looking at this last night :) > > It was originally based on the SLES9 xfs code; my plan is to simply > update it to the latest sles9 xfs codebase, as that has been tended to > with a goal of stability by the fine folks at sgi.... so, no > bleeding-edge xfs for now (I don't particularly want to backport 2.6.18 > xfs code to 2.6.9....) > Eric, please notify us via the list when it has been updated. I would like this module updated too... Thanks! -- Geir A. Myrestrand From owner-xfs@oss.sgi.com Mon Nov 13 10:07:45 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 10:07:52 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADI7iaG007555 for ; Mon, 13 Nov 2006 10:07:45 -0800 X-ASG-Debug-ID: 1163440080-9649-972-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from evaldomino.Falconstor.com (mail1.falconstor.com [216.223.47.230]) by cuda.sgi.com (Spam Firewall) with ESMTP id E5A7CD1D2CDE for ; Mon, 13 Nov 2006 09:48:00 -0800 (PST) Received: from [10.3.4.127] ([10.3.4.127]) by falconstormail.falconstor.net (Lotus Domino Release 5.0.11) with ESMTP id 2006111312411071:1177 ; Mon, 13 Nov 2006 12:41:10 -0500 Message-ID: <4558AFC9.3010009@falconstor.com> Date: Mon, 13 Nov 2006 12:47:53 -0500 From: "Geir A. Myrestrand" Reply-To: geir.myrestrand@falconstor.com Organization: FalconStor Software, Inc. User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Eric Sandeen CC: James Pearson , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: Update for the RHEL4 XFS module? Subject: Re: Update for the RHEL4 XFS module? References: <455893BE.4010208@moving-picture.com> <45589A06.7040503@sandeen.net> <4558AE11.2030108@falconstor.com> <4558AEFE.10108@sandeen.net> In-Reply-To: <4558AEFE.10108@sandeen.net> X-MIMETrack: Itemize by SMTP Server on FalconstorMail/FalconStor(Release 5.0.11 |July 24, 2002) at 11/13/2006 12:41:10 PM, Serialize by Router on evaldomino/FalconStor(Release 5.0.11 |July 24, 2002) at 11/13/2006 12:48:49 PM, Serialize complete at 11/13/2006 12:48:49 PM Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25899 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9624 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: geir.myrestrand@falconstor.com Precedence: bulk X-list: xfs Content-Length: 1075 Lines: 31 Eric Sandeen wrote: > Geir A. Myrestrand wrote: >> Eric Sandeen wrote: >>> James Pearson wrote: >>>> The RHEL4 XFS module code at >>>> is now about a year old. >>>> >>>> Is there any chance that it could be updated with something based on >>>> more recent code? >>>> >>> Funny you should ask, I just started looking at this last night :) >>> >>> It was originally based on the SLES9 xfs code; my plan is to simply >>> update it to the latest sles9 xfs codebase, as that has been tended to >>> with a goal of stability by the fine folks at sgi.... so, no >>> bleeding-edge xfs for now (I don't particularly want to backport 2.6.18 >>> xfs code to 2.6.9....) >>> >> Eric, please notify us via the list when it has been updated. >> I would like this module updated too... > > Will do. Are there particular issues you're having, or is your code > just feeling a bit old in general. ;-) No particular issue in my case, but there tend to be some bug fixes in any given 12-month time-span... ;-) -- Geir A. Myrestrand From owner-xfs@oss.sgi.com Mon Nov 13 10:10:47 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 10:10:54 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kADIAlaG008780 for ; Mon, 13 Nov 2006 10:10:47 -0800 X-ASG-Debug-ID: 1163441398-19523-261-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id 548D4D1D011F for ; Mon, 13 Nov 2006 10:09:58 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kADI9cjH028924; Mon, 13 Nov 2006 13:09:38 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kADI9XU1008726; Mon, 13 Nov 2006 13:09:33 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kADI9WJQ003525; Mon, 13 Nov 2006 13:09:33 -0500 Message-ID: <4558B4DC.3050406@sandeen.net> Date: Mon, 13 Nov 2006 12:09:32 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: "Stephen C. Rigler" CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RHEL 4 Compatible Kernel Module Code Subject: Re: RHEL 4 Compatible Kernel Module Code References: <1163434210.25484.14.camel@houuc8> In-Reply-To: <1163434210.25484.14.camel@houuc8> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25903 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9626 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1552 Lines: 37 Stephen C. Rigler wrote: > Greetings, > > We are using CentOS 4.4 along with the RHEL 4 compatible kernel modules > (downloadable here: > http://mirror.centos.org/centos/4.4/centosplus/x86_64/RPMS/). > > According to the CentOS mailing list, the person at SGI who had been > backporting the xfs code for RHEL4/CentOS4 has left the company. > > Are there any plans to continue this work? It seems like we are getting > bit by this bug: http://oss.sgi.com/bugzilla/show_bug.cgi?id=410 but it > doesn't look like the fix has been backported to the RHEL4 kernel > module. Hot topic today; see my reply on the centos list and other recent threads on this list :) I am planning to update the rpm package to include some bugfixes soon, but it may not help your problems. I had been tracking the sles9 xfs codebase as a fairly stable, bugfix-only xfs codebase for this era of kernels; at this point I don't -think- the extent changes you mentioned are in the sles9 codebase... sgi guys? It looks like you actually got a double-whammy; you probably have a very fragmented file, which caused a large memory allocation on read, which recursed into the filesystem, thereby blowing your stack (on x86_64!) near as I can tell. finding & defragging the fragmented source files may be your best bet for now. To avoid it in the future, perhaps you can use preallocation, if you have any control over the app which is writing these. For those interested, the original bug report was: http://lists.centos.org/pipermail/centos/2006-November/072221.html -Eric From owner-xfs@oss.sgi.com Mon Nov 13 16:02:32 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 16:02:40 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAE02SaG010567 for ; Mon, 13 Nov 2006 16:02:30 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA17053; Tue, 14 Nov 2006 11:01:36 +1100 Received: from [134.14.55.100] (cxfsmac10.melbourne.sgi.com [134.14.55.100]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAE01W7Y33835095; Tue, 14 Nov 2006 11:01:34 +1100 (AEDT) In-Reply-To: <1163395250.14517.38.camel@localhost.localdomain> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> <1163395250.14517.38.camel@localhost.localdomain> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> Cc: xfs@oss.sgi.com Content-Transfer-Encoding: 7bit From: Sam Vaughan Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Date: Tue, 14 Nov 2006 11:04:17 +1100 To: Stewart Smith X-Mailer: Apple Mail (2.752.2) X-archive-position: 9629 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sjv@sgi.com Precedence: bulk X-list: xfs Content-Length: 6546 Lines: 144 On 13/11/2006, at 4:20 PM, Stewart Smith wrote: > On Mon, 2006-11-13 at 15:53 +1100, Sam Vaughan wrote: >> Just to be clear, are we talking about intra-file fragmentation, i.e. >> file data laid out discontiguously on disk, or inter-file >> fragmentation where each file is continguous on disk but the files >> from different processes are getting interleaved? Also, are there >> just a couple of user data files, each of them potentially much >> larger than the size of an AG, or do you split the data up into many >> files, e.g. datafile01.dat ... datafile99.dat ...? > > an example: > > /home/mysql/cluster/ndb_1_fs/datafile1.dat: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL > 0: [0..63]: 32862376..32862439 8 (1405096..1405159) 64 > 1: [64..127]: 32875992..32876055 8 (1418712..1418775) 64 > 2: [128..191]: 33040112..33040175 8 (1582832..1582895) 64 > 3: [192..255]: 33080136..33080199 8 (1622856..1622919) 64 > 4: [256..319]: 33101416..33101479 8 (1644136..1644199) 64 > 5: [320..383]: 33112624..33112687 8 (1655344..1655407) 64 > 6: [384..447]: 32526608..32526671 8 (1069328..1069391) 64 > 7: [448..511]: 31678920..31678983 8 (221640..221703) 64 > /home/mysql/cluster/ndb_2_fs/datafile1.dat: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL > 0: [0..63]: 32864704..32864767 8 (1407424..1407487) 64 > 1: [64..127]: 32888544..32888607 8 (1431264..1431327) 64 > 2: [128..191]: 33068832..33068895 8 (1611552..1611615) 64 > 3: [192..255]: 33101168..33101231 8 (1643888..1643951) 64 > 4: [256..319]: 33101656..33101719 8 (1644376..1644439) 64 > 5: [320..383]: 33115784..33115847 8 (1658504..1658567) 64 > 6: [384..447]: 33897200..33897263 8 (2439920..2439983) 64 > 7: [448..511]: 33900896..33900959 8 (2443616..2443679) 64 Those extents are curiously uniform, all 32kB in size. The fact that both files' extents are in AG 8 suggests that the two directories ndb_1_fs and ndb_2_fs filled their original AGs and spilled out into other ones, which is when the interference would have started. Looking at the directory hierarchy in your last email, you might be better off if you could add another directory for the datafiles and undofiles to live in, so they don't end up sharing their AG with other stuff in their parent directory. > on this fs: > isize=256 agcount=32, agsize=491520 blks > = sectsz=512 attr=0 > data = bsize=4096 blocks=15728640, > imaxpct=25 > = sunit=0 swidth=0 blks, > unwritten=1 > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=3840, version=1 > = sectsz=512 sunit=0 blks > realtime =none extsz=65536 blocks=0, rtextents=0 OK, so you've got 32 2GB AGs, and the filesystem is much too small for the inode32 rotor to be involved. > (somewhere between 5-15Gb free from this create IIRC) > > these datafiles are fixed size, allocated by user. a DBA would run > from > the SQL server something like: > CREATE TABLESPACE ts1 > ADD DATAFILE 'datafile.dat' > USE LOGFILE GROUP lg1 > INITIAL_SIZE 1G > ENGINE NDB; > > to get a tablespace with 1GB data file (on each node). So your data file is half the size of an AG. That shouldn't be a problem but it'd be best to keep it to one or two of these files per directory if there's going to be much other concurrent allocation activity. > we currently don't do any automatic extending. > >> If you have the flexibility to break the data up at arbitrary points >> into separate files, you could get optimal allocation behaviour by >> starting a new directory as soon as the files in the current one are >> large enough to fill an AG. The problem with the filestreams >> allocator is that it will only dedicate an AG to a directory for a >> fixed and short period of time after the last file was written to >> it. This works well to limit the resource drain on AGs when running >> file-per-frame video captures, but not so well with a database that >> writes its data in a far less regimented and timely way. > > for the data and undo files, we're just not changing their size except > at creation time, so that's okay. I'd assumed that these files were being continually grown. If all this is happening at creation time then it shouldn't be too hard to make sure the files are cleanly allocated with just one extent. Does the following not work on your file system? $ touch a b $ for file in a b; do > xfs_io -c 'allocsp 1G 0' $file & > done; wait [1] 12312 [2] 12313 [1]- Done xfs_io -c 'allocsp 1G 0' $file [2]+ Done xfs_io -c 'allocsp 1G 0' $file $ xfs_bmap -v a b a: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..2097151]: 231732008..233829159 6 (11968856..14066007) 2097152 b: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..2097151]: 233829160..235926311 6 (14066008..16163159) 2097152 $ >> Now in your case you're using different directories, so your files >> are probably OK at the start of day. Once the AGs they start in fill >> up though, the files for both processes will start getting allocated >> from the next available AG. At that point, allocations that started >> out looking like the first test above will end up looking like the >> second. >> >> The filestreams allocator will stop this from happening for >> applications that write data regularly like video ingest servers, but >> I wouldn't expect it to be a cure-all for your database app because >> your writes could have large delays between them. Instead, I'd look >> into ways to break up your data into AG-sized chunks, starting a new >> directory every time you go over that magic size. > > I'll have to check our writing behaviour the files that change > sizes... > but they're not too much of an issue (they're hardly ever read > back, so > as long as writing them out is okay and reading isn't totally abismal, > we don't have to worry). That's handy. All in all it sounds like your requirements are very file system friendly in terms of getting optimum allocation. I'm not sure what could be causing all those 32kB extents. Sam From owner-xfs@oss.sgi.com Mon Nov 13 16:29:27 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 16:29:35 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAE0TOaG017321 for ; Mon, 13 Nov 2006 16:29:26 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA17646; Tue, 14 Nov 2006 11:28:33 +1100 Received: from [134.14.55.100] (cxfsmac10.melbourne.sgi.com [134.14.55.100]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAE0SU7Y33889313; Tue, 14 Nov 2006 11:28:31 +1100 (AEDT) In-Reply-To: <20061114002536.GA7846@tuatara.stupidest.org> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> <1163395250.14517.38.camel@localhost.localdomain> <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> <20061114002536.GA7846@tuatara.stupidest.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <21DB0765-9CB9-4831-A099-A75C49696E28@sgi.com> Cc: Stewart Smith , xfs@oss.sgi.com Content-Transfer-Encoding: 7bit From: Sam Vaughan Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Date: Tue, 14 Nov 2006 11:31:15 +1100 To: Chris Wedgwood X-Mailer: Apple Mail (2.752.2) X-archive-position: 9630 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sjv@sgi.com Precedence: bulk X-list: xfs Content-Length: 613 Lines: 18 On 14/11/2006, at 11:25 AM, Chris Wedgwood wrote: > On Tue, Nov 14, 2006 at 11:04:17AM +1100, Sam Vaughan wrote: > >> Those extents are curiously uniform, all 32kB in size. > > O_SYNC writes? I'm assuming from Stuart's original email that these files weren't written out with write(), but instead pre-allocated using allocsp: > So, this would lead me to try XFS_IOC_ALLOCSP64 - which doesn't > have the > "unwritten extents" warning that RESVSP64 does. However, with the two > processes writing the files out, I get heavy fragmentation. Even > with a > RESVSP followed by ALLOCSP I get the same result. From owner-xfs@oss.sgi.com Mon Nov 13 16:35:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 16:36:07 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAE0ZuaG018141 for ; Mon, 13 Nov 2006 16:35:58 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA17929; Tue, 14 Nov 2006 11:35:05 +1100 Received: from [134.14.55.100] (cxfsmac10.melbourne.sgi.com [134.14.55.100]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAE0Z47Y33893526; Tue, 14 Nov 2006 11:35:05 +1100 (AEDT) In-Reply-To: <21DB0765-9CB9-4831-A099-A75C49696E28@sgi.com> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> <1163395250.14517.38.camel@localhost.localdomain> <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> <20061114002536.GA7846@tuatara.stupidest.org> <21DB0765-9CB9-4831-A099-A75C49696E28@sgi.com> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <8085EBC6-B862-41C1-8DC7-A93ABED6E1C1@sgi.com> Cc: xfs@oss.sgi.com Content-Transfer-Encoding: 7bit From: Sam Vaughan Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Date: Tue, 14 Nov 2006 11:37:49 +1100 To: Stewart Smith X-Mailer: Apple Mail (2.752.2) X-archive-position: 9631 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sjv@sgi.com Precedence: bulk X-list: xfs Content-Length: 119 Lines: 6 On 14/11/2006, at 11:31 AM, Sam Vaughan wrote: > I'm assuming from Stuart's original email Oops. s/Stuart/Stewart/ From owner-xfs@oss.sgi.com Mon Nov 13 16:52:07 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 16:52:14 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAE0q5aG019655 for ; Mon, 13 Nov 2006 16:52:06 -0800 X-ASG-Debug-ID: 1163463938-28463-160-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from smtp104.sbc.mail.mud.yahoo.com (smtp104.sbc.mail.mud.yahoo.com [68.142.198.203]) by cuda.sgi.com (Spam Firewall) with SMTP id BF0DB51D17F for ; Mon, 13 Nov 2006 16:25:38 -0800 (PST) Received: (qmail 88051 invoked from network); 14 Nov 2006 00:25:38 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp104.sbc.mail.mud.yahoo.com with SMTP; 14 Nov 2006 00:25:37 -0000 X-YMail-OSG: 3WxVU2kVM1mdqnF02OEWFSs3BjVUuYoYUVEs7DquvpXNsrsHJF3K0exV3mJoYSyBuw5uTwvwksSkcIUvuaMb Received: by tuatara.stupidest.org (Postfix, from userid 10000) id 612BA1827280; Mon, 13 Nov 2006 16:25:36 -0800 (PST) Date: Mon, 13 Nov 2006 16:25:36 -0800 From: Chris Wedgwood To: Sam Vaughan Cc: Stewart Smith , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads Message-ID: <20061114002536.GA7846@tuatara.stupidest.org> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> <1163395250.14517.38.camel@localhost.localdomain> <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25929 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9632 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 136 Lines: 6 On Tue, Nov 14, 2006 at 11:04:17AM +1100, Sam Vaughan wrote: > Those extents are curiously uniform, all 32kB in size. O_SYNC writes? From owner-xfs@oss.sgi.com Mon Nov 13 20:01:58 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 20:02:05 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAE41saG005647 for ; Mon, 13 Nov 2006 20:01:56 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA22903; Tue, 14 Nov 2006 15:00:59 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAE40t7Y33860386; Tue, 14 Nov 2006 15:00:56 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAE40rrE33896367; Tue, 14 Nov 2006 15:00:53 +1100 (AEDT) Date: Tue, 14 Nov 2006 15:00:53 +1100 From: David Chinner To: Martin Braun Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: xfs kernel BUG again in 2.6.17.11 Message-ID: <20061114040053.GD8394166@melbourne.sgi.com> References: <44E1D9CA.30805@uni-hd.de> <20060816101122.E2740551@wobbly.melbourne.sgi.com> <44EB228F.6020903@uni-hd.de> <20060823134211.E2968256@wobbly.melbourne.sgi.com> <45583ABE.6080909@uni-hd.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45583ABE.6080909@uni-hd.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 9633 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 6395 Lines: 151 On Mon, Nov 13, 2006 at 10:28:30AM +0100, Martin Braun wrote: > Hi , > > is it possible that the xfs kernel bug is in the 2.6.17.11 Kernel again? > we got obviously the same bug as with 2.6.17.8: It's likely that XFS is identical in those 2 releases. BTW, Martin, can you cc XFS bug reports to xfs@oss.sgi.com in future? > Nov 13 09:27:01 pers109 kernel: Access to block zero: fs: inode: > 637540399 start_block : 0 start_off : 23812530000000 blkcnt : 84 > extent-state : 0 Looks like you are managing to trigger an inode corruption of some sort. Have you managed to repair the filesystem since you first reported this problem? I don't know the history of the bug you are seeing othat than what you included, so can you give us a more complete picture of your hardware and what sort of workload you are doing that triggers this problem? FWIW, are there any I/o errors being reported in dmesg or syslog? Cheers, Dave. > > On Tue, Aug 22, 2006 at 05:28:15PM +0200, Martin Braun wrote: > >> Hi Nathan, > >> > >> since I haven't repaired the fs we had a crash again (see below). > >> > >> unfortunately we copied at the time of the crash over iscsi some files > >> to an xfs-fs on a nas. > >> and the directory was completely deleted. neither a xfs-check or a > >> xfs_repair did find something. was that due to the combination of iscsi > >> and xfs? > > > > Sorry for not getting back to you earlier, I've been too busy. :( > > > > I think you will need to clear out the affected inode (looks like a > > form of corruption that repair doesn't know about today) - you'll > > need to forcibly remove that inode via xfs_db, something like: > > > > # xfs_db -x -c 'inode 35141650' -c 'write core.mode 0' /dev/sdc1 > > # xfs_repair /dev/sdc1 > > > > cheers. > > > > ps: Barry, looks like repair needs some work in this area... > > > >> Aug 22 12:48:12 pers109 kernel: Access to block zero: fs: inode: > >> 35141650 start_block : 0 start_off : 3a1531 blkcnt : c > >> extent-state : 0 > >> Aug 22 12:48:12 pers109 kernel: ------------[ cut here ]------------ > >> Aug 22 12:48:12 pers109 kernel: kernel BUG at :50307! > >> Aug 22 12:48:12 pers109 kernel: invalid opcode: 0000 [#1] > >> Aug 22 12:48:12 pers109 kernel: SMP > >> Aug 22 12:48:12 pers109 kernel: Modules linked in: iscsi_tcp libiscsi > >> scsi_transport_iscsi > >> Aug 22 12:48:12 pers109 kernel: CPU: 0 > >> Aug 22 12:48:12 pers109 kernel: EIP: 0060:[] Not tainted VLI > >> Aug 22 12:48:12 pers109 kernel: EFLAGS: 00010246 (2.6.17.8 #5) > >> Aug 22 12:48:12 pers109 kernel: EIP is at cmn_err+0xa0/0xaa > >> Aug 22 12:48:12 pers109 kernel: eax: c048a2c4 ebx: c04359e4 ecx: > >> c047c9bc edx: 00000282 > >> Aug 22 12:48:12 pers109 kernel: esi: e595dcb0 edi: c056a120 ebp: > >> 00000000 esp: e595db70 > >> Aug 22 12:48:12 pers109 kernel: ds: 007b es: 007b ss: 0068 > >> Aug 22 12:48:12 pers109 kernel: Process smbd (pid: 25510, > >> threadinfo=e595c000 task=d9628a90) > >> Aug 22 12:48:12 pers109 kernel: Stack: c044497a c0427525 c056a120 > >> 00000282 f3507260 e595dcb0 00000000 d9f9de00 > >> Aug 22 12:48:12 pers109 kernel: c0202f0d 00000000 c04359e4 > >> f686cba0 02183812 00000000 00000000 00000000 > >> Aug 22 12:48:12 pers109 kernel: 003a1531 00000000 0000000c > >> 00000000 00000000 e595dcb0 00000000 00000000 > >> Aug 22 12:48:12 pers109 kernel: Call Trace: > >> Aug 22 12:48:12 pers109 kernel: > >> xfs_bmap_search_extents+0xf5/0xf7 xfs_bmapi+0x229/0x162c > >> Aug 22 12:48:12 pers109 kernel: dev_queue_xmit+0x1f4/0x26f > >> ip_output+0x189/0x270 > >> Aug 22 12:48:12 pers109 kernel: __do_softirq+0x6e/0xdc > >> do_IRQ+0x1e/0x24 > >> Aug 22 12:48:12 pers109 kernel: common_interrupt+0x1a/0x20 > >> xfs_zero_eof+0x1ca/0x340 > >> Aug 22 12:48:12 pers109 kernel: memcpy_toiovec+0x37/0x5c > >> file_update_time+0xa1/0xc0 > >> Aug 22 12:48:12 pers109 kernel: xfs_write+0x4ea/0xda5 > >> sock_aio_read+0x83/0x8e > >> Aug 22 12:48:12 pers109 kernel: fasync_helper+0x4b/0xd3 > >> copy_to_user+0x3c/0x4a > >> Aug 22 12:48:12 pers109 kernel: xfs_file_aio_write+0x8f/0x9a > >> do_sync_write+0xd5/0x130 > >> Aug 22 12:48:12 pers109 kernel: > >> autoremove_wake_function+0x0/0x4b vfs_write+0xcb/0x195 > >> Aug 22 12:48:12 pers109 kernel: sys_pwrite64+0x73/0x80 > >> sysenter_past_esp+0x54/0x75 > >> Aug 22 12:48:12 pers109 kernel: Code: c0 c7 44 24 08 20 a1 56 c0 c7 04 > >> 24 7a 49 44 c0 89 44 24 04 e8 ab eb eb ff b8 c4 a2 48 c > >> 0 8b 54 24 0c e8 fc 95 1a 00 85 ed 75 02 <0f> 0b 83 c4 10 5b 5e 5f 5d c3 > >> 55 b8 07 00 00 00 57 bf 20 a1 56 > >> Aug 22 12:48:12 pers109 kernel: EIP: [] cmn_err+0xa0/0xaa > >> SS:ESP 0068:e595db70 > >> > >> > >> > >> > >> > >> > >> Scott schrieb: > >>> Hi Martin, > >>> > >>> On Tue, Aug 15, 2006 at 04:27:22PM +0200, Martin Braun wrote: > >>>> ... > >>>> What does this bug mean? > >>>> ... > >>>> Aug 15 15:01:02 pers109 kernel: Access to block zero: fs: inode: > >>>> 254474718 start_block : 0 start_off : c0a0b0e8a099 > >>>> 0 blkcnt : 90000 extent-state : 0 > >>>> Aug 15 15:01:02 pers109 kernel: ------------[ cut here ]------------ > >>>> Aug 15 15:01:02 pers109 kernel: kernel BUG at :50307! > >>> It means XFS detected ondisk corruption in inode# 254474718, and > >>> paniced your system (stupidly; a fix for this is around, will be > >>> merged with the next mainline update). For me, a more interesting > >>> question is how that inode got into this state... have you had any > >>> crashes recently (i.e. has the filesystem journal needed to be > >>> replayed recently?) Can you send the output of: > >>> > >>> # xfs_db -c 'inode 254474718' -c print /dev/sdc1 > >>> > >>> You'll need to run xfs_repair on that filesystem to fix this up, > >>> but please send us that output first. > >>> > >>> thanks. > >>> > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Nov 13 22:41:29 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 13 Nov 2006 22:41:37 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAE6fPaG023892 for ; Mon, 13 Nov 2006 22:41:27 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA26934; Tue, 14 Nov 2006 17:40:21 +1100 Date: Tue, 14 Nov 2006 16:42:09 +1000 From: Timothy Shimmin To: Eric Sandeen cc: "Stephen C. Rigler" , xfs@oss.sgi.com Subject: Re: RHEL 4 Compatible Kernel Module Code Message-ID: <30F3263F8874E466D6206C56@timothy-shimmins-power-mac-g5.local> In-Reply-To: <4558B4DC.3050406@sandeen.net> References: <1163434210.25484.14.camel@houuc8> <4558B4DC.3050406@sandeen.net> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9634 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1758 Lines: 51 Hi Eric, --On 13 November 2006 12:09:32 PM -0600 Eric Sandeen wrote: > Stephen C. Rigler wrote: >> Greetings, >> >> We are using CentOS 4.4 along with the RHEL 4 compatible kernel modules >> (downloadable here: >> http://mirror.centos.org/centos/4.4/centosplus/x86_64/RPMS/). >> >> According to the CentOS mailing list, the person at SGI who had been >> backporting the xfs code for RHEL4/CentOS4 has left the company. >> >> Are there any plans to continue this work? It seems like we are getting >> bit by this bug: http://oss.sgi.com/bugzilla/show_bug.cgi?id=410 but it >> doesn't look like the fix has been backported to the RHEL4 kernel >> module. > > Hot topic today; see my reply on the centos list and other recent > threads on this list :) I am planning to update the rpm package to > include some bugfixes soon, but it may not help your problems. > > I had been tracking the sles9 xfs codebase as a fairly stable, > bugfix-only xfs codebase for this era of kernels; at this point I don't > -think- the extent changes you mentioned are in the sles9 codebase... > sgi guys? > If you are referring to Mandy's incore extent changes, I don't see them in sles9 or sles10. Having a quick look she has about 5 Mods. linux-2.4/xfs_ksyms.c | 7 linux-2.6/xfs_ksyms.c | 7 quota/xfs_qm.c | 9 xfs_bmap.c | 725 ++++++++++++---------------- xfs_bmap.h | 30 - xfs_bmap_btree.c | 10 xfs_bmap_btree.h | 8 xfs_inode.c | 1258 ++++++++++++++++++++++++++++++++++++++++++++------ xfs_inode.h | 71 ++ xfsidbg.c | 15 10 files changed, 1548 insertions(+), 592 deletions(-) Quite a bit of changes there for xfs_bmap.c and xfs_inode.c. --Tim From owner-xfs@oss.sgi.com Tue Nov 14 01:41:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 01:41:23 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAE9fDaG016268 for ; Tue, 14 Nov 2006 01:41:16 -0800 X-ASG-Debug-ID: 1163496205-30553-769-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by cuda.sgi.com (Spam Firewall) with ESMTP id AAAC7D1CF2A9; Tue, 14 Nov 2006 01:23:26 -0800 (PST) Received: from serv25.ub.uni-heidelberg.de (serv25.ub.uni-heidelberg.de [147.142.186.75]) by relay.uni-heidelberg.de (8.13.4/8.13.1) with ESMTP id kAE9NLTG031194; Tue, 14 Nov 2006 10:23:22 +0100 Received: from localhost (localhost [127.0.0.1]) by serv25.ub.uni-heidelberg.de (Postfix) with ESMTP id 14C318BD24; Tue, 14 Nov 2006 10:23:22 +0100 (CET) Received: from serv25.ub.uni-heidelberg.de ([127.0.0.1]) by localhost (serv25.ub.uni-heidelberg.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 29318-10; Tue, 14 Nov 2006 10:23:20 +0100 (CET) Received: from [147.142.85.36] (pers16.ub.uni-heidelberg.de [147.142.85.36]) by serv25.ub.uni-heidelberg.de (Postfix) with ESMTP id 25AA78BCF7; Tue, 14 Nov 2006 10:23:20 +0100 (CET) Message-ID: <45598B07.6080401@uni-hd.de> Date: Tue, 14 Nov 2006 10:23:19 +0100 From: Martin Braun Reply-To: mbraun@uni-hd.de User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: David Chinner CC: linux-kernel@vger.kernel.org, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: xfs kernel BUG again in 2.6.17.11 Subject: Re: xfs kernel BUG again in 2.6.17.11 References: <44E1D9CA.30805@uni-hd.de> <20060816101122.E2740551@wobbly.melbourne.sgi.com> <44EB228F.6020903@uni-hd.de> <20060823134211.E2968256@wobbly.melbourne.sgi.com> <45583ABE.6080909@uni-hd.de> <20061114040053.GD8394166@melbourne.sgi.com> In-Reply-To: <20061114040053.GD8394166@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.50 X-Barracuda-Spam-Status: No, SCORE=0.50 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25963 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M BODY: Custom Rule 7568M X-archive-position: 9636 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mbraun@uni-hd.de Precedence: bulk X-list: xfs Content-Length: 14264 Lines: 332 Hi David, > Have you managed to repair the filesystem since you first > reported this problem? I don't know the history of the bug that's something I am not sure about, I have used the newest xfs_repair tools and it found and repaired some inodes. And for about two months there weren't any crashes. > you are seeing othat than what you included, so can you > give us a more complete picture of your hardware and > what sort of workload you are doing that triggers this > problem? The main workload of this machine is high samba activity with few clients but many IO tasks (i.e. Photoshop batch processing on many 3-6 MB Images). The XFS Partition is on an easy-RAID 16 P. Other Partitions are EXT3. There are also 2 iSCSI-Partitions with XFS. For Hardware Information, see below. After the crash I did an xfs_repair and it found corrupt directory inode and moved it to lost+found as " 254474253". Normally the Kernel freezes/hangs completely, but I found two new Kernel BUG (see below) in the log-messages (without a freeze), the corresponding java-program was building an lucene-index from a mysql-database. It seems that xfs_repair (2.8.10), did not find all of the errors of the FS. Is there a way to be sure that the FS is clean? > > FWIW, are there any I/o errors being reported in dmesg or syslog? There weren't any I/o errors. Nov 13 14:16:28 pers109 kernel: ------------[ cut here ]------------ Nov 13 14:16:28 pers109 kernel: kernel BUG at :29837! Nov 13 14:16:28 pers109 kernel: invalid opcode: 0000 [#1] Nov 13 14:16:28 pers109 kernel: SMP Nov 13 14:16:28 pers109 kernel: CPU: 2 Nov 13 14:16:28 pers109 kernel: EIP: 0060:[] Not tainted VLI Nov 13 14:16:28 pers109 kernel: EFLAGS: 00210202 (2.6.17.11 #1) Nov 13 14:16:28 pers109 kernel: EIP is at generic_delete_inode+0xf1/0xf9 Nov 13 14:16:28 pers109 kernel: eax: c2001e80 ebx: ecadeca0 ecx: 00000003 edx: ecadedd8 Nov 13 14:16:28 pers109 kernel: esi: 00000000 edi: ecadeca0 ebp: d8699f4c esp: d8699f18 Nov 13 14:16:28 pers109 kernel: ds: 007b es: 007b ss: 0068 Nov 13 14:16:28 pers109 kernel: Process java (pid: 15883, threadinfo=d8698000 task=d6c78a10) Nov 13 14:16:28 pers109 kernel: Stack: ecadeca0 00000000 00000000 ecadeca0 d7ce4000 c01720cd ecadeca0 c04738dc Nov 13 14:16:28 pers109 kernel: 00000000 c01683fc ecadeca0 f1862094 f1862094 c92b5114 c214c0c0 4859aa9a Nov 13 14:16:28 pers109 kernel: 00000008 d7ce4029 00000010 00000000 00000000 00000000 00000000 c214c0c0 Nov 13 14:16:28 pers109 kernel: Call Trace: Nov 13 14:16:28 pers109 kernel: iput+0x5f/0x74 do_unlinkat+0xc9/0x107 Nov 13 14:16:28 pers109 kernel: filp_close+0x44/0x6c sys_unlink+0x17/0x1b Nov 13 14:16:28 pers109 kernel: sysenter_past_esp+0x54/0x75 Nov 13 14:16:28 pers109 kernel: Code: f0 ff ff 8d 83 a8 00 00 00 c7 44 24 04 00 00 00 00 c7 44 24 08 00 00 00 00 89 04 24 e8 b1 fb fc ff 8 9 1c 24 e8 aa f1 ff ff eb 89 <0f> 0b 8d 74 26 00 eb c2 56 53 83 ec 0c 8b 5c 24 18 8b 53 04 8b Nov 13 14:16:28 pers109 kernel: EIP: [] generic_delete_inode+0xf1/0xf9 SS:ESP 0068:d8699f18 Nov 13 20:22:28 pers109 kernel: ------------[ cut here ]------------ Nov 13 20:22:28 pers109 kernel: kernel BUG at :29837! Nov 13 20:22:28 pers109 kernel: invalid opcode: 0000 [#2] Nov 13 20:22:28 pers109 kernel: SMP Nov 13 20:22:28 pers109 kernel: CPU: 3 Nov 13 20:22:28 pers109 kernel: EIP: 0060:[] Not tainted VLI Nov 13 20:22:28 pers109 kernel: EFLAGS: 00010202 (2.6.17.11 #1) Nov 13 20:22:28 pers109 kernel: EIP is at generic_delete_inode+0xf1/0xf9 Nov 13 20:22:28 pers109 kernel: eax: c2001f10 ebx: d6c586a0 ecx: 00000003 edx: d6c587d8 Nov 13 20:22:28 pers109 kernel: esi: 00000000 edi: d6c586a0 ebp: d2cd9f4c esp: d2cd9f18 Nov 13 20:22:28 pers109 kernel: ds: 007b es: 007b ss: 0068 Nov 13 20:22:28 pers109 kernel: Process java (pid: 19824, threadinfo=d2cd8000 task=d1f575a0) Nov 13 20:22:28 pers109 kernel: Stack: d6c586a0 00000000 00000000 d6c586a0 d5144000 c01720cd d6c586a0 c04738dc Nov 13 20:22:28 pers109 kernel: 00000000 c01683fc d6c586a0 dd28e794 dd28e794 f69dd894 c214c0c0 281c233e Nov 13 20:22:28 pers109 kernel: 00000009 d5144029 00000010 00000000 00000000 00000000 00000000 c214c0c0 Nov 13 20:22:28 pers109 kernel: Call Trace: Nov 13 20:22:28 pers109 kernel: iput+0x5f/0x74 do_unlinkat+0xc9/0x107 Nov 13 20:22:28 pers109 kernel: filp_close+0x44/0x6c sys_unlink+0x17/0x1b Nov 13 20:22:28 pers109 kernel: sysenter_past_esp+0x54/0x75 Nov 13 20:22:28 pers109 kernel: Code: f0 ff ff 8d 83 a8 00 00 00 c7 44 24 04 00 00 00 00 c7 44 24 08 00 00 00 00 89 04 24 e8 b1 fb fc ff 8 9 1c 24 e8 aa f1 ff ff eb 89 <0f> 0b 8d 74 26 00 eb c2 56 53 83 ec 0c 8b 5c 24 18 8b 53 04 8b ________ Hardware Info: ________ (Output of cpu0 from 4 (virtual, 2 physical cpus) cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Xeon(TM) CPU 2.80GHz stepping : 7 cpu MHz : 1595.120 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid bogomips : 3193.91 __________ uname -a Linux pers109 2.6.17.11 #1 SMP Mon Aug 28 10:45:48 CEST 2006 i686 i686 i386 GNU/Linux ---------------------- cat /etc/SuSE-release SuSE Linux 9.3 (i586) VERSION = 9.3 --------------------- lspci 0000:00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01) 0000:00:02.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface B PCI-to-PCI Bridge (rev 01) 0000:00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #1) (rev 02) 0000:00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #2) (rev 02) 0000:00:1d.2 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #3) (rev 02) 0000:00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42) 0000:00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02) 0000:00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller (rev 02) 0000:01:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) 0000:01:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) 0000:01:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04) 0000:01:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04) 0000:02:05.0 SCSI storage controller: Adaptec AIC-7902 U320 (rev 03) 0000:02:05.1 SCSI storage controller: Adaptec AIC-7902 U320 (rev 03) 0000:03:03.0 Ethernet controller: Intel Corporation 82544GC Gigabit Ethernet Controller (LOM) (rev 02) 0000:04:01.0 Ethernet controller: Intel Corporation 82540EM Gigabit Ethernet Controller (rev 02) 0000:04:02.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: IBM Model: DCAS-34330W Rev: S65A Type: Direct-Access ANSI SCSI revision: 02 Host: scsi1 Channel: 00 Id: 04 Lun: 00 Vendor: easyRAID Model: 16P Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 04 Lun: 01 Vendor: easyRAID Model: 16P Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 06 Lun: 00 Vendor: easyRAID Model: X16P Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi1 Channel: 00 Id: 06 Lun: 01 Vendor: easyRAID Model: X16P Rev: 0001 Type: Direct-Access ANSI SCSI revision: 03 Host: scsi2 Channel: 00 Id: 00 Lun: 00 Vendor: LITE-ON Model: LTR-48246K Rev: SKS7 Type: CD-ROM ANSI SCSI revision: ffffffff Host: scsi3 Channel: 00 Id: 00 Lun: 00 Vendor: HITACHI Model: DF600F Rev: 0000 Type: Direct-Access ANSI SCSI revision: 04 Host: scsi3 Channel: 00 Id: 00 Lun: 01 Vendor: HITACHI Model: DF600F Rev: 0000 Type: Direct-Access ANSI SCSI revision: 03 ------------ free total used free shared buffers cached Mem: 2075168 2022916 52252 0 4480 1848936 -/+ buffers/cache: 169500 1905668 Swap: 1959920 1782356 177564 > > Cheers, > > Dave. > >>> On Tue, Aug 22, 2006 at 05:28:15PM +0200, Martin Braun wrote: >>>> Hi Nathan, >>>> >>>> since I haven't repaired the fs we had a crash again (see below). >>>> >>>> unfortunately we copied at the time of the crash over iscsi some files >>>> to an xfs-fs on a nas. >>>> and the directory was completely deleted. neither a xfs-check or a >>>> xfs_repair did find something. was that due to the combination of iscsi >>>> and xfs? >>> Sorry for not getting back to you earlier, I've been too busy. :( >>> >>> I think you will need to clear out the affected inode (looks like a >>> form of corruption that repair doesn't know about today) - you'll >>> need to forcibly remove that inode via xfs_db, something like: >>> >>> # xfs_db -x -c 'inode 35141650' -c 'write core.mode 0' /dev/sdc1 >>> # xfs_repair /dev/sdc1 >>> >>> cheers. >>> >>> ps: Barry, looks like repair needs some work in this area... >>> >>>> Aug 22 12:48:12 pers109 kernel: Access to block zero: fs: inode: >>>> 35141650 start_block : 0 start_off : 3a1531 blkcnt : c >>>> extent-state : 0 >>>> Aug 22 12:48:12 pers109 kernel: ------------[ cut here ]------------ >>>> Aug 22 12:48:12 pers109 kernel: kernel BUG at :50307! >>>> Aug 22 12:48:12 pers109 kernel: invalid opcode: 0000 [#1] >>>> Aug 22 12:48:12 pers109 kernel: SMP >>>> Aug 22 12:48:12 pers109 kernel: Modules linked in: iscsi_tcp libiscsi >>>> scsi_transport_iscsi >>>> Aug 22 12:48:12 pers109 kernel: CPU: 0 >>>> Aug 22 12:48:12 pers109 kernel: EIP: 0060:[] Not tainted VLI >>>> Aug 22 12:48:12 pers109 kernel: EFLAGS: 00010246 (2.6.17.8 #5) >>>> Aug 22 12:48:12 pers109 kernel: EIP is at cmn_err+0xa0/0xaa >>>> Aug 22 12:48:12 pers109 kernel: eax: c048a2c4 ebx: c04359e4 ecx: >>>> c047c9bc edx: 00000282 >>>> Aug 22 12:48:12 pers109 kernel: esi: e595dcb0 edi: c056a120 ebp: >>>> 00000000 esp: e595db70 >>>> Aug 22 12:48:12 pers109 kernel: ds: 007b es: 007b ss: 0068 >>>> Aug 22 12:48:12 pers109 kernel: Process smbd (pid: 25510, >>>> threadinfo=e595c000 task=d9628a90) >>>> Aug 22 12:48:12 pers109 kernel: Stack: c044497a c0427525 c056a120 >>>> 00000282 f3507260 e595dcb0 00000000 d9f9de00 >>>> Aug 22 12:48:12 pers109 kernel: c0202f0d 00000000 c04359e4 >>>> f686cba0 02183812 00000000 00000000 00000000 >>>> Aug 22 12:48:12 pers109 kernel: 003a1531 00000000 0000000c >>>> 00000000 00000000 e595dcb0 00000000 00000000 >>>> Aug 22 12:48:12 pers109 kernel: Call Trace: >>>> Aug 22 12:48:12 pers109 kernel: >>>> xfs_bmap_search_extents+0xf5/0xf7 xfs_bmapi+0x229/0x162c >>>> Aug 22 12:48:12 pers109 kernel: dev_queue_xmit+0x1f4/0x26f >>>> ip_output+0x189/0x270 >>>> Aug 22 12:48:12 pers109 kernel: __do_softirq+0x6e/0xdc >>>> do_IRQ+0x1e/0x24 >>>> Aug 22 12:48:12 pers109 kernel: common_interrupt+0x1a/0x20 >>>> xfs_zero_eof+0x1ca/0x340 >>>> Aug 22 12:48:12 pers109 kernel: memcpy_toiovec+0x37/0x5c >>>> file_update_time+0xa1/0xc0 >>>> Aug 22 12:48:12 pers109 kernel: xfs_write+0x4ea/0xda5 >>>> sock_aio_read+0x83/0x8e >>>> Aug 22 12:48:12 pers109 kernel: fasync_helper+0x4b/0xd3 >>>> copy_to_user+0x3c/0x4a >>>> Aug 22 12:48:12 pers109 kernel: xfs_file_aio_write+0x8f/0x9a >>>> do_sync_write+0xd5/0x130 >>>> Aug 22 12:48:12 pers109 kernel: >>>> autoremove_wake_function+0x0/0x4b vfs_write+0xcb/0x195 >>>> Aug 22 12:48:12 pers109 kernel: sys_pwrite64+0x73/0x80 >>>> sysenter_past_esp+0x54/0x75 >>>> Aug 22 12:48:12 pers109 kernel: Code: c0 c7 44 24 08 20 a1 56 c0 c7 04 >>>> 24 7a 49 44 c0 89 44 24 04 e8 ab eb eb ff b8 c4 a2 48 c >>>> 0 8b 54 24 0c e8 fc 95 1a 00 85 ed 75 02 <0f> 0b 83 c4 10 5b 5e 5f 5d c3 >>>> 55 b8 07 00 00 00 57 bf 20 a1 56 >>>> Aug 22 12:48:12 pers109 kernel: EIP: [] cmn_err+0xa0/0xaa >>>> SS:ESP 0068:e595db70 >>>> >>>> >>>> >>>> >>>> >>>> >>>> Scott schrieb: >>>>> Hi Martin, >>>>> >>>>> On Tue, Aug 15, 2006 at 04:27:22PM +0200, Martin Braun wrote: >>>>>> ... >>>>>> What does this bug mean? >>>>>> ... >>>>>> Aug 15 15:01:02 pers109 kernel: Access to block zero: fs: inode: >>>>>> 254474718 start_block : 0 start_off : c0a0b0e8a099 >>>>>> 0 blkcnt : 90000 extent-state : 0 >>>>>> Aug 15 15:01:02 pers109 kernel: ------------[ cut here ]------------ >>>>>> Aug 15 15:01:02 pers109 kernel: kernel BUG at :50307! >>>>> It means XFS detected ondisk corruption in inode# 254474718, and >>>>> paniced your system (stupidly; a fix for this is around, will be >>>>> merged with the next mainline update). For me, a more interesting >>>>> question is how that inode got into this state... have you had any >>>>> crashes recently (i.e. has the filesystem journal needed to be >>>>> replayed recently?) Can you send the output of: >>>>> >>>>> # xfs_db -c 'inode 254474718' -c print /dev/sdc1 >>>>> >>>>> You'll need to run xfs_repair on that filesystem to fix this up, >>>>> but please send us that output first. >>>>> >>>>> thanks. >>>>> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > -- Universitaetsbibliothek Heidelberg Tel: +49 6221 54-2580 Ploeck 107-109, D-69117 Heidelberg Fax: +49 6221 54-2623 From owner-xfs@oss.sgi.com Tue Nov 14 02:06:17 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 02:06:25 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAEA6GaG018363 for ; Tue, 14 Nov 2006 02:06:17 -0800 X-ASG-Debug-ID: 1163498726-24286-265-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from raven.upol.cz (raven.upol.cz [158.194.120.4]) by cuda.sgi.com (Spam Firewall) with ESMTP id E2A16D1D2934; Tue, 14 Nov 2006 02:05:27 -0800 (PST) Received: from smtpgate (antivir1.upol.cz [158.194.108.127]) by raven.upol.cz (AIX4.3/8.9.3/8.9.3) with SMTP id LAA19928; Tue, 14 Nov 2006 11:13:47 +0100 Received: from flower (flower.upol.cz [158.194.64.22]) by smtpgate ([158.194.108.127]:25) (F-Secure Anti-Virus for Internet Mail 6.50.60 Release) with SMTP; Tue, 14 Nov 2006 10:05:13 -0000 (envelope-from ) Received: from olecom by flower with local (Exim 4.63) (envelope-from ) id 1GjvHA-0002xV-1q; Tue, 14 Nov 2006 10:12:20 +0000 To: Martin Braun , David Chinner , LKML , xfs@oss.sgi.com X-Posted-To: gmane.test X-ASG-Orig-Subj: Re: xfs kernel BUG again in 2.6.17.11 Subject: Re: xfs kernel BUG again in 2.6.17.11 References: <44E1D9CA.30805@uni-hd.de> <20060816101122.E2740551@wobbly.melbourne.sgi.com> <44EB228F.6020903@uni-hd.de> <20060823134211.E2968256@wobbly.melbourne.sgi.com> <45583ABE.6080909@uni-hd.de> <20061114040053.GD8394166@melbourne.sgi.com> <45598B07.6080401@uni-hd.de> Organization: Palacky University in Olomouc, experimental physics department. Date: Tue, 14 Nov 2006 10:12:19 +0000 Message-ID: User-Agent: slrn/0.9.8.1pl1 (Debian) From: Oleg Verych X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25967 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9637 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: olecom@flower.upol.cz Precedence: bulk X-list: xfs Content-Length: 1195 Lines: 43 Hallo. On 2006-11-14, Martin Braun wrote: > Hi David, > > >> Have you managed to repair the filesystem since you first >> reported this problem? I don't know the history of the bug [Well. Just to help (probably) new developers, after Nathan left SGI.] Here's FAQ node about bug: http://oss.sgi.com/projects/xfs/faq.html#dir2 You can find fixes in .17 stable git tree. If it was really just sparse annotations, they were obviously fixed, i think. If not, meybe there are some new bugs. > that's something I am not sure about, I have used the newest xfs_repair > tools and it found and repaired some inodes. And for about two months > there weren't any crashes. + > It seems that xfs_repair (2.8.10), did not find all of the errors of the FS. > Is there a way to be sure that the FS is clean? As in faq: ,-- ..... | Update: a fixed xfs_repair is now available; version 2.8.10 or later | of the xfsprogs package contains the fixed version. ..... | The xfs_check tool, or xfs_repair -n, should be able to detect any | directory corruption. `-- [] > Normally the Kernel freezes/hangs completely, but I found two new Do you mean panic or oops here, or just freeze? ____ From owner-xfs@oss.sgi.com Tue Nov 14 02:34:20 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 02:34:28 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAEAYJaG020999 for ; Tue, 14 Nov 2006 02:34:20 -0800 X-ASG-Debug-ID: 1163500408-19139-87-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by cuda.sgi.com (Spam Firewall) with ESMTP id 4C75451ED20; Tue, 14 Nov 2006 02:33:29 -0800 (PST) Received: from serv25.ub.uni-heidelberg.de (serv25.ub.uni-heidelberg.de [147.142.186.75]) by relay.uni-heidelberg.de (8.13.4/8.13.1) with ESMTP id kAEAVh8C000619; Tue, 14 Nov 2006 11:31:43 +0100 Received: from localhost (localhost [127.0.0.1]) by serv25.ub.uni-heidelberg.de (Postfix) with ESMTP id E275A8BD24; Tue, 14 Nov 2006 11:31:43 +0100 (CET) Received: from serv25.ub.uni-heidelberg.de ([127.0.0.1]) by localhost (serv25.ub.uni-heidelberg.de [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 31424-10; Tue, 14 Nov 2006 11:31:42 +0100 (CET) Received: from [147.142.85.36] (pers16.ub.uni-heidelberg.de [147.142.85.36]) by serv25.ub.uni-heidelberg.de (Postfix) with ESMTP id B98008BCF7; Tue, 14 Nov 2006 11:31:42 +0100 (CET) Message-ID: <45599B0E.8050505@uni-hd.de> Date: Tue, 14 Nov 2006 11:31:42 +0100 From: Martin Braun Reply-To: mbraun@uni-hd.de User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: Oleg Verych CC: David Chinner , LKML , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: xfs kernel BUG again in 2.6.17.11 Subject: Re: xfs kernel BUG again in 2.6.17.11 References: <44E1D9CA.30805@uni-hd.de> <20060816101122.E2740551@wobbly.melbourne.sgi.com> <44EB228F.6020903@uni-hd.de> <20060823134211.E2968256@wobbly.melbourne.sgi.com> <45583ABE.6080909@uni-hd.de> <20061114040053.GD8394166@melbourne.sgi.com> <45598B07.6080401@uni-hd.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25969 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9638 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mbraun@uni-hd.de Precedence: bulk X-list: xfs Content-Length: 1330 Lines: 36 Hi Oleg, thanks for your response. > You can find fixes in .17 stable git tree. Yes it is a 2.6.17.11 stable kernel. - By the way: we tried to setup kernel 2.6.18.2 on that machine but we got a weired time error, ntpdate shows two times: first run correct time, second run time is half an hour in the future - so we switched back to 2.6.17.11 > If it was really just sparse annotations, they were obviously > fixed, i think. If not, meybe there are some new bugs. > + >> It seems that xfs_repair (2.8.10), did not find all of the errors of the FS. >> Is there a way to be sure that the FS is clean? > > As in faq: > | Update: a fixed xfs_repair is now available; version 2.8.10 or later > | of the xfsprogs package contains the fixed version. > ..... > | The xfs_check tool, or xfs_repair -n, should be able to detect any > | directory corruption. However the two Kernel BUGS were _after_ xfs_repair (version 2.8.10). >> Normally the Kernel freezes/hangs completely, but I found two new > > Do you mean panic or oops here, or just freeze? In detail: a Kernel BUG in /var/log/messages is written and after that the cpu load average is climbing up to 20-30, any tries to shutdown the system, kill processes umounts etc. are in vain. Than the system freezes completely: no keyboard, nothing. cheers, martin From owner-xfs@oss.sgi.com Tue Nov 14 06:44:30 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 06:44:38 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAEEiSaG019650 for ; Tue, 14 Nov 2006 06:44:30 -0800 X-ASG-Debug-ID: 1163515420-20113-272-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id B7003D1C909C for ; Tue, 14 Nov 2006 06:43:40 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id AC7A8187B8C1E; Tue, 14 Nov 2006 08:43:39 -0600 (CST) Message-ID: <4559D61B.3060302@sandeen.net> Date: Tue, 14 Nov 2006 08:43:39 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Timothy Shimmin CC: "Stephen C. Rigler" , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: RHEL 4 Compatible Kernel Module Code Subject: Re: RHEL 4 Compatible Kernel Module Code References: <1163434210.25484.14.camel@houuc8> <4558B4DC.3050406@sandeen.net> <30F3263F8874E466D6206C56@timothy-shimmins-power-mac-g5.local> In-Reply-To: <30F3263F8874E466D6206C56@timothy-shimmins-power-mac-g5.local> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25983 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9641 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 607 Lines: 18 Timothy Shimmin wrote: > Hi Eric, > If you are referring to Mandy's incore extent changes, > I don't see them in sles9 or sles10. > > Having a quick look she has about 5 Mods. Ok, that's what I thought... pretty big change for a stable branch. Though I'm surprised that it's not in sles10! I do have an updated rhel4 module rpm going now; there were about 8 more patches that the sgi guys deemed worthy for sles9, so I merged them in. Tidying up the specfile and need to do some tests; if anyone wants a preview, let me know off-list. I hope to get it tested & available in a day or two. -Eric From owner-xfs@oss.sgi.com Tue Nov 14 09:44:22 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 09:44:30 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAEHiKaG008662 for ; Tue, 14 Nov 2006 09:44:22 -0800 X-ASG-Debug-ID: 1163526211-4838-962-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id BD3DBD1D5AE5 for ; Tue, 14 Nov 2006 09:43:31 -0800 (PST) Received: from agami.com ([192.168.168.135]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kAEHhV7M025044 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Tue, 14 Nov 2006 09:43:31 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kAEHhBw1024689 for ; Tue, 14 Nov 2006 09:43:11 -0800 Message-ID: <4559FD92.4040203@agami.com> Date: Tue, 14 Nov 2006 09:32:02 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: No Mails Subject: No Mails References: <1163095715.5632.102.camel@xenon.msp.redhat.com> <20061110011018.GP8394166@melbourne.sgi.com> In-Reply-To: <20061110011018.GP8394166@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.25995 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9642 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 173 Lines: 5 Not getting any mail on the mailing list. I am not sure if there is no activity or I am not getting any mail. I suspect the latter, how do I re-enable myself. -shailendra From owner-xfs@oss.sgi.com Tue Nov 14 12:11:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 12:11:24 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAEKBEaG022861 for ; Tue, 14 Nov 2006 12:11:16 -0800 X-ASG-Debug-ID: 1163535025-25780-403-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id E343B51BDF6 for ; Tue, 14 Nov 2006 12:10:25 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAEKALT5016353; Tue, 14 Nov 2006 15:10:21 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAEKALOl018712; Tue, 14 Nov 2006 15:10:21 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAEKAJJZ022894; Tue, 14 Nov 2006 15:10:21 -0500 Message-ID: <455A22AB.3000802@sandeen.net> Date: Tue, 14 Nov 2006 14:10:19 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: Shailendra Tripathi CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: No Mails Subject: Re: No Mails References: <1163095715.5632.102.camel@xenon.msp.redhat.com> <20061110011018.GP8394166@melbourne.sgi.com> <4559FD92.4040203@agami.com> In-Reply-To: <4559FD92.4040203@agami.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26005 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9643 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 362 Lines: 14 Shailendra Tripathi wrote: > Not getting any mail on the mailing list. I am not sure if there is no > activity or I am not getting any mail. > I suspect the latter, how do I re-enable myself. > -shailendra > > odd, looks like you are not subscribed now... try following the instructions at http://oss.sgi.com/projects/xfs/mail.html to resubscribe... -Eric From owner-xfs@oss.sgi.com Tue Nov 14 12:46:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 12:46:23 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAEKkEaG030582 for ; Tue, 14 Nov 2006 12:46:14 -0800 X-ASG-Debug-ID: 1163537124-23733-94-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2AAD1D1D3129 for ; Tue, 14 Nov 2006 12:45:24 -0800 (PST) Received: from [10.0.0.12] (ease.thebarn.com [10.0.0.12]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kAEKisXF088829; Tue, 14 Nov 2006 14:45:20 -0600 (CST) (envelope-from cattelan@thebarn.com) Message-ID: <455A2AC5.7030505@thebarn.com> Date: Tue, 14 Nov 2006 14:44:53 -0600 From: Russell Cattelan User-Agent: Mozilla Thunderbird 1.0.7 (Macintosh/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Shailendra Tripathi CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: No Mails Subject: Re: No Mails References: <1163095715.5632.102.camel@xenon.msp.redhat.com> <20061110011018.GP8394166@melbourne.sgi.com> <4559FD92.4040203@agami.com> In-Reply-To: <4559FD92.4040203@agami.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26007 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9644 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 1145 Lines: 30 Shailendra Tripathi wrote: > Not getting any mail on the mailing list. I am not sure if there is no > activity or I am not getting any mail. > I suspect the latter, how do I re-enable myself. > -shailendra > the list software will unsubscribe any address that has to many failures. Normally this is a good thing as the list maintainers don't have to keep pruning out dead email addresses. Unfortunately the list software has no way of knowing if the failure was due a bad address or a rejected spam. In your case it turns out to be to many rejected spams. Unsub 0 10 Oct 22 - Nov 09 stripathi@agami.com 550 5.7.1 mail containing es.geocities.com rejected - sbl; see http://www.spamhaus.org/query/bl?ip=66.218.77.68 Unsub 0 10 Oct 22 - Nov 09 miken@agami.com 550 5.7.1 mail containing es.geocities.com rejected - sbl; see http://www.spamhaus.org/query/bl?ip=66.218.77.68 This may have something to do with the fact that oss's mail is running through a barracuda box. Guess you will have to resubscribe to the list. sorry out the inconvenience. -Russell Cattelan cattelan@xfs.org From owner-xfs@oss.sgi.com Tue Nov 14 16:01:29 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 16:01:38 -0800 (PST) Received: from omx1.americas.sgi.com (omx1.americas.sgi.com [198.149.16.13]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAF01RaG010387 for ; Tue, 14 Nov 2006 16:01:29 -0800 Received: from internal-mail-relay1.corp.sgi.com (internal-mail-relay1.corp.sgi.com [198.149.32.52]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id kAF00fnx003424 for ; Tue, 14 Nov 2006 18:00:41 -0600 Received: from [134.15.160.8] (vpn-emea-sw-emea-160-8.emea.sgi.com [134.15.160.8]) by internal-mail-relay1.corp.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id kAF00Ubj57518192; Tue, 14 Nov 2006 16:00:32 -0800 (PST) Message-ID: <455A589E.4040607@sgi.com> Date: Wed, 15 Nov 2006 00:00:30 +0000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com Organization: SGI User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.7.12) Gecko/20050920 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Vlad Apostolov CC: Shailendra Tripathi , xfs mailing list , xfs-dev@sgi.com Subject: Re: xfs_bmap_add_extent_delay_real: Uninited r[3] corrupts startoff References: <4529F8A8.6080900@agami.com> <452C44A2.7000907@sgi.com> In-Reply-To: <452C44A2.7000907@sgi.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9645 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Content-Length: 5574 Lines: 155 This should be all that's needed. This code handles the case where the middle portion of a delayed allocation is being converted and splits the extent into three. The r[1] extent is the rightmost extent that will remain a delayed allocation. Both br_startblock and br_state need to be setup and they will be the same as the original delayed allocation (PREV) so we just inherit those values. Comments? --- fs/xfs/xfs_bmap.c_1.358 2006-11-01 14:44:38.000000000 +0000 +++ fs/xfs/xfs_bmap.c 2006-11-02 13:22:41.000000000 +0000 @@ -1171,6 +1171,7 @@ xfs_bmap_trace_pre_update(fname, "0", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); r[0] = *new; + r[1] = PREV; r[1].br_startoff = new_endoff; temp2 = PREV.br_startoff + PREV.br_blockcount - new_endoff; r[1].br_blockcount = temp2; Lachlan Vlad Apostolov wrote: > Hi Shailendra, > > Shailendra Tripathi wrote: > >> Hi, >> It appears that uninitialized r[3] in >> xfs_bmap_add_extent_delay_real can potentially corrupt the startoff >> for a particular case. >> >> This sequence is below: >> >> xfs_bmap_add_extent_delay_real ( >> ... >> xfs_bmbt_irec_t r[3]; /* neighbor extent entries */ >> >> case 0: >> /* >> * Filling in the middle part of a previous delayed allocation. >> * Contiguity is impossible here. >> * This case is avoided almost all the time. >> */ >> temp = new->br_startoff - PREV.br_startoff; >> xfs_bmbt_set_blockcount(ep, temp); >> r[0] = *new; >> r[1].br_startoff = new_endoff; >> temp2 = PREV.br_startoff + PREV.br_blockcount - new_endoff; >> r[1].br_blockcount = temp2; >> xfs_bmap_insert_exlist(ip, idx + 1, 2, &r[0], XFS_DATA_FORK); >> ip->i_df.if_lastex = idx + 1; >> ip->i_d.di_nextents++; >> >> Look at extent r[1]. It does not set br_startblock. That is, it is any >> random value. Now, look at the xfs_bmbt_set_all. Though, it sets the >> blockcount later, the startoff does not get changed. >> >> #if XFS_BIG_BLKNOS >> ASSERT((s->br_startblock & XFS_MASK64HI(12)) == 0); >> r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | >> ((xfs_bmbt_rec_base_t)s->br_startoff << 9) | >> ((xfs_bmbt_rec_base_t)s->br_startblock >> 43); >> Top 21 bits are taken as it is. However, only 9 bit should be taken. >> So, for random values, it corrupts the startoff which from 9-63 bits. > > From the code inspection I agree with you that br_startblock doesn't > appear > to be initialized in this scenario. Otherwise I think the code looks good. > If the br_startblock is initialized it should be a value that fits > in 52 bits out of 64 (this is what the ASSERT is for) and the top 12 > bits will be 0. > The r->l0 gets the top 21 bits of br_startblock, the most significant 12 > bits of > which are 0 and least significant 9 could be non 0. The r->l1 gets the > rest 43 (= 52-9 = 64-21) bits of br_startblock. > > I will open a bug report for the uninitialized br_startblock. > > Thank you for finding this problem. > > Regards, > Vlad > >> >> r->l1 = ((xfs_bmbt_rec_base_t)s->br_startblock << 21) | >> ((xfs_bmbt_rec_base_t)s->br_blockcount & >> (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); >> >> I have attached a small program which does the same thing as it is >> being done here. I would appreciate if someone can verify that >> assertion is correct. >> >> >> Regards, >> Shailendra >> ------------------------------------------------------------------------ >> >> #include >> typedef unsigned long __uint64_t; >> typedef struct xfs_bmbt_rec_64 >> { >> __uint64_t l0, l1; >> } xfs_bmbt_rec_64_t; >> >> typedef __uint64_t xfs_bmbt_rec_base_t; typedef >> xfs_bmbt_rec_64_t xfs_bmbt_rec_t, xfs_bmdr_rec_t; >> >> typedef enum { >> XFS_EXT_NORM, XFS_EXT_UNWRITTEN, >> XFS_EXT_DMAPI_OFFLINE >> } xfs_exntst_t; >> >> typedef struct xfs_bmbt_irec >> { >> __uint64_t br_startoff; /* starting file offset */ >> __uint64_t br_startblock; /* starting block number */ >> __uint64_t br_blockcount; /* number of blocks */ >> xfs_exntst_t br_state; /* extent state */ >> } xfs_bmbt_irec_t; >> >> #define XFS_MASK64LO(n) (((__uint64_t)1 << (n)) - 1) >> #define XFS_MASK64HI(n) ((__uint64_t)-1 << (64 - (n))) >> >> int main(void) { >> xfs_bmbt_irec_t s; >> xfs_bmbt_rec_t r; >> int extent_flag; >> >> s.br_startoff = 0; >> s.br_blockcount = 5; >> s.br_startblock = 0xfffffffffffffff0; >> extent_flag = (s.br_state == XFS_EXT_NORM) ? 0 : 1; >> >> printf("blockcount = 0x%llx\n", s.br_startblock); >> r.l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | >> ((xfs_bmbt_rec_base_t)s.br_startoff << 9) | >> ((xfs_bmbt_rec_base_t)s.br_startblock >> 43); >> r.l1 = ((xfs_bmbt_rec_base_t)s.br_startblock << 21) | >> ((xfs_bmbt_rec_base_t)s.br_blockcount & >> (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); >> >> printf("l0 = 0x%llx l1 = 0x%llx\n", r.l0, r.l1); >> >> r.l0 = (r.l0 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(55)) | >> (xfs_bmbt_rec_base_t)((__uint64_t)100 >> 43); >> r.l1 = (r.l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)) | >> (xfs_bmbt_rec_base_t)((__uint64_t)100 << 21); >> >> printf("l0 = 0x%llx l1 = 0x%llx\n", r.l0, r.l1); >> return 0; >> } >> > > > From owner-xfs@oss.sgi.com Tue Nov 14 16:44:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 16:45:06 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAF0ivaG018540 for ; Tue, 14 Nov 2006 16:44:59 -0800 X-ASG-Debug-ID: 1163551444-993-205-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id B94A151F80C; Tue, 14 Nov 2006 16:44:04 -0800 (PST) Received: from agami.com ([192.168.168.135]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kAF0i37M032479 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 14 Nov 2006 16:44:03 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kAF0hUZQ020103; Tue, 14 Nov 2006 16:43:30 -0800 Message-ID: <455A600D.8010803@agami.com> Date: Tue, 14 Nov 2006 16:32:13 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: lachlan@sgi.com CC: Vlad Apostolov , xfs mailing list , xfs-dev@sgi.com X-ASG-Orig-Subj: Re: xfs_bmap_add_extent_delay_real: Uninited r[3] corrupts startoff Subject: Re: xfs_bmap_add_extent_delay_real: Uninited r[3] corrupts startoff References: <4529F8A8.6080900@agami.com> <452C44A2.7000907@sgi.com> <455A589E.4040607@sgi.com> In-Reply-To: <455A589E.4040607@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26025 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9646 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 6325 Lines: 182 Hi Lachlan, I would prefer manual assignment here than struct assignment. r[1].br_startoff and r[1].br_blockcount will be modified immediately, so it is not worth assigning via ( r[1] = PREV) as it does extra instructions. Compiler would most likely eliminate the extra assignment but, why to leave on the wit of the compiler. It should be like r[1].br_state = PREV.br_state; r[1].br_startblock = 0 ; /* No fancy stuff required here as the aim here is that br_startoff does not get any thing random */ Regards, Shailendra Lachlan McIlroy wrote: > This should be all that's needed. This code handles the case where > the middle > portion of a delayed allocation is being converted and splits the > extent into > three. The r[1] extent is the rightmost extent that will remain a > delayed > allocation. Both br_startblock and br_state need to be setup and they > will be > the same as the original delayed allocation (PREV) so we just inherit > those > values. Comments? > > --- fs/xfs/xfs_bmap.c_1.358 2006-11-01 14:44:38.000000000 +0000 > +++ fs/xfs/xfs_bmap.c 2006-11-02 13:22:41.000000000 +0000 > @@ -1171,6 +1171,7 @@ > xfs_bmap_trace_pre_update(fname, "0", ip, idx, > XFS_DATA_FORK); > xfs_bmbt_set_blockcount(ep, temp); > r[0] = *new; > + r[1] = PREV; > r[1].br_startoff = new_endoff; > temp2 = PREV.br_startoff + PREV.br_blockcount - > new_endoff; > r[1].br_blockcount = temp2; > > Lachlan > > Vlad Apostolov wrote: >> Hi Shailendra, >> >> Shailendra Tripathi wrote: >> >>> Hi, >>> It appears that uninitialized r[3] in >>> xfs_bmap_add_extent_delay_real can potentially corrupt the startoff >>> for a particular case. >>> >>> This sequence is below: >>> >>> xfs_bmap_add_extent_delay_real ( >>> ... >>> xfs_bmbt_irec_t r[3]; /* neighbor extent entries */ >>> >>> case 0: >>> /* >>> * Filling in the middle part of a previous delayed >>> allocation. >>> * Contiguity is impossible here. >>> * This case is avoided almost all the time. >>> */ >>> temp = new->br_startoff - PREV.br_startoff; >>> xfs_bmbt_set_blockcount(ep, temp); >>> r[0] = *new; >>> r[1].br_startoff = new_endoff; >>> temp2 = PREV.br_startoff + PREV.br_blockcount - new_endoff; >>> r[1].br_blockcount = temp2; >>> xfs_bmap_insert_exlist(ip, idx + 1, 2, &r[0], XFS_DATA_FORK); >>> ip->i_df.if_lastex = idx + 1; >>> ip->i_d.di_nextents++; >>> >>> Look at extent r[1]. It does not set br_startblock. That is, it is >>> any random value. Now, look at the xfs_bmbt_set_all. Though, it sets >>> the blockcount later, the startoff does not get changed. >>> >>> #if XFS_BIG_BLKNOS >>> ASSERT((s->br_startblock & XFS_MASK64HI(12)) == 0); >>> r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | >>> ((xfs_bmbt_rec_base_t)s->br_startoff << 9) | >>> ((xfs_bmbt_rec_base_t)s->br_startblock >> 43); >>> Top 21 bits are taken as it is. However, only 9 bit should be taken. >>> So, for random values, it corrupts the startoff which from 9-63 bits. >> >> From the code inspection I agree with you that br_startblock doesn't >> appear >> to be initialized in this scenario. Otherwise I think the code looks >> good. >> If the br_startblock is initialized it should be a value that fits >> in 52 bits out of 64 (this is what the ASSERT is for) and the top 12 >> bits will be 0. >> The r->l0 gets the top 21 bits of br_startblock, the most significant >> 12 bits of >> which are 0 and least significant 9 could be non 0. The r->l1 gets the >> rest 43 (= 52-9 = 64-21) bits of br_startblock. >> >> I will open a bug report for the uninitialized br_startblock. >> >> Thank you for finding this problem. >> >> Regards, >> Vlad >> >>> >>> r->l1 = ((xfs_bmbt_rec_base_t)s->br_startblock << 21) | >>> ((xfs_bmbt_rec_base_t)s->br_blockcount & >>> (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); >>> >>> I have attached a small program which does the same thing as it is >>> being done here. I would appreciate if someone can verify that >>> assertion is correct. >>> >>> >>> Regards, >>> Shailendra >>> ------------------------------------------------------------------------ >>> >>> >>> #include >>> typedef unsigned long __uint64_t; >>> typedef struct xfs_bmbt_rec_64 >>> { >>> __uint64_t l0, l1; >>> } xfs_bmbt_rec_64_t; >>> >>> typedef __uint64_t xfs_bmbt_rec_base_t; typedef >>> xfs_bmbt_rec_64_t xfs_bmbt_rec_t, xfs_bmdr_rec_t; >>> >>> typedef enum { >>> XFS_EXT_NORM, XFS_EXT_UNWRITTEN, >>> XFS_EXT_DMAPI_OFFLINE >>> } xfs_exntst_t; >>> >>> typedef struct xfs_bmbt_irec >>> { >>> __uint64_t br_startoff; /* starting file offset */ >>> __uint64_t br_startblock; /* starting block number */ >>> __uint64_t br_blockcount; /* number of blocks */ >>> xfs_exntst_t br_state; /* extent state */ >>> } xfs_bmbt_irec_t; >>> >>> #define XFS_MASK64LO(n) (((__uint64_t)1 << (n)) - 1) >>> #define XFS_MASK64HI(n) ((__uint64_t)-1 << (64 - (n))) >>> >>> int main(void) { >>> xfs_bmbt_irec_t s; >>> xfs_bmbt_rec_t r; >>> int extent_flag; >>> >>> s.br_startoff = 0; >>> s.br_blockcount = 5; >>> s.br_startblock = 0xfffffffffffffff0; >>> extent_flag = (s.br_state == XFS_EXT_NORM) ? 0 : 1; >>> >>> printf("blockcount = 0x%llx\n", s.br_startblock); >>> r.l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | >>> ((xfs_bmbt_rec_base_t)s.br_startoff << 9) | >>> ((xfs_bmbt_rec_base_t)s.br_startblock >> 43); >>> r.l1 = ((xfs_bmbt_rec_base_t)s.br_startblock << 21) | >>> ((xfs_bmbt_rec_base_t)s.br_blockcount & >>> (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); >>> >>> printf("l0 = 0x%llx l1 = 0x%llx\n", r.l0, r.l1); >>> >>> r.l0 = (r.l0 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(55)) | >>> (xfs_bmbt_rec_base_t)((__uint64_t)100 >> 43); >>> r.l1 = (r.l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)) | >>> (xfs_bmbt_rec_base_t)((__uint64_t)100 << 21); >>> >>> printf("l0 = 0x%llx l1 = 0x%llx\n", r.l0, r.l1); >>> return 0; >>> } >>> >> >> >> From owner-xfs@oss.sgi.com Tue Nov 14 17:21:17 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 17:21:24 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAF1LFaG022077 for ; Tue, 14 Nov 2006 17:21:17 -0800 X-ASG-Debug-ID: 1163553626-25888-6-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id 50D0351D45E; Tue, 14 Nov 2006 17:20:27 -0800 (PST) Received: from agami.com ([192.168.168.135]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kAF1KP7M000577 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Tue, 14 Nov 2006 17:20:25 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kAF1KKTw002815; Tue, 14 Nov 2006 17:20:20 -0800 Message-ID: <455A68AF.8030309@agami.com> Date: Tue, 14 Nov 2006 17:09:03 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: David Chinner CC: xfs-dev@sgi.com, xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [RFC 0/3] Convert XFS inode hashes to radix trees Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees References: <20061003060610.GV3024@melbourne.sgi.com> In-Reply-To: <20061003060610.GV3024@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26029 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9647 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 4798 Lines: 103 Hi David, I regret for making comments and questions on this quite late (somehow I missed to email). It does appear to me that using this approach can potentially help in cluster hash list related manipulations. However, this appears (to me) to be at the cost of regular inode lookup. As of now, each of the hash buckets have their own lock. This helps in not making the xfs_iget operations hot. I have not seen of xfs_iget anywhere on the top in my profiling of Linux for SPECFS. With this code, the number of hash buckets can be appropriately sized (based upon memory availability). However, it appears to be that radix tree (even with 15) can become a bottleneck. Lets assume that there are 600K inodes on a reasonably big end system and assuming fare distribution, each of the radix tree will have 600K/15 ~ 40K inodes per hash tree. Insertion and deletion to the list have to take writer_lock and given their frequency, both readers (lookups) and writers will be affected. That means, if one tree is locked for insertion or deletion, remaining 40K inodes will be just serialized. However, in current design, by sacrificing little extra memory, we can allocate more hash buckets and eventually the locked down inodes can be made pretty small. My knowledge on radix tree is little limited, but I think, increasing the number of trees would be much more costly in memory terms. Given less memory usage and performance, I tend to believe that hash table is more scalable than radix tree for inode tables. Have you done any performance testing with these patches. I am quite curious to know the results. If not, may be I can try do some perf. testing with these changes albeit on a old kernel tree. Am I missing something here ? Please let me know. Thanks and Regards, Shailendra David Chinner wrote: > One of the long standing problems with XFS on large machines and > filesystems is the sizing of the inode cache hashes used by XFS to > index the xfs_inode_t structures. The mount option ihashsize became > a necessity because the default calculations simply can't get it > right for all situations. > > On top of that, as we increase the size of the inode hash and cache > more inodes, the inode cluster hash becomes the limiting factor, > especially when we have sparse cluster population. The result of > this is that we can always get to the point where either the ihash > or the chash is a scalability or performance limitation. > > The following three patches replace the hashes with a more scalable > solution that should not require tweaking in most situations. > > I chose a radix tree to replace the hash chains because of a neat > alignment of XFS inode structures and the kernel radix tree fanout. > XFS allocates inodes in clusters of 64 inodes and the radix tree > keeps 64 sequential entries per node. That means all for the inodes > in a cluster will always sit in the same node of the radix tree. > > Using this relationship, we completely remove the need for the > cluster hash to track clusters because we can use a gang lookup on > the radix tree to search for an existing inode in the cluster in an > efficient manner. > > The following three patches sit on top of the recently posted > i_flags cleanup patch. > (http://marc.theaimsgroup.com/?l=linux-xfs&m=115985254820322&w=2) > > The first patch replaces the inode hash chains with radix trees. A > single radix tree with a read/write lock does not provide enough > parallelism to prevent performance regressions under simultanenous > create/unlink workloadds, so we hash the inode clusters into > different radix trees each with their own read/write lock. The > default is to create (2*ncpus)-1 radix trees up to a maximum of 15. > At this point I have left the ihashsize mount option alone but > limited the maximum number it can take to 128. if you specify more > than 128 (i.e. everyone currently using this mount option), it > falls back to the default. > > The second patch introduces a per-cluster object lock for chaining > the inodes in the cluster together (for xfs_iflush()). The inode > chain is currently locked by cluster hash chain lock, so we need > some other method of locking if we are to remove the cluster hash > altogether. > > The third patch removes the cluster hash and replaces it with some > masking and a radix tree gang lookup. > > Overall, the patchset removes more than 200 lines of code from the > xfs inode caching and lookup code and provides more consistent > scalability for large numbers of cached inodes. The only down side > is that it limits us to 32 bit inode numbers of 32 bit platforms due > to the way the radix tree uses unsigned longs for it's indexes > > Comments, thoughts, etc are welcome. > > Cheers, > > Dave. > From owner-xfs@oss.sgi.com Tue Nov 14 17:22:19 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 14 Nov 2006 17:22:27 -0800 (PST) Received: from internal-mail-relay1.corp.sgi.com (internal-mail-relay1.corp.sgi.com [198.149.32.52]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAF1MHaG022236 for ; Tue, 14 Nov 2006 17:22:19 -0800 Received: from [134.15.160.8] (vpn-emea-sw-emea-160-8.emea.sgi.com [134.15.160.8]) by internal-mail-relay1.corp.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id kAF1LMbj57540335; Tue, 14 Nov 2006 17:21:23 -0800 (PST) Message-ID: <455A6B92.9060805@sgi.com> Date: Wed, 15 Nov 2006 01:21:22 +0000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com Organization: SGI User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.7.12) Gecko/20050920 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Shailendra Tripathi CC: Vlad Apostolov , xfs mailing list , xfs-dev@sgi.com Subject: Re: xfs_bmap_add_extent_delay_real: Uninited r[3] corrupts startoff References: <4529F8A8.6080900@agami.com> <452C44A2.7000907@sgi.com> <455A589E.4040607@sgi.com> <455A600D.8010803@agami.com> In-Reply-To: <455A600D.8010803@agami.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9648 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Content-Length: 6878 Lines: 195 I considered that approach but wasn't keen on setting br_startblock to 0 when it should be NULLSTARTBLOCK. Subsequent calls to xfs_bmbt_set_all() handle NULLSTARTBLOCK differently but the net result ends up being the same and the startblock eventually gets overridden anyway. I'll go with your suggestion. Shailendra Tripathi wrote: > Hi Lachlan, > I would prefer manual assignment here than struct assignment. > r[1].br_startoff and r[1].br_blockcount will be > modified immediately, so it is not worth assigning via ( r[1] = PREV) as > it does extra instructions. > Compiler would most likely eliminate the extra assignment but, why to > leave on the wit of the compiler. > > It should be like > r[1].br_state = PREV.br_state; > r[1].br_startblock = 0 ; /* No fancy stuff required here as the aim here > is that br_startoff does not get any thing random */ > > Regards, > Shailendra > > Lachlan McIlroy wrote: > >> This should be all that's needed. This code handles the case where >> the middle >> portion of a delayed allocation is being converted and splits the >> extent into >> three. The r[1] extent is the rightmost extent that will remain a >> delayed >> allocation. Both br_startblock and br_state need to be setup and they >> will be >> the same as the original delayed allocation (PREV) so we just inherit >> those >> values. Comments? >> >> --- fs/xfs/xfs_bmap.c_1.358 2006-11-01 14:44:38.000000000 +0000 >> +++ fs/xfs/xfs_bmap.c 2006-11-02 13:22:41.000000000 +0000 >> @@ -1171,6 +1171,7 @@ >> xfs_bmap_trace_pre_update(fname, "0", ip, idx, >> XFS_DATA_FORK); >> xfs_bmbt_set_blockcount(ep, temp); >> r[0] = *new; >> + r[1] = PREV; >> r[1].br_startoff = new_endoff; >> temp2 = PREV.br_startoff + PREV.br_blockcount - >> new_endoff; >> r[1].br_blockcount = temp2; >> >> Lachlan >> >> Vlad Apostolov wrote: >> >>> Hi Shailendra, >>> >>> Shailendra Tripathi wrote: >>> >>>> Hi, >>>> It appears that uninitialized r[3] in >>>> xfs_bmap_add_extent_delay_real can potentially corrupt the startoff >>>> for a particular case. >>>> >>>> This sequence is below: >>>> >>>> xfs_bmap_add_extent_delay_real ( >>>> ... >>>> xfs_bmbt_irec_t r[3]; /* neighbor extent entries */ >>>> >>>> case 0: >>>> /* >>>> * Filling in the middle part of a previous delayed >>>> allocation. >>>> * Contiguity is impossible here. >>>> * This case is avoided almost all the time. >>>> */ >>>> temp = new->br_startoff - PREV.br_startoff; >>>> xfs_bmbt_set_blockcount(ep, temp); >>>> r[0] = *new; >>>> r[1].br_startoff = new_endoff; >>>> temp2 = PREV.br_startoff + PREV.br_blockcount - new_endoff; >>>> r[1].br_blockcount = temp2; >>>> xfs_bmap_insert_exlist(ip, idx + 1, 2, &r[0], XFS_DATA_FORK); >>>> ip->i_df.if_lastex = idx + 1; >>>> ip->i_d.di_nextents++; >>>> >>>> Look at extent r[1]. It does not set br_startblock. That is, it is >>>> any random value. Now, look at the xfs_bmbt_set_all. Though, it sets >>>> the blockcount later, the startoff does not get changed. >>>> >>>> #if XFS_BIG_BLKNOS >>>> ASSERT((s->br_startblock & XFS_MASK64HI(12)) == 0); >>>> r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | >>>> ((xfs_bmbt_rec_base_t)s->br_startoff << 9) | >>>> ((xfs_bmbt_rec_base_t)s->br_startblock >> 43); >>>> Top 21 bits are taken as it is. However, only 9 bit should be taken. >>>> So, for random values, it corrupts the startoff which from 9-63 bits. >>> >>> >>> From the code inspection I agree with you that br_startblock doesn't >>> appear >>> to be initialized in this scenario. Otherwise I think the code looks >>> good. >>> If the br_startblock is initialized it should be a value that fits >>> in 52 bits out of 64 (this is what the ASSERT is for) and the top 12 >>> bits will be 0. >>> The r->l0 gets the top 21 bits of br_startblock, the most significant >>> 12 bits of >>> which are 0 and least significant 9 could be non 0. The r->l1 gets the >>> rest 43 (= 52-9 = 64-21) bits of br_startblock. >>> >>> I will open a bug report for the uninitialized br_startblock. >>> >>> Thank you for finding this problem. >>> >>> Regards, >>> Vlad >>> >>>> >>>> r->l1 = ((xfs_bmbt_rec_base_t)s->br_startblock << 21) | >>>> ((xfs_bmbt_rec_base_t)s->br_blockcount & >>>> (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); >>>> >>>> I have attached a small program which does the same thing as it is >>>> being done here. I would appreciate if someone can verify that >>>> assertion is correct. >>>> >>>> >>>> Regards, >>>> Shailendra >>>> ------------------------------------------------------------------------ >>>> >>>> >>>> #include >>>> typedef unsigned long __uint64_t; >>>> typedef struct xfs_bmbt_rec_64 >>>> { >>>> __uint64_t l0, l1; >>>> } xfs_bmbt_rec_64_t; >>>> >>>> typedef __uint64_t xfs_bmbt_rec_base_t; typedef >>>> xfs_bmbt_rec_64_t xfs_bmbt_rec_t, xfs_bmdr_rec_t; >>>> >>>> typedef enum { >>>> XFS_EXT_NORM, XFS_EXT_UNWRITTEN, >>>> XFS_EXT_DMAPI_OFFLINE >>>> } xfs_exntst_t; >>>> >>>> typedef struct xfs_bmbt_irec >>>> { >>>> __uint64_t br_startoff; /* starting file offset */ >>>> __uint64_t br_startblock; /* starting block number */ >>>> __uint64_t br_blockcount; /* number of blocks */ >>>> xfs_exntst_t br_state; /* extent state */ >>>> } xfs_bmbt_irec_t; >>>> >>>> #define XFS_MASK64LO(n) (((__uint64_t)1 << (n)) - 1) >>>> #define XFS_MASK64HI(n) ((__uint64_t)-1 << (64 - (n))) >>>> >>>> int main(void) { >>>> xfs_bmbt_irec_t s; >>>> xfs_bmbt_rec_t r; >>>> int extent_flag; >>>> >>>> s.br_startoff = 0; >>>> s.br_blockcount = 5; >>>> s.br_startblock = 0xfffffffffffffff0; >>>> extent_flag = (s.br_state == XFS_EXT_NORM) ? 0 : 1; >>>> >>>> printf("blockcount = 0x%llx\n", s.br_startblock); >>>> r.l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | >>>> ((xfs_bmbt_rec_base_t)s.br_startoff << 9) | >>>> ((xfs_bmbt_rec_base_t)s.br_startblock >> 43); >>>> r.l1 = ((xfs_bmbt_rec_base_t)s.br_startblock << 21) | >>>> ((xfs_bmbt_rec_base_t)s.br_blockcount & >>>> (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); >>>> >>>> printf("l0 = 0x%llx l1 = 0x%llx\n", r.l0, r.l1); >>>> >>>> r.l0 = (r.l0 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(55)) | >>>> (xfs_bmbt_rec_base_t)((__uint64_t)100 >> 43); >>>> r.l1 = (r.l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)) | >>>> (xfs_bmbt_rec_base_t)((__uint64_t)100 << 21); >>>> >>>> printf("l0 = 0x%llx l1 = 0x%llx\n", r.l0, r.l1); >>>> return 0; >>>> } >>>> >>> >>> >>> >>> > > From owner-xfs@oss.sgi.com Wed Nov 15 01:35:22 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 15 Nov 2006 01:35:29 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAF9ZLaG008733 for ; Wed, 15 Nov 2006 01:35:22 -0800 X-ASG-Debug-ID: 1163581676-8981-84-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mail.pawisda.de (mail.pawisda.de [213.157.4.156]) by cuda.sgi.com (Spam Firewall) with ESMTP id 56ADAD1D295C for ; Wed, 15 Nov 2006 01:07:56 -0800 (PST) Received: from localhost (localhost.intra.frontsite.de [127.0.0.1]) by mail.pawisda.de (Postfix) with ESMTP id E298F8C01 for ; Wed, 15 Nov 2006 10:07:53 +0100 (CET) Received: from mail.pawisda.de ([127.0.0.1]) by localhost (ndb [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 14329-04 for ; Wed, 15 Nov 2006 10:07:50 +0100 (CET) Received: from groupware.pawisda.de (groupware.intra.frontsite.de [192.168.200.14]) by mail.pawisda.de (Postfix) with ESMTP id F26DB5882 for ; Wed, 15 Nov 2006 10:07:49 +0100 (CET) Received: by groupware.pawisda.de (Postfix, from userid 33) id 7C4C647860; Wed, 15 Nov 2006 10:07:49 +0100 (CET) Received: from 62.159.242.114 (SquirrelMail authenticated user amthor) by groupware.pawisda.de with HTTP; Wed, 15 Nov 2006 10:07:49 +0100 (CET) Message-ID: <18773.62.159.242.114.1163581669.squirrel@groupware.pawisda.de> Date: Wed, 15 Nov 2006 10:07:49 +0100 (CET) X-ASG-Orig-Subj: SLES 9 x86: xfs_force_shutdown Subject: SLES 9 x86: xfs_force_shutdown From: "Dan Am" To: xfs@oss.sgi.com User-Agent: SquirrelMail/1.4.8 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26057 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9649 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: xfs@lonx.net Precedence: bulk X-list: xfs Content-Length: 814 Lines: 26 Hello , this has come up before, I have read up on some threads, esp from Feb this year and it is probably handled by now. (Is it ?) Heres the log: Nov 14 15:31:05 l08arnfs01 kernel: xfs_force_shutdown(dm-0,0x8) called from line 1091 of file fs/xfs/xfs_trans.c. Return address = 0xffffffffa01035e8 Nov 14 15:42:55 l08arnfs01 kernel: xfs_force_shutdown(dm-0,0x1) called from line 353 of file fs/xfs/xfs_rw.c. Return address = 0xffffffffa01035e8 Novell has not yet patched up from our Kernel Version "2.6.5.7-244", and in our env., I cannot switch to Vanilla. Size of device : 1T. My current mount options are: /dev/mapper/vg-l0 on /data type xfs (rw,logbufs=8,logbsize=32768,biosize=16) Can I tweak something here, or elsewhere, so the effects of the bug are lessened until Novell patches ? TIA Best Dan From owner-xfs@oss.sgi.com Wed Nov 15 07:20:01 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 15 Nov 2006 07:20:08 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAFFJxaG019672 for ; Wed, 15 Nov 2006 07:20:01 -0800 X-ASG-Debug-ID: 1163603950-21664-642-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 77351D1D3103 for ; Wed, 15 Nov 2006 07:19:10 -0800 (PST) Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 034D418E21526; Wed, 15 Nov 2006 09:18:53 -0600 (CST) Message-ID: <455B2FDD.3090303@sandeen.net> Date: Wed, 15 Nov 2006 09:18:53 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Dan Am CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: SLES 9 x86: xfs_force_shutdown Subject: Re: SLES 9 x86: xfs_force_shutdown References: <18773.62.159.242.114.1163581669.squirrel@groupware.pawisda.de> In-Reply-To: <18773.62.159.242.114.1163581669.squirrel@groupware.pawisda.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26081 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9652 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 583 Lines: 18 Dan Am wrote: > Hello , > this has come up before, I have read up on some threads, esp from Feb this > year and it is probably handled by now. (Is it ?) > > Heres the log: > Nov 14 15:31:05 l08arnfs01 kernel: xfs_force_shutdown(dm-0,0x8) called > from line 1091 of file fs/xfs/xfs_trans.c. Return address = > 0xffffffffa01035e8 > Nov 14 15:42:55 l08arnfs01 kernel: xfs_force_shutdown(dm-0,0x1) called > from line 353 of file fs/xfs/xfs_rw.c. Return address = > 0xffffffffa01035e8 Any interesting messages before this, from xfs, or from storage subsystems.. or anywhere? -Eric From owner-xfs@oss.sgi.com Wed Nov 15 20:55:34 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 15 Nov 2006 20:55:42 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAG4tWaG018479 for ; Wed, 15 Nov 2006 20:55:34 -0800 X-ASG-Debug-ID: 1163652883-1643-463-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from tyo202.gate.nec.co.jp (TYO202.gate.nec.co.jp [210.143.35.52]) by cuda.sgi.com (Spam Firewall) with ESMTP id 42F34521C3A for ; Wed, 15 Nov 2006 20:54:43 -0800 (PST) Received: from mailgate3.nec.co.jp (mailgate54.nec.co.jp [10.7.69.193]) by tyo202.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAG4sfLZ019559 for ; Thu, 16 Nov 2006 13:54:42 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id kAG4sfN18472 for xfs@oss.sgi.com; Thu, 16 Nov 2006 13:54:41 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv3.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id kAG4sQ806476 for ; Thu, 16 Nov 2006 13:54:26 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20061116.135929.68702252 for ; Thu, 16 Nov 2006 13:59:29 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Thu Nov 16 13:59:29 2006 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id 46A63AE4B0 for ; Thu, 16 Nov 2006 13:54:26 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id kAG4sQKX023847; Thu, 16 Nov 2006 13:54:26 +0900 Message-Id: <200611160454.AA04682@TNESG9305.tnes.nec.co.jp> Date: Thu, 16 Nov 2006 13:54:26 +0900 To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] xfs_db initialization Subject: [PATCH] xfs_db initialization From: Utako Kusaka MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=us-ascii X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26135 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9654 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 863 Lines: 32 Hi, This patch fixes the issue that xfs_db causes segmentation fault when a corrupt file system is specified. Signed-off-by: Utako Kusaka --- --- xfsprogs-2.8.15-orgn/libxfs/init.c 2006-10-18 01:10:14.000000000 +0900 +++ xfsprogs-2.8.15/libxfs/init.c 2006-11-16 10:03:18.575412805 +0900 @@ -595,8 +595,8 @@ libxfs_mount( fprintf(stderr, _("%s: data size check failed\n"), progname); if (!(flags & LIBXFS_MOUNT_DEBUGGER)) return NULL; - } - libxfs_putbuf(bp); + } else + libxfs_putbuf(bp); if (mp->m_logdev && mp->m_logdev != mp->m_dev) { d = (xfs_daddr_t) XFS_FSB_TO_BB(mp, mp->m_sb.sb_logblocks); @@ -610,7 +610,8 @@ libxfs_mount( if (!(flags & LIBXFS_MOUNT_DEBUGGER)) return NULL; } - libxfs_putbuf(bp); + if (bp) + libxfs_putbuf(bp); } /* Initialize realtime fields in the mount structure */ From owner-xfs@oss.sgi.com Thu Nov 16 13:17:27 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 13:17:35 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAGLHQaG013918 for ; Thu, 16 Nov 2006 13:17:27 -0800 X-ASG-Debug-ID: 1163711796-20890-271-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.188]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3A2F6D1E05F2 for ; Thu, 16 Nov 2006 13:16:36 -0800 (PST) Received: by nf-out-0910.google.com with SMTP id x30so1069256nfb for ; Thu, 16 Nov 2006 13:16:35 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:from:to:subject:date:user-agent:cc:mime-version:content-disposition:message-id:content-type:content-transfer-encoding; b=owGlxAYCKFJQjKufbgM7OvuMvSpbh/Erq6DdBwixUti9DBOz94kkqcQBX7GxJcPHou4b44umMz1fSae/i9VkDInBsDjwHnmrQ0pDvyIj/BjvhO+hCkfzN2brIa9cC2FnLqlZxMXP6RhLT1anwcW+u8pTAiSZTaU5bTUGARU/3IA= Received: by 10.49.41.18 with SMTP id t18mr1157787nfj.1163711795319; Thu, 16 Nov 2006 13:16:35 -0800 (PST) Received: from ?192.168.1.34? ( [213.237.34.34]) by mx.google.com with ESMTP id l38sm9447475nfc.2006.11.16.13.16.33; Thu, 16 Nov 2006 13:16:34 -0800 (PST) From: Jesper Juhl To: linux-kernel@vger.kernel.org X-ASG-Orig-Subj: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount Subject: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount Date: Thu, 16 Nov 2006 22:18:26 +0100 User-Agent: KMail/1.9.4 Cc: xfs@oss.sgi.com, xfs-masters@oss.sgi.com, nathans@sgi.com, Jesper Juhl , Andrew Morton MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200611162218.26945.jesper.juhl@gmail.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26201 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9662 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jesper.juhl@gmail.com Precedence: bulk X-list: xfs Content-Length: 1583 Lines: 54 (got no reply on this when I originally send it on 20061031, so resending now that a bit of time has passed. The patch still applies cleanly to Linus' git tree as of today.) The Coverity checker spotted a potential problem in XFS. The problem is that if, in xfs_mount(), this code triggers: ... if (!mp->m_logdev_targp) goto error0; ... Then we'll end up calling xfs_unmountfs_close() with a NULL 'mp->m_logdev_targp'. This in turn will result in a call to xfs_free_buftarg() with its 'btp' argument == NULL. xfs_free_buftarg() dereferences 'btp' leading to a NULL pointer dereference and crash. I think this can happen, since the fatal call to xfs_free_buftarg() happens when 'm_logdev_targp != m_ddev_targp' and due to a check of 'm_ddev_targp' against NULL in xfs_mount() (and subsequent return if it is NULL) the two will never both be NULL when we hit the error0 label from the two lines cited above. Comments welcome (please keep me on Cc: on replies). Here's a proposed patch to fix this by testing 'btp' against NULL in xfs_free_buftarg(). Signed-off-by: Jesper Juhl --- fs/xfs/linux-2.6/xfs_buf.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c index db5f5a3..6ef1860 100644 --- a/fs/xfs/linux-2.6/xfs_buf.c +++ b/fs/xfs/linux-2.6/xfs_buf.c @@ -1450,6 +1450,9 @@ xfs_free_buftarg( xfs_buftarg_t *btp, int external) { + if (unlikely(!btp)) + return; + xfs_flush_buftarg(btp, 1); if (external) xfs_blkdev_put(btp->bt_bdev); From owner-xfs@oss.sgi.com Thu Nov 16 13:54:43 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 13:54:50 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAGLseaG017898 for ; Thu, 16 Nov 2006 13:54:41 -0800 X-ASG-Debug-ID: 1163714030-23530-42-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id 5DE84D1E17CE; Thu, 16 Nov 2006 13:53:50 -0800 (PST) Received: from agami.com ([192.168.168.135]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kAGLZS7M027761 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 16 Nov 2006 13:35:28 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kAGLZMP6020572; Thu, 16 Nov 2006 13:35:22 -0800 Message-ID: <455CD6C8.5030907@agami.com> Date: Thu, 16 Nov 2006 13:23:20 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: Jesper Juhl CC: linux-kernel@vger.kernel.org, xfs@oss.sgi.com, xfs-masters@oss.sgi.com, nathans@sgi.com, Andrew Morton X-ASG-Orig-Subj: Re: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount Subject: Re: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount References: <200611162218.26945.jesper.juhl@gmail.com> In-Reply-To: <200611162218.26945.jesper.juhl@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26205 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9663 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 2253 Lines: 74 Hey Jesper, Rather, it can be done as below. Nothing to say that your code wouldn't work. Just that catch it early, so that potential function call overhead to call xfs_free_buftarg can be avoided. void xfs_unmountfs_close(xfs_mount_t *mp, struct cred *cr) { if (mp->m_logdev_targp && (mp->m_logdev_targp != mp->m_ddev_targp)) xfs_free_buftarg(mp->m_logdev_targp, 1); if (mp->m_rtdev_targp) xfs_free_buftarg(mp->m_rtdev_targp, 1); xfs_free_buftarg(mp->m_ddev_targp, 0); } Jesper Juhl wrote: > (got no reply on this when I originally send it on 20061031, so resending > now that a bit of time has passed. The patch still applies cleanly to > Linus' git tree as of today.) > > > The Coverity checker spotted a potential problem in XFS. > > The problem is that if, in xfs_mount(), this code triggers: > > ... > if (!mp->m_logdev_targp) > goto error0; > ... > > Then we'll end up calling xfs_unmountfs_close() with a NULL > 'mp->m_logdev_targp'. > This in turn will result in a call to xfs_free_buftarg() with its 'btp' > argument == NULL. xfs_free_buftarg() dereferences 'btp' leading to > a NULL pointer dereference and crash. > > I think this can happen, since the fatal call to xfs_free_buftarg() > happens when 'm_logdev_targp != m_ddev_targp' and due to a check of > 'm_ddev_targp' against NULL in xfs_mount() (and subsequent return if it is > NULL) the two will never both be NULL when we hit the error0 label from > the two lines cited above. > > Comments welcome (please keep me on Cc: on replies). > > Here's a proposed patch to fix this by testing 'btp' against NULL in > xfs_free_buftarg(). > > > Signed-off-by: Jesper Juhl > --- > > fs/xfs/linux-2.6/xfs_buf.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c > index db5f5a3..6ef1860 100644 > --- a/fs/xfs/linux-2.6/xfs_buf.c > +++ b/fs/xfs/linux-2.6/xfs_buf.c > @@ -1450,6 +1450,9 @@ xfs_free_buftarg( > xfs_buftarg_t *btp, > int external) > { > + if (unlikely(!btp)) > + return; > + > xfs_flush_buftarg(btp, 1); > if (external) > xfs_blkdev_put(btp->bt_bdev); > > > > From owner-xfs@oss.sgi.com Thu Nov 16 14:01:08 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 14:01:12 -0800 (PST) Received: from cuda.sgi.com (cuda0.sgi.com [192.48.168.32]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAGM17aG018832 for ; Thu, 16 Nov 2006 14:01:08 -0800 X-ASG-Debug-ID: 1163703632-7946-205-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id EA912DB68AC4 for ; Thu, 16 Nov 2006 11:00:32 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAGJ0Wjk007231 for ; Thu, 16 Nov 2006 14:00:32 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAGJ0VGl022127 for ; Thu, 16 Nov 2006 14:00:31 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAGJ0VbI004433 for ; Thu, 16 Nov 2006 14:00:31 -0500 Message-ID: <455CB54F.8080901@sandeen.net> Date: Thu, 16 Nov 2006 13:00:31 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: xfs@oss.sgi.com X-ASG-Orig-Subj: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.22986 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9664 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 2422 Lines: 57 see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212201 Bugzilla Bug 212201: Cannot build sysem with XFS file system. I turned on attr2 in FC6 at nathan's suggestion, for selinux goodness with more efficient xattr space usage. But, many reports that this was totally broken in fc6, on x86_64. Install went ok, but on reboot the filesystem was found to be corrupt. The filesystem was also found to be marked w/ attr1, not attr2.... If you do a fresh mkfs.xfs on x86_64, with -i attr=2, and dump out the superblock (or look at it with xfs_db) you will find that although the versionnum says that there is a morebits bit, the features2 flag is 0. if you dd/hexdump the superblock, you will find the attr2 flag, but at the wrong offset. This is because the xfs_sb_t struct is padded out to 64 bits on 64-bit arches, and the xfs_xlatesb() routine and xfs_sb_info[] array take this padding to mean that the last item is 4 bytes bigger than it is, and treats sb_features2 as 8 bytes not four. This then gets endian-flipped out... I can't quite figure out how this winds up causing problems if you stay on the x86_64 arch, as I'd expect that if the offset is wrong, it should at least be consistently wrong. And in fact if you do mkfs,mount,xfs_info, it will tell you that you do have attr2. But somewhere along the line thing go wrong, and post-install, post-reboot, the filesystem thinks it is attr1, and is therefore corrupt. I think that maybe some accesses are post-xfs_xlatesb, while others may access the un-flipped sb directly? Or maybe this is sb logging code that has messed things up? Not sure... needs more investigation. In any case, dd does not lie, and this patch for the kernel, and a corresponding one for userspace, at least make "mkfs.xfs -i attr=2" puts the features2 flag in the right place, as shown by inspection via dd. Signed-off-by: Eric Sandeen Index: linux-2.6.18/fs/xfs/xfs_sb.h =================================================================== --- linux-2.6.18.orig/fs/xfs/xfs_sb.h +++ linux-2.6.18/fs/xfs/xfs_sb.h @@ -149,7 +149,7 @@ typedef struct xfs_sb __uint16_t sb_logsectsize; /* sector size for the log, bytes */ __uint32_t sb_logsunit; /* stripe unit size for the log */ __uint32_t sb_features2; /* additional feature bits */ -} xfs_sb_t; +} __attribute__ ((packed)) xfs_sb_t; /* * Sequence number values for the fields. From owner-xfs@oss.sgi.com Thu Nov 16 14:04:00 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 14:04:07 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAGM3xaG019803 for ; Thu, 16 Nov 2006 14:04:00 -0800 X-ASG-Debug-ID: 1163714585-19703-206-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by cuda.sgi.com (Spam Firewall) with ESMTP id 86B7CD1E17E8; Thu, 16 Nov 2006 14:03:05 -0800 (PST) Received: from agami.com ([192.168.168.135]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id kAGLuI7M028065 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 16 Nov 2006 13:56:24 -0800 Received: from [10.123.4.231] (ind-1.agami.com [10.123.4.231]) (authenticated bits=0) by agami.com (8.12.11/8.12.11) with ESMTP id kAGLu8tO021468; Thu, 16 Nov 2006 13:56:08 -0800 Message-ID: <455CDBA5.5070809@agami.com> Date: Thu, 16 Nov 2006 13:44:05 -0800 From: Shailendra Tripathi User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: Jesper Juhl CC: linux-kernel@vger.kernel.org, xfs@oss.sgi.com, xfs-masters@oss.sgi.com, nathans@sgi.com, Andrew Morton X-ASG-Orig-Subj: Re: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount Subject: Re: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount References: <200611162218.26945.jesper.juhl@gmail.com> <455CD6C8.5030907@agami.com> <9a8748490611161343x44e759acs9b70247c84452ba5@mail.gmail.com> In-Reply-To: <9a8748490611161343x44e759acs9b70247c84452ba5@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.36 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26205 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9665 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 1178 Lines: 32 Jesper Juhl wrote: > The reason I want to fix it in the freeing function is that many other > functions in the kernel that free resources are safe to call with NULL > pointers and this would make xfs_free_buftarg() follow that > convention. This would perhaps also allow for some cleanups in other > places that call the function since then there's no longer a need for > explicit NULL checks any more (haven't checked if there's anything to > gain there though). > I don't think the function call overhead matters much since this is in > a case of a failed mount, so it should happen very rarely. > I agree with you. However, cleanup functions should(/must?) check for NULL etc and in this case it is already doing so for other cases. So, perhaps not required. Just a different viewpoint. Your choice. >> void >> xfs_unmountfs_close(xfs_mount_t *mp, struct cred *cr) >> { >> if (mp->m_logdev_targp && (mp->m_logdev_targp != >> mp->m_ddev_targp)) >> xfs_free_buftarg(mp->m_logdev_targp, 1); >> if (mp->m_rtdev_targp) >> xfs_free_buftarg(mp->m_rtdev_targp, 1); >> xfs_free_buftarg(mp->m_ddev_targp, 0); >> } >> > > From owner-xfs@oss.sgi.com Thu Nov 16 14:11:36 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 14:11:43 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAGMBZaG020895 for ; Thu, 16 Nov 2006 14:11:36 -0800 X-ASG-Debug-ID: 1163715045-32763-24-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id 59D97D1CF292 for ; Thu, 16 Nov 2006 14:10:47 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAGMAjKd006187; Thu, 16 Nov 2006 17:10:45 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAGMAjR9026654; Thu, 16 Nov 2006 17:10:45 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAGMAiqa023819; Thu, 16 Nov 2006 17:10:44 -0500 Message-ID: <455CE1E3.7020703@sandeen.net> Date: Thu, 16 Nov 2006 16:10:43 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: Eric Sandeen CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> In-Reply-To: <455CB54F.8080901@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26205 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9666 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 478 Lines: 15 Eric Sandeen wrote: > see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212201 > > Bugzilla Bug 212201: Cannot build sysem with XFS file system. > > I turned on attr2 in FC6 at nathan's suggestion, for selinux goodness > with more efficient xattr space usage. > > But, many reports that this was totally broken in fc6, on x86_64. ugh. it's broken on x86 too, so it's not just the alignment/padding, although that should be fixed for cross-arch mounts. -Eric From owner-xfs@oss.sgi.com Thu Nov 16 14:46:30 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 14:46:36 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAGMkRaG025432 for ; Thu, 16 Nov 2006 14:46:29 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA00089; Fri, 17 Nov 2006 09:45:29 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAGMjS7Y37119596; Fri, 17 Nov 2006 09:45:28 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAGMjRnH37137542; Fri, 17 Nov 2006 09:45:27 +1100 (AEDT) Date: Fri, 17 Nov 2006 09:45:27 +1100 From: David Chinner To: Eric Sandeen Cc: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <20061116224527.GF11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <455CB54F.8080901@sandeen.net> User-Agent: Mutt/1.4.2.1i X-archive-position: 9667 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 4062 Lines: 101 On Thu, Nov 16, 2006 at 01:00:31PM -0600, Eric Sandeen wrote: > see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212201 > > Bugzilla Bug 212201: Cannot build sysem with XFS file system. ..... > The filesystem was also found to be marked w/ attr1, not attr2.... ..... > if you dd/hexdump the superblock, you will find the attr2 flag, but at > the wrong offset. > > This is because the xfs_sb_t struct is padded out to 64 bits on 64-bit > arches, and the xfs_xlatesb() routine and xfs_sb_info[] array take this > padding to mean that the last item is 4 bytes bigger than it is, and > treats sb_features2 as 8 bytes not four. This then gets endian-flipped out... Ok. > I can't quite figure out how this winds up causing problems if you stay > on the x86_64 arch, as I'd expect that if the offset is wrong, it should > at least be consistently wrong. And in fact if you do mkfs,mount,xfs_info, > it will tell you that you do have attr2. But somewhere along the line thing > go wrong, and post-install, post-reboot, the filesystem thinks it is attr1, > and is therefore corrupt. Nor would I expect an i386 to have a problem either. > I think that maybe some accesses are post-xfs_xlatesb, while others > may access the un-flipped sb directly? Or maybe this is sb logging > code that has messed things up? Not sure... needs more investigation. More investigation - we shouldn't be operating on an untranslated superblock - the first thing we do is read and translate it.... > Signed-off-by: Eric Sandeen > > Index: linux-2.6.18/fs/xfs/xfs_sb.h > =================================================================== > --- linux-2.6.18.orig/fs/xfs/xfs_sb.h > +++ linux-2.6.18/fs/xfs/xfs_sb.h > @@ -149,7 +149,7 @@ typedef struct xfs_sb > __uint16_t sb_logsectsize; /* sector size for the log, bytes */ > __uint32_t sb_logsunit; /* stripe unit size for the log */ > __uint32_t sb_features2; /* additional feature bits */ > -} xfs_sb_t; > +} __attribute__ ((packed)) xfs_sb_t; I'd prefer not to pack the structure. Over time, here's how this changed: typedef struct xfs_sb { @@ -135,9 +136,12 @@ __uint8_t sb_shared_vn; /* shared version number */ xfs_extlen_t sb_inoalignmt; /* inode chunk alignment, fsblocks */ __uint32_t sb_unit; /* stripe or raid unit */ - __uint32_t sb_width; /* stripe or raid width */ + __uint32_t sb_width; /* stripe or raid width */ __uint8_t sb_dirblklog; /* log2 of dir block size (fsbs) */ - __uint8_t sb_dummy[7]; /* padding */ + __uint8_t sb_logsectlog; /* log2 of the log sector size */ + __uint16_t sb_logsectsize; /* sector size for the log, bytes */ + __uint32_t sb_logsunit; /* stripe unit size for the log */ + __uint32_t sb_features2; /* additional feature bits */ } xfs_sb_t; So before the sector size > 512 bytes code, there was padding to push the superblock out to 64bit alignement so that sb_dirblklog was correctly aligned. The xfs_sb_info structure: { offsetof(xfs_sb_t, sb_unit), 0 }, { offsetof(xfs_sb_t, sb_width), 0 }, { offsetof(xfs_sb_t, sb_dirblklog), 0 }, - { offsetof(xfs_sb_t, sb_dummy), 1 }, + { offsetof(xfs_sb_t, sb_logsectlog), 0 }, + { offsetof(xfs_sb_t, sb_logsectsize),0 }, { offsetof(xfs_sb_t, sb_logsunit), 0 }, + { offsetof(xfs_sb_t, sb_features2), 0 }, { sizeof(xfs_sb_t), 0 } }; had the sb_dummy field as "no translate" so it effectively ignored it but it ensured that sb_dirblklog was sized correctly. The real bug here was whoever removed the dummy field and did not replace that with a comment ot say that the xfs_sb strucutre needs to be padded to 64 bits to ensure translation worked properly on 64 bit systems. I'd prefer explicit padding (with warning comments) over packing the structure. Thoughts? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 16 14:56:29 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 14:56:35 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAGMuSaG026715 for ; Thu, 16 Nov 2006 14:56:29 -0800 X-ASG-Debug-ID: 1163717740-13168-70-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by cuda.sgi.com (Spam Firewall) with ESMTP id 043ABD1D5AFB; Thu, 16 Nov 2006 14:55:40 -0800 (PST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAGMtdxE028179; Thu, 16 Nov 2006 17:55:39 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAGMtdNv009139; Thu, 16 Nov 2006 17:55:39 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAGMtbMb028210; Thu, 16 Nov 2006 17:55:38 -0500 Message-ID: <455CEC68.9000401@sandeen.net> Date: Thu, 16 Nov 2006 16:55:36 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: David Chinner CC: xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <20061116224527.GF11034@melbourne.sgi.com> In-Reply-To: <20061116224527.GF11034@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26209 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9668 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 627 Lines: 19 David Chinner wrote: > The real bug here was whoever removed the dummy field and did not > replace that with a comment ot say that the xfs_sb strucutre needs > to be padded to 64 bits to ensure translation worked properly > on 64 bit systems. > > I'd prefer explicit padding (with warning comments) over packing > the structure. Thoughts? yes, I agree that explicit padding, and a comment about why it's there, would be better. I was thinking about this over lunch and meant to follow up, but then figured out that the actual bug isn't because of this (it's broken as well on x86), and kept trying to chase that :) -Eric From owner-xfs@oss.sgi.com Thu Nov 16 15:45:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 15:45:20 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAGNjAaG031867 for ; Thu, 16 Nov 2006 15:45:12 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA02225; Fri, 17 Nov 2006 10:44:03 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAGNi17Y37136378; Fri, 17 Nov 2006 10:44:01 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAGNhwtF36732260; Fri, 17 Nov 2006 10:43:58 +1100 (AEDT) Date: Fri, 17 Nov 2006 10:43:58 +1100 From: David Chinner To: linux-kernel@ckeith.clara.net Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: GPF oops on 2.6.18-1.2200.fc5 and repeated DWARF2 unwinder XFS errors under 2.6.18-1.2239.fc5 Message-ID: <20061116234358.GJ11034@melbourne.sgi.com> References: <20061115150616.GL26200@dot.oreally.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061115150616.GL26200@dot.oreally.co.uk> User-Agent: Mutt/1.4.2.1i X-archive-position: 9670 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 8378 Lines: 197 On Wed, Nov 15, 2006 at 03:06:16PM +0000, linux-kernel@ckeith.clara.net wrote: > > Hi, > > I just started up a new box yesterday with Fedora Core 5. Its running with > 2 dual core AMD Opteron 2220 SE's and 24Gb of memory and an Adaptec SCSI > card and I've had a number of errors which I can't seem to find solutions > for. I'd had no end of problems with spinlock issues in the aacraid driver > in the 2.6.17 series on another dual opteron box, but on hitting > 2.6.18-1.2200 these went away, so I started the new box off with > 2.6.18-1.2200 as well. As I understand it, this is 2.6.18.1 as compiled > by Redhat/Fedora and includes various DWARD2 unwinder fixes. > > Well this caused a GPF and the following trace: > > ----------- > > general protection fault: 0000 [1] SMP > last sysfs file: /class/net/sit0/address > CPU 1 > Modules linked in: nls_utf8 ipv6 ip_conntrack_ftp ip_conntrack_netbios_ns ipt_owner ipt_LOG xt_limit ipt_REJECT xt_tcpudp xt_state ip_conntrack nfnetlink iptable_filter ip_tables x_tables xfs dm_mod video sbs i2c_ec button battery asus_acpi ac lp parport_pc parport ide_cd cdrom sg ehci_hcd ohci_hcd i2c_nforce2 i2c_core forcedeth serio_raw k8_edac edac_mc shpchp pcspkr ext3 jbd sata_nv libata aacraid sd_mod scsi_mod > Pid: 1093, comm: gawk Not tainted 2.6.18-1.2200.fc5 #1 > RIP: 0010:[] [] > :xfs:xfs_bmap_search_extents+0x1c/0xcb > RSP: 0018:ffff8105fd653b40 EFLAGS: 00010202 > RAX: ffffffff806785a0 RBX: ffff8105fd653d28 RCX: ffff8105fd653d70 > RDX: 0000000000000000 RSI: 00000000000033ce RDI: ffff8102fe801080 > RBP: ffff8105fd653b40 R08: ffff8105fd653d6c R09: ffff8105fd653d28 > R10: ffff8105fd653d70 R11: ffff8102f4655250 R12: ffff8105fd653d6c > R13: ffff8105ff04d800 R14: 0007ffffffffcc32 R15: ffff8105fd653de8 > FS: 00002aaaab093e00(0000) GS:ffff8102ffc3b1c0(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00002aaaaae4a020 CR3: 0000000000201000 CR4: 00000000000006e0 > Process gawk (pid: 1093, threadinfo ffff8105fd652000, task > ffff8105fd4f4810) > Stack: ffff8102fe801080 0000000000000005 0000000000000000 ffff8105ff04d800 > ffffffff8826b972 ffff8105fd653d08 0000000000000007 0000000000000048 > 0000000000000000 000000000000029b 0000000000100000 ffff8105fd653c18 > Call Trace: > [] :xfs:xfs_bmapi+0x2d2/0x1b66 > [] :xfs:xfs_inactive_free_eofblocks+0xa3/0x1ec > [] :xfs:xfs_release+0x97/0xc8 > [] :xfs:xfs_file_release+0x1a/0x1e > [] __fput+0xbf/0x1aa > [] remove_vma+0x4e/0x75 > [] exit_mmap+0xcf/0xf3 > [] mmput+0x41/0x96 > [] do_exit+0x28c/0x8c3 > [] cpuset_exit+0x0/0x6c > [<00002aaaab089888>] > > > Code: 18 4c 8b 4c 24 40 65 8b 0c 25 2c 00 00 00 48 63 c9 48 8b 0c > RIP [] :xfs:xfs_bmap_search_extents+0x1c/0xcb > RSP > <1>Fixing recursive fault but reboot is needed! > > ----------- > > At the time the box was sitting there doing nothing but running openssh. > (This gawk process seems to be from anacron kicking in 'makewhatis'). > The machine didn't die but didn't seem happy. I searching I discovered a > number of people with the same message "general protection fault: 0000 [1] > SMP" on lots of different processes so I assumed that it wasn't related > to the XFS drivers directly, but to a problem somewhere else which is > being triggered by the dual-core opterons (could heat be a factor as its > just sitting on a desk in the office not in a machine room?). > > Anyway since this had happened I decided to upgrade to the next Fedora > kernel 2.6.18-1.2239.fc5 which appears to be 2.6.18.2 + some redhat/fedora > patches (mostly for Xen, which I'm not running). This sit there for a few > hours and hadn't thrown an error so I decided to upload some data to it > overnight ready for the morning. As soon as I did I started getting > traces for: > > > ----------- > Filesystem "sda5": XFS internal error xfs_btree_check_sblock at line 334 of > file fs/xfs/xfs_btree.c. Caller 0xffffffff8825e203 > > Call Trace: > [] show_trace+0x34/0x47 > [] dump_stack+0x12/0x17 > [] :xfs:xfs_btree_check_sblock+0xbc/0xcc > [] :xfs:xfs_alloc_lookup+0x14f/0x39a > [] :xfs:xfs_alloc_ag_vextent+0x74/0xf61 > [] :xfs:xfs_alloc_fix_freelist+0x356/0x410 > [] :xfs:xfs_alloc_vextent+0x2ae/0x400 > [] :xfs:xfs_bmapi+0xed6/0x1b66 > [] :xfs:xfs_iomap_write_allocate+0x257/0x3fc > [] :xfs:xfs_iomap+0x31a/0x521 > [] :xfs:xfs_map_blocks+0x2f/0x5f > [] :xfs:xfs_page_state_convert+0x2b7/0xb63 > [] :xfs:xfs_vm_writepage+0xa7/0xde > [] mpage_writepages+0x1d0/0x395 > [] do_writepages+0x23/0x32 > [] __filemap_fdatawrite_range+0x54/0x5e > [] :xfs:fs_flush_pages+0x4b/0x64 > [] :xfs:xfs_file_close+0x2a/0x2e > [] filp_close+0x36/0x64 > [] sys_close+0x8f/0xaa > [] tracesys+0xd1/0xdc > DWARF2 unwinder stuck at tracesys+0xd1/0xdc > Leftover inexact backtrace: > ----------- You've got a corrupt freelist btree block. how were you uploading files to the machine? Can you cc bug reports involving XFS to the xfs@oss.sgi.com list in future? (added to this reply) > I first booted into 2.6.18-1.2239.fc5 in single user mode and forced a > check of the disk with xfs_repair and I'm using xfs-progs-2.8.11 as > I discovered on my other system that the 2.6.17 XFS kernel driver bugs > were breaking the FS in a way that the xfs-progs-2.7.x code didn't fix. > > These XFS bugs seem to be the same problems that were cropping up in the > 2.6.17 series which were resolved in 2.6.18.1 (2.6.18-1.2200.fc5). > > Any suggestions are greatly appreciated. Also please let me know if more > details are required. The 2.6.17 problems can leave on disk corruption that is not tripped over until some time later on - even after a kernel upgrade. Running the latest repair over all your XFS filesystems that were in use on 2.6.17.x (x <= 6) really needs to be done regardless of whether you've tripped over corruption or not. However, this could be a result of the problems you've been having with the aacraid driver, and not an XFS problem at all.... Cheers, Dave. > Should I just simply go back to ext3? I'd prefer not to because of the > fsck'ing time on a 1Tb array, but if it means that the kernel doesn't throw > a hissy fit then I'll be more than happy to do that. > > Regards, > Colin. > > thor# uname -a > Linux thor 2.6.18-1.2239.fc5 #1 SMP Fri Nov 10 12:51:06 > EST 2006 x86_64 x86_64 x86_64 GNU/Linux > > thor# cat /proc/cmdline > ro root=LABEL=/ > > Adaptec aacraid driver (1.1-5[2409]-mh2) > > > processor : 0 > vendor_id : AuthenticAMD > cpu family : 15 > model : 65 > model name : Dual-Core AMD Opteron(tm) Processor 2220 SE > stepping : 2 > cpu MHz : 2800.000 > cache size : 1024 KB > physical id : 0 > siblings : 2 > core id : 0 > cpu cores : 2 > fpu : yes > fpu_exception : yes > cpuid level : 1 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca > cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt > rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8_legacy > bogomips : 5639.77 > TLB size : 1024 4K pages > clflush size : 64 > cache_alignment : 64 > address sizes : 40 bits physical, 48 bits virtual > power management: ts fid vid ttp tm stc > > > > -- > "Developers are like artists; they produce their best work if they > have the freedom to do so" - Werner Vogels, CTO Amazon.com > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 16 17:07:39 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 17:07:46 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAH17aaG012051 for ; Thu, 16 Nov 2006 17:07:38 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA04884; Fri, 17 Nov 2006 12:06:42 +1100 Date: Fri, 17 Nov 2006 11:08:40 +1000 From: Timothy Shimmin To: Eric Sandeen cc: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: In-Reply-To: <455CB54F.8080901@sandeen.net> References: <455CB54F.8080901@sandeen.net> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9671 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1999 Lines: 54 Hi Eric, --On 16 November 2006 1:00:31 PM -0600 Eric Sandeen wrote: > see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212201 > > Bugzilla Bug 212201: Cannot build sysem with XFS file system. > > I turned on attr2 in FC6 at nathan's suggestion, for selinux goodness > with more efficient xattr space usage. > > But, many reports that this was totally broken in fc6, on x86_64. > > Install went ok, but on reboot the filesystem was found to be corrupt. > > The filesystem was also found to be marked w/ attr1, not attr2.... > > If you do a fresh mkfs.xfs on x86_64, with -i attr=2, and dump out the > superblock (or look at it with xfs_db) you will find that although the > versionnum says that there is a morebits bit, the features2 flag is 0. > > if you dd/hexdump the superblock, you will find the attr2 flag, but at > the wrong offset. > > This is because the xfs_sb_t struct is padded out to 64 bits on 64-bit > arches, This actually came up when I wrote xfstests/122. It looks at sizes of various ondisk structures and for some of them it print's out the offsets of the fields. I noticed that the xfs_sb_t was a different size on 32bit and 64 bit and so printed out all the field offsets. They are all the same (on different word sizes) and so the only difference is that the last field will be padded out on a 64 bit platform as you noticed. I couldn't really see a problem with that. And discussed it with Nathan at the time. > and the xfs_xlatesb() routine and xfs_sb_info[] array take this > padding to mean that the last item is 4 bytes bigger than it is, and > treats sb_features2 as 8 bytes not four. This then gets endian-flipped out... > Well there is a bug in the sb endian translation code then or its setup. All the field accesses should be correct, no? I can't see why it needs to be packed or padded if it's just implicit extra padding after the end of the last field. Am I missing something? Let me look into this a bit more :) --Tim From owner-xfs@oss.sgi.com Thu Nov 16 18:40:52 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 18:41:00 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAH2emaG020525 for ; Thu, 16 Nov 2006 18:40:50 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA06996; Fri, 17 Nov 2006 13:39:51 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAH2do7Y37724469; Fri, 17 Nov 2006 13:39:50 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAH2dkrh37743890; Fri, 17 Nov 2006 13:39:46 +1100 (AEDT) Date: Fri, 17 Nov 2006 13:39:46 +1100 From: David Chinner To: Timothy Shimmin Cc: Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <20061117023946.GN11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 9672 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 801 Lines: 27 On Fri, Nov 17, 2006 at 11:08:40AM +1000, Timothy Shimmin wrote: > Hi Eric, > > > Well there is a bug in the sb endian translation code then or its setup. > All the field accesses should be correct, no? The problem is the size of the variable translated in xfs_xlatesb() is the offset of the next field minus the offset of the current field. With the last filed of the structure, it ends up being the size of the structure minus the offset of the field. On a 32 bit machine, the structure is 4 bytes larger that the offset of the feature2 field. On 64 bit machine, the strucutre is 8 bytes larger than the offset of the features2 field, so it translates is as though it was a 64 bit field, not a 32 bit field..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 16 20:10:28 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 20:10:36 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAH4ANaG029398 for ; Thu, 16 Nov 2006 20:10:26 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA08707; Fri, 17 Nov 2006 15:09:13 +1100 Date: Fri, 17 Nov 2006 14:11:12 +1000 From: Timothy Shimmin To: David Chinner cc: Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: In-Reply-To: <20061117023946.GN11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9673 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1667 Lines: 53 Hi Dave, --On 17 November 2006 1:39:46 PM +1100 David Chinner wrote: > On Fri, Nov 17, 2006 at 11:08:40AM +1000, Timothy Shimmin wrote: >> Hi Eric, >> >> >> Well there is a bug in the sb endian translation code then or its setup. >> All the field accesses should be correct, no? > > The problem is the size of the variable translated in xfs_xlatesb() > is the offset of the next field minus the offset of the current > field. > Yep. > With the last filed of the structure, it ends up being the > size of the structure minus the offset of the field. On a 32 bit > machine, the structure is 4 bytes larger that the offset of > the feature2 field. On 64 bit machine, the strucutre is > 8 bytes larger than the offset of the features2 field, Yep. > so it translates is as though it was a 64 bit field, > not a 32 bit field..... > So why not change xfs_sb_info to give the real offset of where the next field should go (if there was one), instead of giving the sizeof the structure which is not where say a 32 bit field would go and is wrong IMHO. i.e. =========================================================================== Index: fs/xfs/xfs_mount.c =========================================================================== --- a/fs/xfs/xfs_mount.c 2006-11-17 15:02:21.000000000 +1100 +++ b/fs/xfs/xfs_mount.c 2006-11-17 14:48:43.261937705 +1100 @@ -121,7 +121,7 @@ static const struct { { offsetof(xfs_sb_t, sb_logsectsize),0 }, { offsetof(xfs_sb_t, sb_logsunit), 0 }, { offsetof(xfs_sb_t, sb_features2), 0 }, - { sizeof(xfs_sb_t), 0 } + { offsetof(xfs_sb_t, sb_features2) + sizeof(__uint32_t), 0 } }; /* --Tim From owner-xfs@oss.sgi.com Thu Nov 16 21:56:17 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 21:56:26 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAH5uDaG013358 for ; Thu, 16 Nov 2006 21:56:16 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA11166; Fri, 17 Nov 2006 16:55:23 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAH5tM7Y37895399; Fri, 17 Nov 2006 16:55:22 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAH5tL7h37899307; Fri, 17 Nov 2006 16:55:21 +1100 (AEDT) Date: Fri, 17 Nov 2006 16:55:21 +1100 From: David Chinner To: Timothy Shimmin Cc: David Chinner , Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <20061117055521.GS11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 9674 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1686 Lines: 48 On Fri, Nov 17, 2006 at 02:11:12PM +1000, Timothy Shimmin wrote: > >so it translates is as though it was a 64 bit field, > >not a 32 bit field..... > > > > So why not change xfs_sb_info to give the real offset of where > the next field should go (if there was one), instead of giving the sizeof > the > structure which is not where say a 32 bit field would go and > is wrong IMHO. > > i.e. > > =========================================================================== > Index: fs/xfs/xfs_mount.c > =========================================================================== > > --- a/fs/xfs/xfs_mount.c 2006-11-17 15:02:21.000000000 +1100 > +++ b/fs/xfs/xfs_mount.c 2006-11-17 14:48:43.261937705 +1100 > @@ -121,7 +121,7 @@ static const struct { > { offsetof(xfs_sb_t, sb_logsectsize),0 }, > { offsetof(xfs_sb_t, sb_logsunit), 0 }, > { offsetof(xfs_sb_t, sb_features2), 0 }, > - { sizeof(xfs_sb_t), 0 } > + { offsetof(xfs_sb_t, sb_features2) + sizeof(__uint32_t), 0 } > }; Whenever you add to the table, you now need to modify both the new entry and the terminator to get it right. Nor (IMO) is it obvious that it is a terminator or why it is different to all the other entries in the structure. A field such as sb_dummy or sb_pad before the terminator is fairly obvious, and it means that you don't need to modify the table terminator every time the superblock gets extended. That way the code stays more consistent over time, diffs are smaller and neater, and you can see at a simple diff just how the features have been added over time (like I did this morning)..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 16 22:35:51 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 22:36:00 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAH6ZpaG017339 for ; Thu, 16 Nov 2006 22:35:51 -0800 X-ASG-Debug-ID: 1163745303-6596-478-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 697D85B8F46 for ; Thu, 16 Nov 2006 22:35:03 -0800 (PST) Received: by sandeen.net (Postfix, from userid 48) id 0B1F918E21526; Fri, 17 Nov 2006 00:34:46 -0600 (CST) Received: from 10.0.0.2 (SquirrelMail authenticated user sandeen) by sandeen.net with HTTP; Fri, 17 Nov 2006 00:34:45 -0600 (CST) Message-ID: <52841.10.0.0.2.1163745285.squirrel@sandeen.net> In-Reply-To: <20061117055521.GS11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> Date: Fri, 17 Nov 2006 00:34:45 -0600 (CST) X-ASG-Orig-Subj: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: sandeen@sandeen.net To: "David Chinner" Cc: "Timothy Shimmin" , "David Chinner" , "Eric Sandeen" , xfs@oss.sgi.com User-Agent: SquirrelMail/1.4.8-2.el4.centos4 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Barracuda-Spam-Score: 0.55 X-Barracuda-Spam-Status: No, SCORE=0.55 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=NO_REAL_NAME X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26239 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name X-archive-position: 9675 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1070 Lines: 26 > On Fri, Nov 17, 2006 at 02:11:12PM +1000, Timothy Shimmin wrote: > Whenever you add to the table, you now need to modify both the new > entry and the terminator to get it right. > > Nor (IMO) is it obvious that it is a terminator or why it is > different to all the other entries in the structure. A field such as > sb_dummy or sb_pad before the terminator is fairly obvious, and it > means that you don't need to modify the table terminator every time > the superblock gets extended. > > That way the code stays more consistent over time, diffs are smaller > and neater, and you can see at a simple diff just how the features > have been added over time (like I did this morning)..... nothing in the code is terribly obvious.. please add comments however you decide to fix it :) and really, now that this is out in the wild, maybe sb_features3 instead of padding is appropriate, and check both for the attr2 bit...? :( i'm trying to figure out what the kernel upgrade path is for fc6 users who have an extra-padded-flipped features2/attr2 filesystem. :( -Eric From owner-xfs@oss.sgi.com Thu Nov 16 22:59:05 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 22:59:12 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAH6x1aG019794 for ; Thu, 16 Nov 2006 22:59:03 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA12483; Fri, 17 Nov 2006 17:58:12 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAH6wB7Y37945668; Fri, 17 Nov 2006 17:58:11 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAH6wAWo37952896; Fri, 17 Nov 2006 17:58:10 +1100 (AEDT) Date: Fri, 17 Nov 2006 17:58:10 +1100 From: David Chinner To: sandeen@sandeen.net Cc: David Chinner , Timothy Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <20061117065810.GU11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52841.10.0.0.2.1163745285.squirrel@sandeen.net> User-Agent: Mutt/1.4.2.1i X-archive-position: 9676 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1841 Lines: 47 On Fri, Nov 17, 2006 at 12:34:45AM -0600, sandeen@sandeen.net wrote: > > On Fri, Nov 17, 2006 at 02:11:12PM +1000, Timothy Shimmin wrote: > > > Whenever you add to the table, you now need to modify both the new > > entry and the terminator to get it right. > > > > Nor (IMO) is it obvious that it is a terminator or why it is > > different to all the other entries in the structure. A field such as > > sb_dummy or sb_pad before the terminator is fairly obvious, and it > > means that you don't need to modify the table terminator every time > > the superblock gets extended. > > > > That way the code stays more consistent over time, diffs are smaller > > and neater, and you can see at a simple diff just how the features > > have been added over time (like I did this morning)..... > > nothing in the code is terribly obvious.. please add comments however you > decide to fix it :) *nod* > and really, now that this is out in the wild, maybe sb_features3 instead > of padding is appropriate, and check both for the attr2 bit...? :( I'm not sure that this is a good idea, especially as past history of introducing new feature bits is anything to go by (I think this makes bug #6 that the features2 field has been responsible for). I'd much prefer to fix the bug, blacklist the bad 4 bytes in the superblock, and then either: - modify xfs_admin/repair to detect a busted superblock and have them fix it; or - put code in the mount path that detects this and corrects it automatically (which we do for some other superblock fields). > i'm trying to figure out what the kernel upgrade path is for fc6 users who > have an extra-padded-flipped features2/attr2 filesystem. :( Depends on what we do to fix it, right? Do you have any preferences? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 16 23:46:58 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 16 Nov 2006 23:47:06 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAH7kuaG024458 for ; Thu, 16 Nov 2006 23:46:58 -0800 X-ASG-Debug-ID: 1163749567-28043-543-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by cuda.sgi.com (Spam Firewall) with ESMTP id CB118D1D702C for ; Thu, 16 Nov 2006 23:46:07 -0800 (PST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id B5D4C5341F4; Fri, 17 Nov 2006 17:52:11 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 05598-01-50; Fri, 17 Nov 2006 17:52:10 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 980A85341F0; Fri, 17 Nov 2006 17:52:09 +1100 (EST) X-ASG-Orig-Subj: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: Nathan Scott Reply-To: nscott@aconex.com To: sandeen@sandeen.net Cc: David Chinner , Timothy Shimmin , xfs@oss.sgi.com In-Reply-To: <52841.10.0.0.2.1163745285.squirrel@sandeen.net> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> Content-Type: text/plain Organization: Aconex Date: Fri, 17 Nov 2006 17:52:23 +1100 Message-Id: <1163746343.4695.152.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26233 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9677 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 520 Lines: 17 On Fri, 2006-11-17 at 00:34 -0600, sandeen@sandeen.net wrote: > and really, now that this is out in the wild, maybe sb_features3 > instead of padding is appropriate, and check both for the attr2 > bit...? :( Thats not going to work, theres three or four other feature2 bits preceding attr2 as well. The "take a 32 bit systems fs to a 64 bit system" is relatively uncommon, so I suppose its just something we live with (as we did with the log recovery issues in that situation for several years). cheers. -- Nathan From owner-xfs@oss.sgi.com Fri Nov 17 07:21:40 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 17 Nov 2006 07:21:50 -0800 (PST) Received: from cuda.sgi.com (cuda1.sgi.com [192.48.168.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAHFLeaG020715 for ; Fri, 17 Nov 2006 07:21:40 -0800 X-ASG-Debug-ID: 1163776851-7870-68-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from sandeen.net (sandeen.net [209.173.210.139]) by cuda.sgi.com (Spam Firewall) with ESMTP id 3A422604C8F for ; Fri, 17 Nov 2006 07:20:52 -0800 (PST) Received: by sandeen.net (Postfix, from userid 48) id 76ADB18E21526; Fri, 17 Nov 2006 09:20:50 -0600 (CST) Received: from 10.0.0.2 (SquirrelMail authenticated user sandeen) by sandeen.net with HTTP; Fri, 17 Nov 2006 09:20:50 -0600 (CST) Message-ID: <48064.10.0.0.2.1163776850.squirrel@sandeen.net> In-Reply-To: <1163746343.4695.152.camel@edge> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> <1163746343.4695.152.camel@edge> Date: Fri, 17 Nov 2006 09:20:50 -0600 (CST) X-ASG-Orig-Subj: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: sandeen@sandeen.net To: nscott@aconex.com Cc: sandeen@sandeen.net, "David Chinner" , "Timothy Shimmin" , xfs@oss.sgi.com User-Agent: SquirrelMail/1.4.8-2.el4.centos4 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Barracuda-Spam-Score: 0.55 X-Barracuda-Spam-Status: No, SCORE=0.55 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=NO_REAL_NAME X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26275 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.55 NO_REAL_NAME From: does not include a real name X-archive-position: 9679 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 722 Lines: 18 > On Fri, 2006-11-17 at 00:34 -0600, sandeen@sandeen.net wrote: >> and really, now that this is out in the wild, maybe sb_features3 >> instead of padding is appropriate, and check both for the attr2 >> bit...? :( > > Thats not going to work, theres three or four other feature2 bits > preceding attr2 as well. > > The "take a 32 bit systems fs to a 64 bit system" is relatively > uncommon, so I suppose its just something we live with (as we did > with the log recovery issues in that situation for several years). So you think this should not be fixed, then? Because if it -is- fixed then it's not an fs transfer problem; suddenly 64-bit attr2 filesystems will think they have attr1 if proper padding is added. -Eric From owner-xfs@oss.sgi.com Fri Nov 17 07:54:44 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 17 Nov 2006 07:54:52 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAHFsgaG024766 for ; Fri, 17 Nov 2006 07:54:44 -0800 X-ASG-Debug-ID: 1163778831-14612-55-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by cuda.sgi.com (Spam Firewall) with ESMTP id 953A9D1D7043 for ; Fri, 17 Nov 2006 07:53:51 -0800 (PST) Received: from [10.0.0.12] (ease.thebarn.com [10.0.0.12]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kAHFrnnN028968; Fri, 17 Nov 2006 09:53:50 -0600 (CST) (envelope-from cattelan@thebarn.com) Message-ID: <455DDB0D.7000005@thebarn.com> Date: Fri, 17 Nov 2006 09:53:49 -0600 From: Russell Cattelan User-Agent: Mozilla Thunderbird 1.0.7 (Macintosh/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Chinner CC: Eric Sandeen , xfs@oss.sgi.com X-ASG-Orig-Subj: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <20061116224527.GF11034@melbourne.sgi.com> In-Reply-To: <20061116224527.GF11034@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26277 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-archive-position: 9680 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 2066 Lines: 54 David Chinner wrote: > >@@ -135,9 +136,12 @@ > __uint8_t sb_shared_vn; /* shared version number */ > xfs_extlen_t sb_inoalignmt; /* inode chunk alignment, fsblocks */ > __uint32_t sb_unit; /* stripe or raid unit */ >- __uint32_t sb_width; /* stripe or raid width */ >+ __uint32_t sb_width; /* stripe or raid width */ > __uint8_t sb_dirblklog; /* log2 of dir block size (fsbs) */ >- __uint8_t sb_dummy[7]; /* padding */ >+ __uint8_t sb_logsectlog; /* log2 of the log sector size */ >+ __uint16_t sb_logsectsize; /* sector size for the log, bytes */ >+ __uint32_t sb_logsunit; /* stripe unit size for the log */ >+ __uint32_t sb_features2; /* additional feature bits */ > } xfs_sb_t; > >So before the sector size > 512 bytes code, there was padding to push the >superblock out to 64bit alignement so that sb_dirblklog was correctly aligned. >The xfs_sb_info structure: > > { offsetof(xfs_sb_t, sb_unit), 0 }, > { offsetof(xfs_sb_t, sb_width), 0 }, > { offsetof(xfs_sb_t, sb_dirblklog), 0 }, >- { offsetof(xfs_sb_t, sb_dummy), 1 }, >+ { offsetof(xfs_sb_t, sb_logsectlog), 0 }, >+ { offsetof(xfs_sb_t, sb_logsectsize),0 }, > { offsetof(xfs_sb_t, sb_logsunit), 0 }, >+ { offsetof(xfs_sb_t, sb_features2), 0 }, > { sizeof(xfs_sb_t), 0 } > }; > >had the sb_dummy field as "no translate" so it effectively ignored it >but it ensured that sb_dirblklog was sized correctly. > >The real bug here was whoever removed the dummy field and did not >replace that with a comment ot say that the xfs_sb strucutre needs >to be padded to 64 bits to ensure translation worked properly >on 64 bit systems. > >I'd prefer explicit padding (with warning comments) over packing >the structure. Thoughts? > > That seems safer to me and the comments will make the next person to muck with the structure think about padding. >Cheers, > >Dave. > > From owner-xfs@oss.sgi.com Fri Nov 17 08:25:17 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 17 Nov 2006 08:25:24 -0800 (PST) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.168.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAHGPEaG011136 for ; Fri, 17 Nov 2006 08:25:17 -0800 X-ASG-Debug-ID: 1163779590-15393-172-0 X-Barracuda-URL: http://cuda.sgi.com:80/cgi-bin/mark.cgi Received: from mut.autodesk.com (mut.autodesk.com [198.102.112.26]) by cuda.sgi.com (Spam Firewall) with ESMTP id 8A6F8D1D8817; Fri, 17 Nov 2006 08:06:30 -0800 (PST) Received: from msgusawfe01.ads.autodesk.com ([144.111.33.210]) by mut.autodesk.com (8.13.0/8.12.6) with ESMTP id kAHG4twO010650; Fri, 17 Nov 2006 08:04:55 -0800 (PST) Received: from msgusawpf02.ads.autodesk.com ([144.111.33.213]) by msgusawfe01.ads.autodesk.com with Microsoft SMTPSVC(5.0.2195.6713); Fri, 17 Nov 2006 08:04:55 -0800 Received: from msgusaemb01.ads.autodesk.com ([144.111.72.50]) by msgusawpf02.ads.autodesk.com with Microsoft SMTPSVC(5.0.2195.6713); Fri, 17 Nov 2006 08:04:54 -0800 Received: from msgusaemb02.ads.autodesk.com ([144.111.72.54]) by msgusaemb01.ads.autodesk.com with Microsoft SMTPSVC(5.0.2195.6713); Fri, 17 Nov 2006 11:04:53 -0500 x-mimeole: Produced By Microsoft Exchange V6.0.6603.0 content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/related; boundary="----_=_NextPart_001_01C70A62.1E1EC3CD"; type="multipart/alternative" X-ASG-Orig-Subj: mkfs.xfs kernel panic with mdadm on RH4U3 kernel Subject: mkfs.xfs kernel panic with mdadm on RH4U3 kernel Date: Fri, 17 Nov 2006 11:04:53 -0500 Message-ID: X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Thread-Topic: mkfs.xfs kernel panic with mdadm on RH4U3 kernel Thread-Index: AccKYhsboIljfovOTvuXDzEcqC13sg== From: "Billy Russell" To: Cc: "Jean Blouin" , "Dominique Bocquet" X-OriginalArrivalTime: 17 Nov 2006 16:04:53.0968 (UTC) FILETIME=[1E97B500:01C70A62] X-Barracuda-Spam-Score: 0.50 X-Barracuda-Spam-Status: No, SCORE=0.50 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_RULE7568M X-Barracuda-Spam-Report: Code version 3.02, rules version 3.0.26277 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.50 BSF_RULE7568M BODY: Custom Rule 7568M X-archive-position: 9681 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: billy.russell@autodesk.com Precedence: bulk X-list: xfs Content-Length: 30613 Lines: 816 This is a multi-part message in MIME format. ------_=_NextPart_001_01C70A62.1E1EC3CD Content-Type: multipart/alternative; boundary="----_=_NextPart_002_01C70A62.1E1EC3CD" ------_=_NextPart_002_01C70A62.1E1EC3CD Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable Hi all =20 I have encountered a problem on RH4 U2 and U3 with mdadm as the volume manager and XFS as the filesystem. The issue seems to be a max capacity problem that did not exist on a 64bit RH3 kernel. We were previously running on the ia32e kernel supplied with CXFS 3.4.6 for Linux. With this Kernel we were able to use mdadm and a 13TB filesystem with great performance and no capacity issues. =20 The configuration has been tested on 2 different platforms and both QLogic and Atto FC cards as well as RH4U2 and RH4U3. Below if the scenario (what fails and what doesn't) as well as the netdump log file. =20 =20 External 4 Gig FC RAID storage. 8 LUNs of 726 gigs Mdadm 1.6.0-2 Open src XFS=20 This failed =20 External 4 Gig FC RAID storage. 8 LUNs of 726 gigs Mdadm 2.5.2-1 Open src XFS=20 This failed =20 =20 External 4 Gig FC RAID storage. 8 LUNs of 726 gigs Mdadm 2.5.2-1 CXFS 4.0 version of XFS=20 This failed =20 External 4 Gig FC RAID storage. 6 LUNs of 726 gigs This passes with any version of XFS and MDADM =20 =20 If I use XVM and XFS I can configure the full storage configuration, make a filesystem and run fine. There is some kind of link between mdadm and XFS that is failing. =20 Any help would be appreciated. =20 LOG FILE FROM NETDUMP =20 Unable to handle kernel paging request at 000001015a441340 RIP: {:raid0:raid0_make_request+444} PML4 8063 PGD 0 Oops: 0000 [1] SMP CPU 2 Modules linked in: netconsole netdump raid0 celerityfc(U) wacom(U) sg mvfs(U) vnode(U) nfsd exportfs md5 ipv6 parport_pc lp parport autofs4 nfs lockd nfs_acl sunrpc ics_sdp(U) ics_offload(U) ipoib(U) ics_dsc(U) mt25218vpd(U) ibt(U) mst(U) ide_scsi xfs(U) dmapi(U) dm_mirror dm_mod ohci1394 ieee1394 ohci_hcd0000010071d3db40 {generic_make_request+355} {autoremove_wake_function+0} {submit_bio+247} {bio_alloc+288} {submit_bh+255} {block_read_full_page+584} {blkdev_get_block+0} {do_generic_mapping_read+567} {file_read_actor+0} {__generic_file_aio_write_nolock+731} {__generic_file_aio_read+385} {generic_file_read+187} {__up_read+16} {autoremove_wake_function+0} {autoremove_wake_function+0} {dnotify_parent+34} {vfs_read+207} {sys_pread64+86} {system_call+126} =20 Code: 48 8b 14 d0 48 8b 42 28 48 89 43 10 48 03 4a 40 b8 01 00 00 RIP {:raid0:raid0_make_request+444} RSP <0000010071d3dac8> CR2: 000001015a441340 =20 =20 =20 Billy Russell Storage & Networking Technical Lead Autodesk - Media and Entertainment Division 10 rue Duke Montreal, QC. H3C 2L7 514-954-7377 (office)=20 billy.russell@autodesk.com =20 =20 ------_=_NextPart_002_01C70A62.1E1EC3CD Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable

Hi all

 

         &n= bsp;  I have encountered a problem on RH4 U2 and U3 with mdadm as the volume manager and XFS as the filesystem. The issue seems to be a max capacity problem that did not exist on a 64bit RH3 kernel. We were previously running on the ia32e kernel supplied with CXFS 3.4.6 for Linux. With this Kernel we were able to= use mdadm and a 13TB filesystem with great performance and no capacity issues.<= o:p>

 

         &n= bsp;  The configuration has been tested on 2 different platforms and both QLogic and = Atto FC cards as well as RH4U2 and RH4U3. Below if the scenario (what fails and = what doesn’t) as well as the netdump log file.

 

 

External 4 Gig FC RAID storage.=

8 LUNs of 726 gigs

Mdadm 1.6.0-2

Open src XFS

This failed

 

External 4 Gig FC RAID storage.=

8 LUNs of 726 gigs

Mdadm 2.5.2-1

Open src XFS

This failed

 

 

External 4 Gig FC RAID storage.=

8 LUNs of 726 gigs

Mdadm 2.5.2-1

CXFS 4.0 version of XFS

This failed

 

External 4 Gig FC RAID storage.=

6 LUNs of 726 gigs

This passes with any version of XFS and MDADM

 

 

If I use XVM and XFS= I can configure the full storage configuration, make a filesystem and run f= ine. There is some kind of link between mdadm and XFS that is failing.

 

Any help would be appreciated.<= /p>

 

LOG FILE FROM NETDUMP=

 

Unable to handle kernel paging request at 000001015a441340 RIP:

<ffffffffa0b9fab4>{:raid0:raid0_make_reque= st+444}

PML4 8063 PGD 0

Oops: 0000 [1] SMP

CPU 2

Modules linked in: netconsole netdump raid0 celerityfc(U) wacom(U) sg mvfs(U) vnode(U) nfsd exportfs md5 ipv6 parport_p= c lp parport autofs4 nfs lockd nfs_acl sunrpc ics_sdp(U) ics_offload(U) ipoib(U) ics_dsc(U) mt25218vpd(U) ibt(U) mst(U) ide_scsi xfs(U) dmapi(U) dm_mirror dm_mod ohci1394 ieee1394 ohci_hcd0000010071d3db40 <ffffffff8024b0ea>{generic_make_request+355} <ffffffff80134e7e>{autoremove_wake_function+0}

       <ffffffff8024b1f6>{submit_bio+247} <ffffffff8017c6b8>{bio_alloc+288}

       <ffffffff8017a5e6>{submit_bh+255} <ffffffff8017c115>{block_read_full_page+584}=

       <fffffff= f8017e847>{blkdev_get_block+0} <ffffffff80159814>{do_generic_mapping_read+567}

       <ffffffff801599d4>{file_read_actor+0} <ffffffff8015af96>{__generic_file_aio_write_nolock+731}

       <ffffffff8015b59c>{__generic_file_aio_read+385} <ffffffff8015b737>{generic_file_read+187}

       <ffffffff801e8b09>{__up_read+16} <ffffffff80134e7e>{autoremove_= wake_function+0}

       <ffffffff80134e7e>{autoremove_wake_function+0} <ffffffff80191a38>{dnotify_parent+34}

       <ffffffff80177f43>{vfs_read+207} <ffffffff80178287>{sys_pread64= +86}

       <ffffffff80110236>{system_call+126}

 

Code: 48 8b 14 d0 48 8b 42 28 48 89 43 10 48 03 = 4a 40 b8 01 00 00

RIP <ffffffffa0b9fab4>{:raid0:raid0_make_request+444} RSP <0000010071d3dac8>

CR2: 000001015a441340

 

 

 

Billy Russell

St= orage & Networking Technical Lead

Autodesk – = Media and Entertainment Division

10 rue Duke

Montreal, QC. H3C 2L7

514-954-7377 (off= ice)

billy.russell@autodesk.com

 

------_=_NextPart_002_01C70A62.1E1EC3CD-- ------_=_NextPart_001_01C70A62.1E1EC3CD Content-Type: image/jpeg; name="image003.jpg" Content-Transfer-Encoding: base64 Content-ID: Content-Description: image003.jpg Content-Location: image003.jpg /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwgHBgoICAgLCgoLDhgQDg0N Dh0VFhEYIx8lJCIfIiEmKzcvJik0KSEiMEExNDk7Pj4+JS5ESUM8SDc9Pjv/ 2wBDAQoLCw4NDhwQEBw7KCIoOzs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7 Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozv/wAARCABeABADASIAAhEBAxEB/8QA HwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUF BAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkK FhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1 dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXG x8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEB AQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAEC AxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRom JygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOE hYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU 1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD0vxDOltZQ 3UlsZ44plZ8E/IvduPSq+qeJ4IraOLSyLq8nAMMaDPB7n0qotr4pa3Mlxqdv bKSQUkjU4GcDJxis9dH1bRDJLZ39n9om+YRqihn56DPb2FdcKcNE2nb7vyPI q16t24xaTWu11+J0Hiq1ludOiMcBuEinR5YR1dR1Arm7vT7i7tLrUJtOuBcz yLHZRhTmBV6E+grpPFN1Nb6dEIpjAJpkjkmHWNSeTWD9tax0DUI49RkeeK72 27s+5pMbePcHmnRbUVbuLF8jqNPt+Ni3d+L7S5tmSfR7qW3c7CSoKMc4wD61 k/aNO09hc2nh66juNwETXIYqGz79/Suq1rSpJtMiTTljSW2mWaNCMKxBzj9a yb2XXtet/wCzX0tbVGI8yVnyFwc5FVTlC2isuupFeFW9pO7tpZLf1NTxVcyw aZGkc5gWeZY5Jh/Ap6msTUtKstCs11HTtQlF0jAqrSbhLk8jA65rY8Qa3p1s gsbkRTtI6rNC+RtQ9W6dutYU8nhfTIzeaX5d1eKR5UbszAHPUD2qaSkorR/o y8VKLm3daLq9V6I3/Ey6fDbQ3N3ZtcETIFWJRuZuwOeo9qx7vxBDBCDbeHJo Z2IWJ5rcKN2en1roNenubezia1tY7qUzIqrIpIUk8Hjpg9+1Ymqr4jv9PaC9 gsbS3JBabzPuYPXrSpWaXNt6/oXirxlLk3t2v+JZ1DxVKNSfS9MtVa4Q7Wed wqA+3PNZUmmatrepXOnX+pfPFEsirH/q8k9CK66/0fT9TG27tUkI6NjDD8Rz XK634cl0ayub2x1CURlNsiPyxXjgNTpTgrJaP0uRiqdVXnU1iuztp2P/2Q== ------_=_NextPart_001_01C70A62.1E1EC3CD Content-Type: image/jpeg; name="image003.jpg" Content-Transfer-Encoding: base64 Content-ID: Content-Description: image003.jpg Content-Location: image003-1.jpg /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAoHBwgHBgoICAgLCgoLDhgQDg0N Dh0VFhEYIx8lJCIfIiEmKzcvJik0KSEiMEExNDk7Pj4+JS5ESUM8SDc9Pjv/ 2wBDAQoLCw4NDhwQEBw7KCIoOzs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7 Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozs7Ozv/wAARCABeABADASIAAhEBAxEB/8QA HwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUF BAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkK FhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1 dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXG x8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEB AQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAEC AxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRom JygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOE hYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU 1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD0vxDOltZQ 3UlsZ44plZ8E/IvduPSq+qeJ4IraOLSyLq8nAMMaDPB7n0qotr4pa3Mlxqdv bKSQUkjU4GcDJxis9dH1bRDJLZ39n9om+YRqihn56DPb2FdcKcNE2nb7vyPI q16t24xaTWu11+J0Hiq1ludOiMcBuEinR5YR1dR1Arm7vT7i7tLrUJtOuBcz yLHZRhTmBV6E+grpPFN1Nb6dEIpjAJpkjkmHWNSeTWD9tax0DUI49RkeeK72 27s+5pMbePcHmnRbUVbuLF8jqNPt+Ni3d+L7S5tmSfR7qW3c7CSoKMc4wD61 k/aNO09hc2nh66juNwETXIYqGz79/Suq1rSpJtMiTTljSW2mWaNCMKxBzj9a yb2XXtet/wCzX0tbVGI8yVnyFwc5FVTlC2isuupFeFW9pO7tpZLf1NTxVcyw aZGkc5gWeZY5Jh/Ap6msTUtKstCs11HTtQlF0jAqrSbhLk8jA65rY8Qa3p1s gsbkRTtI6rNC+RtQ9W6dutYU8nhfTIzeaX5d1eKR5UbszAHPUD2qaSkorR/o y8VKLm3daLq9V6I3/Ey6fDbQ3N3ZtcETIFWJRuZuwOeo9qx7vxBDBCDbeHJo Z2IWJ5rcKN2en1roNenubezia1tY7qUzIqrIpIUk8Hjpg9+1Ymqr4jv9PaC9 gsbS3JBabzPuYPXrSpWaXNt6/oXirxlLk3t2v+JZ1DxVKNSfS9MtVa4Q7Wed wqA+3PNZUmmatrepXOnX+pfPFEsirH/q8k9CK66/0fT9TG27tUkI6NjDD8Rz XK634cl0ayub2x1CURlNsiPyxXjgNTpTgrJaP0uRiqdVXnU1iuztp2P/2Q== ------_=_NextPart_001_01C70A62.1E1EC3CD Content-Type: image/png; name="image001.png" Content-Transfer-Encoding: base64 Content-ID: Content-Description: image001.png Content-Location: image001.png iVBORw0KGgoAAAANSUhEUgAAABAAAABeCAIAAAClwwMqAAAAAXNSR0IArs4c 6QAADqpJREFUWEclWGlsXNd1Pve+/c0+Qw53kZJIiSK10RYl2Y3l1LYa2E5q 1G7TFogboGj7Jz/6r78LFM2PokBbIGhRpE6aoo3dNrVjeZNsy6K1WDspLpJI DtcZcjj7/vbl9jzm4YGY4dx73z3nfOf7vvtIg+3yEAaIWgCbDXj/y6tVp/3a 6789GY0p0E6AoYAHLvN5wQbVBp6GQBWA4wDwJo7Xl0iOpLs53eSBycApIAJI wEcZxC0IGSAR5lrAKFDeA+jgQ4oWU2AwLnHMShBKXB98XIqvMshokG3oFBwe fB5cYAQ8CtulvSeZjA0gEwk8AfTgV5eDPRs+f7L3s2sPKFg2+L7p+G0CewBX 1h7/1zc37pUa+DSLgu1wQKHBwXIHvtgtXCnu4WbaIFge7zKMAcAm/FZdv58t PWGQJ9BMwTYPV1fYu1/eWd7eSHQnub/68V96lHG8jGHbHpQqJFvV5ncKOVBg KFEB+PRp5eObX2U2V4e64999/hzZZKsCiCnoohByAXIG/Gqh8ovrV/c45+yF 8+AYK/fvqh399PDIhTPnXjjaQzJsWwEpDrJsRjhKMZ2LFsy24V+vXt5s5hXe j4H33ODID16+OCZQwG1/wjoh4EMmRExJ9MGRoaHCDsDdBvzyo/cdU3vl7PRL k+MDPBALZA7I81cz7XIt7nFhJhgdzeOpHZLqvhXuSqxkMnEpnE6mCHCW7xqE xVJJcuZ6Y2VhXnEcFXy9WZVUnkYk07UcxxOoEOEjlWrT5sXQQH+FOczUyfH/ vlbf3Xnu0KHRnhQYDUEEg5kW8wRFtbHoXNh0qS+pdc+/m1nJVSv8pGBFRvt+ /6Vzh0McBoV3FUwJZD2oP41gZQAcgFUTnHZedSpkselahnGwJ+wDGPu1qwPg OCEAHdYCKnWQEgHMHmjedm2POIxZFggSaPsTygBfPFwkjnBmcPxwbwCwW/fy C08edwSWOn3suZNDPFaXlxB7waUDvPf1w/fuPlL5eHs8Eh8c+DBr/9v1T7qY 7BO51lycF3qoSkBg4GCAAC2AYrslhMJnn79w9rmBtQb8euF2UXHHp6dOTE05 vHJ3fiUAJHUtKVjdaTr1enFtlMKfnu8+moBq0d7JLg11q29NH/+j86MnBxNa cR2zwoBQWVSxiMzxPcPhHR9jrQEsbix77c5kuu9UFI5GoT/d5fgmbUmixQuM SQLwXVKqp+twruP+9Fbx3QXz5vp2j5R+YeQspmu7Clv1jh6OkQZjPCbRBCoH CX1/If/zmdvZthmORux69eXjR3742m9xDfjp+5c/rZfD40cpEobKmAgWzyAE MH2k/+KFZ08cHeiT4eVjh944NtbH4drZqpOfiEi/N3JgnwSwWgwj4bDdqgTW ATZwFYCjAAf2S4lfd12fujQpIxQoxo3/BM8wLAtJYv8LQFuDufX63HzOdILJ /TxNYVMif3iMJ+BT4IgkUp60Aa6uFuY2tlfmlwZFYToRHzgyhDhZ2izdfrJ1 ZPIU7VDQKWVAPY4gOh5mO5dn7t1bWDOk6C7w92rVOgdxXFiTPp9/+p8LjzA3 AbnhViwCJZywvNQoVc6fnHr7rT84depUM8IXsUAAp0Zj3X09a/ldKhkg2r7A ux74Rb26mVvsEb2/+M7ExQHocgxT0uscQ8gkZDg+kI7pbQquC66DVUaqdInn MB9UKUqgBzmx2bEpwWbGlijhFizb0UwKIs84yQ3wTxNKumtgLCfI762biHMT GYFPFCBI9OXHleW9lpQ6QJkAFgcmYhUgBXCoL2277cs3v3rvenmzY1l++OFu 48PZ3P9emylXSuePHyOOzzyC/eDa4HhA51vF/5t7eunOYz40bHIRjVLPtZOc yzXKk6nEj374JtFdm+M4H2ycgNnTQVq2tKvz618t5bYqmiBHwWdR0X3x1PjL U0fHVGQn1uGA5xzaMT0rImOl8TJ8qLRgp9JxZMHzvKggH+inTYACwqjIUE4C zcKh2ANYO9QPLM5vMIK9jxe2RxXgneuP8nqb/ONmzdXtHinh2cwk1BJowWpx Udn3bYnnQcMtO/F0eqWw89GDO6bIkxd/db9WKAmGX8sXQ5KcPNifcWq67Iuy 4DZah5S4b9hlQ/cTkZZENXCpzoWqNnXVSKJ3IBbtMgyn0mkzVa10tEaj7XW8 GIkIvmy3PdGWB2MHeCq0wqrz8vMXpsdUvQozDzOF7fbk2JGLZ8Y4E9I1CBOo Mfj4wfLDcvHs6Wk+HdoaG4++Oqn2U6gNwJfXGgcs98+Pjx3HTMjQ1Q++Cb4E 6anxlY+L928u0D/79oXvTY+NUYgilkwgRgMVpJsL8oZ9g+2O4MK8xSOgKGJT b9AJSJ5WB/BnpAZErko1zak+2tndhEBW8O+GCisElnyo8XrTa3B/+9d/w9ss ynE4QaRQKhcWs5lMpVCVlYKotkVxoWDd3CzdWHq8kd/sSkYIw4iwgxCs+GAG s7n8P1z7/Eap3JDjoeRAr9ptdvR6q2R1qoeToZdQRVmLBUPxwmkUNAofbm3P bO/e3CjVMdwOoCRFVTYY4l6ZOPydZ44R32WEA83QqSIhLlzgEA4FD9b2jL16 u1qtRUVhJBYajihH+mJRAUjdbimCjA4Ht9UM9EBA0KIVQnJGBkF0Ya4G98GG HzxsBX0ffPs0gPyHlAlbDK5l2nfy1RuLy9FkMhoReb/x4umDvzMy3I2I7DCG ISDPID4FAjkNLj1e/2wls9TUGj6NhCKuqZud4kBUemlq4s3nz5AqsyUQZCsw WU0ePl1o/cf1K8u1vb6xwa6erhiToG0phrCxttsE4dW33iAt1pRAFXXeEyHL w08+evLlwjdjJ0b++HdfGdrvBHx4yIFL12v/fP2uMHQQhbUpYp8QQGeU92C1 nFNj6vdff2US4GDAx/4EQB8PxyaSqeMj6yTgJQYeCqNnEOg4YPq6rIhpLqik 47iu5joOcimI/aD0ywaU0ZGhhnIWj6CEkAh9sbBjOjNPS8h2usA3Q2JWhBWA 200tW5jl/CxhbR25rCNybaA46N2vHn58/xEXTT17+vzE4d6kCIW6u+WUvlm4 vbW5fG76WcK0AEvoBVsuKDxs6/B3l2a+WF434ik53q0S3mzXGbRou3Q23f2j t/+QaPuFQ71CckVtQfa/UYdPl1dm1rP5lg4WEy37QEw9c3DoW4eHvzUWJ9uM If1LNReND6IQd6UJQSfkNGN7tyiDGnL5PjE2PsohJSNYSI4xtCIxhA4HWKUb S0ueyo8fGke0IBcmQeCRofSAwj5Yz9/dzOAuENGuJzFPghzAZ0uz//SLn2+U t2LgDqOtsUDBW4CtqvbB0t3PS5tYuBADvsWRLYA8wjM5uCXF90I9BeAbIGh+ wIVlCbIhqSDybm+a1Dus5MLVzdxctdZQItjN2UZjaupUmvJSS095FKm6HRaX KsXZ9VX0nsRgrOjCTz67c+VpJu9xVryb9Pe0iyUOdbVSOxgOoxDsmFU+piS7 Ul1dXaTEArG6n9m9v7ZbouG7ufxauz02MdEnqaqu9xJC7A6uQ3g/lYicnjqJ XgOVAX0Z7xNY1uBf/ufTsu+9/fb3RsWAqZKYoSZwPIghQOwgnxNmBySAIMTg sJtnHi50dO3smTNDioBtmQikMhiIkL5jIZw9wpoeyLTFWJ0w9AL4K+Y9aHAs hMPSHpE4pGxYaMM784+e1IuEoXdD7+CBhSYaYKXhZ1s1UGQBO8dyQi6xbN+W pQ2tc2n+QaZcIEHMiCakDQqrBvzs8tefzD1koRCeQiISVoyWanUWDYnprryu YUppwe/o2G4smEAlcESlI6g03WcriQaRTTXK9w5ocninqQFRTkxO8SoNI2OY BGkyyIZlNIaSymuvXzzVq5TakAhhgDBfgg+v3a3Y9jMnJ2nEC+JDG4cuF1v+ ubHkCanzVq8yCnAxAi8QOAvwgzT8yYlzEV+cm90Mgtb1FsSjiNcmWCpIJmIR vDBIyOeIM8waCvPMDvz46p0tz8ZwXRU41Q/8dhIkzKkPvAUSWiE8AO0SWHOh iIY8DHbIa9lVbFEDT33g8SjOOzZ8cG/+0V6ehUOWbohURMPWMS2IxrJG6872 hq8IaBLbgYZZQuDT6vDOzdkr2xu7mhZSVOo7Ek90p+OIyLssFA0/88xp3sdt CUJwSuCgqxcPDdSyrO++9qpv2sQyowJxPcOkFi/T7t7uc8ODPCEUK62FWHPf SEm0GtVyb02mR/HsBYB0jTnEwLDXl61ABhCnZsdHz4AnWTxGceVOZ2Ut9+zp YzFkVR/wvES9gEVzFvz7rcWZ+UekHLSDrYCLChvQBijBdwaIdgQvMhCi+lHD urWWv7K4sVaqBw2EA1QwlQDqaPQjeBzWSKBAi1V/o7U7m91c3Knu6rRmRZRo HzH30bcviwTXxl0GmK3DjbX8cq241tjbKJYdoohCymmrITWJZ4bAwulcMG6t BplCbTm3t7Sz86S460dDTc5D0Zw+eW6091B2LlvcyvOYUET3ugG3Cq3LS4tP cjuWYXq+0z/Y0zvUt1cpFXbzU3HhjVGoSQfycYs3XPh49vEvV1bvNZoloKlU 6vTh8eFEdGJspKeLfLGQuby14ndKPTD0zBCIicN8MQqPJf2mVqrI6vDI5ItH Tr45zE+qQSUxnluqbPK+hb4WkYbBEqwh1oVaqXAoJqpaqbE4v3R9rnw7h8gN 0lorGUSXI1If5qOM6ZNDAc0UbXcha91YK9/eKa7rzZZs8Qo5duDQgXCvnzf3 ljYuTJ/5/iuRh3lYXX+Mwt5kgK8toMJgrg5f5zvXCk/Xa/lOrdXHp5I1OWJL x4+fPPHtyK/nt2e3FkmT4ZNFCR01C14zoNpsO1By4MqNe9WWmd0qGz6vi6I0 3LOqF5nTRJ0O3kSIDg10AC8KphBADTs224R7j1c2O9pMZjVHvVZcQeInvs8I bgiD2vfrOAHNie17shhUX3e8lsB9tbV9q1G7Wims7eUJQ2P/mwkovIhMPPYH b2rQjDtAkUsYCMIOSHMAf//gxvVHc/8PDXm3MsJNxLsAAAAASUVORK5CYII= ------_=_NextPart_001_01C70A62.1E1EC3CD-- From owner-xfs@oss.sgi.com Fri Nov 17 16:18:20 2006 Received: with ECARTIS (v1.0.0; list xfs); Fri, 17 Nov 2006 16:18:26 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAI0IIaG014039 for ; Fri, 17 Nov 2006 16:18:20 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAHNnJxY021386; Fri, 17 Nov 2006 18:49:19 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAHNnEL4021678; Fri, 17 Nov 2006 18:49:14 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAHNnDqk014737; Fri, 17 Nov 2006 18:49:14 -0500 Message-ID: <455E4A79.5090103@sandeen.net> Date: Fri, 17 Nov 2006 17:49:13 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: Eric Sandeen CC: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> In-Reply-To: <455CB54F.8080901@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9683 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 643 Lines: 18 Eric Sandeen wrote: > see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212201 ok so the padding was a red herring w.r.t. the corruption. What I have found is that the bmap btree block -is- in the inode, and the attribute extents -are- in the inode, but they are not properly placed. If you reset the forkoff to 13, you'll get proper answers for the btree block. If you set it to 15, you get proper answers for the attribute extents. But not both at the same time. Something set the forkoff, or the location of the btree block, or the location of the attr extents, at the wrong place in the inode. Getting closer... -Eric From owner-xfs@oss.sgi.com Sun Nov 19 11:39:53 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 11:40:01 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAJJdqaG016791 for ; Sun, 19 Nov 2006 11:39:53 -0800 Received: from [10.0.0.2] (newserver.sandeen.net [10.0.0.2]) by sandeen.net (Postfix) with ESMTP id 1EC5118E2152B; Sun, 19 Nov 2006 13:07:52 -0600 (CST) Message-ID: <4560AB84.9060200@sandeen.net> Date: Sun, 19 Nov 2006 13:07:48 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5 (X11/20060313) MIME-Version: 1.0 To: xfs@oss.sgi.com CC: centos-devel@centos.org Subject: New CentOS4/RHEL4-compatible xfs module rpms Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9692 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1883 Lines: 51 http://sandeen.net/rhel4_xfs/kernel-module-xfs-2.6.9-42.0.2.EL-0.2-1.src.rpm rpmbuild --rebuild --target i686 kernel-module-xfs-2.6.9-42.0.2.EL-0.2-1.src.rpm will build against the currently running kernel, or rpmbuild --rebuild --target i686 --define "kernel_topdir /lib/modules/2.6.9-42.0.2.EL/build" kernel-module-xfs-2.6.9-42.0.2.EL-0.2-1.src.rpm will build against what is defined in kernel_topdir you need matching kernel & kernel-devel rpms installed to build. Changelog: mostly pulling in fixes sgi sent for sles9, plus specfile cleanups. [root@sandeen rhel4_xfs]# rpm -qp --changelog kernel-module-xfs-2.6.9-42.0.2.EL-0.2-1.src.rpm * Mon Nov 13 2006 - sandeen-centos@sandeen.net - removed xfs_direct_io_locking.patch, RHEL4U4 has the fix now. - Update to xfs codebase from SLES9_SP3_BRANCH_20061107171129 - xfs-kern-26347a-fix-race-on-link: fix race on link (191713, SGI:PV953287). - xfs-kern-26040a-do-not-dirty-inode-being-freed: Don't dirty the inode if it being freed in xfs_iunpin (179117, SGI:PV952967). - xfs-kern-25687a-sles9sp3-iunpin-reclaim-fix: Fix an inode use-after-free during an unpin (SGI:PV946321, 142533). - xfs-kern-930841-default_acl_enospc_fix: Default acl ENOSPC fix (133990, SGI:PV930841). - xfs-kern-25238a-quota-trans-diag: quota trans diag (131262, SGI:PV931456). - xfs-ftruncate-stale-data: XFS ftruncate() bug could expose stale data (151055). - xfs-log-diag: log_runout_diagnostics (131262, SGI:PV947110). - xfs-kern-202363a-fix-xfs_finish_reclaim_all-umount-deadlock: fix xfs_finish_reclaim_all umount deadlock (132358, SGI:PV943821). - clean up specfile: default to building against running kernel try to match fedora conventions slightly better re-enable debuginfo pkg building, module stripping fix recognition of root on xfs, (grep -i XFS) remove unused -source subpackage -Eric From owner-xfs@oss.sgi.com Sun Nov 19 14:05:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 14:05:23 -0800 (PST) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.188]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAJM5EaG012669 for ; Sun, 19 Nov 2006 14:05:16 -0800 Received: by nf-out-0910.google.com with SMTP id x30so1993838nfb for ; Sun, 19 Nov 2006 14:04:23 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=sr6Vp+cZ7ZPlmYM0oaqW3mKeRy+u/NtXxmzoeyk0S7BGxkOTk7PGp/zeafCfuP2W1j69zRSbQzSb2aheb7ijYSszHX1iHXQLhxmKe+BmVrbP6JWMcYadE+4Y9ucHPqJTvM3YW/wP9I77kCYJ5965VS/Rh3fpRzR8e/EUGDtgLlo= Received: by 10.49.70.16 with SMTP id x16mr5239616nfk.1163972289473; Sun, 19 Nov 2006 13:38:09 -0800 (PST) Received: by 10.49.22.17 with HTTP; Sun, 19 Nov 2006 13:38:09 -0800 (PST) Message-ID: <1bc1cb170611191338l44425f5k37320aebcb79688e@mail.gmail.com> Date: Sun, 19 Nov 2006 23:38:09 +0200 From: "=?ISO-8859-1?Q?Rami_V=E4stil=E4?=" To: xfs@oss.sgi.com Subject: xfs_growfs failed when crossing 1TB boundary MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9693 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bulpper@gmail.com Precedence: bulk X-list: xfs Content-Length: 4229 Lines: 126 Hi, I'm trying to grow my filesystem over 1TB on top of LVM, but unfortunatelly the whole filesystem just shuts down and comes unusable. xfs_growfs causes these error messages dumped into console when running it for bigger filesystem than 1TB (actually lvextend and 'xfs_growfs' commands was used to do the things...) >xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Input/output error >attempt to access beyond end of device >fe:00: rw=1, want=2137593089, limit=1086410752 >xfs_force_shutdown(device-mapper(254,0),0x1) called from line 353 of file xfs_rw.c Return address = 0xf8a679eb >Filesystem "device-mapper(254,0)": I/O Error detected. Shutting down filesystem: device-mapper(254,0) >Pelase unmout the filesystem, and rectify the proplem(s) >xfs_force_shutdown(device-mapper(254,0),0x1) called from line 353 of file xfs_rw.c Return address = 0xf8a679eb >XFS mounting filesystems device-mapper(254,0) ...after climbing between walls and jumping on my head, I was able to shrink it back under 1 TB barrier - 1023 GB in fact - and everything work just fine....;-) --- Here's some system info: # uname -a Linux sweetums 2.4.27-3-686-smp #1 SMP Thu Sep 14 07:44:00 UTC 2006 i686 GNU/Linux ~# pvdisplay --- Physical volume --- PV Name /dev/hdc1 VG Name lvmdisk PV Size 298.09 GB / not usable 0 Allocatable yes (but full) PE Size (KByte) 4096 Total PE 76310 Free PE 0 Allocated PE 76310 PV UUID pxjwHG-Z2Ct-SSAh-Sm2v-z3gL-YHVX-eO3uXL --- Physical volume --- PV Name /dev/hdi1 VG Name lvmdisk PV Size 279.39 GB / not usable 0 Allocatable yes (but full) PE Size (KByte) 4096 Total PE 71525 Free PE 0 Allocated PE 71525 PV UUID KPYMTo-6tXk-ini3-xBzl-wnCS-VJji-Lq6CMV --- Physical volume --- PV Name /dev/hdj1 VG Name lvmdisk PV Size 279.39 GB / not usable 0 Allocatable yes (but full) PE Size (KByte) 4096 Total PE 71525 Free PE 0 Allocated PE 71525 PV UUID AFqf2U-1o7u-DjNG-Q3Bk-MmdD-wZzt-oBQWGc --- Physical volume --- PV Name /dev/hdk1 VG Name lvmdisk PV Size 279.39 GB / not usable 0 Allocatable yes PE Size (KByte) 4096 Total PE 71525 Free PE 28986 Allocated PE 42539 PV UUID LQZBHZ-QtRb-pKmJ-1Tc2-z2Hz-SU7S-UnWC8I # vgdisplay --- Volume group --- VG Name lvmdisk System ID Format lvm2 Metadata Areas 4 Metadata Sequence No 22 VG Access read/write VG Status resizable MAX LV 0 Cur LV 1 Open LV 1 Max PV 0 Cur PV 4 Act PV 4 VG Size 1.11 TB PE Size 4.00 MB Total PE 290885 Alloc PE / Size 261899 / 1023.04 GB Free PE / Size 28986 / 113.23 GB VG UUID 8B6OW6-YOBj-Weit-LOUL-Rhzb-prO1-OQtEA0 # xfs_info /mnt/storage/ meta-data=/mnt/storage isize=256 agcount=55, agsize=4883776 blks = sectsz=512 data = bsize=4096 blocks=268184576, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 #df Filesystem 1K-blocks Used Available Use% Mounted on /dev/hde1 74224508 1589896 68864160 3% / tmpfs 512572 0 512572 0% /dev/shm /dev/mapper/lvmdisk-storage 1072607232 805661384 266945848 76% /mnt/storage ...and the version of XFS is 2.6.20 I appreciate any helps and tips to get ridden of this problem - thanks in advance... -R From owner-xfs@oss.sgi.com Sun Nov 19 15:40:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 15:40:20 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAJNeCaG026919 for ; Sun, 19 Nov 2006 15:40:14 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 24DD7534312; Mon, 20 Nov 2006 10:11:12 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 10643-01-86; Mon, 20 Nov 2006 10:11:11 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 2B1C6534294; Mon, 20 Nov 2006 10:11:11 +1100 (EST) Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: Nathan Scott Reply-To: nscott@aconex.com To: sandeen@sandeen.net Cc: David Chinner , Timothy Shimmin , xfs@oss.sgi.com In-Reply-To: <48064.10.0.0.2.1163776850.squirrel@sandeen.net> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> <1163746343.4695.152.camel@edge> <48064.10.0.0.2.1163776850.squirrel@sandeen.net> Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 10:11:46 +1100 Message-Id: <1163977907.4695.157.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 9694 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 1212 Lines: 31 On Fri, 2006-11-17 at 09:20 -0600, sandeen@sandeen.net wrote: > > On Fri, 2006-11-17 at 00:34 -0600, sandeen@sandeen.net wrote: > >> and really, now that this is out in the wild, maybe sb_features3 > >> instead of padding is appropriate, and check both for the attr2 > >> bit...? :( > > > > Thats not going to work, theres three or four other feature2 bits > > preceding attr2 as well. > > > > The "take a 32 bit systems fs to a 64 bit system" is relatively > > uncommon, so I suppose its just something we live with (as we did > > with the log recovery issues in that situation for several years). > > So you think this should not be fixed, then? Because if it -is- fixed I didn't say that. It should be fixed. Noone will notice though, as its not actually biting anyone... (the attr2 problem will not be related to this, its gonna be something else). > then it's not an fs transfer problem; suddenly 64-bit attr2 filesystems > will think they have attr1 if proper padding is added. Now to really fry your noodle, attr2 is actually ondisk compatible with attr1. :) (the SB bit was taken to prevent a repair buglet from accidentally trashing all inodes using a non-fixed forkoff). cheers. -- Nathan From owner-xfs@oss.sgi.com Sun Nov 19 16:37:22 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 16:37:29 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAK0bIaG007242 for ; Sun, 19 Nov 2006 16:37:20 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA04170; Mon, 20 Nov 2006 11:36:26 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAK0aP7Y39423612; Mon, 20 Nov 2006 11:36:25 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAK0aMUC40141700; Mon, 20 Nov 2006 11:36:22 +1100 (AEDT) Date: Mon, 20 Nov 2006 11:36:22 +1100 From: David Chinner To: Rami =?iso-8859-1?Q?V=E4stil=E4?= Cc: xfs@oss.sgi.com Subject: Re: xfs_growfs failed when crossing 1TB boundary Message-ID: <20061120003622.GY11034@melbourne.sgi.com> References: <1bc1cb170611191338l44425f5k37320aebcb79688e@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1bc1cb170611191338l44425f5k37320aebcb79688e@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9695 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1938 Lines: 69 On Sun, Nov 19, 2006 at 11:38:09PM +0200, Rami Västilä wrote: > Hi, > > I'm trying to grow my filesystem over 1TB on top of LVM, but > unfortunatelly the whole filesystem just shuts down and comes > unusable. > > xfs_growfs causes these error messages dumped into console when > running it for bigger filesystem than 1TB (actually lvextend and > 'xfs_growfs' commands was used to do the things...) > > >xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Input/output error > > >attempt to access beyond end of device > >fe:00: rw=1, want=2137593089, limit=1086410752 You tried to grow past the end of the volume. ...... > ...after climbing between walls and jumping on my head, I was able to > shrink it back under 1 TB barrier - 1023 GB in fact - and everything > work just fine....;-) > > --- > > Here's some system info: > > # uname -a > Linux sweetums 2.4.27-3-686-smp #1 SMP Thu Sep 14 07:44:00 UTC 2006 > i686 GNU/Linux ..... > # vgdisplay > --- Volume group --- > VG Name lvmdisk > System ID > Format lvm2 > Metadata Areas 4 > Metadata Sequence No 22 > VG Access read/write > VG Status resizable > MAX LV 0 > Cur LV 1 > Open LV 1 > Max PV 0 > Cur PV 4 > Act PV 4 > VG Size 1.11 TB > PE Size 4.00 MB > Total PE 290885 > Alloc PE / Size 261899 / 1023.04 GB > Free PE / Size 28986 / 113.23 GB > VG UUID 8B6OW6-YOBj-Weit-LOUL-Rhzb-prO1-OQtEA0 So you have allocated 1023.04GB of your volume group. I assume that's a single 1023GB logical volume that you have XFS on. Hence if you want your XFS filesystem to be larger than 1023GB, then you need to grow your logical volume to larger than 1023GB first..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Nov 19 17:40:41 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 17:40:50 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK1edaG017515 for ; Sun, 19 Nov 2006 17:40:41 -0800 Received: from [10.0.0.2] (newserver.sandeen.net [10.0.0.2]) by sandeen.net (Postfix) with ESMTP id CEC5D18E2152B; Sun, 19 Nov 2006 19:39:50 -0600 (CST) Message-ID: <45610761.50009@sandeen.net> Date: Sun, 19 Nov 2006 19:39:45 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5 (X11/20060313) MIME-Version: 1.0 To: nscott@aconex.com CC: David Chinner , Timothy Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> <1163746343.4695.152.camel@edge> <48064.10.0.0.2.1163776850.squirrel@sandeen.net> <1163977907.4695.157.camel@edge> In-Reply-To: <1163977907.4695.157.camel@edge> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9696 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1053 Lines: 24 Nathan Scott wrote: > On Fri, 2006-11-17 at 09:20 -0600, sandeen@sandeen.net wrote: >>> On Fri, 2006-11-17 at 00:34 -0600, sandeen@sandeen.net wrote: >>>> and really, now that this is out in the wild, maybe sb_features3 >>>> instead of padding is appropriate, and check both for the attr2 >>>> bit...? :( >>> Thats not going to work, theres three or four other feature2 bits >>> preceding attr2 as well. >>> >>> The "take a 32 bit systems fs to a 64 bit system" is relatively >>> uncommon, so I suppose its just something we live with (as we did >>> with the log recovery issues in that situation for several years). >> So you think this should not be fixed, then? Because if it -is- fixed > > I didn't say that. It should be fixed. Noone will notice though, > as its not actually biting anyone... (the attr2 problem will not > be related to this, its gonna be something else). but it can't just be properly padded in the kernel and leave it at that, can it? If so won't attr2 filesystems on x86_64 suddenly start appearing to be attr2? -Eric From owner-xfs@oss.sgi.com Sun Nov 19 18:14:33 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 18:14:40 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAK2ETaG021709 for ; Sun, 19 Nov 2006 18:14:31 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA06816; Mon, 20 Nov 2006 13:13:31 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAK2DU7Y40137085; Mon, 20 Nov 2006 13:13:30 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAK2DSRv40188730; Mon, 20 Nov 2006 13:13:28 +1100 (AEDT) Date: Mon, 20 Nov 2006 13:13:28 +1100 From: David Chinner To: Shailendra Tripathi Cc: xfs-dev@sgi.com, xfs@oss.sgi.com Subject: Re: [RFC 0/3] Convert XFS inode hashes to radix trees Message-ID: <20061120021328.GZ11034@melbourne.sgi.com> References: <20061003060610.GV3024@melbourne.sgi.com> <455A68AF.8030309@agami.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <455A68AF.8030309@agami.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9697 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 4036 Lines: 94 On Tue, Nov 14, 2006 at 05:09:03PM -0800, Shailendra Tripathi wrote: > Hi David, > I regret for making comments and questions on this quite > late (somehow I missed to email). > It does appear to me that using this approach can potentially help in > cluster hash list related manipulations. > However, this appears (to me) to be at the cost of regular inode lookup. Yes, there is less parallelism in the radix tree approach, as I stated in the original description. > As of now, each of the hash buckets have their own lock. This helps in > not making the xfs_iget > operations hot. I have not seen of xfs_iget anywhere on the top in my > profiling of Linux for SPECFS. > With this code, the number of hash buckets can be appropriately sized > (based upon memory availability). Sure, but tuning for specsfs is not the problem we are trying to solve here. The problem we are solving is scaling to tens of millions of cached inodes in core -without needing to tune- the filesystem and the inode hashes are the number one problem there. > However, it appears to be that radix tree (even with 15) can become a > bottleneck. Lets assume that there are > 600K inodes on a reasonably big end system and assuming fare Only 600k cached inodes? That's not a "big end" system - we're seeing problems with single filesystem inode caches almost two _orders of magnitude_ larger than this on production machines. > distribution, each of the radix tree will > have 600K/15 ~ 40K inodes per hash tree. Insertion and deletion to the > list have to take writer_lock and > given their frequency, both readers (lookups) and writers will be affected. Right, but we've been hacking at this code time and time again because of scalability problems due to hash sizing, inefficient list traversal, non MRU ordering of the hash lists, etc. Hash tables are simply too inflexible when it comes to scaling to really, really large numbers of cached inodes. The advantage of radix trees is logarithmic scaling, so the length of time the lock is held (either shared or exclusive) is reduced substantially when cache misses (i.e. when you need to do an insert) occur. Hence the reduction in the number of locks is somewhat negated by the reduced time we need to hold the lock for. So, I've traded off massively overblown parallelism for a struture that scales far better and, by my measurements, provides the same throughput. And, FWIW, I'm not really concerned about cache hit parallelism in the face of insert and delete exclusive locking because this patch in the -mm tree from Nick Piggin: radix-tree-rcu-lockless-readside.patch is the right way to solve this problem and will be far better than even the existing hash is in terms of lookup parallelism. > Have you done any performance testing with these patches. I am > quite curious to know the results. If not, may be I can > try do some perf. testing with these changes albeit on a old kernel tree. Yes, I have done some performance testing on them (but not specsfs). IIRC (I can't find the results right now), a single radix tree performed the same as a default hash up to ~8 parallel threads all doing creates or removes and the tests ran up to about 5 million inodes in core on the one filesystem. With a hash of 7 radix trees (4p machine) the radix tree implemetnation at 8 threads had about 10% improvement in throughput and this increased to about 15% by 128 threads. Also, there was a reduction in CPU usage of about 10% when the thread count increased past about 16.... The other big difference is theimprovement in inode reclaim speed - unmount of a filesystem with ~13 million inodes in core dropped from about 20 minutes to under 2 minutes i.e. ~18 minutes reclaiming inodes (i.e. removing them form the hashes) down to ~30s during unmount. > Am I missing something here ? Please let me know. The potential that lockless radix tree lookups imply, I think ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Nov 19 19:25:51 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 19:25:59 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK3PnaG030587 for ; Sun, 19 Nov 2006 19:25:51 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 23D2828B9E; Mon, 20 Nov 2006 13:59:52 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 100ED53403A; Mon, 20 Nov 2006 13:59:52 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 28421-01-89; Mon, 20 Nov 2006 13:59:51 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 422CB534039; Mon, 20 Nov 2006 13:59:51 +1100 (EST) Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: Nathan Scott Reply-To: nscott@aconex.com To: Eric Sandeen Cc: David Chinner , Timothy Shimmin , xfs@oss.sgi.com In-Reply-To: <45610761.50009@sandeen.net> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> <1163746343.4695.152.camel@edge> <48064.10.0.0.2.1163776850.squirrel@sandeen.net> <1163977907.4695.157.camel@edge> <45610761.50009@sandeen.net> Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 14:00:28 +1100 Message-Id: <1163991628.4695.169.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 9698 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 324 Lines: 17 On Sun, 2006-11-19 at 19:39 -0600, Eric Sandeen wrote: > ... > but it can't just be properly padded in the kernel and leave it at that, > can it? I think it can. > If so won't attr2 filesystems on x86_64 suddenly start > appearing to be attr2? What problem do you see resulting from that though? cheers. -- Nathan From owner-xfs@oss.sgi.com Sun Nov 19 19:33:35 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 19:33:42 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK3XYaG031836 for ; Sun, 19 Nov 2006 19:33:35 -0800 Received: from [10.0.0.2] (newserver.sandeen.net [10.0.0.2]) by sandeen.net (Postfix) with ESMTP id C6C0A187B8C1E; Sun, 19 Nov 2006 21:32:46 -0600 (CST) Message-ID: <456121D9.9010507@sandeen.net> Date: Sun, 19 Nov 2006 21:32:41 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5 (X11/20060313) MIME-Version: 1.0 To: nscott@aconex.com CC: David Chinner , Timothy Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> <1163746343.4695.152.camel@edge> <48064.10.0.0.2.1163776850.squirrel@sandeen.net> <1163977907.4695.157.camel@edge> <45610761.50009@sandeen.net> <1163991628.4695.169.camel@edge> In-Reply-To: <1163991628.4695.169.camel@edge> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9699 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 427 Lines: 19 Nathan Scott wrote: > On Sun, 2006-11-19 at 19:39 -0600, Eric Sandeen wrote: >> ... >> but it can't just be properly padded in the kernel and leave it at that, >> can it? > > I think it can. > >> If so won't attr2 filesystems on x86_64 suddenly start >> appearing to be attr2? ugh typo... "as attr1" I meant... > What problem do you see resulting from that though? is an attr2 filesystem mounted as attr1 safe? -Eric From owner-xfs@oss.sgi.com Sun Nov 19 19:37:21 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 19:37:27 -0800 (PST) Received: from prod.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK3bKaG000353 for ; Sun, 19 Nov 2006 19:37:21 -0800 Received: from page.mel.office.aconex.com (unknown [192.168.0.210]) by prod.aconex.com (Postfix) with ESMTP id 6E64028C6D; Mon, 20 Nov 2006 14:36:33 +1100 (EST) Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 02EB15340F7; Mon, 20 Nov 2006 14:36:33 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 06302-01-47; Mon, 20 Nov 2006 14:36:30 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 00827534104; Mon, 20 Nov 2006 14:36:27 +1100 (EST) Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: Nathan Scott Reply-To: nscott@aconex.com To: Eric Sandeen Cc: David Chinner , Timothy Shimmin , xfs@oss.sgi.com In-Reply-To: <456121D9.9010507@sandeen.net> References: <455CB54F.8080901@sandeen.net> <20061117023946.GN11034@melbourne.sgi.com> <20061117055521.GS11034@melbourne.sgi.com> <52841.10.0.0.2.1163745285.squirrel@sandeen.net> <1163746343.4695.152.camel@edge> <48064.10.0.0.2.1163776850.squirrel@sandeen.net> <1163977907.4695.157.camel@edge> <45610761.50009@sandeen.net> <1163991628.4695.169.camel@edge> <456121D9.9010507@sandeen.net> Content-Type: text/plain Organization: Aconex Date: Mon, 20 Nov 2006 14:37:02 +1100 Message-Id: <1163993823.4695.175.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 9700 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 528 Lines: 24 On Sun, 2006-11-19 at 21:32 -0600, Eric Sandeen wrote: > Nathan Scott wrote: > > On Sun, 2006-11-19 at 19:39 -0600, Eric Sandeen wrote: > >> ... > >> but it can't just be properly padded in the kernel and leave it at that, > >> can it? > > > > I think it can. > > > >> If so won't attr2 filesystems on x86_64 suddenly start > >> appearing to be attr2? > > ugh typo... "as attr1" I meant... > > > What problem do you see resulting from that though? > > is an attr2 filesystem mounted as attr1 safe? > yes. -- Nathan From owner-xfs@oss.sgi.com Sun Nov 19 19:51:46 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 19 Nov 2006 19:51:52 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAK3pjaG002431 for ; Sun, 19 Nov 2006 19:51:46 -0800 Received: from [10.0.0.2] (newserver.sandeen.net [10.0.0.2]) by sandeen.net (Postfix) with ESMTP id 4541D18E2152B; Sun, 19 Nov 2006 21:50:58 -0600 (CST) Message-ID: <45612621.5010404@sandeen.net> Date: Sun, 19 Nov 2006 21:50:57 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5 (X11/20060313) MIME-Version: 1.0 To: Eric Sandeen CC: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> In-Reply-To: <455CE1E3.7020703@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9701 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1431 Lines: 53 Eric Sandeen wrote: > Eric Sandeen wrote: >> see also https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212201 >> >> Bugzilla Bug 212201: Cannot build sysem with XFS file system. >> >> I turned on attr2 in FC6 at nathan's suggestion, for selinux goodness >> with more efficient xattr space usage. >> >> But, many reports that this was totally broken in fc6, on x86_64. > > ugh. it's broken on x86 too, so it's not just the alignment/padding, > although that should be fixed for cross-arch mounts. > > -Eric > > here's a testcase to corrupt it FWIW. Russell has a slightly different one derived from this. #!/bin/sh remount() { umount mnt xfs_db -r fsfile2 -c "inode 131" -c "p core.forkoff" -c "p u" -c "p a" mount -o loop fsfile2 mnt/ } umount mnt/ rm -f fsfile2 mkfs.xfs -dfile,name=fsfile2,size=100m -iattr=2 mount -o loop fsfile2 mnt/ mkdir mnt/dir setfattr -n user.rity.selinux -v user_foo:blah_foo:mnt_what:0 mnt/dir/ for I in `seq 10 20`; do touch mnt/file$I; done for I in `seq 100 700`; do touch mnt/dir/file$I; done remount for I in `seq 1000 1400`; do touch mnt/dir/file$I; done #remount # works if we do remount here setfattr -n user.rity.selinux -v user_foo:blah_foo:mnt_what:0 mnt/dir/ echo "unmounting" umount mnt/ xfs_db -r fsfile2 -c "inode 131" -c "p core.forkoff" -c "p u" -c "p a" -c "type text" -c "p" Run it and you'll get the bmap root block from the wrong offset; it'll be "0" -Eric From owner-xfs@oss.sgi.com Mon Nov 20 06:57:53 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 20 Nov 2006 06:58:01 -0800 (PST) Received: from nz-out-0102.google.com (nz-out-0102.google.com [64.233.162.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAKEvqaG024848 for ; Mon, 20 Nov 2006 06:57:53 -0800 Received: by nz-out-0102.google.com with SMTP id z6so826696nzd for ; Mon, 20 Nov 2006 06:57:03 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=d5/5yCeVOeNmez32frwc8RlPd44Oux/RgieZrM1PekL3cgIES9r4p8yNL7c6IZE+FyVD3Vfn3RBN4aXHmQ7lCpqu9/RZk6Scw6Xn37IsoJx/Ct3ngxpXKkbS5JFfQVLxjW2PLr/TbMB1Lh5+hDIXGucccTQ4YcOW2xtoa6qz3vw= Received: by 10.65.219.15 with SMTP id w15mr8031845qbq.1164033163776; Mon, 20 Nov 2006 06:32:43 -0800 (PST) Received: by 10.65.216.6 with HTTP; Mon, 20 Nov 2006 06:32:43 -0800 (PST) Message-ID: <9a8748490611200632n3b545698h295631460a212b9b@mail.gmail.com> Date: Mon, 20 Nov 2006 15:32:43 +0100 From: "Jesper Juhl" To: "David Chinner" Subject: Re: [PATCH][RFC][resend] potential NULL pointer deref in XFS on failed mount Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com, xfs-masters@oss.sgi.com, "Andrew Morton" In-Reply-To: <9a8748490611161418l4d5a773k76cf7061d73c8a51@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2046_28541952.1164033163623" References: <200611162218.26945.jesper.juhl@gmail.com> <20061116220958.GE11034@melbourne.sgi.com> <9a8748490611161418l4d5a773k76cf7061d73c8a51@mail.gmail.com> X-archive-position: 9705 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jesper.juhl@gmail.com Precedence: bulk X-list: xfs Content-Length: 6667 Lines: 170 ------=_Part_2046_28541952.1164033163623 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline On 16/11/06, Jesper Juhl wrote: > On 16/11/06, David Chinner wrote: > > On Thu, Nov 16, 2006 at 10:18:26PM +0100, Jesper Juhl wrote: > > > (got no reply on this when I originally send it on 20061031, so resending > > > now that a bit of time has passed. The patch still applies cleanly to > > > Linus' git tree as of today.) > > > > > > > > > The Coverity checker spotted a potential problem in XFS. > > > > > > The problem is that if, in xfs_mount(), this code triggers: > > > > > > ... > > > if (!mp->m_logdev_targp) > > > goto error0; > > > ... > > > > > > Then we'll end up calling xfs_unmountfs_close() with a NULL > > > 'mp->m_logdev_targp'. > > > This in turn will result in a call to xfs_free_buftarg() with its 'btp' > > > argument == NULL. xfs_free_buftarg() dereferences 'btp' leading to > > > a NULL pointer dereference and crash. > > > > Interesting that coverity found that, but failed to find the other > > leaks in that function from exactly the same code and error > > case..... > > > > > I think this can happen, since the fatal call to xfs_free_buftarg() > > > happens when 'm_logdev_targp != m_ddev_targp' and due to a check of > > > 'm_ddev_targp' against NULL in xfs_mount() (and subsequent return if it is > > > NULL) the two will never both be NULL when we hit the error0 label from > > > the two lines cited above. > > > > > > Comments welcome (please keep me on Cc: on replies). > > > > > > Here's a proposed patch to fix this by testing 'btp' against NULL in > > > xfs_free_buftarg(). > > > > Not the right fix - we should only be trying to free valid > > buftargs, which means xfs_unmountfs_close() is the correct > > place to fix this.... > > > > Ok. > > > e.g: > > > > - if (mp->m_logdev_targp != mp->m_ddev_targp) > > + if (mp->m_logdev_targp && (mp->m_logdev_targp != mp->m_ddev_targp)) > > > > As to the afore-mentioned leaks, if we fail to allocate a realtime > > buftarg, then we will leak a reference to both the rtdev and logdev, > > and if we fail to allocate an external log buftarg we'll leak a > > reference to the logdev. i.e., we fail to do one or both of: > > > > xfs_blkdev_put(logdev); > > xfs_blkdev_put(rtdev); > > > > To remove the bdev references we may have gained earlier. Normally, > > these references are released by xfs_free_buftarg(), but because we > > failed to allocate the buftarg, we can't drop the references via > > that method.... > > > How about something like the attached patch ? (sorry about the attachment, but I can't inline the patch from my current location - a whitespace damaged version is below though for easy review) Signed-off-by: Jesper Juhl --- fs/xfs/xfs_mount.c | 2 +- fs/xfs/xfs_vfsops.c | 10 ++++++++-- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 9dfae18..3497128 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1135,7 +1135,7 @@ #endif void xfs_unmountfs_close(xfs_mount_t *mp, struct cred *cr) { - if (mp->m_logdev_targp != mp->m_ddev_targp) + if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) xfs_free_buftarg(mp->m_logdev_targp, 1); if (mp->m_rtdev_targp) xfs_free_buftarg(mp->m_rtdev_targp, 1); diff --git a/fs/xfs/xfs_vfsops.c b/fs/xfs/xfs_vfsops.c index 62336a4..6d7e8f1 100644 --- a/fs/xfs/xfs_vfsops.c +++ b/fs/xfs/xfs_vfsops.c @@ -473,13 +473,19 @@ xfs_mount( } if (rtdev) { mp->m_rtdev_targp = xfs_alloc_buftarg(rtdev, 1); - if (!mp->m_rtdev_targp) + if (!mp->m_rtdev_targp) { + xfs_blkdev_put(logdev); + xfs_blkdev_put(rtdev); goto error0; + } } mp->m_logdev_targp = (logdev && logdev != ddev) ? xfs_alloc_buftarg(logdev, 1) : mp->m_ddev_targp; - if (!mp->m_logdev_targp) + if (!mp->m_logdev_targp) { + xfs_blkdev_put(logdev); + xfs_blkdev_put(rtdev); goto error0; + } /* * Setup flags based on mount(2) options and then the superblock -- Jesper Juhl Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html ------=_Part_2046_28541952.1164033163623 Content-Type: text/plain; name="patch.txt" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="patch.txt" X-Attachment-Id: f_euqza2rt ClNpZ25lZC1vZmYtYnk6IEplc3BlciBKdWhsIDxqZXNwZXIuanVobEBnbWFp bC5jb20+Ci0tLQoKIGZzL3hmcy94ZnNfbW91bnQuYyAgfCAgICAyICstCiBm cy94ZnMveGZzX3Zmc29wcy5jIHwgICAxMCArKysrKysrKy0tCiAyIGZpbGVz IGNoYW5nZWQsIDkgaW5zZXJ0aW9ucygrKSwgMyBkZWxldGlvbnMoLSkKCmRp ZmYgLS1naXQgYS9mcy94ZnMveGZzX21vdW50LmMgYi9mcy94ZnMveGZzX21v dW50LmMKaW5kZXggOWRmYWUxOC4uMzQ5NzEyOCAxMDA2NDQKLS0tIGEvZnMv eGZzL3hmc19tb3VudC5jCisrKyBiL2ZzL3hmcy94ZnNfbW91bnQuYwpAQCAt MTEzNSw3ICsxMTM1LDcgQEAgI2VuZGlmCiB2b2lkCiB4ZnNfdW5tb3VudGZz X2Nsb3NlKHhmc19tb3VudF90ICptcCwgc3RydWN0IGNyZWQgKmNyKQogewot CWlmIChtcC0+bV9sb2dkZXZfdGFyZ3AgIT0gbXAtPm1fZGRldl90YXJncCkK KwlpZiAobXAtPm1fbG9nZGV2X3RhcmdwICYmIG1wLT5tX2xvZ2Rldl90YXJn cCAhPSBtcC0+bV9kZGV2X3RhcmdwKQogCQl4ZnNfZnJlZV9idWZ0YXJnKG1w LT5tX2xvZ2Rldl90YXJncCwgMSk7CiAJaWYgKG1wLT5tX3J0ZGV2X3Rhcmdw KQogCQl4ZnNfZnJlZV9idWZ0YXJnKG1wLT5tX3J0ZGV2X3RhcmdwLCAxKTsK ZGlmZiAtLWdpdCBhL2ZzL3hmcy94ZnNfdmZzb3BzLmMgYi9mcy94ZnMveGZz X3Zmc29wcy5jCmluZGV4IDYyMzM2YTQuLjZkN2U4ZjEgMTAwNjQ0Ci0tLSBh L2ZzL3hmcy94ZnNfdmZzb3BzLmMKKysrIGIvZnMveGZzL3hmc192ZnNvcHMu YwpAQCAtNDczLDEzICs0NzMsMTkgQEAgeGZzX21vdW50KAogCX0KIAlpZiAo cnRkZXYpIHsKIAkJbXAtPm1fcnRkZXZfdGFyZ3AgPSB4ZnNfYWxsb2NfYnVm dGFyZyhydGRldiwgMSk7Ci0JCWlmICghbXAtPm1fcnRkZXZfdGFyZ3ApCisJ CWlmICghbXAtPm1fcnRkZXZfdGFyZ3ApIHsKKwkJCXhmc19ibGtkZXZfcHV0 KGxvZ2Rldik7CisJCQl4ZnNfYmxrZGV2X3B1dChydGRldik7CiAJCQlnb3Rv IGVycm9yMDsKKwkJfQogCX0KIAltcC0+bV9sb2dkZXZfdGFyZ3AgPSAobG9n ZGV2ICYmIGxvZ2RldiAhPSBkZGV2KSA/CiAJCQkJeGZzX2FsbG9jX2J1ZnRh cmcobG9nZGV2LCAxKSA6IG1wLT5tX2RkZXZfdGFyZ3A7Ci0JaWYgKCFtcC0+ bV9sb2dkZXZfdGFyZ3ApCisJaWYgKCFtcC0+bV9sb2dkZXZfdGFyZ3ApIHsK KwkJeGZzX2Jsa2Rldl9wdXQobG9nZGV2KTsKKwkJeGZzX2Jsa2Rldl9wdXQo cnRkZXYpOwogCQlnb3RvIGVycm9yMDsKKwl9CiAKIAkvKgogCSAqIFNldHVw IGZsYWdzIGJhc2VkIG9uIG1vdW50KDIpIG9wdGlvbnMgYW5kIHRoZW4gdGhl IHN1cGVyYmxvY2sK ------=_Part_2046_28541952.1164033163623-- From owner-xfs@oss.sgi.com Mon Nov 20 12:28:50 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 20 Nov 2006 12:28:57 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [66.45.37.187]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAKKSmaG031867 for ; Mon, 20 Nov 2006 12:28:50 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id A855861012A0; Mon, 20 Nov 2006 15:00:20 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id A52AB16172EB3 for ; Mon, 20 Nov 2006 15:00:20 -0500 (EST) Date: Mon, 20 Nov 2006 15:00:20 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: xfs@oss.sgi.com Subject: XFS CORRUPTION 2.6.17.13? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 9707 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs Content-Length: 2270 Lines: 64 Anyone know what could cause this? Is the last good kernel to use 2.6.17.6 w/XFS bugfix patch? Was a new bug introduced from 2.6.16.6 -> 2.6.17.13? Nov 20 13:16:58 box [4299533.469000] 0x0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Nov 20 13:16:58 box [4299533.469000] Filesystem "hda2": XFS internal error xfs_da_do_buf(2) at line 2212 of file fs/xfs/xfs_da_btree.c. Caller 0xc01ffcad Nov 20 13:16:58 box [4299533.469000] Nov 20 13:16:58 box xfs_corruption_error+0xf2/0x11a Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.469000] Nov 20 13:16:58 box kmem_zone_alloc+0x60/0xdd Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_da_buf_make+0xf7/0x14c Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.469000] Nov 20 13:16:58 box xfs_da_do_buf+0x935/0x98d Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.469000] Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.469000] Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.469000] Nov 20 13:16:58 box xfs_dir2_node_lookup+0x3f/0xb9 Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_dir2_lookup+0x137/0x139 Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.470000] Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_dir_lookup_int+0x40/0x125 Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.470000] Nov 20 13:16:58 box xfs_lookup+0x5f/0x88 Nov 20 13:16:58 box Nov 20 13:16:58 box Nov 20 13:16:58 box xfs_vn_lookup+0x4f/0x93 Nov 20 13:16:58 box Nov 20 13:16:58 box [4299533.470000] Nov 20 13:16:58 box do_lookup+0x12d/0x15f Nov 20 13:16:58 box Nov 20 13:16:58 box From owner-xfs@oss.sgi.com Mon Nov 20 12:50:54 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 20 Nov 2006 12:51:01 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [66.45.37.187]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAKKoraG003241 for ; Mon, 20 Nov 2006 12:50:54 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id F3ABF61012A0; Mon, 20 Nov 2006 15:50:02 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id F040416172EB3 for ; Mon, 20 Nov 2006 15:50:02 -0500 (EST) Date: Mon, 20 Nov 2006 15:50:02 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: xfs@oss.sgi.com Subject: Re: XFS CORRUPTION 2.6.17.13? In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 9708 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs Content-Length: 2484 Lines: 70 I meant 2.6.17.6 w/patch -> 2.6.17.13 On Mon, 20 Nov 2006, Justin Piszcz wrote: > Anyone know what could cause this? > Is the last good kernel to use 2.6.17.6 w/XFS bugfix patch? > > Was a new bug introduced from 2.6.16.6 -> 2.6.17.13? > > Nov 20 13:16:58 box [4299533.469000] 0x0: 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > 00 00 00 > Nov 20 13:16:58 box [4299533.469000] Filesystem "hda2": XFS internal error > xfs_da_do_buf(2) at line 2212 of file fs/xfs/xfs_da_btree.c. Caller > 0xc01ffcad > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_corruption_error+0xf2/0x11a > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box kmem_zone_alloc+0x60/0xdd > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_buf_make+0xf7/0x14c > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_da_do_buf+0x935/0x98d > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_dir2_node_lookup+0x3f/0xb9 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_dir2_lookup+0x137/0x139 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.470000] > Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_dir_lookup_int+0x40/0x125 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.470000] > Nov 20 13:16:58 box xfs_lookup+0x5f/0x88 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_vn_lookup+0x4f/0x93 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.470000] > Nov 20 13:16:58 box do_lookup+0x12d/0x15f > Nov 20 13:16:58 box > Nov 20 13:16:58 box > > From owner-xfs@oss.sgi.com Mon Nov 20 20:03:10 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 20 Nov 2006 20:03:17 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAL439aG007613 for ; Mon, 20 Nov 2006 20:03:10 -0800 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) by sandeen.net (Postfix) with ESMTP id 8944E18E21536; Mon, 20 Nov 2006 22:02:21 -0600 (CST) Message-ID: <45627A4D.3020502@sandeen.net> Date: Mon, 20 Nov 2006 22:02:21 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Eric Sandeen CC: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> In-Reply-To: <45612621.5010404@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9710 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1879 Lines: 65 Eric Sandeen wrote: > Eric Sandeen wrote: > >> ugh. it's broken on x86 too, so it's not just the alignment/padding, >> >> although that should be fixed for cross-arch mounts. >> >> -Eric >> >> > here's a testcase to corrupt it FWIW. > > Ok, with expert collaboration from Russell, Barry, Tim, Nathan, David, et al, how about this: For btree dirs, we need a different calculation for the space used in di_u, to set the minimum threshold for the fork offset... This fixes my testcase, but as Tim points out -now- we need to compact the btree ptrs, if we return (and use) an offset < current forkoff... whee.... -Eric Index: linux-2.6.18/fs/xfs.orig/xfs_attr_leaf.c =================================================================== --- linux-2.6.18.orig/fs/xfs.orig/xfs_attr_leaf.c +++ linux-2.6.18/fs/xfs.orig/xfs_attr_leaf.c @@ -116,6 +116,7 @@ xfs_attr_shortform_bytesfit(xfs_inode_t int minforkoff; /* lower limit on valid forkoff locations */ int maxforkoff; /* upper limit on valid forkoff locations */ xfs_mount_t *mp = dp->i_mount; + int dsize = 0; offset = (XFS_LITINO(mp) - bytes) >> 3; /* rounded down */ @@ -134,8 +135,21 @@ xfs_attr_shortform_bytesfit(xfs_inode_t return 0; } + switch (dp->i_d.di_format) { + case XFS_DINODE_FMT_LOCAL: + case XFS_DINODE_FMT_EXTENTS: + dsize = dp->i_df.if_bytes; + break; + case XFS_DINODE_FMT_BTREE: + dsize = XFS_BMDR_SPACE_CALC( + XFS_BMAP_BROOT_NUMRECS(dp->i_df.if_broot)); + break; + default: + /* should bail, unknown format, .... */ + } + /* data fork btree root can have at least this many key/ptr pairs */ - minforkoff = MAX(dp->i_df.if_bytes, XFS_BMDR_SPACE_CALC(MINDBTPTRS)); + minforkoff = MAX(dsize, XFS_BMDR_SPACE_CALC(MINDBTPTRS)); minforkoff = roundup(minforkoff, 8) >> 3; /* attr fork btree root can have at least this many key/ptr pairs */ From owner-xfs@oss.sgi.com Mon Nov 20 22:19:32 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 20 Nov 2006 22:19:39 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAL6JTaG030564 for ; Mon, 20 Nov 2006 22:19:31 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA25229; Tue, 21 Nov 2006 17:18:38 +1100 Message-ID: <45629AD8.8000800@sgi.com> Date: Tue, 21 Nov 2006 17:21:12 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: sgi.bugs.xfs@engr.sgi.com CC: linux-xfs@oss.sgi.com Subject: TAKE 956783 - xfs_dm_getall_dmattr() doesn't check if the user buffer is at valid address Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9713 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 613 Lines: 18 No EFAULT error when dm_getall_dmattr() called with an invalid user buffer address. Date: Tue Nov 21 17:14:34 AEDT 2006 Workarea: soarer.melbourne.sgi.com:/home/vapo/isms/linux-xfs-dmapi Inspected by: donaldd Author: vapo The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:27510a fs/xfs/dmapi/xfs_dm.c - 1.28 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/dmapi/xfs_dm.c.diff?r1=text&tr1=1.28&r2=text&tr2=1.27&f=h - pv 956783, rv donaldd - Check user buffer address in dm_getall_dmattr() for EFAULT error From owner-xfs@oss.sgi.com Tue Nov 21 00:40:31 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 00:40:40 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAL8eSaG024965 for ; Tue, 21 Nov 2006 00:40:30 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id TAA28085; Tue, 21 Nov 2006 19:39:34 +1100 Date: Tue, 21 Nov 2006 19:41:48 +1100 From: Timothy Shimmin To: torvalds@osdl.org cc: akpm@osdl.org, xfs@oss.sgi.com Subject: XFS Update for 2.6.19 Message-ID: <72BEC048EB6E8F7B2893FAAF@timothy-shimmins-power-mac-g5.local> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9714 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1101 Lines: 46 Hi Linus, A couple of small XFS fixes for some apparent bugs. Thanks. Please pull from: git://oss.sgi.com:8090/xfs/xfs-2.6 This will update the following files: fs/xfs/xfs_bmap.c | 2 ++ fs/xfs/xfs_inode.c | 2 +- 2 files changed, 3 insertions(+), 1 deletions(-) through these commits: commit d2133717d5f994cca970b5aeb9d4664feeb92ff4 Author: Lachlan McIlroy Date: Tue Nov 21 18:55:16 2006 +1100 [XFS] Fix uninitialized br_state and br_startoff in xfs_bmap_add_extent_delay_real() SGI-PV: 957008 SGI-Modid: xfs-linux-melb:xfs-kern:27457a Signed-off-by: Lachlan McIlroy Signed-off-by: Shailendra Tripathi Signed-off-by: Tim Shimmin commit e5ffd2bb62c3f2c0d9f34e0d16fab6e2c8b056fb Author: David Chinner Date: Tue Nov 21 18:55:33 2006 +1100 [XFS] Stale the correct inode when freeing clusters. SGI-PV: 958376 SGI-Modid: xfs-linux-melb:xfs-kern:27503a Signed-off-by: David Chinner Signed-off-by: Tim Shimmin --Tim From owner-xfs@oss.sgi.com Tue Nov 21 01:54:42 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 01:54:50 -0800 (PST) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.170]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAL9seaG003973 for ; Tue, 21 Nov 2006 01:54:41 -0800 Received: by ug-out-1314.google.com with SMTP id q2so1249402uge for ; Tue, 21 Nov 2006 01:53:52 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:from:to:subject:date:user-agent:mime-version:content-disposition:message-id:cc:content-type:content-transfer-encoding; b=mP350jguUwomDw1A76rqWEqga4ptWURuB3wufbRPdFGy2C+p20kb/cZDL4GBmeel6lDAL+2sm+IhZ+Jmi7iSUwcsXew67T1AcuOEnfUCClfbjqsV8BzrTENz4DozvIf5RAd556Ta5ORrsQNMLbK/XK1jR9tsFUvDD8i0G190jag= Received: by 10.67.121.15 with SMTP id y15mr7965084ugm.1164101272646; Tue, 21 Nov 2006 01:27:52 -0800 (PST) Received: from homer.cohaesio.com ( [212.97.128.136]) by mx.google.com with ESMTP id o1sm8990897uge.2006.11.21.01.27.50; Tue, 21 Nov 2006 01:27:50 -0800 (PST) From: Jesper Juhl To: LKML Subject: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP Date: Tue, 21 Nov 2006 10:27:41 +0100 User-Agent: KMail/1.9.5 MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200611211027.41971.jesper.juhl@gmail.com> Cc: xfs@oss.sgi.com, xfs-masters@oss.sgi.com, netdev@vger.kernel.org, linux-scsi@vger.kernel.org, Jesper Juhl Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-archive-position: 9715 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jesper.juhl@gmail.com Precedence: bulk X-list: xfs Content-Length: 52515 Lines: 1229 Hi, I have a server that has long suffered from spontaneous reboots and random crashes. The problems seem to be partly SMP related since the machine is rock solid with a UP version of 2.6.11.11, but the same kernel compiled for SMP has issues. The server initially had 1 Intel Xeon CPU with HT and was very recently upgraded with an additional one (in the blind hope that the issues would be fixed). The kernel *seems* to die faster with 2 CPU's and a SMP kernel than it previously did with just one (HT) CPU and a SMP kernel. I've been trying newer kernels, such as 2.6.17.x, 2.6.18.x, 2.6.19-rc* (all SMP), hoping that the problem(s) would be fixed, but that does not seem to be the case. Recently I've been using netconsole and have lots of debug options enabled in the hope that I could capture some relevant info. Unfortunately nothing ever really made it to the remote log - except one little incomplete bit I got the other day (with 2.6.19-rc6) : do_IRQ: stack overflow: 492 That is all that made it to the log, but it does indicate that the problem might be stack-usage related. Since the kernel was compiled with 4K stacks, perhaps if it was changed to use 8K stacks it would stay up long enough for a complete dump to reach the logs. But, if 8K stacks really did help, it would be nice if the dumps still happened at the same point where they would have with 4K stacks. So, I changed STACK_WARN in include/asm/thread_info.h from (THREAD_SIZE/8) to(4608). This way I should get stack traces at the point where the kernel would be in trouble with a 4K stack but since it's actually using a 8K stack it should survive and let me capture the trace. I got more than I could ever have hoped for. I still got spontaneous reboots, but this time my remote log server captured tons of stack dumps. I've got far too many to send here (more than 2G) and most of them are identical anyway, so I'll just submit a few representative samples initially. Most of the traces include XFS functions and some also involve scsi and/or networking. This is the reason I'm submitting this to the XFS & netdev lists in addition to LKML. All of these traces were collected with 2.6.19-rc6 with the modification mentioned above. Hardware details and software environment info is at the end of the email. This is the most often captured trace : do_IRQ: stack overflow: 4416 [] dump_trace+0x1e7/0x1fd [] show_trace_log_lvl+0x1c/0x33 [] show_trace+0x12/0x16 [] dump_stack+0x19/0x1d [] do_IRQ+0xaf/0xd6 [] common_interrupt+0x1a/0x20 [] __do_softirq+0x59/0xd0 [] do_softirq+0x37/0x39 [] irq_exit+0x39/0x3b [] do_IRQ+0x74/0xd6 [] common_interrupt+0x1a/0x20 [] make_request+0x320/0x426 [] generic_make_request+0x14f/0x1b7 [] __map_bio+0x4c/0x93 [] __clone_and_map+0xdb/0x30a [] __split_bio+0xa4/0xc7 [] dm_request+0xa0/0xbf [] generic_make_request+0x14f/0x1b7 [] submit_bio+0x68/0x109 [] _xfs_buf_ioapply+0x1cf/0x28d [] xfs_buf_iorequest+0x29/0x6e [] xlog_bdstrat_cb+0x19/0x41 [] xlog_sync+0x24e/0x457 [] xlog_state_release_iclog+0x75/0xd0 [] xlog_state_sync+0x175/0x269 [] _xfs_log_force+0x7f/0x88 [] xfs_alloc_search_busy+0xdf/0xe1 [] xfs_alloc_get_freelist+0xe7/0xf5 [] xfs_alloc_newroot+0x21/0x34f [] xfs_alloc_insrec+0x3b0/0x3ce [] xfs_alloc_insert+0x5a/0xc3 [] xfs_free_ag_extent+0x57f/0x5f2 [] xfs_alloc_fix_freelist+0x220/0x45c [] xfs_alloc_vextent+0x24e/0x47a [] xfs_bmap_btalloc+0x31f/0x966 [] xfs_bmap_alloc+0x1e/0x29 [] xfs_bmapi+0x1134/0x1545 [] xfs_iomap_write_allocate+0x2bb/0x509 [] xfs_iomap+0x357/0x459 [] xfs_bmap+0x2e/0x35 [] xfs_map_blocks+0x3c/0x70 [] xfs_page_state_convert+0x3cc/0x629 [] xfs_vm_writepage+0x5c/0xd3 [] generic_writepages+0x1b9/0x2d5 [] xfs_vm_writepages+0x24/0x4a [] do_writepages+0x2a/0x46 [] __sync_single_inode+0x5c/0x1de [] __writeback_single_inode+0x85/0x18f [] sync_sb_inodes+0x1b3/0x2b2 [] writeback_inodes+0xb2/0xbe [] background_writeout+0x66/0x9a [] __pdflush+0xcf/0x184 [] pdflush+0x32/0x36 [] kthread+0xa9/0xae [] kernel_thread_helper+0x7/0x10 another very common one is this one : do_IRQ: stack overflow: 4532 [] dump_trace+0x1e7/0x1fd [] show_trace_log_lvl+0x1c/0x33 [] show_trace+0x12/0x16 [] dump_stack+0x19/0x1d [] do_IRQ+0xaf/0xd6 [] common_interrupt+0x1a/0x20 [] xfs_buf_bio_end_io+0xd9/0x11f [] bio_endio+0x55/0x7a [] dec_pending+0x3d/0x6b [] clone_endio+0x85/0xb1 [] bio_endio+0x55/0x7a [] __end_that_request_first+0x1df/0x271 [] end_that_request_chunk+0x8/0xa [] scsi_end_request+0x25/0xcb [] scsi_io_completion+0x82/0x301 [] sd_rw_intr+0x76/0x20f [] scsi_finish_command+0x43/0x5e [] scsi_softirq_done+0x70/0xd5 [] blk_done_softirq+0x62/0x6b [] __do_softirq+0xbb/0xd0 [] do_softirq+0x37/0x39 [] irq_exit+0x39/0x3b [] do_IRQ+0x74/0xd6 [] common_interrupt+0x1a/0x20 [] mempool_alloc+0x21/0xce [] bio_alloc_bioset+0x79/0x13f [] clone_bio+0x36/0x7d [] __clone_and_map+0xce/0x30a [] __split_bio+0xa4/0xc7 [] dm_request+0xa0/0xbf [] generic_make_request+0x14f/0x1b7 [] submit_bio+0x68/0x109 [] _xfs_buf_ioapply+0x1cf/0x28d [] xfs_buf_iorequest+0x29/0x6e [] xfs_buf_iostart+0x6d/0x97 [] xfs_buf_read_flags+0x8a/0x8c [] xfs_trans_read_buf+0x153/0x2fc [] xfs_btree_read_bufs+0x6e/0x84 [] xfs_alloc_lookup+0x10a/0x39e [] xfs_alloc_lookup_ge+0x17/0x1a [] xfs_alloc_ag_vextent_near+0x5f/0x957 [] xfs_alloc_ag_vextent+0x104/0x106 [] xfs_alloc_vextent+0x372/0x47a [] xfs_bmap_btalloc+0x31f/0x966 [] xfs_bmap_alloc+0x1e/0x29 [] xfs_bmapi+0x1134/0x1545 [] xfs_iomap_write_allocate+0x2bb/0x509 [] xfs_iomap+0x357/0x459 [] xfs_bmap+0x2e/0x35 [] xfs_map_blocks+0x3c/0x70 [] xfs_page_state_convert+0x3cc/0x629 [] xfs_vm_writepage+0x5c/0xd3 [] generic_writepages+0x1b9/0x2d5 [] xfs_vm_writepages+0x24/0x4a [] do_writepages+0x2a/0x46 [] __sync_single_inode+0x5c/0x1de [] __writeback_single_inode+0x85/0x18f [] sync_sb_inodes+0x1b3/0x2b2 [] writeback_inodes+0xb2/0xbe [] wb_kupdate+0x80/0xe9 [] __pdflush+0xcf/0x184 [] pdflush+0x32/0x36 [] kthread+0xa9/0xae [] kernel_thread_helper+0x7/0x10 This one seems to involve scsi : do_IRQ: stack overflow: 4568 [] dump_trace+0x1e7/0x1fd [] show_trace_log_lvl+0x1c/0x33 [] show_trace+0x12/0x16 [] dump_stack+0x19/0x1d [] do_IRQ+0xaf/0xd6 [] common_interrupt+0x1a/0x20 [] _spin_unlock_irq+0xa/0xb [] blk_run_queue+0x42/0x77 [] scsi_run_queue+0xc9/0xf1 [] scsi_next_command+0x33/0x49 [] scsi_end_request+0xb0/0xcb [] scsi_io_completion+0x82/0x301 [] sd_rw_intr+0x76/0x20f [] scsi_finish_command+0x43/0x5e [] scsi_softirq_done+0x70/0xd5 [] blk_done_softirq+0x62/0x6b [] __do_softirq+0xbb/0xd0 [] do_softirq+0x37/0x39 [] irq_exit+0x39/0x3b [] do_IRQ+0x74/0xd6 [] common_interrupt+0x1a/0x20 [] _spin_unlock_irq+0xa/0xb [] generic_make_request+0x14f/0x1b7 [] __map_bio+0x4c/0x93 [] __clone_and_map+0xdb/0x30a [] __split_bio+0xa4/0xc7 [] dm_request+0xa0/0xbf [] generic_make_request+0x14f/0x1b7 [] submit_bio+0x68/0x109 [] _xfs_buf_ioapply+0x1cf/0x28d [] xfs_buf_iorequest+0x29/0x6e [] xfs_buf_iostart+0x6d/0x97 [] xfs_buf_read_flags+0x8a/0x8c [] xfs_trans_read_buf+0x153/0x2fc [] xfs_btree_read_bufs+0x6e/0x84 [] xfs_alloc_lookup+0x10a/0x39e [] xfs_alloc_lookup_eq+0x14/0x17 [] xfs_alloc_fixup_trees+0x252/0x2a9 [] xfs_alloc_ag_vextent_size+0x318/0x405 [] xfs_alloc_ag_vextent+0xe2/0x106 [] xfs_alloc_vextent+0x372/0x47a [] xfs_bmap_btalloc+0x31f/0x966 [] xfs_bmap_alloc+0x1e/0x29 [] xfs_bmapi+0x1134/0x1545 [] xfs_iomap_write_allocate+0x2bb/0x509 [] xfs_iomap+0x357/0x459 [] xfs_bmap+0x2e/0x35 [] xfs_map_blocks+0x3c/0x70 [] xfs_page_state_convert+0x3cc/0x629 [] xfs_vm_writepage+0x5c/0xd3 [] generic_writepages+0x1b9/0x2d5 [] xfs_vm_writepages+0x24/0x4a [] do_writepages+0x2a/0x46 [] __sync_single_inode+0x5c/0x1de [] __writeback_single_inode+0x85/0x18f [] sync_sb_inodes+0x1b3/0x2b2 [] writeback_inodes+0xb2/0xbe [] background_writeout+0x66/0x9a [] __pdflush+0xcf/0x184 [] pdflush+0x32/0x36 [] kthread+0xa9/0xae [] kernel_thread_helper+0x7/0x10 And then there are some where stack space is really low, which would certainly have killed us if running with 4K stacks : First this : do_IRQ: stack overflow: 3376 [] dump_trace+0x1e7/0x1fd [] show_trace_log_lvl+0x1c/0x33 [] show_trace+0x12/0x16 [] dump_stack+0x19/0x1d [] do_IRQ+0xaf/0xd6 [] common_interrupt+0x1a/0x20 [] _spin_unlock_irqrestore+0xd/0x10 [] e1000_xmit_frame+0x269/0x3b1 [] dev_hard_start_xmit+0x5a/0xd3 [] __qdisc_run+0x95/0x1d7 [] dev_queue_xmit+0x220/0x285 [] vlan_dev_hwaccel_hard_start_xmit+0x8a/0x92 [] dev_hard_start_xmit+0x5a/0xd3 [] dev_queue_xmit+0x15f/0x285 [] neigh_connected_output+0x93/0xba [] ip_output+0x170/0x250 [] ip_queue_xmit+0x3d8/0x4e1 [] tcp_transmit_skb+0x29e/0x45d [] tcp_send_ack+0xb3/0xf4 [] tcp_send_dupack+0x28/0x7f [] tcp_rcv_established+0x141/0x6c8 [] tcp_v4_do_rcv+0xcb/0xcd [] tcp_v4_rcv+0x673/0x7e3 [] ip_local_deliver+0xf8/0x22d [] ip_rcv+0x244/0x4e4 [] netif_receive_skb+0x1f9/0x26a [] e1000_clean_rx_irq+0x17f/0x4b9 [] e1000_clean+0x66/0xfb [] net_rx_action+0x96/0x174 [] __do_softirq+0xbb/0xd0 [] do_softirq+0x37/0x39 [] irq_exit+0x39/0x3b [] do_IRQ+0x74/0xd6 [] common_interrupt+0x1a/0x20 [] max_io_len+0x15/0x88 [] __clone_and_map+0x44/0x30a [] __split_bio+0xa4/0xc7 [] dm_request+0xa0/0xbf [] generic_make_request+0x14f/0x1b7 [] submit_bio+0x68/0x109 [] _xfs_buf_ioapply+0x1cf/0x28d [] xfs_buf_iorequest+0x29/0x6e [] xfs_buf_iostart+0x6d/0x97 [] xfs_buf_read_flags+0x8a/0x8c [] xfs_trans_read_buf+0x153/0x2fc [] xfs_btree_read_bufs+0x6e/0x84 [] xfs_alloc_lookup+0x10a/0x39e [] xfs_alloc_lookup_ge+0x17/0x1a [] xfs_alloc_ag_vextent_near+0x5f/0x957 [] xfs_alloc_ag_vextent+0x104/0x106 [] xfs_alloc_vextent+0x372/0x47a [] xfs_bmap_btalloc+0x31f/0x966 [] xfs_bmap_alloc+0x1e/0x29 [] xfs_bmapi+0x1134/0x1545 [] xfs_iomap_write_allocate+0x2bb/0x509 [] xfs_iomap+0x357/0x459 [] xfs_bmap+0x2e/0x35 [] xfs_map_blocks+0x3c/0x70 [] xfs_page_state_convert+0x3cc/0x629 [] xfs_vm_writepage+0x5c/0xd3 [] generic_writepages+0x1b9/0x2d5 [] xfs_vm_writepages+0x24/0x4a [] do_writepages+0x2a/0x46 [] __sync_single_inode+0x5c/0x1de [] __writeback_single_inode+0x85/0x18f [] sync_sb_inodes+0x1b3/0x2b2 [] writeback_inodes+0xb2/0xbe [] balance_dirty_pages+0xa6/0x15c [] balance_dirty_pages_ratelimited_nr+0x59/0x5b [] generic_file_buffered_write+0x2ef/0x61f [] xfs_write+0x96f/0xb1c [] xfs_file_aio_write+0x78/0x8a [] do_sync_write+0xc1/0x100 [] vfs_write+0x91/0x137 [] sys_write+0x41/0x6b [] syscall_call+0x7/0xb [] 0xb7f6b95e and this : e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue <0> TDH TDT next_to_use next_to_clean buffer_info[next_to_clean] time_stamp next_to_watch jiffies next_to_watch.status <1> do_IRQ: stack overflow: 3836 [] dump_trace+0x1e7/0x1fd [] show_trace_log_lvl+0x1c/0x33 [] show_trace+0x12/0x16 [] dump_stack+0x19/0x1d [] do_IRQ+0xaf/0xd6 [] common_interrupt+0x1a/0x20 [] csum_partial+0xb8/0x120 DWARF2 unwinder stuck at csum_partial+0xb8/0x120 Leftover inexact backtrace: [] __skb_checksum_complete+0x20/0x67 [] nf_ip_checksum+0xe0/0x125 [] udp_error+0x105/0x184 [] ip_conntrack_in+0x7d/0x294 [] nf_iterate+0x62/0x7c [] nf_hook_slow+0x58/0xbf [] ip_rcv+0x40c/0x4e4 [] netif_receive_skb+0x1f9/0x26a [] e1000_clean_rx_irq+0x17f/0x4b9 [] e1000_clean+0x66/0xfb [] net_rx_action+0x96/0x174 [] __do_softirq+0xbb/0xd0 [] do_softirq+0x37/0x39 [] irq_exit+0x39/0x3b [] do_IRQ+0x74/0xd6 [] common_interrupt+0x1a/0x20 [] __clone_and_map+0x44/0x30a [] __split_bio+0xa4/0xc7 [] dm_request+0xa0/0xbf [] generic_make_request+0x14f/0x1b7 [] submit_bio+0x68/0x109 [] _xfs_buf_ioapply+0x1cf/0x28d [] xfs_buf_iorequest+0x29/0x6e [] xfs_buf_iostart+0x6d/0x97 [] xfs_buf_read_flags+0x8a/0x8c [] xfs_trans_read_buf+0x153/0x2fc [] xfs_btree_read_bufs+0x6e/0x84 [] xfs_alloc_lookup+0x10a/0x39e [] xfs_alloc_lookup_ge+0x17/0x1a [] xfs_alloc_ag_vextent_near+0x5f/0x957 [] xfs_alloc_ag_vextent+0x104/0x106 [] xfs_alloc_vextent+0x372/0x47a [] xfs_bmap_btalloc+0x31f/0x966 [] xfs_bmap_alloc+0x1e/0x29 [] xfs_bmapi+0x1134/0x1545 [] xfs_iomap_write_allocate+0x2bb/0x509 [] xfs_iomap+0x357/0x459 [] xfs_bmap+0x2e/0x35 [] xfs_map_blocks+0x3c/0x70 [] xfs_page_state_convert+0x3cc/0x629 [] xfs_vm_writepage+0x5c/0xd3 [] generic_writepages+0x1b9/0x2d5 [] xfs_vm_writepages+0x24/0x4a [] do_writepages+0x2a/0x46 [] __sync_single_inode+0x5c/0x1de [] __writeback_single_inode+0x85/0x18f [] sync_sb_inodes+0x1b3/0x2b2 [] writeback_inodes+0xb2/0xbe [] balance_dirty_pages+0xa6/0x15c [] balance_dirty_pages_ratelimited_nr+0x59/0x5b [] generic_file_buffered_write+0x2ef/0x61f [] xfs_write+0x96f/0xb1c [] xfs_file_aio_write+0x78/0x8a [] do_sync_write+0xc1/0x100 [] vfs_write+0x91/0x137 [] sys_write+0x41/0x6b [] syscall_call+0x7/0xb and finally this one : do_IRQ: stack overflow: 3916 [] dump_trace+0x1e7/0x1fd [] show_trace_log_lvl+0x1c/0x33 [] show_trace+0x12/0x16 [] dump_stack+0x19/0x1d [] do_IRQ+0xaf/0xd6 [] common_interrupt+0x1a/0x20 [] tcp_init_tso_segs+0x17/0x4c [] tcp_write_xmit+0x5d/0x266 [] __tcp_push_pending_frames+0x29/0x81 [] tcp_rcv_established+0x208/0x6c8 [] tcp_v4_do_rcv+0xcb/0xcd [] tcp_v4_rcv+0x673/0x7e3 [] ip_local_deliver+0xf8/0x22d [] ip_rcv+0x244/0x4e4 [] netif_receive_skb+0x1f9/0x26a [] e1000_clean_rx_irq+0x17f/0x4b9 [] e1000_clean+0x66/0xfb [] net_rx_action+0x96/0x174 [] __do_softirq+0xbb/0xd0 [] do_softirq+0x37/0x39 [] irq_exit+0x39/0x3b [] do_IRQ+0x74/0xd6 [] common_interrupt+0x1a/0x20 [] max_io_len+0x15/0x88 [] __clone_and_map+0x44/0x30a [] __split_bio+0xa4/0xc7 [] dm_request+0xa0/0xbf [] generic_make_request+0x14f/0x1b7 [] submit_bio+0x68/0x109 [] _xfs_buf_ioapply+0x1cf/0x28d [] xfs_buf_iorequest+0x29/0x6e [] xfs_buf_iostart+0x6d/0x97 [] xfs_buf_read_flags+0x8a/0x8c [] xfs_trans_read_buf+0x153/0x2fc [] xfs_btree_read_bufs+0x6e/0x84 [] xfs_alloc_lookup+0x10a/0x39e [] xfs_alloc_lookup_ge+0x17/0x1a [] xfs_alloc_ag_vextent_near+0x5f/0x957 [] xfs_alloc_ag_vextent+0x104/0x106 [] xfs_alloc_vextent+0x372/0x47a [] xfs_bmap_btalloc+0x31f/0x966 [] xfs_bmap_alloc+0x1e/0x29 [] xfs_bmapi+0x1134/0x1545 [] xfs_iomap_write_allocate+0x2bb/0x509 [] xfs_iomap+0x357/0x459 [] xfs_bmap+0x2e/0x35 [] xfs_map_blocks+0x3c/0x70 [] xfs_page_state_convert+0x3cc/0x629 [] xfs_vm_writepage+0x5c/0xd3 [] generic_writepages+0x1b9/0x2d5 [] xfs_vm_writepages+0x24/0x4a [] do_writepages+0x2a/0x46 [] __sync_single_inode+0x5c/0x1de [] __writeback_single_inode+0x85/0x18f [] sync_sb_inodes+0x1b3/0x2b2 [] writeback_inodes+0xb2/0xbe [] balance_dirty_pages+0xa6/0x15c [] balance_dirty_pages_ratelimited_nr+0x59/0x5b [] generic_file_buffered_write+0x2ef/0x61f [] xfs_write+0x96f/0xb1c [] xfs_file_aio_write+0x78/0x8a [] do_sync_write+0xc1/0x100 [] vfs_write+0x91/0x137 [] sys_write+0x41/0x6b [] syscall_call+0x7/0xb [] 0xb7f6b95e And there are lots of other ones as well that differ slightly from the ones above. Some hardware/software details : # scripts/ver_linux If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux server.mydomain.net 2.6.19-rc6 #1 SMP Mon Nov 20 14:33:26 CET 2006 i686 GNU/Linux Gnu C 3.3.5 Gnu make 3.80 binutils 2.15 util-linux 2.12p mount 2.12p module-init-tools 3.2-pre1 e2fsprogs 1.37 xfsprogs 2.6.20 nfs-utils 1.0.6 Linux C Library 2.3.2 Dynamic linker (ldd) 2.3.2 Procps 3.2.1 Net-tools 1.60 Console-tools 0.2.3 Sh-utils 5.2.1 udev 056 Modules Loaded sky2 piix ide_core eeprom # lspci -vvx 0000:00:00.0 Host bridge: Intel Corp. Server Memory Controller Hub (rev 0c) Subsystem: Intel Corp.: Unknown device 3439 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Address: fee00000 Data: 0000 Capabilities: [64] #10 [0041] 00: 86 80 95 35 47 01 10 00 0c 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 01 03 00 b0 c0 00 00 20: c0 fc e0 fc 01 fa 71 fb 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 0a 01 06 00 0000:00:04.0 PCI bridge: Intel Corp. Memory Controller Hub PCI Express Port B0 (rev 0c) (prog-if 00 [Normal decode]) Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Address: fee00000 Data: 0000 Capabilities: [64] #10 [0041] 00: 86 80 97 35 44 01 10 00 0c 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 04 04 00 f0 00 00 20 20: f0 ff 00 00 f1 ff 01 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 0a 01 06 00 0000:00:05.0 PCI bridge: Intel Corp. Memory Controller Hub PCI Express Port B1 (rev 0c) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Address: fee00000 Data: 0000 Capabilities: [64] #10 [0041] 00: 86 80 98 35 47 01 18 00 0c 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 05 05 00 d0 d0 00 00 20: f0 fc f0 fc f1 ff 01 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 0a 01 07 00 0000:00:06.0 PCI bridge: Intel Corp. Memory Controller Hub PCI Express Port C0 (rev 0c) (prog-if 00 [Normal decode]) Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Reset- FastB2B- Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1 Enable- Address: fee00000 Data: 0000 Capabilities: [64] #10 [0041] 00: 86 80 99 35 44 01 10 00 0c 00 04 06 10 00 01 00 10: 00 00 00 00 00 00 00 00 00 06 06 00 f0 00 00 20 20: f0 ff 00 00 f1 ff 01 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 0a 01 06 00 0000:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1 (rev 02) (prog-if 00 [UHCI]) Subsystem: Intel Corp.: Unknown device 3439 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset- FastB2B- 00: 86 80 4e 24 47 01 80 00 c2 00 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 00 07 07 20 e0 e0 80 02 20: 00 fd b0 fe 80 fb f0 fb 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0b 00 0000:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC Bridge (rev 02) Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- Region 1: I/O ports at Region 2: I/O ports at Region 3: I/O ports at Region 4: I/O ports at fc00 [size=16] Region 5: Memory at 88000000 (32-bit, non-prefetchable) [size=1K] 00: 86 80 db 24 07 00 88 02 02 8a 01 01 00 00 00 00 10: 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 20: 01 fc 00 00 00 00 00 88 00 00 00 00 86 80 39 34 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 0000:00:1f.2 IDE interface: Intel Corp. 82801EB (ICH5) Serial ATA 150 Storage Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO]) Subsystem: Intel Corp.: Unknown device 3460 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- Status: Cap- 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 Capabilities: [6c] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [d8] 00: 86 80 29 03 47 01 10 00 09 00 04 06 10 00 81 00 10: 00 00 00 00 00 00 00 00 01 02 02 30 b0 b0 a0 02 20: d0 fc d0 fc 01 fa f1 fa 00 00 00 00 00 00 00 00 30: 00 00 00 00 44 00 00 00 00 00 00 00 00 00 07 00 0000:01:00.1 PIC: Intel Corp. PCI Bridge Hub I/OxAPIC Interrupt Controller A (rev 09) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corp.: Unknown device 3439 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [44] #10 [0071] Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 Capabilities: [6c] Power Management version 2 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=0 PME- Capabilities: [d8] 00: 86 80 2a 03 47 01 10 00 09 00 04 06 10 00 81 00 10: 00 00 00 00 00 00 00 00 01 03 03 30 c0 c0 a0 02 20: e0 fc e0 fc 01 fb 71 fb 00 00 00 00 00 00 00 00 30: 00 00 00 00 44 00 00 00 00 00 00 00 00 00 07 00 0000:01:00.3 PIC: Intel Corp. PCI Bridge Hub I/OxAPIC Interrupt Controller B (rev 09) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corp.: Unknown device 3439 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- TAbort- SERR- PS. Please keep me on Cc: when replying. From owner-xfs@oss.sgi.com Tue Nov 21 11:42:47 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 11:42:59 -0800 (PST) Received: from mail.g-house.de (ns2.g-housing.de [81.169.133.75]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kALJgkaG005295 for ; Tue, 21 Nov 2006 11:42:47 -0800 Received: from [82.41.152.154] (helo=82-41-152-154.cable.ubr01.linl.blueyonder.co.uk) by mail.g-house.de with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.50) id 1Gmb09-0000L9-5A; Tue, 21 Nov 2006 20:09:49 +0100 Date: Tue, 21 Nov 2006 19:09:41 +0000 (GMT) From: Christian Kujau X-X-Sender: evil@sheep.housecafe.de To: ext3-users@redhat.com cc: jfs-discussion@lists.sourceforge.net, linux-crypto@nl.linux.org, reiserfs-list@namesys.com, xfs@oss.sgi.com Subject: 2.6.19-rc5-git4 benchmarks Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 9718 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lists@nerdbynature.de Precedence: bulk X-list: xfs Content-Length: 625 Lines: 21 Apologies for the wide alias, but as it may interest serveral fs groups, here it is: In the everlasting search for the best fs for my shiny new disks, I was interested in some numbers, here're the results: http://nerdbynature.de/bench/amd64/2.6.19-rc5-git4/test-3/dm-crypt-3.html details: http://nerdbynature.de/wp/?cat=4 (in short: ext3 pretty fast in all operations. then again, the numbers suggest that sometimes a crypto-fs is faster than withou crypto, eg. 'ext3_no-cipher' vs. 'ext3_aes-cbc-essiv:md5'...that's strange, no?) Thanks, Christian. -- BOFH excuse #11: magnetic interference from money/credit cards From owner-xfs@oss.sgi.com Tue Nov 21 13:26:42 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 13:26:49 -0800 (PST) Received: from mail.wrs.com (mail.windriver.com [147.11.1.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kALLQaaG028711 for ; Tue, 21 Nov 2006 13:26:36 -0800 Received: from ALA-MAIL03.corp.ad.wrs.com (ala-mail03 [147.11.57.144]) by mail.wrs.com (8.13.6/8.13.3) with ESMTP id kALHGL6M025750 for ; Tue, 21 Nov 2006 09:16:21 -0800 (PST) Received: from ala-mail02.corp.ad.wrs.com ([147.11.57.56]) by ALA-MAIL03.corp.ad.wrs.com with Microsoft SMTPSVC(6.0.3790.1830); Tue, 21 Nov 2006 09:16:20 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 MIME-Version: 1.0 Subject: XFS sync hang Date: Tue, 21 Nov 2006 09:16:19 -0800 Message-ID: <37B62E0F71C9E14B9859FADB1FC3E3E1077D84@ala-mail02.corp.ad.wrs.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: XFS sync hang Thread-Index: AccNkMKB823/FuHoQgaBfDKAx5kxHg== From: "Kottaridis, Chris" To: X-OriginalArrivalTime: 21 Nov 2006 17:16:20.0599 (UTC) FILETIME=[C346B070:01C70D90] Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 9722 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chris.kottaridis@windriver.com Precedence: bulk X-list: xfs Content-Length: 3623 Lines: 90 I have a system based off of 2.6.10 with XFS on top of LVM2 on top of RAID 1 of SCSI disks. The sync commands hang it looks like waiting for a lock. One of them seems to be trying to lock xfs_buf at least two others are waiting on a super lock. I see this in the log files: One of the sync commands is waiting for the xfs_buf lock: Nov 13 19:13:34 typhoon-base-unit0 kernel: sync D C3370BF0 0 8602 1 15685 (NOTLB) Nov 13 19:13:34 typhoon-base-unit0 kernel: db40de44 00000046 e94e2c70 c3370bf0 c3245ee0 f7eb9680 00000000 f7ebe000 Nov 13 19:13:34 typhoon-base-unit0 kernel: f8d44c77 c02c57f1 00000000 00000000 00000000 c3245060 00000002 00000732 Nov 13 19:13:34 typhoon-base-unit0 kernel: ee94e205 00000152 c3370bf0 e94e2c70 e94e2de4 0000219a c3370bf0 00000002 Nov 13 19:13:34 typhoon-base-unit0 kernel: Call Trace: Nov 13 19:13:34 typhoon-base-unit0 kernel: [] __down+0x76/0xde Nov 13 19:13:34 typhoon-base-unit0 kernel: [] __down_failed+0xa/0x10 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] .text.lock.xfs_buf+0x4b/0x51 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] xfs_bwrite+0x9a/0xe7 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] xfs_syncsub+0x148/0x34f Nov 13 19:13:34 typhoon-base-unit0 kernel: [] xfs_sync+0x2a/0x2c Nov 13 19:13:34 typhoon-base-unit0 kernel: [] linvfs_sync_super+0x41/0xf5 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] sync_filesystems+0xe2/0xef Nov 13 19:13:34 typhoon-base-unit0 kernel: [] do_sync+0x4f/0x83 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] sys_sync+0x12/0x16 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] no_dpa_vsyscall_enter+0x8/0x1b There are two other sync comamnds waiting on the super lock, which I assume the above sync has: Nov 13 19:13:34 typhoon-base-unit0 kernel: sync D C04CAB60 0 15093 24660 (NOTLB) Nov 13 19:13:34 typhoon-base-unit0 kernel: da407f38 00000046 f024f250 c04cab60 c3235ee0 0004037f da407f54 c0145da1 Nov 13 19:13:34 typhoon-base-unit0 kernel: f8d2cbb7 f29e2e10 da407f04 00000000 00000000 c3235060 00000000 00000c17 Nov 13 19:13:34 typhoon-base-unit0 kernel: c1b08a89 0000015b c04cab60 f024f250 f024f3c4 00003af5 c04cab60 00000002 Nov 13 19:13:34 typhoon-base-unit0 kernel: Call Trace: Nov 13 19:13:34 typhoon-base-unit0 kernel: [] __down+0x76/0xde Nov 13 19:13:34 typhoon-base-unit0 kernel: [] __down_failed+0xa/0x10 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] .text.lock.super+0xad/0x192 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] do_sync+0x47/0x83 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] sys_sync+0x12/0x16 Nov 13 19:13:34 typhoon-base-unit0 kernel: [] no_dpa_vsyscall_enter+0x8/0x1b I couldn't actually find a .text.lock.xfs_buf routine anywhere until I compiled with -save-temps and found it in an assembly file generated at compile time. I assume this is in the pagebuf_lock routine. I assume some other process has the lock on the pagebuf and it either died without unlocking or is hung up some way that isn't obvious from the logs. I see a PAGEBUF_LOCK_TRACKING option that will add a field to the pb struct to try and track who has the lock. It gets enabled by setting CONFIG_XFS_DEBUG, but I've heard that enabling CONFIG_XFS_DEBUG has it's own problem. So, I thought I'd just try and set the PAGEBUF_LOCK_TRACKING macro and see if I can;t determine the process that is not free'ing the lock. Any advice or comments are appreciated. Thanks Chris Kottaridis Senior Engineer Wind River Systems 719-522-9786 [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Tue Nov 21 16:43:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 16:43:24 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM0hDaG027796 for ; Tue, 21 Nov 2006 16:43:15 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA22789; Wed, 22 Nov 2006 11:42:22 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAM0gK7Y42604385; Wed, 22 Nov 2006 11:42:21 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAM0gHGW42651545; Wed, 22 Nov 2006 11:42:17 +1100 (AEDT) Date: Wed, 22 Nov 2006 11:42:17 +1100 From: David Chinner To: Russell Cattelan Cc: David Chinner , Tim Shimmin , Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static Message-ID: <20061122004216.GT11034@melbourne.sgi.com> References: <20060929032856.8DA9C18001A5E@sandeen.net> <23F15D6AE8566A54B81188AC@timothy-shimmins-power-mac-g5.local> <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1161125131.5723.158.camel@xenon.msp.redhat.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9726 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 2514 Lines: 83 On Tue, Oct 17, 2006 at 05:45:31PM -0500, Russell Cattelan wrote: > On Wed, 2006-10-18 at 07:57 +1000, David Chinner wrote: > > On Tue, Oct 17, 2006 at 05:13:01PM +1000, Tim Shimmin wrote: > > > I thought that for debug, we could stop them from being inline > > > for easier debugging. We could have a STATIC_INLINE :-) > > > > We could, but I don't think it gains us anything. > I agree with Tim on this. > when I see STATIC in the code it's generally assumed to > be a way to toggle of static on/off. Adding static inline > to the #define STATIC starts to overload the the macro > and creates an obfuscation that isn't immediately obvious. > STATIC_INLINE should be fairly obvious. Ok, so I've had time to look at this again. Here's the definitions of STATIC and STATIC_INLINE for debug and nondebug from the patch (whitespace damaged): Index: 2.6.x-xfs-new/fs/xfs/support/debug.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/support/debug.h 2006-11-22 10:54:37.089984780 +1100 +++ 2.6.x-xfs-new/fs/xfs/support/debug.h 2006-11-22 11:30:20.433326839 +1100 @@ -38,13 +38,37 @@ extern void assfail(char *expr, char *f, #ifndef DEBUG # define ASSERT(expr) ((void)0) -#else + +#ifndef STATIC +# define STATIC static noinline +#endif + +#ifndef STATIC_INLINE +# define STATIC_INLINE static inline +#endif + +#else /* DEBUG */ + # define ASSERT(expr) ASSERT_ALWAYS(expr) extern unsigned long random(void); -#endif #ifndef STATIC -# define STATIC static +# define STATIC noinline +#endif + +/* + * We stop inlining of inline functions in debug mode. + * Unfortunately, this means static inline in header files + * get multiple definitions, so they need to remain static. + * This then gives tonnes of warnings about unused but defined + * functions, so we need to add the unused attribute to prevent + * these spurious warnings. + */ +#ifndef STATIC_INLINE +# define STATIC_INLINE static __attribute__ ((unused)) noinline #endif +#endif /* DEBUG */ + + #endif /* __XFS_SUPPORT_DEBUG_H__ */ ------ Is this acceptible to everyone? FWIW, there is one other thing that this conversion causes problems with, and that's variable definitions. i.e. we can't use STATIC on them any more because of the "noinline" attribute it has. Do we care about this and if so, any suggestions on how to keep this functionality (a different STATIC_xxx define for structures)? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Nov 21 17:23:25 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 17:23:31 -0800 (PST) Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAM1NNaG001397 for ; Tue, 21 Nov 2006 17:23:25 -0800 Received: from [127.0.0.1] (lupo.thebarn.com [10.0.0.10]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kAM12GWC005194; Tue, 21 Nov 2006 19:02:17 -0600 (CST) (envelope-from cattelan@thebarn.com) Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: Russell Cattelan To: Eric Sandeen Cc: xfs@oss.sgi.com In-Reply-To: <45627A4D.3020502@sandeen.net> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-16+a/A7x3nE0fpFlpUP4" Date: Tue, 21 Nov 2006 19:02:16 -0600 Message-Id: <1164157336.19915.43.camel@xenon.msp.redhat.com> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1.1-1mdv2007.1 X-archive-position: 9727 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 11662 Lines: 244 --=-16+a/A7x3nE0fpFlpUP4 Content-Type: multipart/mixed; boundary="=-q0sPS2kuB9bq3XAmCC+M" --=-q0sPS2kuB9bq3XAmCC+M Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2006-11-20 at 22:02 -0600, Eric Sandeen wrote: > Eric Sandeen wrote: >=20 > > Eric Sandeen wrote: > >=20=20=20 > >> ugh. it's broken on x86 too, so it's not just the alignment/padding, > >>=20=20=20=20=20 > >> although that should be fixed for cross-arch mounts. > >>=20=20=20=20=20 > >> -Eric > >>=20=20=20=20=20 > >>=20=20=20=20=20 > > here's a testcase to corrupt it FWIW. > >=20=20=20 > >=20=20=20 > Ok, with expert collaboration from Russell, Barry, Tim,=20 > Nathan, David, et al, how about this: >=20 > For btree dirs, we need a different calculation for the space > used in di_u, to set the minimum threshold for the fork offset... >=20 > This fixes my testcase, but as Tim points out -now- we need to compact > the btree ptrs, if we return (and use) an offset < current forkoff... >=20 > whee.... >=20 > -Eric >=20 It turns out this only fixes one of the problems it is still quite easy=20 to corrupt indoes with attr2. The following patch is a short term fix that address the problem of forkoff moving without re-factoring the root inode btree root block. Once the inode has be flipped to BTREE for the data space the forkoff is fixed to the that size, currently due to the way attr1 worked (fixed size forkoff) the code is not handling the size to the root btree node due to size changes in the attr portion of the inode. The optimal solution is to adjust the data portion of the inode root btree block down if space exists. One easy fix that was resulting all attr add being pushed out of line is added the header size to the initial split of the inode, at least the first attr add should go inline now. Which should be a win the big attr user right now=20 SElinux. Including the 2 test script that have been used. --=20 Russell Cattelan --=-q0sPS2kuB9bq3XAmCC+M Content-Disposition: attachment; filename=attr2_patch Content-Type: text/x-patch; name=attr2_patch; charset=utf-8 Content-Transfer-Encoding: base64 SW5kZXg6IHdvcmtfZ2ZzL2ZzL3hmcy94ZnNfYXR0ci5jDQo9PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09DQotLS0gd29ya19nZnMub3JpZy9mcy94ZnMveGZzX2F0 dHIuYwkyMDA2LTExLTIxIDE4OjM4OjI3LjU3Mjk0OTMwMyAtMDYwMA0KKysr IHdvcmtfZ2ZzL2ZzL3hmcy94ZnNfYXR0ci5jCTIwMDYtMTEtMjEgMTg6NDQ6 NTEuNjY2MDMzNDIyIC0wNjAwDQpAQCAtMjEwLDggKzIxMCwyMCBAQCB4ZnNf YXR0cl9zZXRfaW50KHhmc19pbm9kZV90ICpkcCwgY29uc3QgDQogCSAqIChp bm9kZSBtdXN0IG5vdCBiZSBsb2NrZWQgd2hlbiB3ZSBjYWxsIHRoaXMgcm91 dGluZSkNCiAJICovDQogCWlmIChYRlNfSUZPUktfUShkcCkgPT0gMCkgew0K LQkJaWYgKChlcnJvciA9IHhmc19ibWFwX2FkZF9hdHRyZm9yayhkcCwgc2l6 ZSwgcnN2ZCkpKQ0KLQkJCXJldHVybihlcnJvcik7DQorCQlpZiAoKGRwLT5p X2QuZGlfYWZvcm1hdCA9PSBYRlNfRElOT0RFX0ZNVF9MT0NBTCkgfHwNCisJ CSAgICAoKGRwLT5pX2QuZGlfYWZvcm1hdCA9PSBYRlNfRElOT0RFX0ZNVF9F WFRFTlRTKSAmJg0KKwkJICAgICAoZHAtPmlfZC5kaV9hbmV4dGVudHMgPT0g MCkpKSB7DQorCQkJLyogeGZzX2JtYXBfYWRkX2F0dHJmb3JrIHdpbGwgc2V0 IHRoZSBmb3Jrb2Zmc2V0IGJhc2VkIG9uDQorCQkJICogdGhlIHNpemUgbmVl ZGVkLCB0aGUgbG9jYWwgYXR0ciBjYXNlIG5lZWRzIHRoZSBzaXplDQorCQkJ ICogYXR0ciBwbHVzIHRoZSBzaXplIG9mIHRoZSBoZHIsIGlmIHRoZSBzaXpl IG9mDQorCQkJICogaGVhZGVyIGlzIG5vdCBhY2NvdW50ZWQgZm9yIGluaXRp YWxseSB0aGUgZm9ya29mZnNldA0KKwkJCSAqIHdvbid0IGFsbG93IGVub3Vn aCBzcGFjZSwgdGhlIGFjdHVhbGx5IGF0dHIgYWRkIHdpbGwNCisJCQkgKiB0 aGVuIGJlIGZvcmNlZCBvdXQgb3V0IGxpbmUgdG8gZXh0ZW50cw0KKwkJCSAq Lw0KKwkJCXNpemUgKz0gc2l6ZW9mKHhmc19hdHRyX3NmX2hkcl90KTsNCisJ CQlpZiAoKGVycm9yID0geGZzX2JtYXBfYWRkX2F0dHJmb3JrKGRwLCBzaXpl LCByc3ZkKSkpDQorCQkJCXJldHVybihlcnJvcik7DQorCQl9DQogCX0NCiAN CiAJLyoNCkluZGV4OiB3b3JrX2dmcy9mcy94ZnMveGZzX2F0dHJfbGVhZi5j DQo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09DQotLS0gd29ya19nZnMub3JpZy9m cy94ZnMveGZzX2F0dHJfbGVhZi5jCTIwMDYtMTEtMjEgMTg6Mzg6MjcuNTcy OTQ5MzAzIC0wNjAwDQorKysgd29ya19nZnMvZnMveGZzL3hmc19hdHRyX2xl YWYuYwkyMDA2LTExLTIxIDE4OjQ1OjE2LjA5NDM2MDkxOSAtMDYwMA0KQEAg LTE0OCw0MCArMTQ4LDc2IEBAIGludA0KIHhmc19hdHRyX3Nob3J0Zm9ybV9i eXRlc2ZpdCh4ZnNfaW5vZGVfdCAqZHAsIGludCBieXRlcykNCiB7DQogCWlu dCBvZmZzZXQ7DQotCWludCBtaW5mb3Jrb2ZmOwkvKiBsb3dlciBsaW1pdCBv biB2YWxpZCBmb3Jrb2ZmIGxvY2F0aW9ucyAqLw0KLQlpbnQgbWF4Zm9ya29m ZjsJLyogdXBwZXIgbGltaXQgb24gdmFsaWQgZm9ya29mZiBsb2NhdGlvbnMg Ki8NCisJaW50IG1pbmZvcmtvZmYgPSAwOwkvKiBsb3dlciBsaW1pdCBvbiB2 YWxpZCBmb3Jrb2ZmIGxvY2F0aW9ucyAqLw0KKwlpbnQgbWF4Zm9ya29mZiA9 IDA7CS8qIHVwcGVyIGxpbWl0IG9uIHZhbGlkIGZvcmtvZmYgbG9jYXRpb25z ICovDQogCXhmc19tb3VudF90ICptcCA9IGRwLT5pX21vdW50Ow0KKwlpbnQg cmVzdWx0ID0gMDsNCisJaW50IGRzaXplID0gMDsNCiANCiAJb2Zmc2V0ID0g KFhGU19MSVRJTk8obXApIC0gYnl0ZXMpID4+IDM7IC8qIHJvdW5kZWQgZG93 biAqLw0KIA0KIAlzd2l0Y2ggKGRwLT5pX2QuZGlfZm9ybWF0KSB7DQogCWNh c2UgWEZTX0RJTk9ERV9GTVRfREVWOg0KIAkJbWluZm9ya29mZiA9IHJvdW5k dXAoc2l6ZW9mKHhmc19kZXZfdCksIDgpID4+IDM7DQotCQlyZXR1cm4gKG9m ZnNldCA+PSBtaW5mb3Jrb2ZmKSA/IG1pbmZvcmtvZmYgOiAwOw0KKwkJcmVz dWx0ID0gKG9mZnNldCA+PSBtaW5mb3Jrb2ZmKSA/IG1pbmZvcmtvZmYgOiAw Ow0KKwkJZ290byByZXN1bHQ7DQogCWNhc2UgWEZTX0RJTk9ERV9GTVRfVVVJ RDoNCiAJCW1pbmZvcmtvZmYgPSByb3VuZHVwKHNpemVvZih1dWlkX3QpLCA4 KSA+PiAzOw0KLQkJcmV0dXJuIChvZmZzZXQgPj0gbWluZm9ya29mZikgPyBt aW5mb3Jrb2ZmIDogMDsNCisJCXJlc3VsdCA9IChvZmZzZXQgPj0gbWluZm9y a29mZikgPyBtaW5mb3Jrb2ZmIDogMDsNCisJCWdvdG8gcmVzdWx0Ow0KIAl9 DQogDQogCWlmICghKG1wLT5tX2ZsYWdzICYgWEZTX01PVU5UX0FUVFIyKSkg ew0KLQkJaWYgKGJ5dGVzIDw9IFhGU19JRk9SS19BU0laRShkcCkpDQotCQkJ cmV0dXJuIG1wLT5tX2F0dHJvZmZzZXQgPj4gMzsNCi0JCXJldHVybiAwOw0K KwkJaWYgKGJ5dGVzIDw9IFhGU19JRk9SS19BU0laRShkcCkpIHsNCisJCQly ZXN1bHQgPSBtcC0+bV9hdHRyb2Zmc2V0ID4+IDM7DQorCQkJZ290byByZXN1 bHQ7DQorCQl9DQorCQlyZXN1bHQgPSAwOw0KKwkJZ290byByZXN1bHQ7DQog CX0NCiANCiAJLyogZGF0YSBmb3JrIGJ0cmVlIHJvb3QgY2FuIGhhdmUgYXQg bGVhc3QgdGhpcyBtYW55IGtleS9wdHIgcGFpcnMgKi8NCi0JbWluZm9ya29m ZiA9IE1BWChkcC0+aV9kZi5pZl9ieXRlcywgWEZTX0JNRFJfU1BBQ0VfQ0FM QyhNSU5EQlRQVFJTKSk7DQorCXN3aXRjaCAoZHAtPmlfZC5kaV9mb3JtYXQp IHsNCisJY2FzZSBYRlNfRElOT0RFX0ZNVF9MT0NBTDoNCisJY2FzZSBYRlNf RElOT0RFX0ZNVF9FWFRFTlRTOg0KKwkJZHNpemUgPSBkcC0+aV9kZi5pZl9i eXRlczsNCisJCWJyZWFrOw0KKwljYXNlIFhGU19ESU5PREVfRk1UX0JUUkVF Og0KKwkJaWYgKGRwLT5pX2QuZGlfZm9ya29mZikNCisJCQlkc2l6ZSA9IGRw LT5pX2QuZGlfZm9ya29mZiA8PCAzOw0KKwkJZWxzZQ0KKwkJCWRzaXplID0g WEZTX0JNRFJfU1BBQ0VfQ0FMQygNCisJCQkgIFhGU19CTUFQX0JST09UX05V TVJFQ1MoZHAtPmlfZGYuaWZfYnJvb3QpKTsNCisJCWJyZWFrOw0KKwlkZWZh dWx0Og0KKwkJYnJlYWs7DQorCX0NCisNCisJbWluZm9ya29mZiA9IE1BWChk c2l6ZSwgWEZTX0JNRFJfU1BBQ0VfQ0FMQyhNSU5EQlRQVFJTKSk7DQogCW1p bmZvcmtvZmYgPSByb3VuZHVwKG1pbmZvcmtvZmYsIDgpID4+IDM7DQogDQog CS8qIGF0dHIgZm9yayBidHJlZSByb290IGNhbiBoYXZlIGF0IGxlYXN0IHRo aXMgbWFueSBrZXkvcHRyIHBhaXJzICovDQogCW1heGZvcmtvZmYgPSBYRlNf TElUSU5PKG1wKSAtIFhGU19CTURSX1NQQUNFX0NBTEMoTUlOQUJUUFRSUyk7 DQogCW1heGZvcmtvZmYgPSBtYXhmb3Jrb2ZmID4+IDM7CS8qIHJvdW5kZWQg ZG93biAqLw0KIA0KLQlpZiAob2Zmc2V0ID49IG1pbmZvcmtvZmYgJiYgb2Zm c2V0IDwgbWF4Zm9ya29mZikNCi0JCXJldHVybiBvZmZzZXQ7DQotCWlmIChv ZmZzZXQgPj0gbWF4Zm9ya29mZikNCi0JCXJldHVybiBtYXhmb3Jrb2ZmOw0K LQlyZXR1cm4gMDsNCisJaWYgKG9mZnNldCA+PSBtaW5mb3Jrb2ZmICYmIG9m ZnNldCA8IG1heGZvcmtvZmYpIHsNCisJCXJlc3VsdCA9IG9mZnNldDsNCisJ fQ0KKw0KKwlpZiAob2Zmc2V0ID49IG1heGZvcmtvZmYpIHsNCisJCXJlc3Vs dCA9IG1heGZvcmtvZmY7DQorCX0NCisNCisJLyogdGhlIGNhc2Ugb2YgYnRy ZWUgd2UgZG9uJ3Qgd2FudCB0byBtb3ZlIHRoZSBmb3Jrb2ZmDQorCSAqIHNp bmNlIHRoYXQgd291bGQgcmVxdWlyZSBhIHJlYmFsYW5jZSBvZiB0aGUgYnRy ZWUNCisJICogd2hpY2ggaXMgY3VycmVudGx5IG5vdCBpbXBsZW1lbnRlZCBm b3IgYXR0cjINCisJICovDQorDQorCWlmIChkcC0+aV9kLmRpX2Zvcm1hdCA9 PSBYRlNfRElOT0RFX0ZNVF9CVFJFRSAmJiByZXN1bHQpIHsNCisJCXJlc3Vs dCA9IGRzaXplID4+IDM7DQorCX0NCityZXN1bHQ6DQorCXJldHVybiByZXN1 bHQ7DQogfQ0KIA0KIC8qDQpJbmRleDogd29ya19nZnMvZnMveGZzL3hmc19i bWFwLmMNCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0NCi0tLSB3b3JrX2dmcy5v cmlnL2ZzL3hmcy94ZnNfYm1hcC5jCTIwMDYtMTEtMjEgMTg6Mzg6MjcuNTcy OTQ5MzAzIC0wNjAwDQorKysgd29ya19nZnMvZnMveGZzL3hmc19ibWFwLmMJ MjAwNi0xMS0yMSAxODo0NTozOC40NDY2NjA0NDUgLTA2MDANCkBAIC0zNjMy LDcgKzM2MzIsMjYgQEAgeGZzX2JtYXBfbG9jYWxfdG9fZXh0ZW50cygNCiAJ CWZsYWdzIHw9IFhGU19JTE9HX0ZFWFQod2hpY2hmb3JrKTsNCiAJfSBlbHNl IHsNCiAJCUFTU0VSVChYRlNfSUZPUktfTkVYVEVOVFMoaXAsIHdoaWNoZm9y aykgPT0gMCk7DQotCQl4ZnNfYm1hcF9mb3Jrb2ZmX3Jlc2V0KGlwLT5pX21v dW50LCBpcCwgd2hpY2hmb3JrKTsNCisNCisJCS8qIENoYW5naW5nIHRoZSBm b3Jrb2ZmIHJlcXVpcmVzIGEgcmViYWxhbmNlDQorCQkgICBvZiB0aGUgZGF0 YSBidHJlZS4NCisJCSAgIE9uY2UgZm9ya29mZiBpcyBzZXQgbGVhdmUgaXQg Zml4ZWQgd2hpbGUNCisJCSAgIGRhdGEgZm9ybWF0IGlzIHN0b3JlZCBpbiBh IGJ0cmVlLg0KKw0KKwkJICAgR3JhbnRlZCB0aGUgaW5vZGUgYnRyZWUgYmxv Y2sgaW4gaW5pdGlhbGx5DQorCQkgICBzZXR1cCB3aXRoIHRoZSBtYXggYXZh bGlibGUgRFNJWkUgc3BhY2Ugc28NCisJCSAgIG1pZ2h0IGJlIGEgc2xpZ2h0 IHdhc3RlIG9mIHNwYWNlIHRvIG5vdA0KKwkJICAgcmViYWxhbmNlLg0KKwkJ ICAgQnV0IHNpbmNlIHRoZSBhdHRyMSBjb2RlIGRpZCBub3QgaGF2ZSBhDQor CQkgICB2YXJpYWJsZSBmb3Jrb2ZmIGFzIGF0dHIyIG5vdyBoYXMgdGhlDQor CQkgICBkZXRlY3Rpb24gcG9pbnRzIGFuZCB0aGUgcmVibGFuYWNlIGNvZGUN CisJCSAgIGl0IG5vdCBpbiBwbGFjZS4NCisJCSAgIFRha2UgdGhlIGVhc3kg d2F5IG91dCBmb3Igbm93DQorCQkqLw0KKw0KKwkJaWYgKGlwLT5pX2QuZGlf Zm9ybWF0ICE9IFhGU19ESU5PREVfRk1UX0JUUkVFKQ0KKwkJCXhmc19ibWFw X2ZvcmtvZmZfcmVzZXQoaXAtPmlfbW91bnQsIGlwLCB3aGljaGZvcmspOw0K Kw0KIAl9DQogCWlmcC0+aWZfZmxhZ3MgJj0gflhGU19JRklOTElORTsNCiAJ aWZwLT5pZl9mbGFncyB8PSBYRlNfSUZFWFRFTlRTOw0K --=-q0sPS2kuB9bq3XAmCC+M Content-Disposition: attachment; filename=r2.sh Content-Type: application/x-shellscript; name=r2.sh Content-Transfer-Encoding: base64 IyEvYmluL3NoCgpmaWxlPWZzZmlsZTMKCnJlbW91bnQoKSB7CiAgICB1bW91 bnQgbW50CiAgICB4ZnNfZGIgLXIgJGZpbGUgLWMgImlub2RlIDEzMSIgLWMg InAgY29yZS5mb3Jrb2ZmIiAtYyAicCB1IiAtYyAicCBhIgogICAgbW91bnQg LW8gbG9vcCAkZmlsZSBtbnQvCn0KCnVtb3VudCBtbnQvCnJtIC1mICRmaWxl Cm1rZnMueGZzIC1kZmlsZSxuYW1lPSRmaWxlLHNpemU9MTAwbSAtaWF0dHI9 Mgptb3VudCAtbyBsb29wICRmaWxlIG1udC8KbWtkaXIgbW50L2RpcgoKc2V0 ZmF0dHIgLW4gdXNlci5yaXR5LnNlbGludXggLXYgdXNlcl9mb286YmxhaF9m b286bW50X3doYXQ6MCBtbnQvZGlyLwpmb3IgSSBpbiBgc2VxIDEwMCAxNDAw YDsgZG8gdG91Y2ggbW50L2Rpci9maWxlJEk7IGRvbmUKCnJlbW91bnQKc2V0 ZmF0dHIgLW4gdXNlci5GT08uQkFSIC12IHVzZXJfZm9vOmJsYWhfZm9vOm1u dF93aGF0OjAgbW50L2Rpci8KCmVjaG8gInVubW91bnRpbmciCnVtb3VudCBt bnQvCnhmc19kYiAtciAkZmlsZSAtYyAiaW5vZGUgMTMxIiAtYyAicCBjb3Jl LmZvcmtvZmYiIC1jICJwIHUiIC1jICJwIGEiIAo= --=-q0sPS2kuB9bq3XAmCC+M Content-Disposition: attachment; filename=e2 Content-Type: application/x-shellscript; name=e2 Content-Transfer-Encoding: base64 IyEvYmluL3NoCgpyZW1vdW50KCkgewogICAgdW1vdW50IG1udAogICAgeGZz X2RiIC1yIGZzZmlsZTIgLWMgImlub2RlIDEzMSIgLWMgInAgY29yZS5mb3Jr b2ZmIiAtYyAicCB1IiAtYyAicCBhIgogICAgbW91bnQgLW8gbG9vcCBmc2Zp bGUyIG1udC8KfQoKdW1vdW50IG1udC8Kcm0gLWYgZnNmaWxlMgpta2ZzLnhm cyAtZGZpbGUsbmFtZT1mc2ZpbGUyLHNpemU9MTAwbSAtaWF0dHI9Mgptb3Vu dCAtbyBsb29wIGZzZmlsZTIgbW50Lwpta2RpciBtbnQvZGlyCnNldGZhdHRy IC1uIHVzZXIucml0eS5zZWxpbnV4IC12IHVzZXJfZm9vOmJsYWhfZm9vOm1u dF93aGF0OjAgbW50L2Rpci8KZm9yIEkgaW4gYHNlcSAxMCAyMGA7IGRvIHRv dWNoIG1udC9maWxlJEk7IGRvbmUKZm9yIEkgaW4gYHNlcSAxMDAgNzAwYDsg ZG8gdG91Y2ggbW50L2Rpci9maWxlJEk7IGRvbmUKcmVtb3VudApmb3IgSSBp biBgc2VxIDEwMDAgMTQwMGA7IGRvIHRvdWNoIG1udC9kaXIvZmlsZSRJOyBk b25lCgojcmVtb3VudCAjIHdvcmtzIGlmIHdlIGRvIHJlbW91bnQgaGVyZQpz ZXRmYXR0ciAtbiB1c2VyLnJpdHkuc2VsaW51eCAtdiB1c2VyX2ZvbzpibGFo X2ZvbzptbnRfd2hhdDowIG1udC9kaXIvCgplY2hvICJ1bm1vdW50aW5nIgp1 bW91bnQgbW50Lwp4ZnNfZGIgLXIgZnNmaWxlMiAtYyAiaW5vZGUgMTMxIiAt YyAicCBjb3JlLmZvcmtvZmYiIC1jICJwIHUiIC1jICJwIGEiIAo= --=-q0sPS2kuB9bq3XAmCC+M-- --=-16+a/A7x3nE0fpFlpUP4 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQBFY6GXNRmM+OaGhBgRAoguAJ9aEGwNw4p6JFNA0MiiHc+FbqivMgCeMEg1 eaCDHFbCVkLirAJIEvJLy7c= =cD7Z -----END PGP SIGNATURE----- --=-16+a/A7x3nE0fpFlpUP4-- From owner-xfs@oss.sgi.com Tue Nov 21 17:23:26 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 17:23:32 -0800 (PST) Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAM1NNaI001397 for ; Tue, 21 Nov 2006 17:23:25 -0800 Received: from [127.0.0.1] (lupo.thebarn.com [10.0.0.10]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kAM19hdo005392; Tue, 21 Nov 2006 19:09:44 -0600 (CST) (envelope-from cattelan@thebarn.com) Subject: Re: [PATCH 1/2] Make stuff static From: Russell Cattelan To: David Chinner Cc: Tim Shimmin , Eric Sandeen , xfs@oss.sgi.com In-Reply-To: <20061122004216.GT11034@melbourne.sgi.com> References: <20060929032856.8DA9C18001A5E@sandeen.net> <23F15D6AE8566A54B81188AC@timothy-shimmins-power-mac-g5.local> <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-6xPZTyS3NQLbgyeu3PBL" Date: Tue, 21 Nov 2006 19:09:43 -0600 Message-Id: <1164157783.19915.46.camel@xenon.msp.redhat.com> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1.1-1mdv2007.1 X-archive-position: 9728 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 3593 Lines: 115 --=-6xPZTyS3NQLbgyeu3PBL Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Wed, 2006-11-22 at 11:42 +1100, David Chinner wrote: > On Tue, Oct 17, 2006 at 05:45:31PM -0500, Russell Cattelan wrote: > > On Wed, 2006-10-18 at 07:57 +1000, David Chinner wrote: > > > On Tue, Oct 17, 2006 at 05:13:01PM +1000, Tim Shimmin wrote: > > > > I thought that for debug, we could stop them from being inline > > > > for easier debugging. We could have a STATIC_INLINE :-) > > >=20 > > > We could, but I don't think it gains us anything. > > I agree with Tim on this. > > when I see STATIC in the code it's generally assumed to=20 > > be a way to toggle of static on/off. Adding static inline=20 > > to the #define STATIC starts to overload the the macro=20 > > and creates an obfuscation that isn't immediately obvious. > > STATIC_INLINE should be fairly obvious. >=20 > Ok, so I've had time to look at this again. Here's the definitions > of STATIC and STATIC_INLINE for debug and nondebug from the > patch (whitespace damaged): >=20 > Index: 2.6.x-xfs-new/fs/xfs/support/debug.h > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- 2.6.x-xfs-new.orig/fs/xfs/support/debug.h 2006-11-22 10:54:37.08998= 4780 +1100 > +++ 2.6.x-xfs-new/fs/xfs/support/debug.h 2006-11-22 11:30:20.43332= 6839 +1100 > @@ -38,13 +38,37 @@ extern void assfail(char *expr, char *f, >=20 > #ifndef DEBUG > # define ASSERT(expr) ((void)0) > -#else > + > +#ifndef STATIC > +# define STATIC static noinline > +#endif > + > +#ifndef STATIC_INLINE > +# define STATIC_INLINE static inline > +#endif > + > +#else /* DEBUG */ > + > # define ASSERT(expr) ASSERT_ALWAYS(expr) > extern unsigned long random(void); > -#endif >=20 > #ifndef STATIC > -# define STATIC static > +# define STATIC noinline > +#endif > + > +/* > + * We stop inlining of inline functions in debug mode. > + * Unfortunately, this means static inline in header files > + * get multiple definitions, so they need to remain static. > + * This then gives tonnes of warnings about unused but defined > + * functions, so we need to add the unused attribute to prevent > + * these spurious warnings. > + */ > +#ifndef STATIC_INLINE > +# define STATIC_INLINE static __attribute__ ((unused)) noinline > #endif >=20 > +#endif /* DEBUG */ > + > + > #endif /* __XFS_SUPPORT_DEBUG_H__ */ >=20 > ------ >=20 > Is this acceptible to everyone? Yup. >=20 > FWIW, there is one other thing that this conversion causes > problems with, and that's variable definitions. i.e. we can't > use STATIC on them any more because of the "noinline" attribute > it has. Do we care about this and if so, any suggestions on > how to keep this functionality (a different STATIC_xxx define > for structures)? So I know things like systemtap kgdb oprofile all work better when functions are not static, but what about variables/structures? do things really get that confused? Maybe we shouldn't worry about conditioning them and just make them static =20=20 >=20 > Cheers, >=20 > Dave. --=20 Russell Cattelan --=-6xPZTyS3NQLbgyeu3PBL Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQBFY6NXNRmM+OaGhBgRAhAOAJ9TZfVNTZJCP+bBnvSGtGGEpl1GaACfauSg +V4kBh6aZQUOh2WK++10dg4= =/5gO -----END PGP SIGNATURE----- --=-6xPZTyS3NQLbgyeu3PBL-- From owner-xfs@oss.sgi.com Tue Nov 21 18:17:27 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 18:17:34 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM2HMaG007921 for ; Tue, 21 Nov 2006 18:17:24 -0800 Received: from [134.14.55.18] (dhcp18.melbourne.sgi.com [134.14.55.18]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA25628; Wed, 22 Nov 2006 13:16:21 +1100 Message-ID: <4563B2F1.6040603@melbourne.sgi.com> Date: Wed, 22 Nov 2006 13:16:17 +1100 From: David Chatterton Reply-To: chatz@melbourne.sgi.com Organization: SGI User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Russell Cattelan CC: David Chinner , Tim Shimmin , Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static References: <20060929032856.8DA9C18001A5E@sandeen.net> <23F15D6AE8566A54B81188AC@timothy-shimmins-power-mac-g5.local> <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> <1164157783.19915.46.camel@xenon.msp.redhat.com> In-Reply-To: <1164157783.19915.46.camel@xenon.msp.redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9729 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chatz@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 3303 Lines: 96 Russell Cattelan wrote: > On Wed, 2006-11-22 at 11:42 +1100, David Chinner wrote: >> On Tue, Oct 17, 2006 at 05:45:31PM -0500, Russell Cattelan wrote: >>> On Wed, 2006-10-18 at 07:57 +1000, David Chinner wrote: >>>> On Tue, Oct 17, 2006 at 05:13:01PM +1000, Tim Shimmin wrote: >>>>> I thought that for debug, we could stop them from being inline >>>>> for easier debugging. We could have a STATIC_INLINE :-) >>>> We could, but I don't think it gains us anything. >>> I agree with Tim on this. >>> when I see STATIC in the code it's generally assumed to >>> be a way to toggle of static on/off. Adding static inline >>> to the #define STATIC starts to overload the the macro >>> and creates an obfuscation that isn't immediately obvious. >>> STATIC_INLINE should be fairly obvious. >> Ok, so I've had time to look at this again. Here's the definitions >> of STATIC and STATIC_INLINE for debug and nondebug from the >> patch (whitespace damaged): >> >> Index: 2.6.x-xfs-new/fs/xfs/support/debug.h >> =================================================================== >> --- 2.6.x-xfs-new.orig/fs/xfs/support/debug.h 2006-11-22 10:54:37.089984780 +1100 >> +++ 2.6.x-xfs-new/fs/xfs/support/debug.h 2006-11-22 11:30:20.433326839 +1100 >> @@ -38,13 +38,37 @@ extern void assfail(char *expr, char *f, >> >> #ifndef DEBUG >> # define ASSERT(expr) ((void)0) >> -#else >> + >> +#ifndef STATIC >> +# define STATIC static noinline >> +#endif >> + >> +#ifndef STATIC_INLINE >> +# define STATIC_INLINE static inline >> +#endif >> + >> +#else /* DEBUG */ >> + >> # define ASSERT(expr) ASSERT_ALWAYS(expr) >> extern unsigned long random(void); >> -#endif >> >> #ifndef STATIC >> -# define STATIC static >> +# define STATIC noinline >> +#endif >> + >> +/* >> + * We stop inlining of inline functions in debug mode. >> + * Unfortunately, this means static inline in header files >> + * get multiple definitions, so they need to remain static. >> + * This then gives tonnes of warnings about unused but defined >> + * functions, so we need to add the unused attribute to prevent >> + * these spurious warnings. >> + */ >> +#ifndef STATIC_INLINE >> +# define STATIC_INLINE static __attribute__ ((unused)) noinline >> #endif >> >> +#endif /* DEBUG */ >> + >> + >> #endif /* __XFS_SUPPORT_DEBUG_H__ */ >> >> ------ >> >> Is this acceptible to everyone? > Yup. > >> FWIW, there is one other thing that this conversion causes >> problems with, and that's variable definitions. i.e. we can't >> use STATIC on them any more because of the "noinline" attribute >> it has. Do we care about this and if so, any suggestions on >> how to keep this functionality (a different STATIC_xxx define >> for structures)? > So I know things like systemtap kgdb oprofile all work better when > functions are not static, but what about variables/structures? > do things really get that confused? > Maybe we shouldn't worry about conditioning them and just make them > static > I agree with Russell, is there a case for not defining a structure static? I can't think of one, unless it kdb/lcrash is going to work better if they are not static in a debug build. Otherwise, we should just use "static" and not "STATIC". Some for static file variables. David -- David Chatterton XFS Engineering Manager SGI Australia From owner-xfs@oss.sgi.com Tue Nov 21 20:25:48 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 20:25:56 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM4PiaG028806 for ; Tue, 21 Nov 2006 20:25:45 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA28918; Wed, 22 Nov 2006 15:24:50 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAM4Om7Y42765159; Wed, 22 Nov 2006 15:24:49 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAM4OjIN42777709; Wed, 22 Nov 2006 15:24:45 +1100 (AEDT) Date: Wed, 22 Nov 2006 15:24:45 +1100 From: David Chinner To: Russell Cattelan Cc: David Chinner , Tim Shimmin , Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static Message-ID: <20061122042445.GR37654165@melbourne.sgi.com> References: <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> <1164157783.19915.46.camel@xenon.msp.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1164157783.19915.46.camel@xenon.msp.redhat.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9730 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 33471 Lines: 967 On Tue, Nov 21, 2006 at 07:09:43PM -0600, Russell Cattelan wrote: > On Wed, 2006-11-22 at 11:42 +1100, David Chinner wrote: > > Ok, so I've had time to look at this again. Here's the definitions > > of STATIC and STATIC_INLINE for debug and nondebug from the > > patch (whitespace damaged): ..... > > Is this acceptible to everyone? > Yup. > > > > > FWIW, there is one other thing that this conversion causes > > problems with, and that's variable definitions. i.e. we can't > > use STATIC on them any more because of the "noinline" attribute > > it has. Do we care about this and if so, any suggestions on > > how to keep this functionality (a different STATIC_xxx define > > for structures)? > So I know things like systemtap kgdb oprofile all work better when > functions are not static, but what about variables/structures? > do things really get that confused? > Maybe we shouldn't worry about conditioning them and just make them > static Ok, so I'd already converted them to static where necessary. Attached is thecomplete patch. On ia64, the size of the xfs.ko and xfs_quota.ko modules decreases with this patch: Orig: -rw-rw-r-- 1 dgc ptg 1662416 2006-11-22 14:41 fs/xfs/quota/xfs_quota.ko -rw-rw-r-- 1 dgc ptg 856748 2006-11-22 14:41 fs/xfs/xfsidbg.ko -rw-rw-r-- 1 dgc ptg 13614719 2006-11-22 14:41 fs/xfs/xfs.ko With patch: -rw-rw-r-- 1 dgc ptg 1657814 2006-11-22 14:42 fs/xfs/quota/xfs_quota.ko -rw-rw-r-- 1 dgc ptg 856748 2006-11-22 14:42 fs/xfs/xfsidbg.ko -rw-rw-r-- 1 dgc ptg 13557579 2006-11-22 14:42 fs/xfs/xfs.ko The original top 10 stack users: 0x000e10c6 xfs_vn_mknod [xfs]: 576 0x000ddfe6 xfs_ioctl [xfs]: 368 0x000e1f46 xfs_vn_symlink [xfs]: 368 0x000345a6 xfs_bmapi [xfs]: 320 0x000b1146 _xfs_trans_commit [xfs]: 272 0x000c59c6 xfs_change_file_space [xfs]: 272 0x0003a6a6 xfs_bunmapi [xfs]: 240 0x000afa06 xfs_trans_unreserve_and_mod_sb [xfs]: 224 0x00040626 xfs_bmbt_insert [xfs]: 192 0x0008be26 xfs_iomap_write_delay [xfs]: 192 [64 functions with stack usage larger than 100 bytes] With patch: 0x000b7c46 _xfs_trans_commit [xfs]: 272 0x000b5426 xfs_trans_unreserve_and_mod_sb [xfs]: 224 0x000e4106 xfs_find_handle [xfs]: 224 0x000396c6 xfs_bmapi [xfs]: 208 0x00090066 xfs_iomap_write_delay [xfs]: 208 0x000e9046 xfs_cleanup_inode [xfs]: 208 0x000058c6 xfs_acl_setmode [xfs]: 160 0x00005f46 xfs_acl_allow_set [xfs]: 160 0x000067c6 xfs_acl_vtoacl [xfs]: 160 0x00007366 xfs_acl_vget [xfs]: 160 [69 functions with stack usage larger than 100 bytes] Performance appears to be slight faster with the noinline patch, but the variation is within the error margins of my measurements so I'd say it's neutral. Comments? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/dmapi/xfs_dm.c | 6 +++--- fs/xfs/linux-2.4/mrlock.c | 2 +- fs/xfs/linux-2.4/xfs_buf.c | 30 +++++++++++++++--------------- fs/xfs/linux-2.4/xfs_file.c | 8 ++++---- fs/xfs/linux-2.4/xfs_super.c | 14 +++++++------- fs/xfs/linux-2.4/xfs_vnode.h | 4 ++-- fs/xfs/linux-2.6/xfs_aops.c | 2 +- fs/xfs/linux-2.6/xfs_buf.c | 24 ++++++++++++------------ fs/xfs/linux-2.6/xfs_export.c | 2 +- fs/xfs/linux-2.6/xfs_file.c | 4 ++-- fs/xfs/linux-2.6/xfs_iops.c | 4 ++-- fs/xfs/linux-2.6/xfs_super.c | 16 ++++++++-------- fs/xfs/linux-2.6/xfs_sysctl.c | 6 +++--- fs/xfs/linux-2.6/xfs_vfs.c | 5 +++-- fs/xfs/linux-2.6/xfs_vnode.c | 2 +- fs/xfs/linux-2.6/xfs_vnode.h | 4 ++-- fs/xfs/quota/xfs_dquot_item.c | 6 +++--- fs/xfs/quota/xfs_qm.c | 6 +++--- fs/xfs/quota/xfs_qm_bhv.c | 2 +- fs/xfs/support/debug.h | 30 +++++++++++++++++++++++++++--- fs/xfs/xfs_attr.c | 12 ++++++------ fs/xfs/xfs_attr_leaf.c | 6 +++--- fs/xfs/xfs_bit.c | 2 +- fs/xfs/xfs_bmap_btree.c | 2 +- fs/xfs/xfs_buf_item.c | 2 +- fs/xfs/xfs_extfree_item.c | 4 ++-- fs/xfs/xfs_ialloc.c | 2 +- fs/xfs/xfs_inode.c | 2 +- fs/xfs/xfs_inode_item.c | 2 +- fs/xfs/xfs_mount.c | 8 ++++---- fs/xfs/xfs_refcache.c | 10 +++++----- 31 files changed, 127 insertions(+), 102 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_file.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_file.c 2006-11-22 14:41:06.076805511 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_file.c 2006-11-22 14:42:21.558973682 +1100 @@ -55,7 +55,7 @@ static struct vm_operations_struct xfs_d #define do_up_read(x) #endif -STATIC inline ssize_t +STATIC_INLINE ssize_t __xfs_file_read( struct file *file, char *buf, @@ -99,7 +99,7 @@ xfs_file_read_invis( } -STATIC inline ssize_t +STATIC_INLINE ssize_t __xfs_file_write( struct file *file, const char *buf, @@ -146,7 +146,7 @@ __xfs_file_write( return rval; } -STATIC inline ssize_t +STATIC_INLINE ssize_t xfs_file_write( struct file *file, const char *buf, @@ -156,7 +156,7 @@ xfs_file_write( return __xfs_file_write(file, buf, 0, count, ppos); } -STATIC inline ssize_t +STATIC_INLINE ssize_t xfs_file_write_invis( struct file *file, const char *buf, Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_aops.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_aops.c 2006-11-22 14:42:21.558973682 +1100 @@ -246,7 +246,7 @@ xfs_map_blocks( return -error; } -STATIC inline int +STATIC_INLINE int xfs_iomap_valid( xfs_iomap_t *iomapp, loff_t offset) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_buf.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c 2006-11-22 14:42:21.562973161 +1100 @@ -34,13 +34,13 @@ #include #include "xfs_linux.h" -STATIC kmem_zone_t *xfs_buf_zone; -STATIC kmem_shaker_t xfs_buf_shake; +static kmem_zone_t *xfs_buf_zone; +static kmem_shaker_t xfs_buf_shake; STATIC int xfsbufd(void *); STATIC int xfsbufd_wakeup(int, gfp_t); STATIC void xfs_buf_delwri_queue(xfs_buf_t *, int); -STATIC struct workqueue_struct *xfslogd_workqueue; +static struct workqueue_struct *xfslogd_workqueue; struct workqueue_struct *xfsdatad_workqueue; #ifdef XFS_BUF_TRACE @@ -139,7 +139,7 @@ page_region_mask( return mask; } -STATIC inline void +STATIC_INLINE void set_page_region( struct page *page, size_t offset, @@ -151,7 +151,7 @@ set_page_region( SetPageUptodate(page); } -STATIC inline int +STATIC_INLINE int test_page_region( struct page *page, size_t offset, @@ -171,9 +171,9 @@ typedef struct a_list { struct a_list *next; } a_list_t; -STATIC a_list_t *as_free_head; -STATIC int as_list_len; -STATIC DEFINE_SPINLOCK(as_lock); +static a_list_t *as_free_head; +static int as_list_len; +static DEFINE_SPINLOCK(as_lock); /* * Try to batch vunmaps because they are costly. @@ -1084,7 +1084,7 @@ xfs_buf_iostart( return status; } -STATIC __inline__ int +STATIC_INLINE int _xfs_buf_iolocked( xfs_buf_t *bp) { @@ -1094,7 +1094,7 @@ _xfs_buf_iolocked( return 0; } -STATIC __inline__ void +STATIC_INLINE void _xfs_buf_ioend( xfs_buf_t *bp, int schedule) @@ -1425,8 +1425,8 @@ xfs_free_bufhash( /* * buftarg list for delwrite queue processing */ -STATIC LIST_HEAD(xfs_buftarg_list); -STATIC DEFINE_SPINLOCK(xfs_buftarg_lock); +LIST_HEAD(xfs_buftarg_list); +static DEFINE_SPINLOCK(xfs_buftarg_lock); STATIC void xfs_register_buftarg( Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_file.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_file.c 2006-11-22 14:42:21.562973161 +1100 @@ -46,7 +46,7 @@ static struct vm_operations_struct xfs_f static struct vm_operations_struct xfs_dmapi_file_vm_ops; #endif -STATIC inline ssize_t +STATIC_INLINE ssize_t __xfs_file_read( struct kiocb *iocb, const struct iovec *iov, @@ -84,7 +84,7 @@ xfs_file_aio_read_invis( return __xfs_file_read(iocb, iov, nr_segs, IO_ISAIO|IO_INVIS, pos); } -STATIC inline ssize_t +STATIC_INLINE ssize_t __xfs_file_write( struct kiocb *iocb, const struct iovec *iov, Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_iops.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_iops.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_iops.c 2006-11-22 14:42:21.562973161 +1100 @@ -250,13 +250,13 @@ xfs_init_security( * * XXX(hch): nfsd is broken, better fix it instead. */ -STATIC inline int +STATIC_INLINE int xfs_has_fs_struct(struct task_struct *task) { return (task->fs != init_task.fs); } -STATIC inline void +STATIC void xfs_cleanup_inode( bhv_vnode_t *dvp, bhv_vnode_t *vp, Index: 2.6.x-xfs-new/fs/xfs/support/debug.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/support/debug.h 2006-11-22 14:41:06.144796655 +1100 +++ 2.6.x-xfs-new/fs/xfs/support/debug.h 2006-11-22 14:42:21.566972640 +1100 @@ -38,13 +38,37 @@ extern void assfail(char *expr, char *f, #ifndef DEBUG # define ASSERT(expr) ((void)0) -#else + +#ifndef STATIC +# define STATIC static noinline +#endif + +#ifndef STATIC_INLINE +# define STATIC_INLINE static inline +#endif + +#else /* DEBUG */ + # define ASSERT(expr) ASSERT_ALWAYS(expr) extern unsigned long random(void); -#endif #ifndef STATIC -# define STATIC static +# define STATIC noinline +#endif + +/* + * We stop inlining of inline functions in debug mode. + * Unfortunately, this means static inline in header files + * get multiple definitions, so they need to remain static. + * This then gives tonnes of warnings about unused but defined + * functions, so we need to add the unused attribute to prevent + * these spurious warnings. + */ +#ifndef STATIC_INLINE +# define STATIC_INLINE static __attribute__ ((unused)) noinline #endif +#endif /* DEBUG */ + + #endif /* __XFS_SUPPORT_DEBUG_H__ */ Index: 2.6.x-xfs-new/fs/xfs/xfs_attr_leaf.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_attr_leaf.c 2006-11-22 14:41:06.180791966 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_attr_leaf.c 2006-11-22 14:42:21.566972640 +1100 @@ -94,7 +94,7 @@ STATIC int xfs_attr_leaf_entsize(xfs_att * Namespace helper routines *========================================================================*/ -STATIC inline attrnames_t * +STATIC_INLINE attrnames_t * xfs_attr_flags_namesp(int flags) { return ((flags & XFS_ATTR_SECURE) ? &attr_secure: @@ -105,7 +105,7 @@ xfs_attr_flags_namesp(int flags) * If namespace bits don't match return 0. * If all match then return 1. */ -STATIC inline int +STATIC_INLINE int xfs_attr_namesp_match(int arg_flags, int ondisk_flags) { return XFS_ATTR_NSP_ONDISK(ondisk_flags) == XFS_ATTR_NSP_ARGS_TO_ONDISK(arg_flags); @@ -116,7 +116,7 @@ xfs_attr_namesp_match(int arg_flags, int * then return 0. * If all match or are overridable then return 1. */ -STATIC inline int +STATIC_INLINE int xfs_attr_namesp_match_overrides(int arg_flags, int ondisk_flags) { if (((arg_flags & ATTR_SECURE) == 0) != Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.c 2006-11-22 14:41:06.180791966 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.c 2006-11-22 14:42:21.566972640 +1100 @@ -1796,7 +1796,7 @@ xfs_icsb_destroy_counters( } } -STATIC inline void +STATIC_INLINE void xfs_icsb_lock_cntr( xfs_icsb_cnts_t *icsbp) { @@ -1805,7 +1805,7 @@ xfs_icsb_lock_cntr( } } -STATIC inline void +STATIC_INLINE void xfs_icsb_unlock_cntr( xfs_icsb_cnts_t *icsbp) { @@ -1813,7 +1813,7 @@ xfs_icsb_unlock_cntr( } -STATIC inline void +STATIC_INLINE void xfs_icsb_lock_all_counters( xfs_mount_t *mp) { @@ -1826,7 +1826,7 @@ xfs_icsb_lock_all_counters( } } -STATIC inline void +STATIC_INLINE void xfs_icsb_unlock_all_counters( xfs_mount_t *mp) { Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_buf.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_buf.c 2006-11-22 14:41:06.076805511 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_buf.c 2006-11-22 14:42:21.578971077 +1100 @@ -67,17 +67,17 @@ #define VM_MAP VM_ALLOC #endif -STATIC kmem_zone_t *xfs_buf_zone; -STATIC kmem_shaker_t xfs_buf_shake; +static kmem_zone_t *xfs_buf_zone; +static kmem_shaker_t xfs_buf_shake; #define MAX_IO_DAEMONS NR_CPUS #define CPU_TO_DAEMON(cpu) (cpu) -STATIC int xb_logio_daemons[MAX_IO_DAEMONS]; -STATIC struct list_head xfs_buf_logiodone_tq[MAX_IO_DAEMONS]; -STATIC wait_queue_head_t xfs_buf_logiodone_wait[MAX_IO_DAEMONS]; -STATIC int xb_dataio_daemons[MAX_IO_DAEMONS]; -STATIC struct list_head xfs_buf_dataiodone_tq[MAX_IO_DAEMONS]; -STATIC wait_queue_head_t xfs_buf_dataiodone_wait[MAX_IO_DAEMONS]; +static int xb_logio_daemons[MAX_IO_DAEMONS]; +static struct list_head xfs_buf_logiodone_tq[MAX_IO_DAEMONS]; +static wait_queue_head_t xfs_buf_logiodone_wait[MAX_IO_DAEMONS]; +static int xb_dataio_daemons[MAX_IO_DAEMONS]; +static struct list_head xfs_buf_dataiodone_tq[MAX_IO_DAEMONS]; +static wait_queue_head_t xfs_buf_dataiodone_wait[MAX_IO_DAEMONS]; /* * For pre-allocated buffer head pool @@ -154,9 +154,9 @@ typedef struct a_list { struct a_list *next; } a_list_t; -STATIC a_list_t *as_free_head; -STATIC int as_list_len; -STATIC DEFINE_SPINLOCK(as_lock); +static a_list_t *as_free_head; +static int as_list_len; +static DEFINE_SPINLOCK(as_lock); /* * Try to batch vunmaps because they are costly. @@ -515,7 +515,7 @@ _xfs_buf_get_prealloc_bh(void) * Otherwise, put it back in the pool, and wake up anybody * waiting for one. */ -STATIC inline void +STATIC_INLINE void _xfs_buf_free_bh( struct buffer_head *bh) { @@ -1204,7 +1204,7 @@ xfs_buf_iostart( return status; } -STATIC __inline__ int +STATIC_INLINE int _xfs_buf_iolocked( xfs_buf_t *bp) { @@ -1366,8 +1366,8 @@ xfs_free_bufhash( /* * buftarg list for delwrite queue processing */ -STATIC LIST_HEAD(xfs_buftarg_list); -STATIC DEFINE_SPINLOCK(xfs_buftarg_lock); +static LIST_HEAD(xfs_buftarg_list); +static DEFINE_SPINLOCK(xfs_buftarg_lock); STATIC void xfs_register_buftarg( Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_sysctl.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_sysctl.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_sysctl.c 2006-11-22 14:42:21.594968993 +1100 @@ -54,7 +54,7 @@ xfs_stats_clear_proc_handler( } #endif /* CONFIG_PROC_FS */ -STATIC ctl_table xfs_table[] = { +static ctl_table xfs_table[] = { {XFS_RESTRICT_CHOWN, "restrict_chown", &xfs_params.restrict_chown.val, sizeof(int), 0644, NULL, &proc_dointvec_minmax, &sysctl_intvec, NULL, @@ -151,12 +151,12 @@ STATIC ctl_table xfs_table[] = { {0} }; -STATIC ctl_table xfs_dir_table[] = { +static ctl_table xfs_dir_table[] = { {FS_XFS, "xfs", NULL, 0, 0555, xfs_table}, {0} }; -STATIC ctl_table xfs_root_table[] = { +static ctl_table xfs_root_table[] = { {CTL_FS, "fs", NULL, 0, 0555, xfs_dir_table}, {0} }; Index: 2.6.x-xfs-new/fs/xfs/xfs_attr.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_attr.c 2006-11-22 14:41:06.188790924 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_attr.c 2006-11-22 14:42:21.598968473 +1100 @@ -57,9 +57,9 @@ */ #define ATTR_SYSCOUNT 2 -STATIC struct attrnames posix_acl_access; -STATIC struct attrnames posix_acl_default; -STATIC struct attrnames *attr_system_names[ATTR_SYSCOUNT]; +static struct attrnames posix_acl_access; +static struct attrnames posix_acl_default; +static struct attrnames *attr_system_names[ATTR_SYSCOUNT]; /*======================================================================== * Function prototypes for the kernel. @@ -2477,7 +2477,7 @@ posix_acl_default_exists( return xfs_acl_vhasacl_default(vp); } -STATIC struct attrnames posix_acl_access = { +static struct attrnames posix_acl_access = { .attr_name = "posix_acl_access", .attr_namelen = sizeof("posix_acl_access") - 1, .attr_get = posix_acl_access_get, @@ -2486,7 +2486,7 @@ STATIC struct attrnames posix_acl_access .attr_exists = posix_acl_access_exists, }; -STATIC struct attrnames posix_acl_default = { +static struct attrnames posix_acl_default = { .attr_name = "posix_acl_default", .attr_namelen = sizeof("posix_acl_default") - 1, .attr_get = posix_acl_default_get, @@ -2495,7 +2495,7 @@ STATIC struct attrnames posix_acl_defaul .attr_exists = posix_acl_default_exists, }; -STATIC struct attrnames *attr_system_names[] = +static struct attrnames *attr_system_names[] = { &posix_acl_access, &posix_acl_default }; Index: 2.6.x-xfs-new/fs/xfs/xfs_bit.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_bit.c 2006-11-22 14:41:06.196789882 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_bit.c 2006-11-22 14:42:21.598968473 +1100 @@ -29,7 +29,7 @@ /* * Index of high bit number in byte, -1 for none set, 0..7 otherwise. */ -STATIC const char xfs_highbit[256] = { +static const char xfs_highbit[256] = { -1, 0, 1, 1, 2, 2, 2, 2, /* 00 .. 07 */ 3, 3, 3, 3, 3, 3, 3, 3, /* 08 .. 0f */ 4, 4, 4, 4, 4, 4, 4, 4, /* 10 .. 17 */ Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/mrlock.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/mrlock.c 2006-11-22 14:41:06.076805511 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/mrlock.c 2006-11-22 14:42:21.598968473 +1100 @@ -195,7 +195,7 @@ mrtryupdate(mrlock_t *mrp) return 1; } -static __inline__ void mrwake(mrlock_t *mrp) +STATIC_INLINE void mrwake(mrlock_t *mrp) { /* * First, if the count is now 0, we need to wake-up anyone waiting. Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_super.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_super.c 2006-11-22 14:41:06.076805511 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_super.c 2006-11-22 14:42:21.598968473 +1100 @@ -54,9 +54,9 @@ #include -STATIC struct quotactl_ops xfs_quotactl_operations; -STATIC struct super_operations xfs_super_operations; -STATIC kmem_zone_t *xfs_vnode_zone; +static struct quotactl_ops xfs_quotactl_operations; +static struct super_operations xfs_super_operations; +static kmem_zone_t *xfs_vnode_zone; STATIC struct xfs_mount_args * xfs_args_allocate( @@ -113,7 +113,7 @@ xfs_max_file_offset( return (((__uint64_t)pagefactor) << bitshift) - 1; } -STATIC __inline__ void +STATIC_INLINE void xfs_set_inodeops( struct inode *inode) { @@ -140,7 +140,7 @@ xfs_set_inodeops( } } -STATIC __inline__ void +STATIC_INLINE void xfs_revalidate_inode( xfs_mount_t *mp, bhv_vnode_t *vp, @@ -974,7 +974,7 @@ fail_vfsop: } -STATIC struct super_operations xfs_super_operations = { +static struct super_operations xfs_super_operations = { .alloc_inode = xfs_fs_alloc_inode, .destroy_inode = xfs_fs_destroy_inode, .write_inode = xfs_fs_write_inode, @@ -991,7 +991,7 @@ STATIC struct super_operations xfs_super .show_options = xfs_fs_show_options, }; -STATIC struct quotactl_ops xfs_quotactl_operations = { +static struct quotactl_ops xfs_quotactl_operations = { .quota_sync = xfs_fs_quotasync, .get_xstate = xfs_fs_getxstate, .set_xstate = xfs_fs_setxstate, Index: 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_vnode.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.4/xfs_vnode.h 2006-11-22 14:41:06.092803428 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.4/xfs_vnode.h 2006-11-22 14:42:21.598968473 +1100 @@ -468,14 +468,14 @@ static inline struct bhv_vnode *vn_grab( #define VN_LOCK(vp) mutex_spinlock(&(vp)->v_lock) #define VN_UNLOCK(vp, s) mutex_spinunlock(&(vp)->v_lock, s) -static __inline__ void vn_flagset(struct bhv_vnode *vp, uint flag) +STATIC_INLINE void vn_flagset(struct bhv_vnode *vp, uint flag) { spin_lock(&vp->v_lock); vp->v_flag |= flag; spin_unlock(&vp->v_lock); } -static __inline__ uint vn_flagclr(struct bhv_vnode *vp, uint flag) +STATIC_INLINE uint vn_flagclr(struct bhv_vnode *vp, uint flag) { uint cleared; Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c 2006-11-22 14:42:21.602967952 +1100 @@ -57,10 +57,10 @@ #include #include -STATIC struct quotactl_ops xfs_quotactl_operations; -STATIC struct super_operations xfs_super_operations; -STATIC kmem_zone_t *xfs_vnode_zone; -STATIC kmem_zone_t *xfs_ioend_zone; +static struct quotactl_ops xfs_quotactl_operations; +static struct super_operations xfs_super_operations; +static kmem_zone_t *xfs_vnode_zone; +static kmem_zone_t *xfs_ioend_zone; mempool_t *xfs_ioend_pool; STATIC struct xfs_mount_args * @@ -120,7 +120,7 @@ xfs_max_file_offset( return (((__uint64_t)pagefactor) << bitshift) - 1; } -STATIC __inline__ void +STATIC_INLINE void xfs_set_inodeops( struct inode *inode) { @@ -146,7 +146,7 @@ xfs_set_inodeops( } } -STATIC __inline__ void +STATIC_INLINE void xfs_revalidate_inode( xfs_mount_t *mp, bhv_vnode_t *vp, @@ -881,7 +881,7 @@ xfs_fs_get_sb( mnt); } -STATIC struct super_operations xfs_super_operations = { +static struct super_operations xfs_super_operations = { .alloc_inode = xfs_fs_alloc_inode, .destroy_inode = xfs_fs_destroy_inode, .write_inode = xfs_fs_write_inode, @@ -895,7 +895,7 @@ STATIC struct super_operations xfs_super .show_options = xfs_fs_show_options, }; -STATIC struct quotactl_ops xfs_quotactl_operations = { +static struct quotactl_ops xfs_quotactl_operations = { .quota_sync = xfs_fs_quotasync, .get_xstate = xfs_fs_getxstate, .set_xstate = xfs_fs_setxstate, Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vnode.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vnode.h 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vnode.h 2006-11-22 14:42:21.602967952 +1100 @@ -492,14 +492,14 @@ static inline struct bhv_vnode *vn_grab( #define VN_LOCK(vp) mutex_spinlock(&(vp)->v_lock) #define VN_UNLOCK(vp, s) mutex_spinunlock(&(vp)->v_lock, s) -static __inline__ void vn_flagset(struct bhv_vnode *vp, uint flag) +STATIC_INLINE void vn_flagset(struct bhv_vnode *vp, uint flag) { spin_lock(&vp->v_lock); vp->v_flag |= flag; spin_unlock(&vp->v_lock); } -static __inline__ uint vn_flagclr(struct bhv_vnode *vp, uint flag) +STATIC_INLINE uint vn_flagclr(struct bhv_vnode *vp, uint flag) { uint cleared; Index: 2.6.x-xfs-new/fs/xfs/xfs_bmap_btree.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_bmap_btree.c 2006-11-22 14:41:06.204788840 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_bmap_btree.c 2006-11-22 14:42:21.602967952 +1100 @@ -1862,7 +1862,7 @@ xfs_bmbt_delete( * xfs_bmbt_get_startblock, xfs_bmbt_get_blockcount and xfs_bmbt_get_state. */ -STATIC __inline__ void +STATIC_INLINE void __xfs_bmbt_get_all( __uint64_t l0, __uint64_t l1, Index: 2.6.x-xfs-new/fs/xfs/xfs_extfree_item.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_extfree_item.c 2006-11-22 14:41:06.204788840 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_extfree_item.c 2006-11-22 14:42:21.606967431 +1100 @@ -227,7 +227,7 @@ xfs_efi_item_committing(xfs_efi_log_item /* * This is the ops vector shared by all efi log items. */ -STATIC struct xfs_item_ops xfs_efi_item_ops = { +static struct xfs_item_ops xfs_efi_item_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_efi_item_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_efi_item_format, @@ -525,7 +525,7 @@ xfs_efd_item_committing(xfs_efd_log_item /* * This is the ops vector shared by all efd log items. */ -STATIC struct xfs_item_ops xfs_efd_item_ops = { +static struct xfs_item_ops xfs_efd_item_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_efd_item_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_efd_item_format, Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2006-11-22 14:41:06.204788840 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2006-11-22 14:42:21.614966389 +1100 @@ -2125,7 +2125,7 @@ xfs_iunlink_remove( return 0; } -static __inline__ int xfs_inode_clean(xfs_inode_t *ip) +STATIC_INLINE int xfs_inode_clean(xfs_inode_t *ip) { return (((ip->i_itemp == NULL) || !(ip->i_itemp->ili_format.ilf_fields & XFS_ILOG_ALL)) && Index: 2.6.x-xfs-new/fs/xfs/xfs_buf_item.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_buf_item.c 2006-11-22 14:41:06.204788840 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_buf_item.c 2006-11-22 14:42:21.618965868 +1100 @@ -660,7 +660,7 @@ xfs_buf_item_committing(xfs_buf_log_item /* * This is the ops vector shared by all buf log items. */ -STATIC struct xfs_item_ops xfs_buf_item_ops = { +static struct xfs_item_ops xfs_buf_item_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_buf_item_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_buf_item_format, Index: 2.6.x-xfs-new/fs/xfs/xfs_ialloc.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_ialloc.c 2006-11-22 14:41:06.204788840 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_ialloc.c 2006-11-22 14:42:21.618965868 +1100 @@ -342,7 +342,7 @@ xfs_ialloc_ag_alloc( return 0; } -STATIC __inline xfs_agnumber_t +STATIC_INLINE xfs_agnumber_t xfs_ialloc_next_ag( xfs_mount_t *mp) { Index: 2.6.x-xfs-new/fs/xfs/xfs_inode_item.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode_item.c 2006-11-22 14:41:06.204788840 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode_item.c 2006-11-22 14:42:21.618965868 +1100 @@ -887,7 +887,7 @@ xfs_inode_item_committing( /* * This is the ops vector shared by all buf log items. */ -STATIC struct xfs_item_ops xfs_inode_item_ops = { +static struct xfs_item_ops xfs_inode_item_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_inode_item_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_inode_item_format, Index: 2.6.x-xfs-new/fs/xfs/dmapi/xfs_dm.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/dmapi/xfs_dm.c 2006-11-22 14:41:06.268780504 +1100 +++ 2.6.x-xfs-new/fs/xfs/dmapi/xfs_dm.c 2006-11-22 14:42:21.622965347 +1100 @@ -133,9 +133,9 @@ typedef struct { changed! */ -STATIC const char dmattr_prefix[DMATTR_PREFIXLEN + 1] = DMATTR_PREFIXSTRING; +static const char dmattr_prefix[DMATTR_PREFIXLEN + 1] = DMATTR_PREFIXSTRING; -STATIC dm_size_t dm_min_dio_xfer = 0; /* direct I/O disabled for now */ +static dm_size_t dm_min_dio_xfer = 0; /* direct I/O disabled for now */ /* See xfs_dm_get_dmattr() for a description of why this is needed. */ @@ -3124,7 +3124,7 @@ xfs_dm_obj_ref_hold( } -STATIC fsys_function_vector_t xfs_fsys_vector[DM_FSYS_MAX]; +static fsys_function_vector_t xfs_fsys_vector[DM_FSYS_MAX]; int Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_export.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_export.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_export.c 2006-11-22 14:42:21.622965347 +1100 @@ -24,7 +24,7 @@ #include "xfs_mount.h" #include "xfs_export.h" -STATIC struct dentry dotdot = { .d_name.name = "..", .d_name.len = 2, }; +static struct dentry dotdot = { .d_name.name = "..", .d_name.len = 2, }; /* * XFS encodes and decodes the fileid portion of NFS filehandles Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vfs.c 2006-11-22 14:41:06.112800823 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vfs.c 2006-11-22 14:42:21.622965347 +1100 @@ -295,8 +295,9 @@ typedef struct bhv_module_list { const char * bm_name; void * bm_ops; } bhv_module_list_t; -STATIC DEFINE_SPINLOCK(bhv_lock); -STATIC struct list_head bhv_list = LIST_HEAD_INIT(bhv_list); + +static DEFINE_SPINLOCK(bhv_lock); +static struct list_head bhv_list = LIST_HEAD_INIT(bhv_list); void bhv_module_init( Index: 2.6.x-xfs-new/fs/xfs/quota/xfs_dquot_item.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/quota/xfs_dquot_item.c 2006-11-22 14:41:06.300776336 +1100 +++ 2.6.x-xfs-new/fs/xfs/quota/xfs_dquot_item.c 2006-11-22 14:42:21.626964826 +1100 @@ -399,7 +399,7 @@ xfs_qm_dquot_logitem_committing( /* * This is the ops vector for dquots */ -STATIC struct xfs_item_ops xfs_dquot_item_ops = { +static struct xfs_item_ops xfs_dquot_item_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_qm_dquot_logitem_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_qm_dquot_logitem_format, @@ -606,7 +606,7 @@ xfs_qm_qoffend_logitem_committing(xfs_qo return; } -STATIC struct xfs_item_ops xfs_qm_qoffend_logitem_ops = { +static struct xfs_item_ops xfs_qm_qoffend_logitem_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_qm_qoff_logitem_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_qm_qoff_logitem_format, @@ -628,7 +628,7 @@ STATIC struct xfs_item_ops xfs_qm_qoffen /* * This is the ops vector shared by all quotaoff-start log items. */ -STATIC struct xfs_item_ops xfs_qm_qoff_logitem_ops = { +static struct xfs_item_ops xfs_qm_qoff_logitem_ops = { .iop_size = (uint(*)(xfs_log_item_t*))xfs_qm_qoff_logitem_size, .iop_format = (void(*)(xfs_log_item_t*, xfs_log_iovec_t*)) xfs_qm_qoff_logitem_format, Index: 2.6.x-xfs-new/fs/xfs/quota/xfs_qm.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/quota/xfs_qm.c 2006-11-22 14:41:06.316774252 +1100 +++ 2.6.x-xfs-new/fs/xfs/quota/xfs_qm.c 2006-11-22 14:42:21.626964826 +1100 @@ -64,10 +64,10 @@ uint ndquot; kmem_zone_t *qm_dqzone; kmem_zone_t *qm_dqtrxzone; -STATIC kmem_shaker_t xfs_qm_shaker; +static kmem_shaker_t xfs_qm_shaker; -STATIC cred_t xfs_zerocr; -STATIC xfs_inode_t xfs_zeroino; +static cred_t xfs_zerocr; +static xfs_inode_t xfs_zeroino; STATIC void xfs_qm_list_init(xfs_dqlist_t *, char *, int); STATIC void xfs_qm_list_destroy(xfs_dqlist_t *); Index: 2.6.x-xfs-new/fs/xfs/quota/xfs_qm_bhv.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/quota/xfs_qm_bhv.c 2006-11-22 14:41:06.316774252 +1100 +++ 2.6.x-xfs-new/fs/xfs/quota/xfs_qm_bhv.c 2006-11-22 14:42:21.630964305 +1100 @@ -384,7 +384,7 @@ xfs_qm_dqrele_null( } -STATIC struct xfs_qmops xfs_qmcore_xfs = { +static struct xfs_qmops xfs_qmcore_xfs = { .xfs_qminit = xfs_qm_newmount, .xfs_qmdone = xfs_qm_unmount_quotadestroy, .xfs_qmmount = xfs_qm_endmount, Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vnode.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vnode.c 2006-11-22 14:41:06.120799781 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vnode.c 2006-11-22 14:42:21.630964305 +1100 @@ -26,7 +26,7 @@ DEFINE_SPINLOCK(vnumber_lock); */ #define NVSYNC 37 #define vptosync(v) (&vsync[((unsigned long)v) % NVSYNC]) -STATIC wait_queue_head_t vsync[NVSYNC]; +static wait_queue_head_t vsync[NVSYNC]; void vn_init(void) Index: 2.6.x-xfs-new/fs/xfs/xfs_refcache.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_refcache.c 2006-11-22 14:41:06.324773210 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_refcache.c 2006-11-22 14:42:21.630964305 +1100 @@ -45,11 +45,11 @@ #include "xfs_buf_item.h" #include "xfs_refcache.h" -STATIC spinlock_t xfs_refcache_lock = SPIN_LOCK_UNLOCKED; -STATIC xfs_inode_t **xfs_refcache; -STATIC int xfs_refcache_index; -STATIC int xfs_refcache_busy; -STATIC int xfs_refcache_count; +static spinlock_t xfs_refcache_lock = SPIN_LOCK_UNLOCKED; +static xfs_inode_t **xfs_refcache; +static int xfs_refcache_index; +static int xfs_refcache_busy; +static int xfs_refcache_count; /* * Insert the given inode into the reference cache. From owner-xfs@oss.sgi.com Tue Nov 21 20:47:52 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 20:48:00 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM4lnaG031816 for ; Tue, 21 Nov 2006 20:47:51 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA29658; Wed, 22 Nov 2006 15:46:58 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAM4kv7Y42662921; Wed, 22 Nov 2006 15:46:58 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAM4kudL42646968; Wed, 22 Nov 2006 15:46:56 +1100 (AEDT) Date: Wed, 22 Nov 2006 15:46:56 +1100 From: David Chinner To: xfs-dev@sgi.com Cc: xfs@oss.sgi.com Subject: Review: Fix inverted quiet mount logic Message-ID: <20061122044656.GS37654165@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 9731 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1087 Lines: 32 Simple problem Noticed by Eric Sandeen at about the same time I did - we are not getting error messages in dmesg when certain checks fail during mount. Turns out the XFS_MFSI_QUIET flag usage is inverted, so we suppress messages when we should be noisy and shout when we should be silent.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_error.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_error.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_error.h 2006-10-17 12:17:25.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_error.h 2006-11-16 09:45:03.444451972 +1100 @@ -180,6 +180,6 @@ extern void xfs_fs_cmn_err(int level, st xfs_fs_cmn_err(level, mp, fmt " Unmount and run xfs_repair.", ## args) #define xfs_fs_mount_cmn_err(f, fmt, args...) \ - ((f & XFS_MFSI_QUIET)? cmn_err(CE_WARN, "XFS: " fmt, ## args) : (void)0) + ((f & XFS_MFSI_QUIET)? (void)0 : cmn_err(CE_WARN, "XFS: " fmt, ## args)) #endif /* __XFS_ERROR_H__ */ From owner-xfs@oss.sgi.com Tue Nov 21 20:50:38 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 20:50:45 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM4oYaG032562 for ; Tue, 21 Nov 2006 20:50:37 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAB29789; Wed, 22 Nov 2006 15:49:41 +1100 Date: Wed, 22 Nov 2006 15:51:59 +1100 From: Timothy Shimmin To: David Chinner , xfs-dev@sgi.com cc: xfs@oss.sgi.com Subject: Re: Review: Fix inverted quiet mount logic Message-ID: <7013720DE121BA0C2CA1A09D@timothy-shimmins-power-mac-g5.local> In-Reply-To: <20061122044656.GS37654165@melbourne.sgi.com> References: <20061122044656.GS37654165@melbourne.sgi.com> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9732 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1255 Lines: 40 Add me to the reviewer list :) --Tim --On 22 November 2006 3:46:56 PM +1100 David Chinner wrote: > > Simple problem Noticed by Eric Sandeen at about the same time I did > - we are not getting error messages in dmesg when certain checks > fail during mount. Turns out the XFS_MFSI_QUIET flag usage is > inverted, so we suppress messages when we should be noisy and > shout when we should be silent.... > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > > --- > fs/xfs/xfs_error.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: 2.6.x-xfs-new/fs/xfs/xfs_error.h > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_error.h 2006-10-17 12:17:25.000000000 +1000 > +++ 2.6.x-xfs-new/fs/xfs/xfs_error.h 2006-11-16 09:45:03.444451972 +1100 > @@ -180,6 +180,6 @@ extern void xfs_fs_cmn_err(int level, st > xfs_fs_cmn_err(level, mp, fmt " Unmount and run xfs_repair.", ## args) > > #define xfs_fs_mount_cmn_err(f, fmt, args...) \ > - ((f & XFS_MFSI_QUIET)? cmn_err(CE_WARN, "XFS: " fmt, ## args) : (void)0) > + ((f & XFS_MFSI_QUIET)? (void)0 : cmn_err(CE_WARN, "XFS: " fmt, ## args)) > > #endif /* __XFS_ERROR_H__ */ From owner-xfs@oss.sgi.com Tue Nov 21 20:54:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 20:55:07 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM4ssaG001087 for ; Tue, 21 Nov 2006 20:54:57 -0800 Received: from [134.14.55.18] (dhcp18.melbourne.sgi.com [134.14.55.18]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA29946; Wed, 22 Nov 2006 15:53:53 +1100 Message-ID: <4563D7DD.1060907@melbourne.sgi.com> Date: Wed, 22 Nov 2006 15:53:49 +1100 From: David Chatterton Reply-To: chatz@melbourne.sgi.com Organization: SGI User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: David Chinner CC: Russell Cattelan , Tim Shimmin , Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static References: <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> <1164157783.19915.46.camel@xenon.msp.redhat.com> <20061122042445.GR37654165@melbourne.sgi.com> In-Reply-To: <20061122042445.GR37654165@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9733 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chatz@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 3515 Lines: 91 David Chinner wrote: > On Tue, Nov 21, 2006 at 07:09:43PM -0600, Russell Cattelan wrote: >> On Wed, 2006-11-22 at 11:42 +1100, David Chinner wrote: >>> Ok, so I've had time to look at this again. Here's the definitions >>> of STATIC and STATIC_INLINE for debug and nondebug from the >>> patch (whitespace damaged): > ..... >>> Is this acceptible to everyone? >> Yup. >> >>> FWIW, there is one other thing that this conversion causes >>> problems with, and that's variable definitions. i.e. we can't >>> use STATIC on them any more because of the "noinline" attribute >>> it has. Do we care about this and if so, any suggestions on >>> how to keep this functionality (a different STATIC_xxx define >>> for structures)? >> So I know things like systemtap kgdb oprofile all work better when >> functions are not static, but what about variables/structures? >> do things really get that confused? >> Maybe we shouldn't worry about conditioning them and just make them >> static > > Ok, so I'd already converted them to static where necessary. > Attached is thecomplete patch. On ia64, the size of the xfs.ko > and xfs_quota.ko modules decreases with this patch: > > Orig: > > -rw-rw-r-- 1 dgc ptg 1662416 2006-11-22 14:41 fs/xfs/quota/xfs_quota.ko > -rw-rw-r-- 1 dgc ptg 856748 2006-11-22 14:41 fs/xfs/xfsidbg.ko > -rw-rw-r-- 1 dgc ptg 13614719 2006-11-22 14:41 fs/xfs/xfs.ko > > With patch: > > -rw-rw-r-- 1 dgc ptg 1657814 2006-11-22 14:42 fs/xfs/quota/xfs_quota.ko > -rw-rw-r-- 1 dgc ptg 856748 2006-11-22 14:42 fs/xfs/xfsidbg.ko > -rw-rw-r-- 1 dgc ptg 13557579 2006-11-22 14:42 fs/xfs/xfs.ko > > The original top 10 stack users: > > 0x000e10c6 xfs_vn_mknod [xfs]: 576 > 0x000ddfe6 xfs_ioctl [xfs]: 368 > 0x000e1f46 xfs_vn_symlink [xfs]: 368 > 0x000345a6 xfs_bmapi [xfs]: 320 > 0x000b1146 _xfs_trans_commit [xfs]: 272 > 0x000c59c6 xfs_change_file_space [xfs]: 272 > 0x0003a6a6 xfs_bunmapi [xfs]: 240 > 0x000afa06 xfs_trans_unreserve_and_mod_sb [xfs]: 224 > 0x00040626 xfs_bmbt_insert [xfs]: 192 > 0x0008be26 xfs_iomap_write_delay [xfs]: 192 > > [64 functions with stack usage larger than 100 bytes] > > With patch: > > 0x000b7c46 _xfs_trans_commit [xfs]: 272 > 0x000b5426 xfs_trans_unreserve_and_mod_sb [xfs]: 224 > 0x000e4106 xfs_find_handle [xfs]: 224 > 0x000396c6 xfs_bmapi [xfs]: 208 > 0x00090066 xfs_iomap_write_delay [xfs]: 208 > 0x000e9046 xfs_cleanup_inode [xfs]: 208 > 0x000058c6 xfs_acl_setmode [xfs]: 160 > 0x00005f46 xfs_acl_allow_set [xfs]: 160 > 0x000067c6 xfs_acl_vtoacl [xfs]: 160 > 0x00007366 xfs_acl_vget [xfs]: 160 > > [69 functions with stack usage larger than 100 bytes] > > Performance appears to be slight faster with the noinline > patch, but the variation is within the error margins of > my measurements so I'd say it's neutral. > > Comments? > Just reducing xfs_bmapi by 118 bytes makes this worthwhile doesn't it? Out of interest, what estimated improvement does this have on one of Jesper's stacks? Should we be concerned that there are now more functions with 100 or more bytes? David -- David Chatterton XFS Engineering Manager SGI Australia From owner-xfs@oss.sgi.com Tue Nov 21 21:03:06 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 21:03:14 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM532aG003886 for ; Tue, 21 Nov 2006 21:03:04 -0800 Received: from [134.14.55.18] (dhcp18.melbourne.sgi.com [134.14.55.18]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA00345; Wed, 22 Nov 2006 16:02:08 +1100 Message-ID: <4563D9CC.4030006@melbourne.sgi.com> Date: Wed, 22 Nov 2006 16:02:04 +1100 From: David Chatterton Reply-To: chatz@melbourne.sgi.com Organization: SGI User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Timothy Shimmin CC: David Chinner , xfs-dev@sgi.com, xfs@oss.sgi.com Subject: Re: Review: Fix inverted quiet mount logic References: <20061122044656.GS37654165@melbourne.sgi.com> <7013720DE121BA0C2CA1A09D@timothy-shimmins-power-mac-g5.local> In-Reply-To: <7013720DE121BA0C2CA1A09D@timothy-shimmins-power-mac-g5.local> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9734 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chatz@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 1507 Lines: 54 One of the bugs we found on the training course that I had not raised yet...thanks! Timothy Shimmin wrote: > Add me to the reviewer list :) > --Tim > > --On 22 November 2006 3:46:56 PM +1100 David Chinner wrote: > >> >> Simple problem Noticed by Eric Sandeen at about the same time I did >> - we are not getting error messages in dmesg when certain checks >> fail during mount. Turns out the XFS_MFSI_QUIET flag usage is >> inverted, so we suppress messages when we should be noisy and >> shout when we should be silent.... >> >> Cheers, >> >> Dave. >> -- >> Dave Chinner >> Principal Engineer >> SGI Australian Software Group >> >> --- >> fs/xfs/xfs_error.h | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> Index: 2.6.x-xfs-new/fs/xfs/xfs_error.h >> =================================================================== >> --- 2.6.x-xfs-new.orig/fs/xfs/xfs_error.h 2006-10-17 >> 12:17:25.000000000 +1000 >> +++ 2.6.x-xfs-new/fs/xfs/xfs_error.h 2006-11-16 09:45:03.444451972 >> +1100 >> @@ -180,6 +180,6 @@ extern void xfs_fs_cmn_err(int level, st >> xfs_fs_cmn_err(level, mp, fmt " Unmount and run xfs_repair.", ## >> args) >> >> #define xfs_fs_mount_cmn_err(f, fmt, args...) \ >> - ((f & XFS_MFSI_QUIET)? cmn_err(CE_WARN, "XFS: " fmt, ## args) : >> (void)0) >> + ((f & XFS_MFSI_QUIET)? (void)0 : cmn_err(CE_WARN, "XFS: " fmt, ## >> args)) >> >> #endif /* __XFS_ERROR_H__ */ > > > > -- David Chatterton XFS Engineering Manager SGI Australia From owner-xfs@oss.sgi.com Tue Nov 21 21:16:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 21 Nov 2006 21:16:25 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAM5GFaG006202 for ; Tue, 21 Nov 2006 21:16:16 -0800 Received: from [192.168.1.4] (c-68-55-210-74.hsd1.dc.comcast.net [68.55.210.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 6E1FF18E21536; Tue, 21 Nov 2006 23:15:27 -0600 (CST) Message-ID: <4563DCE8.6050701@sandeen.net> Date: Tue, 21 Nov 2006 23:15:20 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: David Chinner CC: xfs-dev@sgi.com, xfs@oss.sgi.com Subject: Re: Review: Fix inverted quiet mount logic References: <20061122044656.GS37654165@melbourne.sgi.com> In-Reply-To: <20061122044656.GS37654165@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9735 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 362 Lines: 15 David Chinner wrote: > Simple problem Noticed by Eric Sandeen at about the same time I did > - we are not getting error messages in dmesg when certain checks > fail during mount. Turns out the XFS_MFSI_QUIET flag usage is > inverted, so we suppress messages when we should be noisy and > shout when we should be silent.... > > Cheers, > > Dave. ACK. -Eric From owner-xfs@oss.sgi.com Wed Nov 22 00:58:06 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 00:58:14 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAM8w2aG005540 for ; Wed, 22 Nov 2006 00:58:05 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id TAA06932; Wed, 22 Nov 2006 19:57:04 +1100 Date: Wed, 22 Nov 2006 19:59:22 +1100 From: Timothy Shimmin To: Russell Cattelan cc: Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> In-Reply-To: <1164157336.19915.43.camel@xenon.msp.redhat.com> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9736 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 2132 Lines: 74 Thanks, Russell. I've been going thru the irc and just started looking at the patch. I'll get back to you about it tomorrow. I agree it would be good to have the fixed forkoff for data btree roots as the first fix. And look into redoing the btree root for a later change. Cheers, Tim. --On 21 November 2006 7:02:16 PM -0600 Russell Cattelan wrote: > On Mon, 2006-11-20 at 22:02 -0600, Eric Sandeen wrote: >> Eric Sandeen wrote: >> >> > Eric Sandeen wrote: >> > >> >> ugh. it's broken on x86 too, so it's not just the alignment/padding, >> >> >> >> although that should be fixed for cross-arch mounts. >> >> >> >> -Eric >> >> >> >> >> > here's a testcase to corrupt it FWIW. >> > >> > >> Ok, with expert collaboration from Russell, Barry, Tim, >> Nathan, David, et al, how about this: >> >> For btree dirs, we need a different calculation for the space >> used in di_u, to set the minimum threshold for the fork offset... >> >> This fixes my testcase, but as Tim points out -now- we need to compact >> the btree ptrs, if we return (and use) an offset < current forkoff... >> >> whee.... >> >> -Eric >> > It turns out this only fixes one of the problems it is still quite easy > to corrupt indoes with attr2. > > The following patch is a short term fix that address the problem of > forkoff > moving without re-factoring the root inode btree root block. > > Once the inode has be flipped to BTREE for the data space the forkoff is > fixed > to the that size, currently due to the way attr1 worked (fixed size > forkoff) the code is not handling the size to the root btree node due to > size changes in the attr portion of the inode. > > The optimal solution is to adjust the data portion of the inode root > btree block down if space exists. > > One easy fix that was resulting all attr add being pushed out of line is > added > the header size to the initial split of the inode, at least the first > attr add > should go inline now. Which should be a win the big attr user right now > SElinux. > > Including the 2 test script that have been used. > > > > -- > Russell Cattelan From owner-xfs@oss.sgi.com Wed Nov 22 07:44:51 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 07:45:00 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMFioaG006557 for ; Wed, 22 Nov 2006 07:44:51 -0800 Received: from [192.168.1.4] (c-68-55-210-74.hsd1.dc.comcast.net [68.55.210.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 55D2918E2156D; Wed, 22 Nov 2006 09:44:02 -0600 (CST) Message-ID: <45647042.2040604@sandeen.net> Date: Wed, 22 Nov 2006 09:44:02 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Timothy Shimmin CC: Russell Cattelan , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> In-Reply-To: <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9742 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 389 Lines: 13 Timothy Shimmin wrote: > Thanks, Russell. > > I've been going thru the irc and just started looking at the patch. > I'll get back to you about it tomorrow. > > I agree it would be good to have the fixed forkoff for data btree roots > as the first fix. And look into redoing the btree root for a later change. My only question is, how much does this defeat the purpose of attr2? -Eric From owner-xfs@oss.sgi.com Wed Nov 22 08:14:05 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 08:14:13 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMGE3aG010972 for ; Wed, 22 Nov 2006 08:14:05 -0800 Received: from [192.168.1.4] (c-68-55-210-74.hsd1.dc.comcast.net [68.55.210.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id CCB4A18E21536; Wed, 22 Nov 2006 10:13:15 -0600 (CST) Message-ID: <4564771B.4040004@sandeen.net> Date: Wed, 22 Nov 2006 10:13:15 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: chatz@melbourne.sgi.com CC: David Chinner , Russell Cattelan , Tim Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static References: <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> <1164157783.19915.46.camel@xenon.msp.redhat.com> <20061122042445.GR37654165@melbourne.sgi.com> <4563D7DD.1060907@melbourne.sgi.com> In-Reply-To: <4563D7DD.1060907@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9743 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 838 Lines: 28 David Chatterton wrote: > > David Chinner wrote: >> Comments? >> > > Just reducing xfs_bmapi by 118 bytes makes this worthwhile doesn't it? > > Out of interest, what estimated improvement does this have on one of Jesper's > stacks? > > Should we be concerned that there are now more functions with 100 or more bytes? I don't think we need to worry about that, it is probably in the noise. There will almost certainly be fallout from this change w.r.t. 4k stacks. It should probably at least be tested on 4k stacks over a fairly complex volume setup to see. Also with respect to stack usage, is there extra stack space used, in addition to the explicit %esp adjustments, to set up each function call? IOW is the total more than the sum of the parts? :) I'm glad to hear that there's no apparent performance penalty.... -eric From owner-xfs@oss.sgi.com Wed Nov 22 08:26:19 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 08:26:25 -0800 (PST) Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMGQHaG017799 for ; Wed, 22 Nov 2006 08:26:18 -0800 Received: from [127.0.0.1] (lupo.thebarn.com [10.0.0.10]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kAMGOtn5032284; Wed, 22 Nov 2006 10:25:29 -0600 (CST) (envelope-from cattelan@thebarn.com) Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches From: Russell Cattelan To: Eric Sandeen Cc: Timothy Shimmin , xfs@oss.sgi.com In-Reply-To: <45647042.2040604@sandeen.net> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-OteeeT/tPu66SAzVEb3c" Date: Wed, 22 Nov 2006 10:24:55 -0600 Message-Id: <1164212695.19915.65.camel@xenon.msp.redhat.com> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1.1-1mdv2007.1 X-archive-position: 9744 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 1869 Lines: 57 --=-OteeeT/tPu66SAzVEb3c Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Wed, 2006-11-22 at 09:44 -0600, Eric Sandeen wrote: > Timothy Shimmin wrote: > > Thanks, Russell. > >=20 > > I've been going thru the irc and just started looking at the patch. > > I'll get back to you about it tomorrow. > >=20 > > I agree it would be good to have the fixed forkoff for data btree roots > > as the first fix. And look into redoing the btree root for a later chan= ge. >=20 > My only question is, how much does this defeat the purpose of attr2? Well from the standpoint that attr2 currently corrupts inodes anything to prevent that is good, since currently attr2 can't be used at all. When the di_u is extent based the attr2 code works as expected, giving space to which ever segment gets there first.The attr2 code should still be a big win for most file/dir inodes since they are probably able to do their block mapping with local or extent mode. The number of inodes that get pushed to btree mode should be a small % of the total number of inodes, especially on a root file system. So while attr2 is not as efficient as it could be for that segment of the inodes the rest of inodes do benefit from attr2 By fixing the initial size calculation at least things like SElinux which is adding one attr won't cause the attr segment to flip to extents immediately. The second attr will cause the flip but not the first one. >=20 > -Eric >=20 --=20 Russell Cattelan --=-OteeeT/tPu66SAzVEb3c Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQBFZHnXNRmM+OaGhBgRAiUmAJ9EcLBfckkOk8ceY+ZQkapTN3mvAACfXO4x d898C9V4nXS0QlWhM9wWzU4= =Wxjo -----END PGP SIGNATURE----- --=-OteeeT/tPu66SAzVEb3c-- From owner-xfs@oss.sgi.com Wed Nov 22 08:39:06 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 08:39:13 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMGd4aG020168 for ; Wed, 22 Nov 2006 08:39:05 -0800 Received: from [192.168.1.4] (c-68-55-210-74.hsd1.dc.comcast.net [68.55.210.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C5E7018E2156D; Wed, 22 Nov 2006 10:38:16 -0600 (CST) Message-ID: <45647CF8.8020104@sandeen.net> Date: Wed, 22 Nov 2006 10:38:16 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Russell Cattelan CC: Timothy Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> In-Reply-To: <1164212695.19915.65.camel@xenon.msp.redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9745 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 2008 Lines: 51 Russell Cattelan wrote: > On Wed, 2006-11-22 at 09:44 -0600, Eric Sandeen wrote: >> Timothy Shimmin wrote: >>> Thanks, Russell. >>> >>> I've been going thru the irc and just started looking at the patch. >>> I'll get back to you about it tomorrow. >>> >>> I agree it would be good to have the fixed forkoff for data btree roots >>> as the first fix. And look into redoing the btree root for a later change. >> My only question is, how much does this defeat the purpose of attr2? > Well from the standpoint that attr2 currently corrupts inodes anything > to prevent that is good, since currently attr2 can't be used at all. > When the di_u is extent based the attr2 code works as expected, giving > space to which ever segment gets there first.The attr2 code should still > be a big win for most file/dir inodes since they are probably able to do > their block mapping with local or extent mode. yeah, that;s rpobqably true. > The number of inodes that get pushed to btree mode should be a small % > of the > total number of inodes, especially on a root file system. So while attr2 > is > not as efficient as it could be for that segment of the inodes the rest > of inodes > do benefit from attr2 > > By fixing the initial size calculation at least things like SElinux > which is adding one attr won't cause the attr segment to flip to extents > immediately. > The second attr will cause the flip but not the first one. I'd say this part (fixing up proper space for the initial attr fork setup) should probably go in soon if it gets good reviews (with the removal of the extra tests, as we discussed on irc last night). I think this proper change stands on its own just fine. the rest of the patch... I'd rather not confuse the functional changes with your rearrangement of return locations (the new gotos etc) but that's just me. I think the bytesfit() fixup is probably good too, with your short-term addition of "if forkoff exists with btree data then it cannot move" -Eric > >> -Eric >> From owner-xfs@oss.sgi.com Wed Nov 22 12:40:36 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 12:40:44 -0800 (PST) Received: from mail.groll.co.za (mail.groll.co.za [67.18.176.185]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMKeYaG018814 for ; Wed, 22 Nov 2006 12:40:36 -0800 Received: by mail.groll.co.za (Postfix, from userid 1004) id 4853A258F8; Wed, 22 Nov 2006 22:08:11 +0200 (SAST) Date: Wed, 22 Nov 2006 22:08:11 +0200 From: Jonathan Groll To: xfs@oss.sgi.com Subject: Unexpected inode type 0160000 causes abort of xfs_repair Message-ID: <20061122200811.GB2493@groll.co.za> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i (Linux mail 2.4.29-linode39-1um i686) X-archive-position: 9749 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lists@groll.co.za Precedence: bulk X-list: xfs Content-Length: 531 Lines: 20 I'm unsuccesfully trying to repair an XFS filesystem xfs_repair /dev/md1 and the same with -L both end with bad (negative) size -2150115811482766770 on inode 1539849750 cleared inode 1539849750 bad magic number 0x859b on inode 1539849751, resetting magic number bad version number 0xffffff88 on inode 1539849751, resetting version number Unexpected inode type 0160000 inode 1539849751 Aborted OS: debian sarge (stable) Kernel: 2.6.15.7 xfsprogs 2.8.11-1 Is there anything I can possibly do? Many thanks, Jonathan Groll From owner-xfs@oss.sgi.com Wed Nov 22 12:40:36 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 12:40:43 -0800 (PST) Received: from mail.groll.co.za (mail.groll.co.za [67.18.176.185]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMKeYaG018812 for ; Wed, 22 Nov 2006 12:40:36 -0800 Received: by mail.groll.co.za (Postfix, from userid 1004) id CCEB5258F9; Wed, 22 Nov 2006 22:11:31 +0200 (SAST) Date: Wed, 22 Nov 2006 22:11:31 +0200 From: Jonathan Groll To: xfs@oss.sgi.com Subject: Unexpected inode type 0160000 causes abort of xfs_repair Message-ID: <20061122201131.GD2493@groll.co.za> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i (Linux mail 2.4.29-linode39-1um i686) X-archive-position: 9748 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lists@groll.co.za Precedence: bulk X-list: xfs Content-Length: 531 Lines: 20 I'm unsuccesfully trying to repair an XFS filesystem xfs_repair /dev/md1 and the same with -L both end with bad (negative) size -2150115811482766770 on inode 1539849750 cleared inode 1539849750 bad magic number 0x859b on inode 1539849751, resetting magic number bad version number 0xffffff88 on inode 1539849751, resetting version number Unexpected inode type 0160000 inode 1539849751 Aborted OS: debian sarge (stable) Kernel: 2.6.15.7 xfsprogs 2.8.11-1 Is there anything I can possibly do? Many thanks, Jonathan Groll From owner-xfs@oss.sgi.com Wed Nov 22 13:43:05 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 13:43:11 -0800 (PST) Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAMLh3aG025337 for ; Wed, 22 Nov 2006 13:43:05 -0800 Received: from [127.0.0.1] (lupo.thebarn.com [10.0.0.10]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kAMLfuD4041449; Wed, 22 Nov 2006 15:42:05 -0600 (CST) (envelope-from cattelan@thebarn.com) Subject: Re: XFS CORRUPTION 2.6.17.13? From: Russell Cattelan To: Justin Piszcz Cc: xfs@oss.sgi.com In-Reply-To: References: Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-3cgue0QeGJhKbvmyNhDp" Date: Wed, 22 Nov 2006 15:41:56 -0600 Message-Id: <1164231716.19915.68.camel@xenon.msp.redhat.com> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1.1-1mdv2007.1 X-archive-position: 9750 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 3189 Lines: 95 --=-3cgue0QeGJhKbvmyNhDp Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2006-11-20 at 15:00 -0500, Justin Piszcz wrote: > Anyone know what could cause this? > Is the last good kernel to use 2.6.17.6 w/XFS bugfix patch? >=20 > Was a new bug introduced from 2.6.16.6 -> 2.6.17.13? you must have missed the "new bugs introduced in this release" :-) There is no info here to go on. run xfs_repair -n=20 if it looks like an existing bug add to it if not open a new bug and attach repair output to it. > Nov 20 13:16:58 box [4299533.469000] 0x0: 00 00 00 00 00 00 00 00 00 00 0= 0=20 > 00 00 > 00 00 00 > Nov 20 13:16:58 box [4299533.469000] Filesystem "hda2": XFS internal error > xfs_da_do_buf(2) at line 2212 of file fs/xfs/xfs_da_btree.c. Caller=20 > 0xc01ffcad > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_corruption_error+0xf2/0x11a > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box kmem_zone_alloc+0x60/0xdd > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_buf_make+0xf7/0x14c > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_da_do_buf+0x935/0x98d > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.469000] > Nov 20 13:16:58 box xfs_dir2_node_lookup+0x3f/0xb9 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_dir2_lookup+0x137/0x139 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.470000] > Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_dir_lookup_int+0x40/0x125 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.470000] > Nov 20 13:16:58 box xfs_lookup+0x5f/0x88 > Nov 20 13:16:58 box > Nov 20 13:16:58 box > Nov 20 13:16:58 box xfs_vn_lookup+0x4f/0x93 > Nov 20 13:16:58 box > Nov 20 13:16:58 box [4299533.470000] > Nov 20 13:16:58 box do_lookup+0x12d/0x15f > Nov 20 13:16:58 box > Nov 20 13:16:58 box >=20 --=20 Russell Cattelan --=-3cgue0QeGJhKbvmyNhDp Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQBFZMQkNRmM+OaGhBgRAgWQAJ4ljO7Kbak+LBup0wq3v0CYJPQIgACeIF53 9RsS/dHd/gUbWCH9TE+ubE8= =H/TL -----END PGP SIGNATURE----- --=-3cgue0QeGJhKbvmyNhDp-- From owner-xfs@oss.sgi.com Wed Nov 22 20:42:23 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 20:42:30 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAN4gIaG014008 for ; Wed, 22 Nov 2006 20:42:20 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA14495; Thu, 23 Nov 2006 15:41:24 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAN4fN7Y43605714; Thu, 23 Nov 2006 15:41:23 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAN4fMAu43454170; Thu, 23 Nov 2006 15:41:22 +1100 (AEDT) Date: Thu, 23 Nov 2006 15:41:22 +1100 From: David Chinner To: xfs-dev@sgi.com Cc: xfs@oss.sgi.com Subject: Review: Reduce in-core superblock lock contention near ENOSPC Message-ID: <20061123044122.GU11034@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 9753 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 16073 Lines: 482 The existing per-cpu superblock counter code still uses the existing superblock spin lock when we approach ENOSPC for global synchronisation. On larger machines than this code was originally tested on we've found that we can still get catastrophic spinlock contention as we near ENOSPC due to the frequency of rebalances increasing. The following patch prevents the case of tens of CPUs spinning on the incore superblock lock as rebalances fly around. This is done mainly by introducing a sleeping lock that is used to serialise balances and modifications near ENOSPC. While this does not prevent contention and serialisation, it prevents use from wasting the CPU time of potentially hundreds of CPUs. This patch also reduces the number of balances occuring by separating the "need rebalance" case from the "slow allocate" case. Previously, when a per-cpu counter ran dry, we would lock the superblock, disable the counters, rebalance and retry the modification. If we failed a second time we'd disable the per-cpu counter and use the global slowpath. However, while we had the counters disabled other threads could run would then sit waiting in the slow path on the global superblock lock waiting to do a rebalance. IOWs, near ENOSPC we can end up with lots of CPUs waiting to do a rebalance and then executing a rebalance even though it is not necessary. Now, a counter running dry will trigger a rebalance during which counters are disabled. Any thread that sees a disabled counter enters a different path where it waits on the new mutex. When it gets the new mutex, it checks if the counter is disabled. If the counter is disabled, then we _know_ that we have to use the global counter and lock and it is safe to do so immediately. Otherwise, we drop the mutex and go back to trying the per-cpu counters which we know were re-enabled. IOWs, we only do a single rebalance for each counter that runs dry and we don't get a stampeding heard of rebalances on large CPU count machines and the subsequent problems that spinlock contention will cause. This patch also fixes a rebalance loop that can occur when we try to reserve more than any per-cpu counter holds while the aggregate free space is sufficient for a rebalance to always redistribute the space acorss the per-cpu counters. It does so by ensuring that the minimum amount on each counter as a result of a rebalance is sufficient to satisfy the request, or it falls back to the global slow path earlier than it otherwise would. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_mount.c | 234 +++++++++++++++++++++++++++++++---------------------- fs/xfs/xfs_mount.h | 1 2 files changed, 142 insertions(+), 93 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.c 2006-10-19 10:29:35.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.c 2006-10-19 10:32:16.626827226 +1000 @@ -52,21 +52,19 @@ STATIC void xfs_unmountfs_wait(xfs_mount #ifdef HAVE_PERCPU_SB STATIC void xfs_icsb_destroy_counters(xfs_mount_t *); -STATIC void xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t, int); +STATIC void xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t, int, +int); STATIC void xfs_icsb_sync_counters(xfs_mount_t *); STATIC int xfs_icsb_modify_counters(xfs_mount_t *, xfs_sb_field_t, int64_t, int); -STATIC int xfs_icsb_modify_counters_locked(xfs_mount_t *, xfs_sb_field_t, - int64_t, int); STATIC int xfs_icsb_disable_counter(xfs_mount_t *, xfs_sb_field_t); #else #define xfs_icsb_destroy_counters(mp) do { } while (0) -#define xfs_icsb_balance_counter(mp, a, b) do { } while (0) +#define xfs_icsb_balance_counter(mp, a, b, c) do { } while (0) #define xfs_icsb_sync_counters(mp) do { } while (0) #define xfs_icsb_modify_counters(mp, a, b, c) do { } while (0) -#define xfs_icsb_modify_counters_locked(mp, a, b, c) do { } while (0) #endif @@ -540,9 +538,11 @@ xfs_readsb(xfs_mount_t *mp, int flags) ASSERT(XFS_BUF_VALUSEMA(bp) <= 0); } - xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0); + mutex_lock(&mp->m_icsb_mutex); + xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0, 0); + mutex_unlock(&mp->m_icsb_mutex); mp->m_sb_bp = bp; xfs_buf_relse(bp); @@ -1479,9 +1479,11 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, case XFS_SBS_IFREE: case XFS_SBS_FDBLOCKS: if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { - status = xfs_icsb_modify_counters_locked(mp, + XFS_SB_UNLOCK(mp, s); + status = xfs_icsb_modify_counters(mp, msbp->msb_field, msbp->msb_delta, rsvd); + s = XFS_SB_LOCK(mp); break; } /* FALLTHROUGH */ @@ -1515,11 +1517,12 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, case XFS_SBS_IFREE: case XFS_SBS_FDBLOCKS: if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { - status = - xfs_icsb_modify_counters_locked(mp, + XFS_SB_UNLOCK(mp, s); + status = xfs_icsb_modify_counters(mp, msbp->msb_field, -(msbp->msb_delta), rsvd); + s = XFS_SB_LOCK(mp); break; } /* FALLTHROUGH */ @@ -1727,14 +1730,17 @@ xfs_icsb_cpu_notify( memset(cntp, 0, sizeof(xfs_icsb_cnts_t)); break; case CPU_ONLINE: - xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0); + mutex_lock(&mp->m_icsb_mutex); + xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0, 0); + mutex_unlock(&mp->m_icsb_mutex); break; case CPU_DEAD: /* Disable all the counters, then fold the dead cpu's * count into the total on the global superblock and * re-enable the counters. */ + mutex_lock(&mp->m_icsb_mutex); s = XFS_SB_LOCK(mp); xfs_icsb_disable_counter(mp, XFS_SBS_ICOUNT); xfs_icsb_disable_counter(mp, XFS_SBS_IFREE); @@ -1746,10 +1752,14 @@ xfs_icsb_cpu_notify( memset(cntp, 0, sizeof(xfs_icsb_cnts_t)); - xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, XFS_ICSB_SB_LOCKED); - xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, XFS_ICSB_SB_LOCKED); - xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, XFS_ICSB_SB_LOCKED); + xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, + XFS_ICSB_SB_LOCKED, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, + XFS_ICSB_SB_LOCKED, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, + XFS_ICSB_SB_LOCKED, 0); XFS_SB_UNLOCK(mp, s); + mutex_unlock(&mp->m_icsb_mutex); break; } @@ -1778,6 +1788,9 @@ xfs_icsb_init_counters( cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i); memset(cntp, 0, sizeof(xfs_icsb_cnts_t)); } + + mutex_init(&mp->m_icsb_mutex); + /* * start with all counters disabled so that the * initial balance kicks us off correctly @@ -1882,6 +1895,17 @@ xfs_icsb_disable_counter( ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS)); + /* + * If we are already disabled, then there is nothing to do + * here. We check before locking all the counters to avoid + * the expensive lock operation when being called in the + * slow path and the counter is already disabled. This is + * safe because the only time we set or clear this state is under + * the m_icsb_mutex. + */ + if (xfs_icsb_counter_disabled(mp, field)) + return 0; + xfs_icsb_lock_all_counters(mp); if (!test_and_set_bit(field, &mp->m_icsb_counters)) { /* drain back to superblock */ @@ -1991,24 +2015,33 @@ xfs_icsb_sync_counters_lazy( /* * Balance and enable/disable counters as necessary. * - * Thresholds for re-enabling counters are somewhat magic. - * inode counts are chosen to be the same number as single - * on disk allocation chunk per CPU, and free blocks is - * something far enough zero that we aren't going thrash - * when we get near ENOSPC. + * Thresholds for re-enabling counters are somewhat magic. inode counts are + * chosen to be the same number as single on disk allocation chunk per CPU, and + * free blocks is something far enough zero that we aren't going thrash when we + * get near ENOSPC. We also need to supply a minimum we require per cpu to + * prevent looping endlessly when xfs_alloc_space asks for more than will + * be distributed to a single CPU but each CPU has enough blocks to be + * reenabled. + * + * Note that we can be called when counters are already disabled. + * xfs_icsb_disable_counter() optimises the counter locking in this case to + * prevent locking every per-cpu counter needlessly. */ -#define XFS_ICSB_INO_CNTR_REENABLE 64 + +#define XFS_ICSB_INO_CNTR_REENABLE (uint64_t)64 #define XFS_ICSB_FDBLK_CNTR_REENABLE(mp) \ - (512 + XFS_ALLOC_SET_ASIDE(mp)) + (uint64_t)(512 + XFS_ALLOC_SET_ASIDE(mp)) STATIC void xfs_icsb_balance_counter( xfs_mount_t *mp, xfs_sb_field_t field, - int flags) + int flags, + int min_per_cpu) { uint64_t count, resid; int weight = num_online_cpus(); int s; + uint64_t min = (uint64_t)min_per_cpu; if (!(flags & XFS_ICSB_SB_LOCKED)) s = XFS_SB_LOCK(mp); @@ -2021,19 +2054,19 @@ xfs_icsb_balance_counter( case XFS_SBS_ICOUNT: count = mp->m_sb.sb_icount; resid = do_div(count, weight); - if (count < XFS_ICSB_INO_CNTR_REENABLE) + if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE)) goto out; break; case XFS_SBS_IFREE: count = mp->m_sb.sb_ifree; resid = do_div(count, weight); - if (count < XFS_ICSB_INO_CNTR_REENABLE) + if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE)) goto out; break; case XFS_SBS_FDBLOCKS: count = mp->m_sb.sb_fdblocks; resid = do_div(count, weight); - if (count < XFS_ICSB_FDBLK_CNTR_REENABLE(mp)) + if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp))) goto out; break; default: @@ -2048,32 +2081,39 @@ out: XFS_SB_UNLOCK(mp, s); } -STATIC int -xfs_icsb_modify_counters_int( +int +xfs_icsb_modify_counters( xfs_mount_t *mp, xfs_sb_field_t field, int64_t delta, - int rsvd, - int flags) + int rsvd) { xfs_icsb_cnts_t *icsbp; long long lcounter; /* long counter for 64 bit fields */ - int cpu, s, locked = 0; - int ret = 0, balance_done = 0; + int cpu, ret = 0, s; + might_sleep(); again: cpu = get_cpu(); - icsbp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, cpu), - xfs_icsb_lock_cntr(icsbp); + icsbp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, cpu); + + /* + * if the counter is disabled, go to slow path + */ if (unlikely(xfs_icsb_counter_disabled(mp, field))) goto slow_path; + xfs_icsb_lock_cntr(icsbp); + if (unlikely(xfs_icsb_counter_disabled(mp, field))) { + xfs_icsb_unlock_cntr(icsbp); + goto slow_path; + } switch (field) { case XFS_SBS_ICOUNT: lcounter = icsbp->icsb_icount; lcounter += delta; if (unlikely(lcounter < 0)) - goto slow_path; + goto balance_counter; icsbp->icsb_icount = lcounter; break; @@ -2081,7 +2121,7 @@ again: lcounter = icsbp->icsb_ifree; lcounter += delta; if (unlikely(lcounter < 0)) - goto slow_path; + goto balance_counter; icsbp->icsb_ifree = lcounter; break; @@ -2091,7 +2131,7 @@ again: lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp); lcounter += delta; if (unlikely(lcounter < 0)) - goto slow_path; + goto balance_counter; icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp); break; default: @@ -2100,72 +2140,80 @@ again: } xfs_icsb_unlock_cntr(icsbp); put_cpu(); - if (locked) - XFS_SB_UNLOCK(mp, s); return 0; - /* - * The slow path needs to be run with the SBLOCK - * held so that we prevent other threads from - * attempting to run this path at the same time. - * this provides exclusion for the balancing code, - * and exclusive fallback if the balance does not - * provide enough resources to continue in an unlocked - * manner. - */ slow_path: - xfs_icsb_unlock_cntr(icsbp); put_cpu(); - /* need to hold superblock incase we need - * to disable a counter */ - if (!(flags & XFS_ICSB_SB_LOCKED)) { - s = XFS_SB_LOCK(mp); - locked = 1; - flags |= XFS_ICSB_SB_LOCKED; - } - if (!balance_done) { - xfs_icsb_balance_counter(mp, field, flags); - balance_done = 1; + /* + * serialise with a mutex so we don't burn lots of cpu on + * the superblock lock. We still need to hold the superblock + * lock, however, when we modify the global structures. + */ + mutex_lock(&mp->m_icsb_mutex); + + /* + * Now running atomically. + * + * If the counter is enabled, someone has beaten us to rebalancing. + * Drop the lock and try again in the fast path.... + */ + if (!(xfs_icsb_counter_disabled(mp, field))) { + mutex_unlock(&mp->m_icsb_mutex); goto again; - } else { - /* - * we might not have enough on this local - * cpu to allocate for a bulk request. - * We need to drain this field from all CPUs - * and disable the counter fastpath - */ - xfs_icsb_disable_counter(mp, field); } + /* + * The counter is currently disabled. Because we are + * running atomically here, we know a rebalance cannot + * be in progress. Hence we can go straight to operating + * on the global superblock. We do not call xfs_mod_incore_sb() + * here even though we need to get the SB_LOCK. Doing so + * will cause us to re-enter this function and deadlock. + * Hence we get the SB_LOCK ourselves and then call + * xfs_mod_incore_sb_unlocked() as the unlocked path operates + * directly on the global counters. + */ + s = XFS_SB_LOCK(mp); ret = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd); + XFS_SB_UNLOCK(mp, s); - if (locked) - XFS_SB_UNLOCK(mp, s); + /* + * Now that we've modified the global superblock, we + * may be able to re-enable the distributed counters + * (e.g. lots of space just got freed). After that + * we are done. + */ + if (ret != ENOSPC) + xfs_icsb_balance_counter(mp, field, 0, 0); + mutex_unlock(&mp->m_icsb_mutex); return ret; -} -STATIC int -xfs_icsb_modify_counters( - xfs_mount_t *mp, - xfs_sb_field_t field, - int64_t delta, - int rsvd) -{ - return xfs_icsb_modify_counters_int(mp, field, delta, rsvd, 0); -} +balance_counter: + xfs_icsb_unlock_cntr(icsbp); + put_cpu(); -/* - * Called when superblock is already locked - */ -STATIC int -xfs_icsb_modify_counters_locked( - xfs_mount_t *mp, - xfs_sb_field_t field, - int64_t delta, - int rsvd) -{ - return xfs_icsb_modify_counters_int(mp, field, delta, - rsvd, XFS_ICSB_SB_LOCKED); + /* + * We may have multiple threads here if multiple per-cpu + * counters run dry at the same time. This will mean we can + * do more balances than strictly necessary but it is not + * the common slowpath case. + */ + mutex_lock(&mp->m_icsb_mutex); + + /* + * running atomically. + * + * This will leave the counter in the correct state for future + * accesses. After the rebalance, we simply try again but with the + * global superblock lock held. This ensures that the counter state as + * a result of the balance does not change and our retry will either + * succeed through the fast path or slow path without another balance + * operation being required. + */ + xfs_icsb_balance_counter(mp, field, 0, delta); + mutex_unlock(&mp->m_icsb_mutex); + goto again; } + #endif Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.h 2006-10-19 10:25:12.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.h 2006-10-19 10:32:16.626827226 +1000 @@ -419,6 +419,7 @@ typedef struct xfs_mount { xfs_icsb_cnts_t *m_sb_cnts; /* per-cpu superblock counters */ unsigned long m_icsb_counters; /* disabled per-cpu counters */ struct notifier_block m_icsb_notifier; /* hotplug cpu notifier */ + struct mutex m_icsb_mutex; /* balancer sync lock */ #endif } xfs_mount_t; From owner-xfs@oss.sgi.com Wed Nov 22 22:06:45 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 22:06:54 -0800 (PST) Received: from smtp113.sbc.mail.mud.yahoo.com (smtp113.sbc.mail.mud.yahoo.com [68.142.198.212]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAN66iaG023528 for ; Wed, 22 Nov 2006 22:06:45 -0800 Received: (qmail 10348 invoked from network); 23 Nov 2006 05:39:16 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp113.sbc.mail.mud.yahoo.com with SMTP; 23 Nov 2006 05:39:16 -0000 X-YMail-OSG: Y5WNtXMVM1lUlU_w5WquGfv6bzTpH9pqwGW6g3ebx0HmT0cxqtwoxxgNMIW5c8Ionqtx1WQa8cFuU5P_7f7v Received: by tuatara.stupidest.org (Postfix, from userid 10000) id 1EDB51827280; Wed, 22 Nov 2006 21:39:15 -0800 (PST) Date: Wed, 22 Nov 2006 21:39:14 -0800 From: Chris Wedgwood To: Jonathan Groll Cc: xfs@oss.sgi.com Subject: Re: Unexpected inode type 0160000 causes abort of xfs_repair Message-ID: <20061123053914.GA22521@tuatara.stupidest.org> References: <20061122201131.GD2493@groll.co.za> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061122201131.GD2493@groll.co.za> X-archive-position: 9754 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 209 Lines: 7 On Wed, Nov 22, 2006 at 10:11:31PM +0200, Jonathan Groll wrote: > Is there anything I can possibly do? smash the inode using xfs_db (set mode to 0 should work) and run repair, it will prune it away for you From owner-xfs@oss.sgi.com Wed Nov 22 23:08:00 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 22 Nov 2006 23:08:08 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAN77waG032340 for ; Wed, 22 Nov 2006 23:07:59 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA17345; Thu, 23 Nov 2006 18:07:00 +1100 Date: Thu, 23 Nov 2006 18:09:22 +1100 From: Timothy Shimmin To: Eric Sandeen , Russell Cattelan cc: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <26F2AE58A7D40E5170649BC2@timothy-shimmins-power-mac-g5.local> In-Reply-To: <45647CF8.8020104@sandeen.net> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> <45647CF8.8020104@sandeen.net> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9756 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 3752 Lines: 81 Hi Guys, So just looking at the first part, which as Eric suggested can be considered on its own. Index: work_gfs/fs/xfs/xfs_attr.c =================================================================== --- work_gfs.orig/fs/xfs/xfs_attr.c 2006-11-21 18:38:27.572949303 -0600 +++ work_gfs/fs/xfs/xfs_attr.c 2006-11-21 18:44:51.666033422 -0600 @@ -210,8 +210,20 @@ xfs_attr_set_int(xfs_inode_t *dp, const * (inode must not be locked when we call this routine) */ if (XFS_IFORK_Q(dp) == 0) { - if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) - return(error); + if ((dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) || + ((dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS) && + (dp->i_d.di_anextents == 0))) { + /* xfs_bmap_add_attrfork will set the forkoffset based on + * the size needed, the local attr case needs the size + * attr plus the size of the hdr, if the size of + * header is not accounted for initially the forkoffset + * won't allow enough space, the actually attr add will + * then be forced out out line to extents + */ + size += sizeof(xfs_attr_sf_hdr_t); + if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) + return(error); + } } --On 22 November 2006 10:38:16 AM -0600 Eric Sandeen wrote: >> By fixing the initial size calculation at least things like SElinux >> which is adding one attr won't cause the attr segment to flip to extents >> immediately. >> The second attr will cause the flip but not the first one. > > I'd say this part (fixing up proper space for the initial attr fork setup) should probably go in > soon if it gets good reviews (with the removal of the extra tests, as we discussed on irc last > night). I think this proper change stands on its own just fine. > So yeah, as you said in IRC, the brace is in the wrong spot and the di_aformat tests don't make any sense here. Basically, we know that fork offset is zero and therefore that the di_aformat should be set at XFS_DINODE_FMT_EXTENTS and di_anetents will be zero. As this is the state before we add in an attribute fork. Why we have this initial state as extents, I'm not too sure and wondered in the past. Maybe because this state is one which doesn't occupy any space in the literal area. A shortform EA has a header at least. My next concern is that the size that is calculated is presumably trying to accomodate the shortform EA. However, the calculation is for the sf header and the space for a a xfs_attr_leaf_name_local with given namelen and valuelen. It would be better to base it on an xfs_attr_sf_entry type. So I think we need to rework this calculation. Which leads me on to the next issue. We don't know what EA form we are going to take, so we can't really assume that it will be shortform. If the EA name or value is big then the EA will go into extents and could occupy very little room in the inode. With the current & proposed test this could make the bytesfit function return 0 (the offset calculated in bytesfit could also go negative) and then we would set the forkoff back at the old attr1 default. So we might have 1 EA extent in the inode taking little space and yet setting the forkoff in the middle. Of course the setting of the forkoff is a bit of a guessing game since we can't predict the future usage but I think the plan is to set it to the minimum to fit on a first come first served basis. So I'm thinking that we should set it based on the size of shortform if that is how it will be stored or to the size taken up by the EA extents - I was initially thinking that this would be 1 extent but with a remote value block of up to 64K this could in theory be an extent for each fsb of the value I guess. Have to think about this some more. --Tim From owner-xfs@oss.sgi.com Thu Nov 23 08:23:59 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 08:24:06 -0800 (PST) Received: from mail.groll.co.za (mail.groll.co.za [67.18.176.185]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kANGNwaG009586 for ; Thu, 23 Nov 2006 08:23:59 -0800 Received: by mail.groll.co.za (Postfix, from userid 1004) id A2B6E258F8; Thu, 23 Nov 2006 15:14:05 +0200 (SAST) Date: Thu, 23 Nov 2006 15:14:05 +0200 From: Jonathan Groll To: Barry Naujok , xfs@oss.sgi.com Subject: Re: Unexpected inode type 0160000 causes abort of xfs_repair Message-ID: <20061123131405.GA27453@groll.co.za> References: <20061122201131.GD2493@groll.co.za> <200611230047.LAA09023@larry.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200611230047.LAA09023@larry.melbourne.sgi.com> User-Agent: Mutt/1.5.9i (Linux mail 2.4.29-linode39-1um i686) X-archive-position: 9758 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lists@groll.co.za Precedence: bulk X-list: xfs Content-Length: 369 Lines: 12 On Thu, Nov 23, 2006 at 11:51:11AM +1100, Barry Naujok wrote: > Can you try the attached patch and see how xfs_repair goes? Many thanks, the patch worked like a charm! Is it going to be incorporated into the package in future? Luckily I didn't have to blast the inode away, but I suspect that is exactly what the effect of the patch was ;-) Thanks again, Jonathan From owner-xfs@oss.sgi.com Thu Nov 23 08:41:34 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 08:41:40 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [66.45.37.187]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kANGfQaG011769 for ; Thu, 23 Nov 2006 08:41:28 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id 6A2FB61012A0; Thu, 23 Nov 2006 11:40:38 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id 6441316172EB3; Thu, 23 Nov 2006 11:40:38 -0500 (EST) Date: Thu, 23 Nov 2006 11:40:38 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Russell Cattelan cc: xfs@oss.sgi.com Subject: Re: XFS CORRUPTION 2.6.17.13? In-Reply-To: <1164231716.19915.68.camel@xenon.msp.redhat.com> Message-ID: References: <1164231716.19915.68.camel@xenon.msp.redhat.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 9759 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs Content-Length: 83667 Lines: 1805 Here is the info: Script started on Thu Nov 23 09:55:38 2006 1;36mroot@1[~]#0;39m xfs_repair -n /dev/hda2 Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 data fork in regular inode 939526080 claims used block 114661 bad data fork in inode 939526080 would have cleared inode 939526080 data fork in regular inode 939526081 claims used block 114662 bad data fork in inode 939526081 would have cleared inode 939526081 data fork in regular inode 939526082 claims used block 114663 bad data fork in inode 939526082 would have cleared inode 939526082 data fork in regular inode 939526083 claims used block 114664 bad data fork in inode 939526083 would have cleared inode 939526083 data fork in regular inode 939526084 claims used block 114665 bad data fork in inode 939526084 would have cleared inode 939526084 data fork in regular inode 939526085 claims used block 114666 bad data fork in inode 939526085 would have cleared inode 939526085 data fork in regular inode 939526086 claims used block 114667 bad data fork in inode 939526086 would have cleared inode 939526086 data fork in regular inode 939526087 claims used block 114668 bad data fork in inode 939526087 would have cleared inode 939526087 data fork in regular inode 939526088 claims used block 114669 bad data fork in inode 939526088 would have cleared inode 939526088 data fork in regular inode 939526089 claims used block 114670 bad data fork in inode 939526089 would have cleared inode 939526089 data fork in regular inode 939526090 claims used block 114671 bad data fork in inode 939526090 would have cleared inode 939526090 data fork in regular inode 939526091 claims used block 114672 bad data fork in inode 939526091 would have cleared inode 939526091 data fork in regular inode 939526092 claims used block 114673 bad data fork in inode 939526092 would have cleared inode 939526092 data fork in regular inode 939526093 claims used block 114674 bad data fork in inode 939526093 would have cleared inode 939526093 data fork in regular inode 939526094 claims used block 114675 bad data fork in inode 939526094 would have cleared inode 939526094 data fork in regular inode 939526095 claims used block 114676 bad data fork in inode 939526095 would have cleared inode 939526095 data fork in regular inode 939526096 claims used block 114677 bad data fork in inode 939526096 would have cleared inode 939526096 data fork in regular inode 939526097 claims used block 114678 bad data fork in inode 939526097 would have cleared inode 939526097 data fork in regular inode 939526098 claims used block 114679 bad data fork in inode 939526098 would have cleared inode 939526098 data fork in regular inode 939526099 claims used block 114680 bad data fork in inode 939526099 would have cleared inode 939526099 data fork in regular inode 939526100 claims used block 114681 bad data fork in inode 939526100 would have cleared inode 939526100 data fork in regular inode 939526101 claims used block 114682 bad data fork in inode 939526101 would have cleared inode 939526101 data fork in regular inode 939526102 claims used block 114683 bad data fork in inode 939526102 would have cleared inode 939526102 data fork in regular inode 939526103 claims used block 114684 bad data fork in inode 939526103 would have cleared inode 939526103 data fork in regular inode 939526104 claims used block 114685 bad data fork in inode 939526104 would have cleared inode 939526104 data fork in regular inode 939526105 claims used block 114686 bad data fork in inode 939526105 would have cleared inode 939526105 data fork in regular inode 939526106 claims used block 114687 bad data fork in inode 939526106 would have cleared inode 939526106 data fork in regular inode 939526107 claims used block 114688 bad data fork in inode 939526107 would have cleared inode 939526107 data fork in regular inode 939526108 claims used block 114689 bad data fork in inode 939526108 would have cleared inode 939526108 data fork in regular inode 939526109 claims used block 114690 bad data fork in inode 939526109 would have cleared inode 939526109 data fork in regular inode 939526110 claims used block 114691 bad data fork in inode 939526110 would have cleared inode 939526110 data fork in regular inode 939526111 claims used block 114692 bad data fork in inode 939526111 would have cleared inode 939526111 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 data fork in ino 1814911 claims dup extent, off - 0, start - 114111, cnt 4161 bad data fork in inode 1814911 would have cleared inode 1814911 - agno = 1 entry "parts" at block 0 offset 96 in directory inode 134219629 references free inode 939526093 would clear inode number in entry at offset 96... entry "violations.ignore.d" in shortform directory 134219645 references free inode 939526109 would have junked entry "violations.ignore.d" in directory inode 134219645 - agno = 2 - agno = 3 entry "dirmngr" at block 0 offset 456 in directory inode 402655041 references free inode 939526087 would clear inode number in entry at offset 456... entry "alsa" at block 0 offset 520 in directory inode 402655041 references free inode 939526088 would clear inode number in entry at offset 520... entry "ipv6-down.d" at block 0 offset 112 in directory inode 402655063 references free inode 939526103 would clear inode number in entry at offset 112... - agno = 4 entry "cookie" in shortform directory 536872873 references free inode 939526089 would have junked entry "cookie" in directory inode 536872873 entry "4.2" in shortform directory 536872894 references free inode 939526110 would have junked entry "4.2" in directory inode 536872894 - agno = 5 entry "menus" in shortform directory 671090530 references free inode 939526082 would have junked entry "menus" in directory inode 671090530 entry "defaults" in shortform directory 671090531 references free inode 939526083 would have junked entry "defaults" in directory inode 671090531 - agno = 6 entry "cat2" at block 0 offset 232 in directory inode 805308319 references free inode 939526080 would clear inode number in entry at offset 232... entry "python2.4" in shortform directory 805308321 references free inode 939526081 would have junked entry "python2.4" in directory inode 805308321 entry "components" in shortform directory 805308326 references free inode 939526086 would have junked entry "components" in directory inode 805308326 entry "cache" at block 0 offset 48 in directory inode 805308330 references free inode 939526090 would clear inode number in entry at offset 48... entry "aspell" in shortform directory 805308332 references free inode 939526092 would have junked entry "aspell" in directory inode 805308332 entry "update-libc.d" in shortform directory 805308335 references free inode 939526095 would have junked entry "update-libc.d" in directory inode 805308335 entry "private" in shortform directory 805308336 references free inode 939526096 would have junked entry "private" in directory inode 805308336 entry "en_US" in shortform directory 805308339 references free inode 939526099 would have junked entry "en_US" in directory inode 805308339 entry "1" in shortform directory 805308340 references free inode 939526100 would have junked entry "1" in directory inode 805308340 entry "kde-applications-merged" at block 0 offset 48 in directory inode 805308341 references free inode 939526101 would clear inode number in entry at offset 48... entry "events" in shortform directory 805308351 references free inode 939526111 would have junked entry "events" in directory inode 805308351 - agno = 7 imap claims in-use inode 939526080 is free, would correct imap imap claims in-use inode 939526081 is free, would correct imap imap claims in-use inode 939526082 is free, would correct imap imap claims in-use inode 939526083 is free, would correct imap imap claims in-use inode 939526084 is free, would correct imap imap claims in-use inode 939526085 is free, would correct imap imap claims in-use inode 939526086 is free, would correct imap imap claims in-use inode 939526087 is free, would correct imap imap claims in-use inode 939526088 is free, would correct imap imap claims in-use inode 939526089 is free, would correct imap imap claims in-use inode 939526090 is free, would correct imap imap claims in-use inode 939526091 is free, would correct imap imap claims in-use inode 939526092 is free, would correct imap imap claims in-use inode 939526093 is free, would correct imap imap claims in-use inode 939526094 is free, would correct imap imap claims in-use inode 939526095 is free, would correct imap imap claims in-use inode 939526096 is free, would correct imap imap claims in-use inode 939526097 is free, would correct imap imap claims in-use inode 939526098 is free, would correct imap imap claims in-use inode 939526099 is free, would correct imap imap claims in-use inode 939526100 is free, would correct imap imap claims in-use inode 939526101 is free, would correct imap imap claims in-use inode 939526102 is free, would correct imap imap claims in-use inode 939526103 is free, would correct imap imap claims in-use inode 939526104 is free, would correct imap imap claims in-use inode 939526105 is free, would correct imap imap claims in-use inode 939526106 is free, would correct imap imap claims in-use inode 939526107 is free, would correct imap imap claims in-use inode 939526108 is free, would correct imap imap claims in-use inode 939526109 is free, would correct imap imap claims in-use inode 939526110 is free, would correct imap imap claims in-use inode 939526111 is free, would correct imap - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem starting at / ... entry "6C1095804A88D" in shortform directory inode 18848 points to free inode 1814911 would junk entry "6C1095804A88D" - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... disconnected inode 939525281, would move to lost+found disconnected inode 939525296, would move to lost+found disconnected inode 939525297, would move to lost+found disconnected inode 939525298, would move to lost+found disconnected inode 939525299, would move to lost+found disconnected inode 939525300, would move to lost+found disconnected inode 939525301, would move to lost+found disconnected inode 939525302, would move to lost+found disconnected inode 939525425, would move to lost+found disconnected inode 939525426, would move to lost+found disconnected inode 939526091, would move to lost+found disconnected inode 939526164, would move to lost+found disconnected inode 939526200, would move to lost+found disconnected inode 939526263, would move to lost+found disconnected inode 939526266, would move to lost+found disconnected inode 939526268, would move to lost+found disconnected inode 939526270, would move to lost+found disconnected inode 939526271, would move to lost+found disconnected inode 939526280, would move to lost+found disconnected inode 939526281, would move to lost+found disconnected inode 939526282, would move to lost+found disconnected inode 939526283, would move to lost+found disconnected inode 939526284, would move to lost+found disconnected inode 939526286, would move to lost+found disconnected inode 939526287, would move to lost+found disconnected inode 939526288, would move to lost+found disconnected inode 939526300, would move to lost+found disconnected inode 939526301, would move to lost+found disconnected inode 939526302, would move to lost+found disconnected inode 939526303, would move to lost+found disconnected inode 939526304, would move to lost+found disconnected inode 939526305, would move to lost+found disconnected inode 939526307, would move to lost+found disconnected inode 939526308, would move to lost+found disconnected inode 939526309, would move to lost+found disconnected inode 939526314, would move to lost+found disconnected inode 939526315, would move to lost+found disconnected inode 939526316, would move to lost+found disconnected inode 939526317, would move to lost+found disconnected inode 939526318, would move to lost+found disconnected inode 939526328, would move to lost+found disconnected inode 939526329, would move to lost+found disconnected inode 939526330, would move to lost+found disconnected inode 939526331, would move to lost+found disconnected inode 939526332, would move to lost+found disconnected inode 939526333, would move to lost+found disconnected inode 939526334, would move to lost+found disconnected inode 939526335, would move to lost+found disconnected inode 939526934, would move to lost+found disconnected inode 939526937, would move to lost+found disconnected inode 939626830, would move to lost+found disconnected inode 939626837, would move to lost+found disconnected inode 939629683, would move to lost+found disconnected inode 939629684, would move to lost+found disconnected inode 939629685, would move to lost+found disconnected inode 939629686, would move to lost+found disconnected inode 939629687, would move to lost+found disconnected inode 939629690, would move to lost+found disconnected inode 939629691, would move to lost+found disconnected inode 939633842, would move to lost+found disconnected inode 939633843, would move to lost+found disconnected inode 939633881, would move to lost+found disconnected inode 939633897, would move to lost+found disconnected inode 939633898, would move to lost+found disconnected inode 939633899, would move to lost+found disconnected inode 939633900, would move to lost+found disconnected inode 939633901, would move to lost+found disconnected inode 939633902, would move to lost+found disconnected inode 939633903, would move to lost+found disconnected inode 939633909, would move to lost+found disconnected inode 939633910, would move to lost+found disconnected inode 939634022, would move to lost+found disconnected inode 939634023, would move to lost+found disconnected inode 939634024, would move to lost+found disconnected inode 939634025, would move to lost+found disconnected inode 939707628, would move to lost+found disconnected inode 939707629, would move to lost+found disconnected inode 939707630, would move to lost+found disconnected inode 939707633, would move to lost+found disconnected inode 939707636, would move to lost+found disconnected inode 939707639, would move to lost+found disconnected inode 939707641, would move to lost+found disconnected inode 939707643, would move to lost+found disconnected inode 939707645, would move to lost+found disconnected inode 939707712, would move to lost+found disconnected inode 939734343, would move to lost+found disconnected inode 939783610, would move to lost+found disconnected inode 939783612, would move to lost+found disconnected inode 939783613, would move to lost+found disconnected inode 939783614, would move to lost+found disconnected inode 939783616, would move to lost+found disconnected inode 939783623, would move to lost+found disconnected inode 939783635, would move to lost+found disconnected inode 939783663, would move to lost+found disconnected inode 939783672, would move to lost+found disconnected inode 939804193, would move to lost+found disconnected inode 939804194, would move to lost+found disconnected inode 939804195, would move to lost+found disconnected inode 939804196, would move to lost+found disconnected inode 939804197, would move to lost+found disconnected inode 939804198, would move to lost+found disconnected inode 939804199, would move to lost+found disconnected inode 939804200, would move to lost+found disconnected inode 939804201, would move to lost+found disconnected inode 939804203, would move to lost+found disconnected inode 939804205, would move to lost+found disconnected inode 939804206, would move to lost+found disconnected inode 939804242, would move to lost+found disconnected inode 939808074, would move to lost+found disconnected inode 939808075, would move to lost+found disconnected inode 939808076, would move to lost+found disconnected inode 939808077, would move to lost+found disconnected inode 939819891, would move to lost+found disconnected inode 939820963, would move to lost+found disconnected inode 939820964, would move to lost+found disconnected inode 939820965, would move to lost+found disconnected inode 939820966, would move to lost+found disconnected inode 939820968, would move to lost+found disconnected inode 939820969, would move to lost+found disconnected inode 939820970, would move to lost+found disconnected inode 939820971, would move to lost+found disconnected inode 939820972, would move to lost+found disconnected inode 939820973, would move to lost+found disconnected inode 939820974, would move to lost+found disconnected inode 939841107, would move to lost+found disconnected inode 939843487, would move to lost+found disconnected inode 939872184, would move to lost+found disconnected inode 939872208, would move to lost+found disconnected inode 940744021, would move to lost+found disconnected inode 940744022, would move to lost+found disconnected inode 940923246, would move to lost+found disconnected inode 940923247, would move to lost+found disconnected inode 940936619, would move to lost+found disconnected inode 941261016, would move to lost+found disconnected inode 941261017, would move to lost+found disconnected inode 941261018, would move to lost+found disconnected inode 941266732, would move to lost+found disconnected inode 941266764, would move to lost+found disconnected inode 941266765, would move to lost+found disconnected inode 941274207, would move to lost+found disconnected inode 941446125, would move to lost+found disconnected inode 941458243, would move to lost+found disconnected inode 941460430, would move to lost+found disconnected inode 941460434, would move to lost+found disconnected inode 941460443, would move to lost+found disconnected inode 941460445, would move to lost+found disconnected inode 941460452, would move to lost+found disconnected inode 941460454, would move to lost+found disconnected inode 941460455, would move to lost+found disconnected inode 941460457, would move to lost+found disconnected inode 941510370, would move to lost+found disconnected inode 941510371, would move to lost+found disconnected inode 941514067, would move to lost+found disconnected inode 941514074, would move to lost+found disconnected inode 941514114, would move to lost+found disconnected inode 941514115, would move to lost+found disconnected inode 941907501, would move to lost+found disconnected inode 941907502, would move to lost+found disconnected inode 941933094, would move to lost+found disconnected inode 941933223, would move to lost+found disconnected inode 941933225, would move to lost+found disconnected inode 941933226, would move to lost+found disconnected inode 941942463, would move to lost+found disconnected inode 942080921, would move to lost+found disconnected inode 942090075, would move to lost+found disconnected inode 942090451, would move to lost+found disconnected inode 942092049, would move to lost+found disconnected inode 942092304, would move to lost+found disconnected inode 942092305, would move to lost+found disconnected inode 942092306, would move to lost+found disconnected inode 942260373, would move to lost+found disconnected dir inode 1074268001, would move to lost+found disconnected dir inode 1074268007, would move to lost+found disconnected dir inode 1074268009, would move to lost+found disconnected dir inode 1074268014, would move to lost+found disconnected dir inode 1074268017, would move to lost+found disconnected dir inode 1074268019, would move to lost+found disconnected dir inode 1074268022, would move to lost+found disconnected dir inode 1074268024, would move to lost+found disconnected dir inode 1207961425, would move to lost+found disconnected dir inode 1342179089, would move to lost+found Phase 7 - verify link counts... would have reset inode 134219629 nlinks from 7 to 6 would have reset inode 134219645 nlinks from 8 to 7 would have reset inode 402655041 nlinks from 50 to 48 would have reset inode 402655063 nlinks from 8 to 7 would have reset inode 536872873 nlinks from 7 to 6 would have reset inode 536872894 nlinks from 8 to 7 would have reset inode 671090530 nlinks from 6 to 5 would have reset inode 671090531 nlinks from 4 to 3 would have reset inode 805308319 nlinks from 16 to 15 would have reset inode 805308321 nlinks from 4 to 3 would have reset inode 805308326 nlinks from 5 to 4 would have reset inode 805308330 nlinks from 3 to 2 would have reset inode 805308332 nlinks from 5 to 4 would have reset inode 805308335 nlinks from 3 to 2 would have reset inode 805308336 nlinks from 4 to 3 would have reset inode 805308339 nlinks from 4 to 3 would have reset inode 805308340 nlinks from 7 to 6 would have reset inode 805308341 nlinks from 4 to 3 would have reset inode 805308351 nlinks from 3 to 2 would have reset inode 1342179075 nlinks from 43 to 41 would have reset inode 1610614679 nlinks from 21 to 20 would have reset inode 1744832408 nlinks from 8 to 7 would have reset inode 1744832409 nlinks from 10 to 9 would have reset inode 1879050093 nlinks from 132 to 127 would have reset inode 2013268025 nlinks from 12 to 11 No modify flag set, skipping filesystem flush and exiting. 1;36mroot@1[~]#0;39m Kxfs_repair 1@1@-1@v2da2 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... zero_log: head block 23951 tail block 23951 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 data fork in regular inode 939526080 claims used block 114661 bad data fork in inode 939526080 cleared inode 939526080 data fork in regular inode 939526081 claims used block 114662 bad data fork in inode 939526081 cleared inode 939526081 data fork in regular inode 939526082 claims used block 114663 bad data fork in inode 939526082 cleared inode 939526082 data fork in regular inode 939526083 claims used block 114664 bad data fork in inode 939526083 cleared inode 939526083 data fork in regular inode 939526084 claims used block 114665 bad data fork in inode 939526084 cleared inode 939526084 data fork in regular inode 939526085 claims used block 114666 bad data fork in inode 939526085 cleared inode 939526085 data fork in regular inode 939526086 claims used block 114667 bad data fork in inode 939526086 cleared inode 939526086 data fork in regular inode 939526087 claims used block 114668 bad data fork in inode 939526087 cleared inode 939526087 data fork in regular inode 939526088 claims used block 114669 bad data fork in inode 939526088 cleared inode 939526088 data fork in regular inode 939526089 claims used block 114670 bad data fork in inode 939526089 cleared inode 939526089 data fork in regular inode 939526090 claims used block 114671 bad data fork in inode 939526090 cleared inode 939526090 data fork in regular inode 939526091 claims used block 114672 bad data fork in inode 939526091 cleared inode 939526091 data fork in regular inode 939526092 claims used block 114673 bad data fork in inode 939526092 cleared inode 939526092 data fork in regular inode 939526093 claims used block 114674 bad data fork in inode 939526093 cleared inode 939526093 data fork in regular inode 939526094 claims used block 114675 bad data fork in inode 939526094 cleared inode 939526094 data fork in regular inode 939526095 claims used block 114676 bad data fork in inode 939526095 cleared inode 939526095 data fork in regular inode 939526096 claims used block 114677 bad data fork in inode 939526096 cleared inode 939526096 data fork in regular inode 939526097 claims used block 114678 bad data fork in inode 939526097 cleared inode 939526097 data fork in regular inode 939526098 claims used block 114679 bad data fork in inode 939526098 cleared inode 939526098 data fork in regular inode 939526099 claims used block 114680 bad data fork in inode 939526099 cleared inode 939526099 data fork in regular inode 939526100 claims used block 114681 bad data fork in inode 939526100 cleared inode 939526100 data fork in regular inode 939526101 claims used block 114682 bad data fork in inode 939526101 cleared inode 939526101 data fork in regular inode 939526102 claims used block 114683 bad data fork in inode 939526102 cleared inode 939526102 data fork in regular inode 939526103 claims used block 114684 bad data fork in inode 939526103 cleared inode 939526103 data fork in regular inode 939526104 claims used block 114685 bad data fork in inode 939526104 cleared inode 939526104 data fork in regular inode 939526105 claims used block 114686 bad data fork in inode 939526105 cleared inode 939526105 data fork in regular inode 939526106 claims used block 114687 bad data fork in inode 939526106 cleared inode 939526106 data fork in regular inode 939526107 claims used block 114688 bad data fork in inode 939526107 cleared inode 939526107 data fork in regular inode 939526108 claims used block 114689 bad data fork in inode 939526108 cleared inode 939526108 data fork in regular inode 939526109 claims used block 114690 bad data fork in inode 939526109 cleared inode 939526109 data fork in regular inode 939526110 claims used block 114691 bad data fork in inode 939526110 cleared inode 939526110 data fork in regular inode 939526111 claims used block 114692 bad data fork in inode 939526111 cleared inode 939526111 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - marking entry "lost+found" to be deleted - check for inodes claiming duplicate blocks... - agno = 0 data fork in ino 1814911 claims dup extent, off - 0, start - 114111, cnt 4161 bad data fork in inode 1814911 cleared inode 1814911 - agno = 1 entry "parts" at block 0 offset 96 in directory inode 134219629 references free inode 939526093 clearing inode number in entry at offset 96... entry "violations.ignore.d" in shortform directory 134219645 references free inode 939526109 junking entry "violations.ignore.d" in directory inode 134219645 - agno = 2 - agno = 3 entry "dirmngr" at block 0 offset 456 in directory inode 402655041 references free inode 939526087 clearing inode number in entry at offset 456... entry "alsa" at block 0 offset 520 in directory inode 402655041 references free inode 939526088 clearing inode number in entry at offset 520... entry "ipv6-down.d" at block 0 offset 112 in directory inode 402655063 references free inode 939526103 clearing inode number in entry at offset 112... - agno = 4 entry "cookie" in shortform directory 536872873 references free inode 939526089 junking entry "cookie" in directory inode 536872873 entry "4.2" in shortform directory 536872894 references free inode 939526110 junking entry "4.2" in directory inode 536872894 - agno = 5 entry "menus" in shortform directory 671090530 references free inode 939526082 junking entry "menus" in directory inode 671090530 entry "defaults" in shortform directory 671090531 references free inode 939526083 junking entry "defaults" in directory inode 671090531 - agno = 6 entry "cat2" at block 0 offset 232 in directory inode 805308319 references free inode 939526080 clearing inode number in entry at offset 232... entry "python2.4" in shortform directory 805308321 references free inode 939526081 junking entry "python2.4" in directory inode 805308321 entry "components" in shortform directory 805308326 references free inode 939526086 junking entry "components" in directory inode 805308326 entry "cache" at block 0 offset 48 in directory inode 805308330 references free inode 939526090 clearing inode number in entry at offset 48... entry "aspell" in shortform directory 805308332 references free inode 939526092 junking entry "aspell" in directory inode 805308332 entry "update-libc.d" in shortform directory 805308335 references free inode 939526095 junking entry "update-libc.d" in directory inode 805308335 entry "private" in shortform directory 805308336 references free inode 939526096 junking entry "private" in directory inode 805308336 entry "en_US" in shortform directory 805308339 references free inode 939526099 junking entry "en_US" in directory inode 805308339 entry "1" in shortform directory 805308340 references free inode 939526100 junking entry "1" in directory inode 805308340 entry "kde-applications-merged" at block 0 offset 48 in directory inode 805308341 references free inode 939526101 clearing inode number in entry at offset 48... entry "events" in shortform directory 805308351 references free inode 939526111 junking entry "events" in directory inode 805308351 - agno = 7 - agno = 8 - agno = 9 - agno = 10 entry "vi" at block 0 offset 240 in directory inode 1342179075 references free inode 939526084 clearing inode number in entry at offset 240... entry "it" at block 0 offset 504 in directory inode 1342179075 references free inode 939526085 clearing inode number in entry at offset 504... - agno = 11 - agno = 12 entry "ro" at block 0 offset 144 in directory inode 1610614679 references free inode 939526104 clearing inode number in entry at offset 144... - agno = 13 entry "misc" in shortform directory 1744832408 references free inode 939526105 junking entry "misc" in directory inode 1744832408 entry "compat" at block 0 offset 200 in directory inode 1744832409 references free inode 939526107 clearing inode number in entry at offset 200... - agno = 14 entry "dirmngr" at block 0 offset 200 in directory inode 1879050093 references free inode 939526094 clearing inode number in entry at offset 200... entry "pure-ftpd" at block 0 offset 768 in directory inode 1879050093 references free inode 939526097 clearing inode number in entry at offset 768... entry "ld.so.conf.d" at block 0 offset 992 in directory inode 1879050093 references free inode 939526098 clearing inode number in entry at offset 992... entry "hotplug.d" at block 0 offset 1536 in directory inode 1879050093 references free inode 939526102 clearing inode number in entry at offset 1536... entry "console-tools" at block 0 offset 1840 in directory inode 1879050093 references free inode 939526108 clearing inode number in entry at offset 1840... - agno = 15 entry "hp" at block 0 offset 176 in directory inode 2013268025 references free inode 939526106 clearing inode number in entry at offset 176... Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... rebuilding directory inode 128 rebuilding directory inode 1879050093 rebuilding directory inode 1610614679 rebuilding directory inode 1744832409 rebuilding directory inode 2013268025 rebuilding directory inode 402655063 rebuilding directory inode 805308341 rebuilding directory inode 402655041 rebuilding directory inode 134219629 rebuilding directory inode 805308330 rebuilding directory inode 1342179075 rebuilding directory inode 805308319 entry "6C1095804A88D" in shortform directory inode 18848 points to free inode 1814911 junking entry "6C1095804A88D" in directory inode 1814911 - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... disconnected inode 939525281, moving to lost+found disconnected inode 939525296, moving to lost+found disconnected inode 939525297, moving to lost+found disconnected inode 939525298, moving to lost+found disconnected inode 939525299, moving to lost+found disconnected inode 939525300, moving to lost+found disconnected inode 939525301, moving to lost+found disconnected inode 939525302, moving to lost+found disconnected inode 939525425, moving to lost+found disconnected inode 939525426, moving to lost+found disconnected inode 939526164, moving to lost+found disconnected inode 939526200, moving to lost+found disconnected inode 939526263, moving to lost+found disconnected inode 939526266, moving to lost+found disconnected inode 939526268, moving to lost+found disconnected inode 939526270, moving to lost+found disconnected inode 939526271, moving to lost+found disconnected inode 939526280, moving to lost+found disconnected inode 939526281, moving to lost+found disconnected inode 939526282, moving to lost+found disconnected inode 939526283, moving to lost+found disconnected inode 939526284, moving to lost+found disconnected inode 939526286, moving to lost+found disconnected inode 939526287, moving to lost+found disconnected inode 939526288, moving to lost+found disconnected inode 939526300, moving to lost+found disconnected inode 939526301, moving to lost+found disconnected inode 939526302, moving to lost+found disconnected inode 939526303, moving to lost+found disconnected inode 939526304, moving to lost+found disconnected inode 939526305, moving to lost+found disconnected inode 939526307, moving to lost+found disconnected inode 939526308, moving to lost+found disconnected inode 939526309, moving to lost+found disconnected inode 939526314, moving to lost+found disconnected inode 939526315, moving to lost+found disconnected inode 939526316, moving to lost+found disconnected inode 939526317, moving to lost+found disconnected inode 939526318, moving to lost+found disconnected inode 939526328, moving to lost+found disconnected inode 939526329, moving to lost+found disconnected inode 939526330, moving to lost+found disconnected inode 939526331, moving to lost+found disconnected inode 939526332, moving to lost+found disconnected inode 939526333, moving to lost+found disconnected inode 939526334, moving to lost+found disconnected inode 939526335, moving to lost+found disconnected inode 939526934, moving to lost+found disconnected inode 939526937, moving to lost+found disconnected inode 939626830, moving to lost+found disconnected inode 939626837, moving to lost+found disconnected inode 939629683, moving to lost+found disconnected inode 939629684, moving to lost+found disconnected inode 939629685, moving to lost+found disconnected inode 939629686, moving to lost+found disconnected inode 939629687, moving to lost+found disconnected inode 939629690, moving to lost+found disconnected inode 939629691, moving to lost+found disconnected inode 939633842, moving to lost+found disconnected inode 939633843, moving to lost+found disconnected inode 939633881, moving to lost+found disconnected inode 939633897, moving to lost+found disconnected inode 939633898, moving to lost+found disconnected inode 939633899, moving to lost+found disconnected inode 939633900, moving to lost+found disconnected inode 939633901, moving to lost+found disconnected inode 939633902, moving to lost+found disconnected inode 939633903, moving to lost+found disconnected inode 939633909, moving to lost+found disconnected inode 939633910, moving to lost+found disconnected inode 939634022, moving to lost+found disconnected inode 939634023, moving to lost+found disconnected inode 939634024, moving to lost+found disconnected inode 939634025, moving to lost+found disconnected inode 939707628, moving to lost+found disconnected inode 939707629, moving to lost+found disconnected inode 939707630, moving to lost+found disconnected inode 939707633, moving to lost+found disconnected inode 939707636, moving to lost+found disconnected inode 939707639, moving to lost+found disconnected inode 939707641, moving to lost+found disconnected inode 939707643, moving to lost+found disconnected inode 939707645, moving to lost+found disconnected inode 939707712, moving to lost+found disconnected inode 939734343, moving to lost+found disconnected inode 939783610, moving to lost+found disconnected inode 939783612, moving to lost+found disconnected inode 939783613, moving to lost+found disconnected inode 939783614, moving to lost+found disconnected inode 939783616, moving to lost+found disconnected inode 939783623, moving to lost+found disconnected inode 939783635, moving to lost+found disconnected inode 939783663, moving to lost+found disconnected inode 939783672, moving to lost+found disconnected inode 939804193, moving to lost+found disconnected inode 939804194, moving to lost+found disconnected inode 939804195, moving to lost+found disconnected inode 939804196, moving to lost+found disconnected inode 939804197, moving to lost+found disconnected inode 939804198, moving to lost+found disconnected inode 939804199, moving to lost+found disconnected inode 939804200, moving to lost+found disconnected inode 939804201, moving to lost+found disconnected inode 939804203, moving to lost+found disconnected inode 939804205, moving to lost+found disconnected inode 939804206, moving to lost+found disconnected inode 939804242, moving to lost+found disconnected inode 939808074, moving to lost+found disconnected inode 939808075, moving to lost+found disconnected inode 939808076, moving to lost+found disconnected inode 939808077, moving to lost+found disconnected inode 939819891, moving to lost+found disconnected inode 939820963, moving to lost+found disconnected inode 939820964, moving to lost+found disconnected inode 939820965, moving to lost+found disconnected inode 939820966, moving to lost+found disconnected inode 939820968, moving to lost+found disconnected inode 939820969, moving to lost+found disconnected inode 939820970, moving to lost+found disconnected inode 939820971, moving to lost+found disconnected inode 939820972, moving to lost+found disconnected inode 939820973, moving to lost+found disconnected inode 939820974, moving to lost+found disconnected inode 939841107, moving to lost+found disconnected inode 939843487, moving to lost+found disconnected inode 939872184, moving to lost+found disconnected inode 939872208, moving to lost+found disconnected inode 940744021, moving to lost+found disconnected inode 940744022, moving to lost+found disconnected inode 940923246, moving to lost+found disconnected inode 940923247, moving to lost+found disconnected inode 940936619, moving to lost+found disconnected inode 941261016, moving to lost+found disconnected inode 941261017, moving to lost+found disconnected inode 941261018, moving to lost+found disconnected inode 941266732, moving to lost+found disconnected inode 941266764, moving to lost+found disconnected inode 941266765, moving to lost+found disconnected inode 941274207, moving to lost+found disconnected inode 941446125, moving to lost+found disconnected inode 941458243, moving to lost+found disconnected inode 941460430, moving to lost+found disconnected inode 941460434, moving to lost+found disconnected inode 941460443, moving to lost+found disconnected inode 941460445, moving to lost+found disconnected inode 941460452, moving to lost+found disconnected inode 941460454, moving to lost+found disconnected inode 941460455, moving to lost+found disconnected inode 941460457, moving to lost+found disconnected inode 941510370, moving to lost+found disconnected inode 941510371, moving to lost+found disconnected inode 941514067, moving to lost+found disconnected inode 941514074, moving to lost+found disconnected inode 941514114, moving to lost+found disconnected inode 941514115, moving to lost+found disconnected inode 941907501, moving to lost+found disconnected inode 941907502, moving to lost+found disconnected inode 941933094, moving to lost+found disconnected inode 941933223, moving to lost+found disconnected inode 941933225, moving to lost+found disconnected inode 941933226, moving to lost+found disconnected inode 941942463, moving to lost+found disconnected inode 942080921, moving to lost+found disconnected inode 942090075, moving to lost+found disconnected inode 942090451, moving to lost+found disconnected inode 942092049, moving to lost+found disconnected inode 942092304, moving to lost+found disconnected inode 942092305, moving to lost+found disconnected inode 942092306, moving to lost+found disconnected inode 942260373, moving to lost+found disconnected dir inode 1074268001, moving to lost+found disconnected dir inode 1074268007, moving to lost+found disconnected dir inode 1074268009, moving to lost+found disconnected dir inode 1074268014, moving to lost+found disconnected dir inode 1074268017, moving to lost+found disconnected dir inode 1074268019, moving to lost+found disconnected dir inode 1074268022, moving to lost+found disconnected dir inode 1074268024, moving to lost+found disconnected dir inode 1207961425, moving to lost+found disconnected dir inode 1342179089, moving to lost+found Phase 7 - verify and correct link counts... resetting inode 134219629 nlinks from 7 to 6 resetting inode 134219645 nlinks from 8 to 7 resetting inode 402655041 nlinks from 50 to 48 resetting inode 402655063 nlinks from 8 to 7 resetting inode 536872873 nlinks from 7 to 6 resetting inode 536872894 nlinks from 8 to 7 resetting inode 671090530 nlinks from 6 to 5 resetting inode 671090531 nlinks from 4 to 3 resetting inode 805308319 nlinks from 16 to 15 resetting inode 805308321 nlinks from 4 to 3 resetting inode 805308326 nlinks from 5 to 4 resetting inode 805308330 nlinks from 3 to 2 resetting inode 805308332 nlinks from 5 to 4 resetting inode 805308335 nlinks from 3 to 2 resetting inode 805308336 nlinks from 4 to 3 resetting inode 805308339 nlinks from 4 to 3 resetting inode 805308340 nlinks from 7 to 6 resetting inode 805308341 nlinks from 4 to 3 resetting inode 805308351 nlinks from 3 to 2 resetting inode 1342179075 nlinks from 43 to 41 resetting inode 1610614679 nlinks from 21 to 20 resetting inode 1744832408 nlinks from 8 to 7 resetting inode 1744832409 nlinks from 10 to 9 resetting inode 1879050093 nlinks from 132 to 127 resetting inode 2013268025 nlinks from 12 to 11 done 1;36mroot@1[~]#0;39m Kxfs_repair -v /dev/hda2 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... zero_log: head block 2 tail block 2 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - marking entry "lost+found" to be deleted - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... rebuilding directory inode 128 - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... disconnected inode 939525281, moving to lost+found disconnected inode 939525296, moving to lost+found disconnected inode 939525297, moving to lost+found disconnected inode 939525298, moving to lost+found disconnected inode 939525299, moving to lost+found disconnected inode 939525300, moving to lost+found disconnected inode 939525301, moving to lost+found disconnected inode 939525302, moving to lost+found disconnected inode 939525425, moving to lost+found disconnected inode 939525426, moving to lost+found disconnected inode 939526164, moving to lost+found disconnected inode 939526200, moving to lost+found disconnected inode 939526263, moving to lost+found disconnected inode 939526266, moving to lost+found disconnected inode 939526268, moving to lost+found disconnected inode 939526270, moving to lost+found disconnected inode 939526271, moving to lost+found disconnected inode 939526280, moving to lost+found disconnected inode 939526281, moving to lost+found disconnected inode 939526282, moving to lost+found disconnected inode 939526283, moving to lost+found disconnected inode 939526284, moving to lost+found disconnected inode 939526286, moving to lost+found disconnected inode 939526287, moving to lost+found disconnected inode 939526288, moving to lost+found disconnected inode 939526300, moving to lost+found disconnected inode 939526301, moving to lost+found disconnected inode 939526302, moving to lost+found disconnected inode 939526303, moving to lost+found disconnected inode 939526304, moving to lost+found disconnected inode 939526305, moving to lost+found disconnected inode 939526307, moving to lost+found disconnected inode 939526308, moving to lost+found disconnected inode 939526309, moving to lost+found disconnected inode 939526314, moving to lost+found disconnected inode 939526315, moving to lost+found disconnected inode 939526316, moving to lost+found disconnected inode 939526317, moving to lost+found disconnected inode 939526318, moving to lost+found disconnected inode 939526328, moving to lost+found disconnected inode 939526329, moving to lost+found disconnected inode 939526330, moving to lost+found disconnected inode 939526331, moving to lost+found disconnected inode 939526332, moving to lost+found disconnected inode 939526333, moving to lost+found disconnected inode 939526334, moving to lost+found disconnected inode 939526335, moving to lost+found disconnected inode 939526934, moving to lost+found disconnected inode 939526937, moving to lost+found disconnected inode 939626830, moving to lost+found disconnected inode 939626837, moving to lost+found disconnected inode 939629683, moving to lost+found disconnected inode 939629684, moving to lost+found disconnected inode 939629685, moving to lost+found disconnected inode 939629686, moving to lost+found disconnected inode 939629687, moving to lost+found disconnected inode 939629690, moving to lost+found disconnected inode 939629691, moving to lost+found disconnected inode 939633842, moving to lost+found disconnected inode 939633843, moving to lost+found disconnected inode 939633881, moving to lost+found disconnected inode 939633897, moving to lost+found disconnected inode 939633898, moving to lost+found disconnected inode 939633899, moving to lost+found disconnected inode 939633900, moving to lost+found disconnected inode 939633901, moving to lost+found disconnected inode 939633902, moving to lost+found disconnected inode 939633903, moving to lost+found disconnected inode 939633909, moving to lost+found disconnected inode 939633910, moving to lost+found disconnected inode 939634022, moving to lost+found disconnected inode 939634023, moving to lost+found disconnected inode 939634024, moving to lost+found disconnected inode 939634025, moving to lost+found disconnected inode 939707628, moving to lost+found disconnected inode 939707629, moving to lost+found disconnected inode 939707630, moving to lost+found disconnected inode 939707633, moving to lost+found disconnected inode 939707636, moving to lost+found disconnected inode 939707639, moving to lost+found disconnected inode 939707641, moving to lost+found disconnected inode 939707643, moving to lost+found disconnected inode 939707645, moving to lost+found disconnected inode 939707712, moving to lost+found disconnected inode 939734343, moving to lost+found disconnected inode 939783610, moving to lost+found disconnected inode 939783612, moving to lost+found disconnected inode 939783613, moving to lost+found disconnected inode 939783614, moving to lost+found disconnected inode 939783616, moving to lost+found disconnected inode 939783623, moving to lost+found disconnected inode 939783635, moving to lost+found disconnected inode 939783663, moving to lost+found disconnected inode 939783672, moving to lost+found disconnected inode 939804193, moving to lost+found disconnected inode 939804194, moving to lost+found disconnected inode 939804195, moving to lost+found disconnected inode 939804196, moving to lost+found disconnected inode 939804197, moving to lost+found disconnected inode 939804198, moving to lost+found disconnected inode 939804199, moving to lost+found disconnected inode 939804200, moving to lost+found disconnected inode 939804201, moving to lost+found disconnected inode 939804203, moving to lost+found disconnected inode 939804205, moving to lost+found disconnected inode 939804206, moving to lost+found disconnected inode 939804242, moving to lost+found disconnected inode 939808074, moving to lost+found disconnected inode 939808075, moving to lost+found disconnected inode 939808076, moving to lost+found disconnected inode 939808077, moving to lost+found disconnected inode 939819891, moving to lost+found disconnected inode 939820963, moving to lost+found disconnected inode 939820964, moving to lost+found disconnected inode 939820965, moving to lost+found disconnected inode 939820966, moving to lost+found disconnected inode 939820968, moving to lost+found disconnected inode 939820969, moving to lost+found disconnected inode 939820970, moving to lost+found disconnected inode 939820971, moving to lost+found disconnected inode 939820972, moving to lost+found disconnected inode 939820973, moving to lost+found disconnected inode 939820974, moving to lost+found disconnected inode 939841107, moving to lost+found disconnected inode 939843487, moving to lost+found disconnected inode 939872184, moving to lost+found disconnected inode 939872208, moving to lost+found disconnected inode 940744021, moving to lost+found disconnected inode 940744022, moving to lost+found disconnected inode 940923246, moving to lost+found disconnected inode 940923247, moving to lost+found disconnected inode 940936619, moving to lost+found disconnected inode 941261016, moving to lost+found disconnected inode 941261017, moving to lost+found disconnected inode 941261018, moving to lost+found disconnected inode 941266732, moving to lost+found disconnected inode 941266764, moving to lost+found disconnected inode 941266765, moving to lost+found disconnected inode 941274207, moving to lost+found disconnected inode 941446125, moving to lost+found disconnected inode 941458243, moving to lost+found disconnected inode 941460430, moving to lost+found disconnected inode 941460434, moving to lost+found disconnected inode 941460443, moving to lost+found disconnected inode 941460445, moving to lost+found disconnected inode 941460452, moving to lost+found disconnected inode 941460454, moving to lost+found disconnected inode 941460455, moving to lost+found disconnected inode 941460457, moving to lost+found disconnected inode 941510370, moving to lost+found disconnected inode 941510371, moving to lost+found disconnected inode 941514067, moving to lost+found disconnected inode 941514074, moving to lost+found disconnected inode 941514114, moving to lost+found disconnected inode 941514115, moving to lost+found disconnected inode 941907501, moving to lost+found disconnected inode 941907502, moving to lost+found disconnected inode 941933094, moving to lost+found disconnected inode 941933223, moving to lost+found disconnected inode 941933225, moving to lost+found disconnected inode 941933226, moving to lost+found disconnected inode 941942463, moving to lost+found disconnected inode 942080921, moving to lost+found disconnected inode 942090075, moving to lost+found disconnected inode 942090451, moving to lost+found disconnected inode 942092049, moving to lost+found disconnected inode 942092304, moving to lost+found disconnected inode 942092305, moving to lost+found disconnected inode 942092306, moving to lost+found disconnected inode 942260373, moving to lost+found disconnected dir inode 1074268001, moving to lost+found disconnected dir inode 1074268007, moving to lost+found disconnected dir inode 1074268009, moving to lost+found disconnected dir inode 1074268014, moving to lost+found disconnected dir inode 1074268017, moving to lost+found disconnected dir inode 1074268019, moving to lost+found disconnected dir inode 1074268022, moving to lost+found disconnected dir inode 1074268024, moving to lost+found disconnected dir inode 1207961425, moving to lost+found disconnected dir inode 1342179089, moving to lost+found Phase 7 - verify and correct link counts... done 1;36mroot@1[~]#0;39m Kxfs_repair -v /dev/hd1Pd1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... zero_log: head block 111878 tail block 111878 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done 1;36mroot@1[~]#0;39m Kxfs_repair -v /dev/hd1Pg1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... zero_log: head block 73 tail block 73 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done 1;36mroot@1[~]#0;39m Kmkdir /tmp/hda1 1;36mroot@1[~]#0;39m KKKmkdird/tmp/hda2tmp/hd 1;36mroot@1[~]#0;39m Kmount /dev/hda2 /tmp/hda2 1;36mroot@1[~]#0;39m Kcd /tmp/hda2 1;36mroot@1[hda2]#0;39m ls 0m01;34ma0m 01;34mbin0m 01;34md20m 01;34mhome0m 01;34mlib0m 01;34mmnt0m 01;34msbin0m 01;34musr0m 01;36mvmlinuz0m 01;36mapp0m 01;34mboot0m 01;34mdev0m 01;34minitrd0m 01;34mlost+found0m 01;34mproc0m 01;34msys0m 01;36mvapp0m 01;34mx0m 01;36mappc0m 01;34md10m 01;34metc0m 01;36minitrd.img0m 01;34mmedia0m 01;34mroot0m 01;34mtmp0m 01;34mvar0m m1;36mroot@1[hda2]#0;39m cd lost* 1;36mroot@1[lost+found]#0;39m ls 0m01;34m10742680010m 0m9395262710m 0m9395263310m 0m9396339090m 0m9398041930m 0m9398209720m 0m9414604540m 01;34m10742680070m 0m9395262800m 0m9395263320m 0m9396339100m 0m9398041940m 0m9398209730m 0m9414604550m 01;34m10742680090m 0m9395262810m 0m9395263330m 0m9396340220m 0m9398041950m 0m9398209740m 0m9414604570m 01;34m10742680140m 0m9395262820m 0m9395263340m 0m9396340230m 0m9398041960m 0m9398411070m 0m9415103700m 01;34m10742680170m 0m9395262830m 0m9395263350m 0m9396340240m 0m9398041970m 0m9398434870m 0m9415103710m 01;34m10742680190m 0m9395262840m 0m9395269340m 0m9396340250m 0m9398041980m 0m9398721840m 0m9415140670m 01;34m10742680220m 0m9395262860m 0m9395269370m 0m9397076280m 0m9398041990m 0m9398722080m 0m9415140740m 01;34m10742680240m 0m9395262870m 0m9396268300m 0m9397076290m 0m9398042000m 01;32m9407440210m 0m9415141140m 01;34m12079614250m 0m9395262880m 0m9396268370m 0m9397076300m 0m9398042010m 01;32m9407440220m 0m9415141150m 01;34m13421790890m 0m9395263000m 0m9396296830m 0m9397076330m 0m9398042030m 0m9409232460m 0m9419075010m 0m9395252810m 0m9395263010m 0m9396296840m 0m9397076360m 0m9398042050m 0m9409232470m 0m9419075020m 0m9395252960m 0m9395263020m 0m9396296850m 0m9397076390m 0m9398042060m 0m9409366190m 01;32m9419330940m 0m9395252970m 0m9395263030m 0m9396296860m 0m9397076410m 0m9398042420m 01;36m9412610160m 0m9419332230m 0m9395252980m 0m9395263040m 0m9396296870m 0m9397076430m 0m9398080740m 0m9412610170m 0m9419332250m 0m9395252990m 0m9395263050m 0m9396296900m 0m9397076450m 0m9398080750m 0m9412610180m 0m9419332260m 0m9395253000m 0m9395263070m 0m9396296910m 0m9397077120m 0m9398080760m 01;36m9412667320m 0m9419424630m 0m9395253010m 0m9395263080m 0m9396338420m 0m9397343430m 0m9398080770m 01;36m9412667640m 0m9420809210m 0m9395253020m 0m9395263090m 0m9396338430m 0m9397836100m 0m9398198910m 0m9412667650m 0m9420900750m 0m9395254250m 0m9395263140m 0m9396338810m 0m9397836120m 0m9398209630m 0m9412742070m 0m9420904510m 0m9395254260m 0m9395263150m 0m9396338970m 0m9397836130m 0m9398209640m 0m9414461250m 0m9420920490m 0m9395261640m 0m9395263160m 0m9396338980m 0m9397836140m 0m9398209650m 0m9414582430m 0m9420923040m 0m9395262000m 0m9395263170m 0m9396338990m 0m9397836160m 0m9398209660m 0m9414604300m 0m9420923050m 0m9395262630m 0m9395263180m 0m9396339000m 0m9397836230m 0m9398209680m 0m9414604340m 0m9420923060m 0m9395262660m 0m9395263280m 0m9396339010m 0m9397836350m 0m9398209690m 0m9414604430m 0m9422603730m 0m9395262680m 0m9395263290m 0m9396339020m 0m9397836630m 0m9398209700m 0m9414604450m 0m9395262700m 0m9395263300m 0m9396339030m 0m9397836720m 0m9398209710m 0m9414604520m m1;36mroot@1[lost+found]#0;39m ls -al 0mtotal 15540 drwxr-xr-x 12 root root 8192 Nov 23 10:01 01;34m.0m drwxr-xr-x 23 root root 4096 Oct 28 06:32 01;34m..0m drwxr-xr-x 3 root root 118 Oct 7 07:33 01;34m10742680010m drwxr-xr-x 2 root root 6 Apr 19 2005 01;34m10742680070m drwxr-xr-x 2 root root 25 Jun 18 10:00 01;34m10742680090m drwxr-xr-x 2 root root 6 Apr 19 2005 01;34m10742680140m drwxr-xr-x 2 root root 31 Oct 21 06:22 01;34m10742680170m drwxr-xr-x 3 root root 23 Jan 28 2005 01;34m10742680190m drwxr-xr-x 2 root root 28 Sep 2 07:13 01;34m10742680220m drwxr-xr-x 2 root root 19 Sep 2 07:15 01;34m10742680240m drwxr-xr-x 2 root root 6 Sep 28 2004 01;34m12079614250m drwxr-xr-x 2 root root 112 Oct 21 06:28 01;34m13421790890m -rw-r--r-- 1 root root 137463 Nov 18 06:36 0m9395252810m -rw-r--r-- 1 root root 5030 Nov 18 06:36 0m9395252960m -rw-r--r-- 1 root root 214150 Nov 18 06:36 0m9395252970m -rw-r--r-- 1 root root 45628 Nov 18 06:35 0m9395252980m -rw-r--r-- 1 root root 201991 Nov 18 06:36 0m9395252990m -rw-r--r-- 1 root root 256393 Nov 18 06:36 0m9395253000m -rw-r--r-- 1 root root 277659 Nov 18 06:36 0m9395253010m -rw-r--r-- 1 root root 40770 Nov 18 06:16 0m9395253020m -rw-r--r-- 1 root root 7925 Nov 18 06:16 0m9395254250m -rw-r--r-- 1 root root 154710 Nov 18 06:36 0m9395254260m -rw-r--r-- 1 root root 24027 Nov 18 06:35 0m9395261640m -rw-r--r-- 1 root root 261048 Nov 18 06:36 0m9395262000m -rw-r--r-- 1 root root 86527 Nov 18 06:36 0m9395262630m -rw-r--r-- 1 root root 198932 Nov 18 06:36 0m9395262660m -rw-r--r-- 1 root root 222726 Nov 18 06:36 0m9395262680m -rw-r--r-- 1 root root 61110 Nov 18 06:36 0m9395262700m -rw-r--r-- 1 root root 137263 Nov 18 06:36 0m9395262710m -rw-r--r-- 1 root root 173481 Nov 18 06:35 0m9395262800m -rw-r--r-- 1 root root 110993 Nov 18 06:36 0m9395262810m -rw-r--r-- 1 root root 220421 Nov 18 06:36 0m9395262820m -rw-r--r-- 1 root root 104893 Nov 18 06:16 0m9395262830m -rw-r--r-- 1 root root 235904 Nov 18 06:36 0m9395262840m -rw-r--r-- 1 root root 18070 Nov 18 06:16 0m9395262860m -rw-r--r-- 1 root root 5543 Nov 18 06:16 0m9395262870m -rw-r--r-- 1 root root 238969 Nov 18 06:36 0m9395262880m -rw-r--r-- 1 root root 77329 Nov 18 06:36 0m9395263000m -rw-r--r-- 1 root root 159049 Nov 18 06:36 0m9395263010m -rw-r--r-- 1 root root 224866 Nov 18 06:36 0m9395263020m -rw-r--r-- 1 root root 228463 Nov 18 06:36 0m9395263030m -rw-r--r-- 1 root root 155675 Nov 18 06:36 0m9395263040m -rw-r--r-- 1 root root 176774 Nov 18 06:36 0m9395263050m -rw-r--r-- 1 root root 227041 Nov 18 06:36 0m9395263070m -rw-r--r-- 1 root root 274659 Nov 18 06:36 0m9395263080m -rw-r--r-- 1 root root 261357 Nov 18 06:36 0m9395263090m -rw-r--r-- 1 root root 3415 Nov 18 06:16 0m9395263140m -rw-r--r-- 1 root root 102324 Nov 18 06:36 0m9395263150m -rw-r--r-- 1 root root 192552 Nov 18 06:35 0m9395263160m -rw-r--r-- 1 root root 242376 Nov 18 06:36 0m9395263170m -rw-r--r-- 1 root root 147585 Nov 18 06:36 0m9395263180m -rw-r--r-- 1 root root 168016 Nov 18 06:35 0m9395263280m -rw-r--r-- 1 root root 229258 Nov 18 06:36 0m9395263290m -rw-r--r-- 1 root root 227503 Nov 18 06:36 0m9395263300m -rw-r--r-- 1 root root 240842 Nov 18 06:36 0m9395263310m -rw-r--r-- 1 root root 207746 Nov 18 06:36 0m9395263320m -rw-r--r-- 1 root root 190586 Nov 18 06:36 0m9395263330m -rw-r--r-- 1 root root 169669 Nov 18 06:36 0m9395263340m -rw-r--r-- 1 root root 79349 Nov 18 06:16 0m9395263350m -rw-r--r-- 1 root root 233336 Nov 18 06:36 0m9395269340m -rw-r--r-- 1 root root 254633 Nov 18 06:36 0m9395269370m -rw-r--r-- 1 root root 151098 Nov 18 06:36 0m9396268300m -rw-r--r-- 1 root root 56929 Nov 18 06:36 0m9396268370m -rw-r--r-- 1 root root 44455 Nov 18 06:35 0m9396296830m -rw-r--r-- 1 root root 253932 Nov 18 06:36 0m9396296840m -rw-r--r-- 1 root root 266353 Nov 18 06:36 0m9396296850m -rw-r--r-- 1 root root 272961 Nov 18 06:36 0m9396296860m -rw-r--r-- 1 root root 42935 Nov 18 06:35 0m9396296870m -rw-r--r-- 1 root root 250118 Nov 18 06:36 0m9396296900m -rw-r--r-- 1 root root 25575 Nov 18 06:36 0m9396296910m -rw-r--r-- 1 root root 138874 Nov 18 06:36 0m9396338420m -rw-r--r-- 1 root root 164779 Nov 18 06:36 0m9396338430m -rw-r--r-- 1 root root 6270 Sep 30 17:15 0m9396338810m -rw-r--r-- 1 root root 138223 Nov 18 06:36 0m9396338970m -rw-r--r-- 1 root root 264691 Nov 18 06:36 0m9396338980m -rw-r--r-- 1 root root 237897 Nov 18 06:36 0m9396338990m -rw-r--r-- 1 root root 258196 Nov 18 06:36 0m9396339000m -rw-r--r-- 1 root root 210051 Nov 18 06:36 0m9396339010m -rw-r--r-- 1 root root 261781 Nov 18 06:36 0m9396339020m -rw-r--r-- 1 root root 32682 Nov 18 06:16 0m9396339030m -rw-r--r-- 1 root root 197097 Nov 18 06:36 0m9396339090m -rw-r--r-- 1 root root 5035 Nov 18 06:16 0m9396339100m -rw-r--r-- 1 root root 174043 Nov 18 06:36 0m9396340220m -rw-r--r-- 1 root root 243884 Nov 18 06:36 0m9396340230m -rw-r--r-- 1 root root 244551 Nov 18 06:36 0m9396340240m -rw-r--r-- 1 root root 187625 Nov 18 06:36 0m9396340250m -rw-r--r-- 1 root root 146903 Nov 18 06:36 0m9397076280m -rw-r--r-- 1 root root 217365 Nov 18 06:36 0m9397076290m -rw-r--r-- 1 root root 172851 Nov 18 06:36 0m9397076300m -rw-r--r-- 1 root root 252126 Nov 18 06:36 0m9397076330m -rw-r--r-- 1 root root 258958 Nov 18 06:36 0m9397076360m -rw-r--r-- 1 root root 5908 Nov 18 06:36 0m9397076390m -rw-r--r-- 1 root root 189787 Nov 18 06:36 0m9397076410m -rw-r--r-- 1 root root 119942 Nov 18 06:36 0m9397076430m -rw-r--r-- 1 root root 1206126 Nov 18 06:36 0m9397076450m -rw-r--r-- 1 root root 2931 Sep 24 04:09 0m9397077120m -rw-r----- 1 root bacula 887 May 29 10:46 0m9397343430m -rw-r--r-- 1 root root 1626 Aug 31 2005 0m9397836100m -rw-r--r-- 1 root root 675 Aug 31 2005 0m9397836120m -rw-r--r-- 1 root root 1877 Aug 31 2005 0m9397836130m -rw-r--r-- 1 root root 436 Aug 31 2005 0m9397836140m -rw-r--r-- 1 root root 1379 Aug 31 2005 0m9397836160m -rw-r--r-- 1 root root 1194 Jan 14 2006 0m9397836230m -rw-r----- 1 root root 82 Oct 23 2005 0m9397836350m -rw-r----- 1 root bacula 312 Apr 30 2006 0m9397836630m -rw-r----- 1 root bacula 668 Feb 19 2006 0m9397836720m -rw-r--r-- 1 root root 1184 Dec 15 2004 0m9398041930m -rw-r--r-- 1 root root 237 Dec 15 2004 0m9398041940m -rw-r--r-- 1 root root 255 Dec 15 2004 0m9398041950m -rw-r--r-- 1 root root 672 Dec 15 2004 0m9398041960m -rw-r--r-- 1 root root 1482 Dec 15 2004 0m9398041970m -rw-r--r-- 1 root root 4104 Dec 15 2004 0m9398041980m -rw-r--r-- 1 root root 257 Dec 15 2004 0m9398041990m -rw-r--r-- 1 root root 396 Dec 15 2004 0m9398042000m -rw-r--r-- 1 root root 1530 Dec 15 2004 0m9398042010m -rw-r----- 1 root root 126 Jul 18 01:29 0m9398042030m -rw-r----- 1 root root 725 Jul 18 01:29 0m9398042050m -rw-r--r-- 1 root root 1720 Aug 31 2005 0m9398042060m -rw-r----- 1 root bacula 273 Jul 18 01:29 0m9398042420m -rw-r--r-- 1 root root 503 Jan 14 2006 0m9398080740m -rw-r--r-- 1 root root 501 Jan 14 2006 0m9398080750m -rw-r--r-- 1 root root 520 Jan 14 2006 0m9398080760m -rw-r--r-- 1 root root 2558 Jan 14 2006 0m9398080770m -rw-r--r-- 1 root root 195404 Nov 18 06:36 0m9398198910m -rw-r----- 1 root bacula 84 Aug 22 2005 0m9398209630m -rw-r----- 1 root bacula 248 Aug 22 2005 0m9398209640m -rw-r----- 1 root bacula 184 Aug 22 2005 0m9398209650m -rw-r----- 1 root bacula 1545 Aug 22 2005 0m9398209660m -rw-r----- 1 root bacula 831 Aug 22 2005 0m9398209680m -rw-r----- 1 root bacula 555 Aug 22 2005 0m9398209690m -rw-r----- 1 root bacula 101 Aug 22 2005 0m9398209700m -rw-r----- 1 root bacula 614 Aug 22 2005 0m9398209710m -rw-r----- 1 root bacula 251 Aug 22 2005 0m9398209720m -rw-r----- 1 root bacula 93 Aug 22 2005 0m9398209730m -rw-r----- 1 root bacula 138 Aug 22 2005 0m9398209740m -rw-r--r-- 1 root root 840 Apr 7 2005 0m9398411070m -rw-r--r-- 1 root root 3393 Dec 15 2004 0m9398434870m -rw-r--r-- 1 root root 9232 Nov 20 18:08 0m9398721840m -rw-r--r-- 1 root root 0 Jan 28 2005 0m9398722080m -rwxr-xr-x 1 root root 211 Feb 28 2005 01;32m9407440210m -rwxr-xr-x 1 root root 162 Dec 19 2004 01;32m9407440220m -rw-r--r-- 1 root root 22291 Oct 19 05:33 0m9409232460m -rw-r--r-- 1 root root 14612 Oct 19 05:33 0m9409232470m -rw-r--r-- 1 root root 402 Sep 30 14:31 0m9409366190m lrwxrwxrwx 1 root root 57 Oct 9 18:37 01;36m9412610160m -> 0m/usr/share/python-support/python-feedparser/feedparser.py0m -rw-r--r-- 1 root root 84 Oct 28 07:54 0m9412610170m -rw-r--r-- 1 root root 94122 Sep 2 07:11 0m9412610180m lrwxrwxrwx 1 root root 47 Oct 9 18:37 01;36m9412667320m -> 0m/usr/share/python-support/python-gtk2/pygtk.pth0m lrwxrwxrwx 1 root root 46 Oct 9 18:37 01;36m9412667640m -> 0m/usr/share/python-support/python-gtk2/pygtk.py0m -rw-r--r-- 1 root root 1604 Oct 7 07:33 0m9412667650m -rw-r----- 1 root bacula 7235 Nov 2 17:58 0m9412742070m -rw-r--r-- 1 root root 813 May 23 2005 0m9414461250m -rw-r--r-- 1 root root 408 May 23 2005 0m9414582430m -rw-r----- 1 root bacula 273 Oct 21 04:41 0m9414604300m -rw-r----- 1 root root 141 Oct 21 04:41 0m9414604340m -rw-r----- 1 root root 79 Oct 21 04:41 0m9414604430m -rw-r----- 1 root bacula 118 Oct 21 04:41 0m9414604450m -rw-r----- 1 root root 377 Oct 21 04:41 0m9414604520m -rw-r----- 1 root root 1300 Oct 21 04:41 0m9414604540m -rw-r----- 1 root bacula 747 Oct 21 04:41 0m9414604550m -rw-r----- 1 root bacula 1394 Oct 21 04:41 0m9414604570m -rw-r--r-- 1 root root 30 Apr 19 2005 0m9415103700m -rw-r--r-- 1 root root 42 Apr 19 2005 0m9415103710m -rw-r--r-- 1 root root 27091 Nov 11 06:28 0m9415140670m -rw-r--r-- 1 root root 28928 Nov 11 06:28 0m9415140740m -rw-r--r-- 1 root root 27044 Sep 1 03:50 0m9415141140m -rw-r--r-- 1 root root 27044 Sep 1 03:50 0m9415141150m -rw-r--r-- 1 root root 95494 Jun 3 14:05 0m9419075010m -rw-r--r-- 1 root root 144186 Jun 3 14:05 0m9419075020m -rwxr-xr-x 1 root root 64 Aug 24 16:20 01;32m9419330940m -rw-r--r-- 1 root root 3055 Jan 27 2005 0m9419332230m -rw-r--r-- 1 root root 170 May 23 2004 0m9419332250m -rw-r--r-- 1 root root 2855 Feb 5 2005 0m9419332260m -rw-r--r-- 1 root root 230 Sep 28 2004 0m9419424630m -rw-r--r-- 1 root root 1564 Sep 2 07:12 0m9420809210m -rw-r--r-- 1 root root 82729 Oct 14 10:01 0m9420900750m -rw-r--r-- 1 root root 1565 Jul 17 2002 0m9420904510m -rw-r--r-- 1 root root 321 Jul 1 11:53 0m9420920490m -rw-r--r-- 1 root root 145 Jul 1 10:57 0m9420923040m -rw-r--r-- 1 root root 850 Jul 1 10:57 0m9420923050m -rw-r--r-- 1 root root 546 Sep 2 07:14 0m9420923060m -rw------- 1 root root 24399 Nov 4 06:21 0m9422603730m m1;36mroot@1[lost+found]#0;39m file * 1074268001: directory 1074268007: directory 1074268009: directory 1074268014: directory 1074268017: directory 1074268019: directory 1074268022: directory 1074268024: directory 1207961425: directory 1342179089: directory 939525281: XML document text 939525296: XML document text 939525297: XML document text 939525298: XML document text 939525299: XML document text 939525300: XML document text 939525301: XML document text 939525302: XML document text 939525425: XML document text 939525426: XML document text 939526164: XML document text 939526200: XML document text 939526263: XML document text 939526266: XML document text 939526268: XML document text 939526270: XML document text 939526271: XML document text 939526280: XML document text 939526281: XML document text 939526282: XML document text 939526283: XML document text 939526284: XML document text 939526286: XML document text 939526287: XML document text 939526288: XML document text 939526300: XML document text 939526301: XML document text 939526302: XML document text 939526303: XML document text 939526304: XML document text 939526305: XML document text 939526307: XML document text 939526308: XML document text 939526309: XML document text 939526314: XML document text 939526315: XML document text 939526316: XML document text 939526317: XML document text 939526318: XML document text 939526328: XML document text 939526329: XML document text 939526330: XML document text 939526331: XML document text 939526332: XML document text 939526333: XML document text 939526334: XML document text 939526335: XML document text 939526934: XML document text 939526937: XML document text 939626830: XML document text 939626837: XML document text 939629683: XML document text 939629684: XML document text 939629685: XML document text 939629686: XML document text 939629687: XML document text 939629690: XML document text 939629691: XML document text 939633842: XML document text 939633843: XML document text 939633881: Assembler source 939633897: XML document text 939633898: XML document text 939633899: XML document text 939633900: XML document text 939633901: XML document text 939633902: XML document text 939633903: XML document text 939633909: XML document text 939633910: XML document text 939634022: XML document text 939634023: XML document text 939634024: XML document text 939634025: XML document text 939707628: XML document text 939707629: XML document text 939707630: XML document text 939707633: XML document text 939707636: XML document text 939707639: XML document text 939707641: XML document text 939707643: XML document text 939707645: XML document text 939707712: UTF-8 Unicode text 939734343: ASCII text 939783610: ASCII C++ program text 939783612: ASCII C++ program text 939783613: ASCII C++ program text 939783614: ASCII C++ program text 939783616: ASCII C++ program text 939783623: ASCII C++ program text 939783635: ASCII text 939783663: ASCII text 939783672: ASCII text 939804193: ASCII C++ program text 939804194: ASCII C++ program text 939804195: ASCII C++ program text 939804196: ASCII C++ program text 939804197: ASCII C++ program text 939804198: ASCII C++ program text 939804199: ASCII C++ program text 939804200: ASCII C++ program text 939804201: ASCII C++ program text 939804203: ASCII text 939804205: ASCII text 939804206: ASCII English text 939804242: ASCII text 939808074: ASCII C++ program text 939808075: ASCII C++ program text 939808076: ASCII C++ program text 939808077: ASCII C++ program text 939819891: XML document text 939820963: ASCII text 939820964: ASCII text 939820965: ASCII text 939820966: ASCII text, with very long lines 939820968: ASCII text 939820969: ASCII text 939820970: ASCII text 939820971: ASCII text, with very long lines 939820972: ASCII text 939820973: ASCII text 939820974: ASCII text 939841107: ASCII English text 939843487: ASCII C++ program text 939872184: ASCII text, with very long lines 939872208: empty 940744021: Bourne shell script text executable 940744022: Bourne shell script text executable 940923246: UTF-8 Unicode text 940923247: UTF-8 Unicode text 940936619: ASCII text 941261016: broken symbolic link to `/usr/share/python-support/python-feedparser/feedparser.py' 941261017: ASCII text 941261018: data 941266732: broken symbolic link to `/usr/share/python-support/python-gtk2/pygtk.pth' 941266764: broken symbolic link to `/usr/share/python-support/python-gtk2/pygtk.py' 941266765: data 941274207: ASCII English text, with very long lines 941446125: ASCII English text 941458243: ASCII text 941460430: ASCII text 941460434: ASCII text 941460443: ASCII text 941460445: ASCII text 941460452: ASCII English text 941460454: ASCII English text 941460455: ASCII text 941460457: ASCII English text 941510370: ASCII text 941510371: ASCII text 941514067: XML document text 941514074: XML document text 941514114: XML document text 941514115: XML document text 941907501: ASCII text 941907502: ASCII text, with very long lines 941933094: ASCII text 941933223: ASCII English text 941933225: ASCII English text 941933226: ASCII English text 941942463: ASCII text 942080921: ASCII English text 942090075: ASCII English text 942090451: exported SGML document text 942092049: ASCII English text 942092304: ASCII text 942092305: ASCII English text 942092306: ASCII English text 942260373: ASCII English text 1;36mroot@1[lost+found]#0;39m more * *** 1074268001: directory *** *** 1074268007: directory *** *** 1074268009: directory *** *** 1074268014: directory *** *** 1074268017: directory *** *** 1074268019: directory *** *** 1074268022: directory *** *** 1074268024: directory *** *** 1207961425: directory *** *** 1342179089: directory *** 1;36mroot@1[/]#0;39m !1 xfs_repair -n /dev/hda2 Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. 1;36mroot@1[/]#0;39m Script done on Thu Nov 23 10:06:36 2006 On Wed, 22 Nov 2006, Russell Cattelan wrote: > On Mon, 2006-11-20 at 15:00 -0500, Justin Piszcz wrote: > > Anyone know what could cause this? > > Is the last good kernel to use 2.6.17.6 w/XFS bugfix patch? > > > > Was a new bug introduced from 2.6.16.6 -> 2.6.17.13? > > you must have missed the "new bugs introduced in this release" :-) > > There is no info here to go on. > > run xfs_repair -n > if it looks like an existing bug add to it > if not open a new bug and attach repair output to it. > > > Nov 20 13:16:58 box [4299533.469000] 0x0: 00 00 00 00 00 00 00 00 00 00 00 > > 00 00 > > 00 00 00 > > Nov 20 13:16:58 box [4299533.469000] Filesystem "hda2": XFS internal error > > xfs_da_do_buf(2) at line 2212 of file fs/xfs/xfs_da_btree.c. Caller > > 0xc01ffcad > > Nov 20 13:16:58 box [4299533.469000] > > Nov 20 13:16:58 box xfs_corruption_error+0xf2/0x11a > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.469000] > > Nov 20 13:16:58 box kmem_zone_alloc+0x60/0xdd > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_da_buf_make+0xf7/0x14c > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.469000] > > Nov 20 13:16:58 box xfs_da_do_buf+0x935/0x98d > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.469000] > > Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_da_read_buf+0x3b/0x3f > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.469000] > > Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_da_node_lookup_int+0xd0/0x399 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.469000] > > Nov 20 13:16:58 box xfs_dir2_node_lookup+0x3f/0xb9 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_dir2_lookup+0x137/0x139 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.470000] > > Nov 20 13:16:58 box __alloc_pages+0x53/0x2d6 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_dir_lookup_int+0x40/0x125 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.470000] > > Nov 20 13:16:58 box xfs_lookup+0x5f/0x88 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box xfs_vn_lookup+0x4f/0x93 > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box [4299533.470000] > > Nov 20 13:16:58 box do_lookup+0x12d/0x15f > > Nov 20 13:16:58 box > > Nov 20 13:16:58 box > > > -- > Russell Cattelan > From owner-xfs@oss.sgi.com Thu Nov 23 09:38:43 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 09:38:50 -0800 (PST) Received: from slurp.thebarn.com (cattelan-host202.dsl.visi.com [208.42.117.202]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kANHcfaG019084 for ; Thu, 23 Nov 2006 09:38:43 -0800 Received: from [10.0.0.12] (ease.thebarn.com [10.0.0.12]) (authenticated bits=0) by slurp.thebarn.com (8.13.8/8.13.8) with ESMTP id kANHbqiD071807; Thu, 23 Nov 2006 11:37:53 -0600 (CST) (envelope-from cattelan@thebarn.com) Message-ID: <4565DC6A.9080602@thebarn.com> Date: Thu, 23 Nov 2006 11:37:46 -0600 From: Russell Cattelan User-Agent: Mozilla Thunderbird 1.0.7 (Macintosh/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Timothy Shimmin CC: Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> <45647CF8.8020104@sandeen.net> <26F2AE58A7D40E5170649BC2@timothy-shimmins-power-mac-g5.local> In-Reply-To: <26F2AE58A7D40E5170649BC2@timothy-shimmins-power-mac-g5.local> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9760 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cattelan@thebarn.com Precedence: bulk X-list: xfs Content-Length: 5806 Lines: 145 Timothy Shimmin wrote: > Hi Guys, > > So just looking at the first part, which as Eric suggested can be > considered > on its own. > > Index: work_gfs/fs/xfs/xfs_attr.c > =================================================================== > --- work_gfs.orig/fs/xfs/xfs_attr.c 2006-11-21 18:38:27.572949303 > -0600 > +++ work_gfs/fs/xfs/xfs_attr.c 2006-11-21 18:44:51.666033422 -0600 > @@ -210,8 +210,20 @@ xfs_attr_set_int(xfs_inode_t *dp, const > * (inode must not be locked when we call this routine) > */ > if (XFS_IFORK_Q(dp) == 0) { > - if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) > - return(error); > + if ((dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) || > + ((dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS) && > + (dp->i_d.di_anextents == 0))) { > + /* xfs_bmap_add_attrfork will set the forkoffset based on > + * the size needed, the local attr case needs the size > + * attr plus the size of the hdr, if the size of > + * header is not accounted for initially the forkoffset > + * won't allow enough space, the actually attr add will > + * then be forced out out line to extents > + */ > + size += sizeof(xfs_attr_sf_hdr_t); > + if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) > + return(error); > + } > } > > --On 22 November 2006 10:38:16 AM -0600 Eric Sandeen > wrote: > >>> By fixing the initial size calculation at least things like SElinux >>> which is adding one attr won't cause the attr segment to flip to >>> extents >>> immediately. >>> The second attr will cause the flip but not the first one. >> >> >> I'd say this part (fixing up proper space for the initial attr fork >> setup) should probably go in >> soon if it gets good reviews (with the removal of the extra tests, as >> we discussed on irc last >> night). I think this proper change stands on its own just fine. >> > > So yeah, as you said in IRC, the brace is in the wrong spot and > the di_aformat tests don't make any sense here. > Basically, we know that fork offset is zero and therefore that the > di_aformat should be > set at XFS_DINODE_FMT_EXTENTS and di_anetents will be zero. > As this is the state before we add in an attribute fork. > Why we have this initial state as extents, I'm not too sure and > wondered in the > past. Maybe because this state is one which doesn't occupy any space > in the literal area. > A shortform EA has a header at least. > > My next concern is that the size that is calculated is presumably > trying to accomodate > the shortform EA. However, the calculation is for the sf header and > the space for a > a xfs_attr_leaf_name_local with given namelen and valuelen. > It would be better to base it on an xfs_attr_sf_entry type. > So I think we need to rework this calculation. > > Which leads me on to the next issue. > We don't know what EA form we are going to take, > so we can't really assume that it will be shortform. > If the EA name or value is big then the EA will go into extents and > could occupy very > little room in the inode. > With the current & proposed test this could make the bytesfit function > return 0 > (the offset calculated in bytesfit could also go negative) and > then we would set the forkoff back at the old attr1 default. > So we might have 1 EA extent in the inode taking little space and yet > setting the forkoff > in the middle. Yes I agree worst cast scenario is that the inode has reverted to an attr1 split and that space is being wasted in the attr portion. By the time an inode has flipped to btree mode for di_u how much of a performance hit is really going to be noticed? mapping the blocks for that inode is going to take multiple reads. Attr2 seems most effective at space optimization for the local and extent versions of di_u and probably not so much for btree. At least by fixing the size calculation the shortforms that do fit into di_a will now be added inline. What is happening now the btree is being re factored which is probably expensive and the attr is being added as extents since the original size used for the btree refactoring wasn't enough. So the change to add in the header size will at least make single case attrs more efficient since they will now be inline. If the attr does not fit inline then worst case the forkoff flips to attr1 default or a half and half split. Given the cost of refactoring a btree it might be better to have attr1 behavior? Since di_a will have extra space additional attr adds won't cause forkoff to move and thus won't cause a rebalance of di_u. So in thinking about this more does it make sense to actually not try to optimize the space needed for di_u when it is a btree? Maybe the first time an attr is added simply split the inode space doing the rebalance once? That would allow for more attrs to be added without rebalancing the data btree. The other scheme of space optimzation if the root btree node is sparse would say sure give more space to di_a but at the expense of a reblanace. So ya it's a bit of a guessing game. > > Of course the setting of the forkoff is a bit of a guessing game since > we can't > predict the future usage but I think the plan is to set it to the > minimum to fit > on a first come first served basis. > > So I'm thinking that we should set it based on the size of shortform > if that > is how it will be stored or to the size taken up by the EA extents - > I was initially thinking that this would be 1 extent but with a remote > value > block of up to 64K this could in theory be an extent for each fsb of > the value > I guess. > Have to think about this some more. > > --Tim > From owner-xfs@oss.sgi.com Thu Nov 23 14:08:40 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 14:08:48 -0800 (PST) Received: from page.mel.office.aconex.com (mail.aconex.com [150.101.159.26]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kANM8caG022645 for ; Thu, 23 Nov 2006 14:08:40 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id 8FB44534260; Fri, 24 Nov 2006 09:07:45 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 22427-01-18; Fri, 24 Nov 2006 09:07:44 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 27D3A534244; Fri, 24 Nov 2006 09:07:44 +1100 (EST) Subject: Re: [xfs-masters] Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP From: Nathan Scott Reply-To: nscott@aconex.com To: xfs-masters@oss.sgi.com Cc: Al Viro , David Miller , dgc@sgi.com, jesper.juhl@gmail.com, chatz@melbourne.sgi.com, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, netdev@vger.kernel.org, linux-scsi@vger.kernel.org In-Reply-To: <1164269545.31358.771.camel@laptopd505.fenrus.org> References: <9a8748490611211551v2ebe88fel2bcf25af004c338a@mail.gmail.com> <9a8748490611220458w4d94d953v21f7a29a9f1bdb72@mail.gmail.com> <20061123011809.GY37654165@melbourne.sgi.com> <20061122.201013.112290046.davem@davemloft.net> <20061123043543.GI3078@ftp.linux.org.uk> <1164269545.31358.771.camel@laptopd505.fenrus.org> Content-Type: text/plain Organization: Aconex Date: Fri, 24 Nov 2006 09:08:45 +1100 Message-Id: <1164319725.4695.367.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 9762 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 764 Lines: 24 On Thu, 2006-11-23 at 09:12 +0100, Arjan van de Ven wrote: > On Thu, 2006-11-23 at 04:35 +0000, Al Viro wrote: > > > I would even say 10 function calls deep to allocate file blocks > > > is overkill, but 22 it just astronomically bad. > > > > Especially since a large part is due to cxfs... > > - > > it's a bit sad to see XFS this crippled in linux due to an external, > proprietary module ;( Heh, never let reality get in the way of a good conspiracy theory. The stack depth in XFS is more a factor of the complexity of the XFS space allocation algorithms, and is unrelated to CXFS. I'm sure if people would point to specific stack issues they would (continue to) get addressed. Its just so much easier to speculate randomly though... cheers. -- Nathan From owner-xfs@oss.sgi.com Thu Nov 23 14:51:01 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 14:51:09 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kANMouaG027386 for ; Thu, 23 Nov 2006 14:51:00 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA07497; Fri, 24 Nov 2006 09:49:58 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kANMnu7Y44113373; Fri, 24 Nov 2006 09:49:57 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kANMnqvF44340509; Fri, 24 Nov 2006 09:49:52 +1100 (AEDT) Date: Fri, 24 Nov 2006 09:49:52 +1100 From: David Chinner To: Russell Cattelan Cc: Eric Sandeen , Timothy Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <20061123224952.GZ11034@melbourne.sgi.com> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1164212695.19915.65.camel@xenon.msp.redhat.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9763 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1306 Lines: 32 On Wed, Nov 22, 2006 at 10:24:55AM -0600, Russell Cattelan wrote: > On Wed, 2006-11-22 at 09:44 -0600, Eric Sandeen wrote: > > Timothy Shimmin wrote: > > > Thanks, Russell. > > > > > > I've been going thru the irc and just started looking at the patch. > > > I'll get back to you about it tomorrow. > > > > > > I agree it would be good to have the fixed forkoff for data btree roots > > > as the first fix. And look into redoing the btree root for a later change. > > > > My only question is, how much does this defeat the purpose of attr2? > Well from the standpoint that attr2 currently corrupts inodes anything > to prevent that is good, since currently attr2 can't be used at all. > When the di_u is extent based the attr2 code works as expected, giving > space to which ever segment gets there first.The attr2 code should still > be a big win for most file/dir inodes since they are probably able to do > their block mapping with local or extent mode. I suggest that dbench -x might be the best way to determine the perfomrance impact - attr1 on a single disk would get ~5MB/s; attr2 on the same disk for the same test gave about 50MB/s (IIRC). I'd hope that this fix retains that kind of advantage for attr2.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 23 15:09:00 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 15:09:08 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kANN8uaG029512 for ; Thu, 23 Nov 2006 15:08:58 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA07998; Fri, 24 Nov 2006 10:07:51 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kANN7m7Y44300694; Fri, 24 Nov 2006 10:07:48 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kANN7i3F44341771; Fri, 24 Nov 2006 10:07:44 +1100 (AEDT) Date: Fri, 24 Nov 2006 10:07:44 +1100 From: David Chinner To: Justin Piszcz Cc: Russell Cattelan , xfs@oss.sgi.com Subject: Re: XFS CORRUPTION 2.6.17.13? Message-ID: <20061123230744.GA11034@melbourne.sgi.com> References: <1164231716.19915.68.camel@xenon.msp.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 9764 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1874 Lines: 54 On Thu, Nov 23, 2006 at 11:40:38AM -0500, Justin Piszcz wrote: > Here is the info: > > Script started on Thu Nov 23 09:55:38 2006 > 1;36mroot@1[~]#0;39m xfs_repair -n /dev/hda2 > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > - scan (but don't clear) agi unlinked lists... > - process known inodes and perform inode discovery... > - agno = 0 > - agno = 1 > - agno = 2 > - agno = 3 > - agno = 4 > - agno = 5 > - agno = 6 > - agno = 7 > data fork in regular inode 939526080 claims used block 114661 > bad data fork in inode 939526080 > would have cleared inode 939526080 ...... > data fork in regular inode 939526111 claims used block 114692 > bad data fork in inode 939526111 > would have cleared inode 939526111 Looks like half an inode cluster has been trashed in some way (32 consecutive inodes are bad). All the following errors appear to be a direct result of these inodes being trashed. Are you using 256 byte inodes? if it is, that means that the 32 inodes would have been written in a single buffer, and so that buffer write would be suspect. FWIW, Irix XFS actually validates inode buffers before they get written out, so if it was a bad write that might have been caught on irix. Unfortunately, we don't do those checks in Linux (most of the hooks are there, just not used) so it is possible that some kind of memory corruption has lead to this damaged state on disk. Seeing as you've repair the filesystem, we can't really get a dump of the raw inode data to find out exactly how they were corrupted. Unless you have a copy of the fs around somewhere? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 23 15:18:49 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 15:18:54 -0800 (PST) Received: from lucidpixels.com (lucidpixels.com [66.45.37.187]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kANNIlaG031240 for ; Thu, 23 Nov 2006 15:18:48 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id 4315E61012A0; Thu, 23 Nov 2006 18:18:00 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id 3A86D16172EB3; Thu, 23 Nov 2006 18:18:00 -0500 (EST) Date: Thu, 23 Nov 2006 18:18:00 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: David Chinner cc: Russell Cattelan , xfs@oss.sgi.com Subject: Re: XFS CORRUPTION 2.6.17.13? In-Reply-To: <20061123230744.GA11034@melbourne.sgi.com> Message-ID: References: <1164231716.19915.68.camel@xenon.msp.redhat.com> <20061123230744.GA11034@melbourne.sgi.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 9765 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs Content-Length: 2368 Lines: 70 Ah, I did not make a copy unfortunatley. I used the default mkfs.xfs settings from Knoppix 4.0.2 I believe, whatever version of xfsprogs it has for a 400GB drive. Any idea how it could have been 'trashed' in one way? It appeared to occur shortly after boot-up, which then a backup occurs (heavy I/O) to a remote box via NFS. Justin. On Fri, 24 Nov 2006, David Chinner wrote: > On Thu, Nov 23, 2006 at 11:40:38AM -0500, Justin Piszcz wrote: > > Here is the info: > > > > Script started on Thu Nov 23 09:55:38 2006 > > 1;36mroot@1[~]#0;39m xfs_repair -n /dev/hda2 > > Phase 1 - find and verify superblock... > > Phase 2 - using internal log > > - scan filesystem freespace and inode maps... > > - found root inode chunk > > Phase 3 - for each AG... > > - scan (but don't clear) agi unlinked lists... > > - process known inodes and perform inode discovery... > > - agno = 0 > > - agno = 1 > > - agno = 2 > > - agno = 3 > > - agno = 4 > > - agno = 5 > > - agno = 6 > > - agno = 7 > > data fork in regular inode 939526080 claims used block 114661 > > bad data fork in inode 939526080 > > would have cleared inode 939526080 > ...... > > data fork in regular inode 939526111 claims used block 114692 > > bad data fork in inode 939526111 > > would have cleared inode 939526111 > > Looks like half an inode cluster has been trashed in some way (32 > consecutive inodes are bad). All the following errors appear to be a > direct result of these inodes being trashed. Are you using 256 byte > inodes? if it is, that means that the 32 inodes would have been > written in a single buffer, and so that buffer write would be > suspect. > > FWIW, Irix XFS actually validates inode buffers before they get > written out, so if it was a bad write that might have been caught on > irix. Unfortunately, we don't do those checks in Linux (most of the > hooks are there, just not used) so it is possible that some kind of > memory corruption has lead to this damaged state on disk. > > Seeing as you've repair the filesystem, we can't really get a dump > of the raw inode data to find out exactly how they were corrupted. > Unless you have a copy of the fs around somewhere? > > Cheers, > > Dave. > > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > From owner-xfs@oss.sgi.com Thu Nov 23 15:56:24 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 15:56:30 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kANNuLaG003275 for ; Thu, 23 Nov 2006 15:56:22 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA09667; Fri, 24 Nov 2006 10:55:30 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kANNtR7Y40283492; Fri, 24 Nov 2006 10:55:28 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kANNtP4Q43546242; Fri, 24 Nov 2006 10:55:25 +1100 (AEDT) Date: Fri, 24 Nov 2006 10:55:25 +1100 From: David Chinner To: Justin Piszcz Cc: David Chinner , Russell Cattelan , xfs@oss.sgi.com Subject: Re: XFS CORRUPTION 2.6.17.13? Message-ID: <20061123235525.GD11034@melbourne.sgi.com> References: <1164231716.19915.68.camel@xenon.msp.redhat.com> <20061123230744.GA11034@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 9766 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 707 Lines: 28 On Thu, Nov 23, 2006 at 06:18:00PM -0500, Justin Piszcz wrote: > Ah, > > I did not make a copy unfortunatley. It was too much to hope for :/ > I used the default mkfs.xfs settings from Knoppix 4.0.2 I believe, > whatever version of xfsprogs it has for a 400GB drive. Ok, so 256 byte inodes then. A single bad buffer write is possible then. > Any idea how it could have been 'trashed' in one way? It appeared to > occur shortly after boot-up, which then a backup occurs (heavy I/O) to a > remote box via NFS. No idea - I was hoping to get a clue by looking at the raw corrupted data if you still had it around..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 23 20:46:18 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 20:46:26 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAO4kFaG012394 for ; Thu, 23 Nov 2006 20:46:16 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA15973; Fri, 24 Nov 2006 15:45:15 +1100 Date: Fri, 24 Nov 2006 15:47:41 +1100 From: Timothy Shimmin To: Russell Cattelan cc: Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <85F7BBFB66573AAC402078C9@timothy-shimmins-power-mac-g5.local> In-Reply-To: <4565DC6A.9080602@thebarn.com> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> <45647CF8.8020104@sandeen.net> <26F2AE58A7D40E5170649BC2@timothy-shimmins-power-mac-g5.local> <4565DC6A.9080602@thebarn.com> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9767 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 6844 Lines: 156 --On 23 November 2006 11:37:46 AM -0600 Russell Cattelan wrote: > Timothy Shimmin wrote: > >> Hi Guys, >> >> So just looking at the first part, which as Eric suggested can be >> considered >> on its own. >> >> Index: work_gfs/fs/xfs/xfs_attr.c >> =================================================================== >> --- work_gfs.orig/fs/xfs/xfs_attr.c 2006-11-21 18:38:27.572949303 >> -0600 >> +++ work_gfs/fs/xfs/xfs_attr.c 2006-11-21 18:44:51.666033422 -0600 >> @@ -210,8 +210,20 @@ xfs_attr_set_int(xfs_inode_t *dp, const >> * (inode must not be locked when we call this routine) >> */ >> if (XFS_IFORK_Q(dp) == 0) { >> - if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) >> - return(error); >> + if ((dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) || >> + ((dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS) && >> + (dp->i_d.di_anextents == 0))) { >> + /* xfs_bmap_add_attrfork will set the forkoffset based on >> + * the size needed, the local attr case needs the size >> + * attr plus the size of the hdr, if the size of >> + * header is not accounted for initially the forkoffset >> + * won't allow enough space, the actually attr add will >> + * then be forced out out line to extents >> + */ >> + size += sizeof(xfs_attr_sf_hdr_t); >> + if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) >> + return(error); >> + } >> } >> >> --On 22 November 2006 10:38:16 AM -0600 Eric Sandeen >> wrote: >> >>>> By fixing the initial size calculation at least things like SElinux >>>> which is adding one attr won't cause the attr segment to flip to >>>> extents >>>> immediately. >>>> The second attr will cause the flip but not the first one. >>> >>> >>> I'd say this part (fixing up proper space for the initial attr fork >>> setup) should probably go in >>> soon if it gets good reviews (with the removal of the extra tests, as >>> we discussed on irc last >>> night). I think this proper change stands on its own just fine. >>> >> >> So yeah, as you said in IRC, the brace is in the wrong spot and >> the di_aformat tests don't make any sense here. >> Basically, we know that fork offset is zero and therefore that the >> di_aformat should be >> set at XFS_DINODE_FMT_EXTENTS and di_anetents will be zero. >> As this is the state before we add in an attribute fork. >> Why we have this initial state as extents, I'm not too sure and >> wondered in the >> past. Maybe because this state is one which doesn't occupy any space >> in the literal area. >> A shortform EA has a header at least. >> >> My next concern is that the size that is calculated is presumably >> trying to accomodate >> the shortform EA. However, the calculation is for the sf header and >> the space for a >> a xfs_attr_leaf_name_local with given namelen and valuelen. >> It would be better to base it on an xfs_attr_sf_entry type. >> So I think we need to rework this calculation. >> >> Which leads me on to the next issue. >> We don't know what EA form we are going to take, >> so we can't really assume that it will be shortform. >> If the EA name or value is big then the EA will go into extents and >> could occupy very >> little room in the inode. >> With the current & proposed test this could make the bytesfit function >> return 0 >> (the offset calculated in bytesfit could also go negative) and >> then we would set the forkoff back at the old attr1 default. >> So we might have 1 EA extent in the inode taking little space and yet >> setting the forkoff >> in the middle. > > Yes I agree worst cast scenario is that the inode has reverted to an attr1 split and > that space is being wasted in the attr portion. By the time an inode has flipped to > btree mode for di_u how much of a performance hit is really going to be noticed? > mapping the blocks for that inode is going to take multiple reads. > Attr2 seems most effective at space optimization for the local and extent > versions of di_u and probably not so much for btree. > > At least by fixing the size calculation the shortforms that do fit into di_a will now > be added inline. What is happening now the btree is being re factored > which is probably expensive and the attr is being added as extents since the > original size used for the btree refactoring wasn't enough. > > So the change to add in the header size will at least make single case > attrs more efficient since they will now be inline. No argument there, I just want to make sure the SF size calculation is correct. I think we need a new macro here. Currently we just accumulate the size using the totsize field in the header and so on an EA add, just add in the space for 1 entry: i.e. newsize = XFS_ATTR_SF_TOTSIZE(args->dp); newsize += XFS_ATTR_SF_ENTSIZE_BYNAME(args->namelen, args->valuelen); We need a macro, at least, for the total size for 1 given EA. xfs_attr_leaf_newentsize is not necessarily the space for a shortform (SF) EA - the structs do have different field sizes. > If the attr does not fit inline then worst case the forkoff flips to attr1 > default or a half and half split. > > Given the cost of refactoring a btree it might be better to have attr1 behavior? > Since di_a will have extra space additional attr adds won't cause forkoff to > move and thus won't cause a rebalance of di_u. > If the SF EA can't fit in the literal space, and goes to extents then adding further EAs probably isn't going to move the forkoff at all either, since they'll probably just go into the EA leaf block. > So in thinking about this more does it make sense to actually not > try to optimize the space needed for di_u when it is a btree? > Maybe the first time an attr is added simply split the inode > space doing the rebalance once? > That would allow for more attrs to be added without rebalancing > the data btree. > The other scheme of space optimzation if the root btree node > is sparse would say sure give more space to di_a but at > the expense of a reblanace. > > So ya it's a bit of a guessing game. > Yeah I'm inclined not to worry about compressing the btree data root but just working out where it really finishes in the literal area and not pushing it. (All this would be a lot easier to discuss in person with a whiteboard, as I think Eric was kind of suggesting in irc :) Back to the 1st part. I'm thinking that if I can't fit the SF EA in the available space, then we might just assume that we should leave EA space for the MAX(2 extents, min-root-size) i.e. try for 2 extents (bytesfit will make sure we atleast have the min-root-size). The forkoff is just a good guess after all - it's a tradeoff. Okay, need to do some coding... --Tim From owner-xfs@oss.sgi.com Thu Nov 23 20:49:26 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 23 Nov 2006 20:49:33 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAO4nPaG012798 for ; Thu, 23 Nov 2006 20:49:26 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAO4VmBb009394 for ; Thu, 23 Nov 2006 23:31:48 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAO4VmFZ004471 for ; Thu, 23 Nov 2006 23:31:48 -0500 Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAO4VmHX026697 for ; Thu, 23 Nov 2006 23:31:48 -0500 Message-ID: <456675B4.5050302@redhat.com> Date: Thu, 23 Nov 2006 22:31:48 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: xfs problems with xen blktap driver Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9768 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@redhat.com Precedence: bulk X-list: xfs Content-Length: 370 Lines: 14 https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098 Bugzilla Bug 217098: corruption of xen image if stored on XFS partition apparently using xfs as backing store for xen guest block devices, using the blktap driver, is busted. Ring any bells? I need to read up on what blktap does (apparently large parts in userspace, O_DIRECT + aio....) Thanks, -Eric From owner-xfs@oss.sgi.com Sat Nov 25 20:26:49 2006 Received: with ECARTIS (v1.0.0; list xfs); Sat, 25 Nov 2006 20:26:56 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAQ4QmaG010878 for ; Sat, 25 Nov 2006 20:26:49 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAQ4Pw5Q020163; Sat, 25 Nov 2006 23:25:58 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAQ4PwTt030354; Sat, 25 Nov 2006 23:25:58 -0500 Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAQ4PvD2006265; Sat, 25 Nov 2006 23:25:57 -0500 Message-ID: <45691753.60500@redhat.com> Date: Sat, 25 Nov 2006 22:25:55 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.org, xfs@oss.sgi.com CC: rtc@gmx.de Subject: [PATCH/RFC] pass dio_complete proper offset from finished_one_bio Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9778 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@redhat.com Precedence: bulk X-list: xfs Content-Length: 1924 Lines: 60 We saw problems w/ xfs doing AIO+DIO into a sparse file. https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098 It seemed that xfs was doing "extent conversion" at the wrong offsets, so written regions came up as unwritten (zeros) and stale data was exposed in the region after the write. Thanks to Peter Backes for the very nice testcase. This also broke xen with blktap over xfs. Here's what I found: finished_one_bio calls dio_complete calls xfs_end_io_direct with an offset, but: offset = dio->iocb->ki_pos; so this is the -current- post-IO position, not the IO start point that dio_complete expects. So, xfs converts the the wrong region. Ouch! XFS seems to be the only filesystem that uses the offset passed to the end_io function, so only it is affected by this as near as I can tell. However, the "short read" case is probably also wrong, as it is checking: if ((dio->rw == READ) && ((offset + transferred) > dio->i_size)) transferred = dio->i_size - offset; but offset is the ending IO position in this case too. This patch seems to fix it up. Comments? Thanks, -Eric Signed-off-by: Eric Sandeen Index: linux-2.6.18/fs/direct-io.c =================================================================== --- linux-2.6.18.orig/fs/direct-io.c +++ linux-2.6.18/fs/direct-io.c @@ -244,12 +244,12 @@ static void finished_one_bio(struct dio */ spin_unlock_irqrestore(&dio->bio_lock, flags); - /* Check for short read case */ transferred = dio->result; - offset = dio->iocb->ki_pos; + offset = dio->iocb->ki_pos - transferred; + /* Check for short read case */ if ((dio->rw == READ) && - ((offset + transferred) > dio->i_size)) + (dio->iocb->ki_pos > dio->i_size)) transferred = dio->i_size - offset; /* check for error in completion path */ From owner-xfs@oss.sgi.com Sun Nov 26 06:06:45 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 06:06:53 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAQE6iaG015090 for ; Sun, 26 Nov 2006 06:06:45 -0800 Received: from [192.168.1.5] (c-68-55-210-74.hsd1.dc.comcast.net [68.55.210.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 5327618E21425; Sun, 26 Nov 2006 08:05:55 -0600 (CST) Message-ID: <45699F40.9090700@sandeen.net> Date: Sun, 26 Nov 2006 08:05:52 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: David Chinner CC: Russell Cattelan , Tim Shimmin , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static References: <45338DDE.8020903@sandeen.net> <4533FAEA.2080500@sandeen.net> <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> <1164157783.19915.46.camel@xenon.msp.redhat.com> <20061122042445.GR37654165@melbourne.sgi.com> In-Reply-To: <20061122042445.GR37654165@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9779 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 307 Lines: 13 David Chinner wrote: > Performance appears to be slight faster with the noinline > patch, but the variation is within the error margins of > my measurements so I'd say it's neutral. > > Comments? with fewer inlines & more function calls, what about stack frames adding up? can we measure that? -Eric From owner-xfs@oss.sgi.com Sun Nov 26 06:32:23 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 06:32:31 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAQEWLaG017943 for ; Sun, 26 Nov 2006 06:32:22 -0800 Received: from [192.168.1.5] (c-68-55-210-74.hsd1.dc.comcast.net [68.55.210.74]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 6C90218E20C7D; Sun, 26 Nov 2006 08:31:32 -0600 (CST) Message-ID: <4569A541.3030600@sandeen.net> Date: Sun, 26 Nov 2006 08:31:29 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Arjan van de Ven CC: Al Viro , David Miller , dgc@sgi.com, jesper.juhl@gmail.com, chatz@melbourne.sgi.com, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, xfs-masters@oss.sgi.com, netdev@vger.kernel.org, linux-scsi@vger.kernel.org Subject: Re: 2.6.19-rc6 : Spontaneous reboots, stack overflows - seems to implicate xfs, scsi, networking, SMP References: <9a8748490611211551v2ebe88fel2bcf25af004c338a@mail.gmail.com> <9a8748490611220458w4d94d953v21f7a29a9f1bdb72@mail.gmail.com> <20061123011809.GY37654165@melbourne.sgi.com> <20061122.201013.112290046.davem@davemloft.net> <20061123043543.GI3078@ftp.linux.org.uk> <1164269545.31358.771.camel@laptopd505.fenrus.org> In-Reply-To: <1164269545.31358.771.camel@laptopd505.fenrus.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9780 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 608 Lines: 18 Arjan van de Ven wrote: > On Thu, 2006-11-23 at 04:35 +0000, Al Viro wrote: >>> I would even say 10 function calls deep to allocate file blocks >>> is overkill, but 22 it just astronomically bad. >> Especially since a large part is due to cxfs... >> - > > it's a bit sad to see XFS this crippled in linux due to an external, > proprietary module ;( > I understand that cxfs is a bit of a whipping-boy, but the stacks in question in this thread really don't have much if anything to do with the filesystem layering in xfs. They are deep callchains & large functions in core xfs code, it seems. -Eric From owner-xfs@oss.sgi.com Sun Nov 26 15:00:04 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 15:00:11 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAQN01aG016847 for ; Sun, 26 Nov 2006 15:00:03 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA10337; Mon, 27 Nov 2006 09:59:08 +1100 Message-ID: <456A1CD4.8000301@sgi.com> Date: Mon, 27 Nov 2006 10:01:40 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: sgi.bugs.xfs@engr.sgi.com CC: linux-xfs@oss.sgi.com Subject: TAKE 958534 - [clone 958472] dmf does not work with mangrove 1.0 enhanced dmapi module Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9781 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 600 Lines: 18 DMF doesn't work with mangrove 1.0 because of uninitialized dt_dev in xfs_ip_to_stat() Date: Mon Nov 27 09:56:20 AEDT 2006 Workarea: soarer.melbourne.sgi.com:/home/vapo/isms/linux-xfs Inspected by: chatz Author: vapo The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:27551a fs/xfs/dmapi/xfs_dm.c - 1.29 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/dmapi/xfs_dm.c.diff?r1=text&tr1=1.29&r2=text&tr2=1.28&f=h - pv 958534, rv chatz - add forgotten initialization of dt_dev in xfs_ip_to_stat() From owner-xfs@oss.sgi.com Sun Nov 26 22:21:55 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 22:22:02 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAR6LpaG031812 for ; Sun, 26 Nov 2006 22:21:54 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA25831; Mon, 27 Nov 2006 17:20:43 +1100 Message-ID: <456A8452.9060602@sgi.com> Date: Mon, 27 Nov 2006 17:23:14 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: Christoph Hellwig CC: sgi.bugs.xfs@engr.sgi.com, linux-xfs@oss.sgi.com Subject: Re: TAKE 958534 - [clone 958472] dmf does not work with mangrove 1.0 enhanced dmapi module References: <456A1CD4.8000301@sgi.com> <20061127055657.GB1374@infradead.org> In-Reply-To: <20061127055657.GB1374@infradead.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9782 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 335 Lines: 12 Christoph Hellwig wrote: > On Mon, Nov 27, 2006 at 10:01:40AM +1100, Vlad Apostolov wrote: > >> DMF doesn't work with mangrove 1.0 because of uninitialized dt_dev in >> xfs_ip_to_stat() >> > > What's mangrove? > The mangrove tree is an internal tree we are using to track top-of-xfs stuff backported to the SLES10 kernel. From owner-xfs@oss.sgi.com Sun Nov 26 22:29:03 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 22:29:11 -0800 (PST) Received: from mailgate.mysql.com (mailgate-out2.mysql.com [213.136.52.68]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR6T1aG000561 for ; Sun, 26 Nov 2006 22:29:03 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgate.mysql.com (8.13.4/8.13.4) with ESMTP id kAR5t0EZ025237; Mon, 27 Nov 2006 06:55:00 +0100 Received: from mail.mysql.com ([10.222.1.99]) by localhost (mailgate.mysql.com [10.222.1.98]) (amavisd-new, port 10026) with LMTP id 10928-10; Mon, 27 Nov 2006 06:55:00 +0100 (CET) Received: from [192.168.1.100] (ppp163-199.static.internode.on.net [150.101.163.199]) (authenticated bits=0) by mail.mysql.com (8.13.3/8.13.3) with ESMTP id kAR5sqoJ028104 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Mon, 27 Nov 2006 06:54:56 +0100 Subject: Re: XFS_IOC_RESVSP64 versus XFS_IOC_ALLOCSP64 with multiple threads From: Stewart Smith To: Sam Vaughan Cc: xfs@oss.sgi.com In-Reply-To: <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> References: <1163381602.11914.10.camel@localhost.localdomain> <965ECEF2-971D-46A1-B3F2-C6C1860C9ED8@sgi.com> <1163390942.14517.12.camel@localhost.localdomain> <12275452-56ED-4921-899F-EFF1C05B251A@sgi.com> <1163395250.14517.38.camel@localhost.localdomain> <950D2C3E-11AE-4805-9286-65ECD880272D@sgi.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-+0Xx/HOYEKeIYPkemlLi" Organization: MySQL AB Date: Mon, 27 Nov 2006 05:55:01 +0000 Message-Id: <1164606901.26726.18.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 X-archive-position: 9783 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stewart@mysql.com Precedence: bulk X-list: xfs Content-Length: 4187 Lines: 115 --=-+0Xx/HOYEKeIYPkemlLi Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, 2006-11-14 at 11:04 +1100, Sam Vaughan wrote:=20 > Those extents are curiously uniform, all 32kB in size. The fact that=20= =20 > both files' extents are in AG 8 suggests that the two directories=20=20 > ndb_1_fs and ndb_2_fs filled their original AGs and spilled out into=20= =20 > other ones, which is when the interference would have started.=20=20=20 > Looking at the directory hierarchy in your last email, you might be=20=20 > better off if you could add another directory for the datafiles and=20=20 > undofiles to live in, so they don't end up sharing their AG with=20=20 > other stuff in their parent directory. I think this is typically what the QA guys do (to help keep their sanity if anything). Perhaps we should have this in our "best practice" documentation as well... > > for the data and undo files, we're just not changing their size except > > at creation time, so that's okay. >=20 > I'd assumed that these files were being continually grown. If all=20=20 > this is happening at creation time then it shouldn't be too hard to=20=20 > make sure the files are cleanly allocated with just one extent. Does=20= =20 > the following not work on your file system? >=20 > $ touch a b > $ for file in a b; do > > xfs_io -c 'allocsp 1G 0' $file & > > done; wait > [1] 12312 > [2] 12313 > [1]- Done xfs_io -c 'allocsp 1G 0' $file > [2]+ Done xfs_io -c 'allocsp 1G 0' $file > $ xfs_bmap -v a b > a: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 > TOTAL > 0: [0..2097151]: 231732008..233829159 6 (11968856..14066007)=20= =20 > 2097152 > b: > EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET=20=20=20=20=20=20= =20=20=20=20=20=20=20=20=20 > TOTAL > 0: [0..2097151]: 233829160..235926311 6 (14066008..16163159)=20= =20 > 2097152 > $ That works fine on my file systems (or, on my rather full and well used /home, as well as it can). We're opening the files with O_DIRECT (or, if not available or fails, O_SYNC) > >> Now in your case you're using different directories, so your files > >> are probably OK at the start of day. Once the AGs they start in fill > >> up though, the files for both processes will start getting allocated > >> from the next available AG. At that point, allocations that started > >> out looking like the first test above will end up looking like the > >> second. > >> > >> The filestreams allocator will stop this from happening for > >> applications that write data regularly like video ingest servers, but > >> I wouldn't expect it to be a cure-all for your database app because > >> your writes could have large delays between them. Instead, I'd look > >> into ways to break up your data into AG-sized chunks, starting a new > >> directory every time you go over that magic size. > > > > I'll have to check our writing behaviour the files that change=20=20 > > sizes... > > but they're not too much of an issue (they're hardly ever read=20=20 > > back, so > > as long as writing them out is okay and reading isn't totally abismal, > > we don't have to worry). >=20 > That's handy. All in all it sounds like your requirements are very=20=20 > file system friendly in terms of getting optimum allocation. I'm not=20= =20 > sure what could be causing all those 32kB extents. Perhaps being flushed out due to VM pressure? but with O_DIRECT/O_SYNC that shouldn't be the case, right? Or perhaps *because* of O_DIRECT/O_SYNC? --=20 Stewart Smith, Software Engineer MySQL AB, www.mysql.com Office: +14082136540 Ext: 6616 VoIP: 6616@sip.us.mysql.com Mobile: +61 4 3 8844 332 Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html --=-+0Xx/HOYEKeIYPkemlLi Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) iD8DBQBFan21KglWCUL+FDoRAuShAKCFCYR9f8UxYdccOPP02RC5dOYUQQCgugkN WrWw90M0e7g//UX2WCHeVjI= =H03E -----END PGP SIGNATURE----- --=-+0Xx/HOYEKeIYPkemlLi-- From owner-xfs@oss.sgi.com Sun Nov 26 22:35:48 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 22:35:55 -0800 (PST) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR6ZkaG001666 for ; Sun, 26 Nov 2006 22:35:48 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1GoZW7-0000Zt-7t; Mon, 27 Nov 2006 05:58:59 +0000 Date: Mon, 27 Nov 2006 05:58:59 +0000 From: Christoph Hellwig To: Vlad Apostolov Cc: sgi.bugs.xfs@engr.sgi.com, linux-xfs@oss.sgi.com Subject: Re: TAKE 956783 - xfs_dm_getall_dmattr() doesn't check if the user buffer is at valid address Message-ID: <20061127055859.GC1374@infradead.org> References: <45629AD8.8000800@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45629AD8.8000800@sgi.com> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 9784 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 511 Lines: 11 On Tue, Nov 21, 2006 at 05:21:12PM +1100, Vlad Apostolov wrote: > No EFAULT error when dm_getall_dmattr() called with an invalid user > buffer address. This fix is broken. access_ok is not enough to verify the buffer, it just does very few static check (basically the address space limit) You need to use copy_{from,to}_user to access user pointers. I had an untested patch to fix this at my good old SGI time, but Dean wanted to review and test it a lot more. I'll try to dig up that patch if you care. From owner-xfs@oss.sgi.com Sun Nov 26 22:35:57 2006 Received: with ECARTIS (v1.0.0; list xfs); Sun, 26 Nov 2006 22:36:03 -0800 (PST) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR6ZuaG001692 for ; Sun, 26 Nov 2006 22:35:57 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1GoZU9-0000ZO-9S; Mon, 27 Nov 2006 05:56:57 +0000 Date: Mon, 27 Nov 2006 05:56:57 +0000 From: Christoph Hellwig To: Vlad Apostolov Cc: sgi.bugs.xfs@engr.sgi.com, linux-xfs@oss.sgi.com Subject: Re: TAKE 958534 - [clone 958472] dmf does not work with mangrove 1.0 enhanced dmapi module Message-ID: <20061127055657.GB1374@infradead.org> References: <456A1CD4.8000301@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <456A1CD4.8000301@sgi.com> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 9785 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 175 Lines: 6 On Mon, Nov 27, 2006 at 10:01:40AM +1100, Vlad Apostolov wrote: > DMF doesn't work with mangrove 1.0 because of uninitialized dt_dev in > xfs_ip_to_stat() What's mangrove? From owner-xfs@oss.sgi.com Mon Nov 27 00:16:03 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 27 Nov 2006 00:16:11 -0800 (PST) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAR8FxaG014319 for ; Mon, 27 Nov 2006 00:16:02 -0800 Received: from root by ciao.gmane.org with local (Exim 4.43) id 1GoaTC-00074K-2q for linux-xfs@oss.sgi.com; Mon, 27 Nov 2006 08:00:03 +0100 Received: from halleracaffe-14.broadband.pl ([85.232.226.14]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 27 Nov 2006 08:00:02 +0100 Received: from mszpak by halleracaffe-14.broadband.pl with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 27 Nov 2006 08:00:02 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: linux-xfs@oss.sgi.com From: =?ISO-8859-2?Q?Marcin_Zaj=B1czkowski?= Subject: Errors on XFS partition - ask for diagnose Date: Sun, 26 Nov 2006 22:02:49 +0100 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: halleracaffe-14.broadband.pl User-Agent: Thunderbird 1.5.0.7 (X11/20060913) X-archive-position: 9786 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mszpak@wp.pl Precedence: bulk X-list: xfs Content-Length: 3359 Lines: 76 Hi, Recently some of my executable files (on XFS partition) have become "invisible" for whereis, locate, find, bash, mc, nautilus and others. They are not reported by autofill in bash, but I can run it by typing full file name. "ls" with full name returns info about file, "ls" with regexp no. I run xfs_check and it returned me many errors (see below). Because I don't have experience with errors on XFS partition I would like to ask, do you think that after ran of xfs_repair file system would be still usable (now only those files are invisible, but still accessible)? Btw, I didn't have any power failure, nor problems with hardware (at least on this partition - based on smart report). The trigger for that was to fill in partition in 100%. Could it cause those error on my partition? Regards Marcin [xfs_check-output] bad free block nused 1 should be 14 for dir ino 48 block 16777216 bad free block nused 0 should be 5 for dir ino 335698 block 16777216 bad free block nused 0 should be 5 for dir ino 1355949 block 16777216 agi unlinked bucket 4 is 44164 in ag 2 (inode=1092740) agi unlinked bucket 43 is 117355 in ag 2 (inode=1165931) agi unlinked bucket 59 is 43387 in ag 2 (inode=1091963) dir 2102297 block 0 entry 454ba2b3v bad inode number 1183729173579457891 dir 2102297 block 0 entry 3 bad inode number 3619544128639099186 dir 2102297 block 0 bad free entry at 296 dir 2102297 block 0 bad bestfree data dir 2102297 block 0 entry/unused tag mismatch dir 2102297 block 8388609 extra leaf entry 4d39b335 196 dir 2102297 block 8388609 extra leaf entry 4d5c32e1 60 (... 159 lines removed) dir 2102297 block 8388612 extra leaf entry 4c6b7599 17c dir 2102297 block 8388612 extra leaf entry 4c7a6849 18 bad free block ent 0 is 0 should be 65320 for dir ino 2102297 block 16777216 bad free block ent 1 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 2 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 3 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 4 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 5 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 6 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 7 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 8 is 0 should be 8 for dir ino 2102297 block 16777216 bad free block ent 10 is 0 should be 40 for dir ino 2102297 block 16777216 bad free block nused 1 should be 12 for dir ino 2102297 block 16777216 dir ino 2102297 missing leaf entry for ebbb4d51/a dir ino 2102297 missing leaf entry for e118f281/21 dir ino 2102297 missing leaf entry for e34e82df/13 bad free block nused 0 should be 4 for dir ino 3426262 block 16777216 allocated inode 1091963 has 0 link count allocated inode 1092740 has 0 link count allocated inode 1165931 has 0 link count disconnected inode 2146977, nlink 1 disconnected inode 2130509, nlink 1 (... 97 lines removed) disconnected inode 2159424, nlink 1 disconnected inode 2272314, nlink 1 link count mismatch for inode 2136334 (name ?), nlink 8, counted 9 disconnected inode 2182819, nlink 1 disconnected inode 2146376, nlink 1 (... 62 lines removed) disconnected inode 2256376, nlink 1 disconnected inode 2160161, nlink 1 [/xfs_check-output] From owner-xfs@oss.sgi.com Mon Nov 27 04:51:42 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 27 Nov 2006 04:51:52 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kARCpcaG025830 for ; Mon, 27 Nov 2006 04:51:40 -0800 Received: from [134.14.52.201] (pmmelb201.melbourne.sgi.com [134.14.52.201]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id XAA09106; Mon, 27 Nov 2006 23:50:31 +1100 Message-ID: <456ADF08.4080002@sgi.com> Date: Mon, 27 Nov 2006 23:50:16 +1100 From: Tim Shimmin Reply-To: tes@sgi.com Organization: SGI User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Russell Cattelan , Eric Sandeen CC: xfs@oss.sgi.com Subject: Re: [PATCH] (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> <45647CF8.8020104@sandeen.net> <26F2AE58A7D40E5170649BC2@timothy-shimmins-power-mac-g5.local> <4565DC6A.9080602@thebarn.com> In-Reply-To: <4565DC6A.9080602@thebarn.com> Content-Type: multipart/mixed; boundary="------------090006010607070904010804" X-archive-position: 9789 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 10945 Lines: 291 This is a multi-part message in MIME format. --------------090006010607070904010804 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi Russell & Eric, Basing on Russell's patch, I was thinking of something like the attached patch. However, I'm wondering if the xfs_attr_set_int() change to use a req_size for non-fitting shortform EA's is worth it - as it is a bit of a prediction (trying to codify what I was thinking). Russell, perhaps I should just send in sf_size like you initially intended. In fact, the more I think about it, I'm more inclined to just pass in sf_size. Haven't tested the patch out yet. Just wanted to discuss it a bit. Cheers, Tim. Russell Cattelan wrote: > Timothy Shimmin wrote: > >> Hi Guys, >> >> So just looking at the first part, which as Eric suggested can be >> considered >> on its own. >> >> Index: work_gfs/fs/xfs/xfs_attr.c >> =================================================================== >> --- work_gfs.orig/fs/xfs/xfs_attr.c 2006-11-21 18:38:27.572949303 >> -0600 >> +++ work_gfs/fs/xfs/xfs_attr.c 2006-11-21 18:44:51.666033422 -0600 >> @@ -210,8 +210,20 @@ xfs_attr_set_int(xfs_inode_t *dp, const >> * (inode must not be locked when we call this routine) >> */ >> if (XFS_IFORK_Q(dp) == 0) { >> - if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) >> - return(error); >> + if ((dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) || >> + ((dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS) && >> + (dp->i_d.di_anextents == 0))) { >> + /* xfs_bmap_add_attrfork will set the forkoffset based on >> + * the size needed, the local attr case needs the size >> + * attr plus the size of the hdr, if the size of >> + * header is not accounted for initially the forkoffset >> + * won't allow enough space, the actually attr add will >> + * then be forced out out line to extents >> + */ >> + size += sizeof(xfs_attr_sf_hdr_t); >> + if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) >> + return(error); >> + } >> } >> >> --On 22 November 2006 10:38:16 AM -0600 Eric Sandeen >> wrote: >> >>>> By fixing the initial size calculation at least things like SElinux >>>> which is adding one attr won't cause the attr segment to flip to >>>> extents >>>> immediately. >>>> The second attr will cause the flip but not the first one. >>> >>> >>> I'd say this part (fixing up proper space for the initial attr fork >>> setup) should probably go in >>> soon if it gets good reviews (with the removal of the extra tests, as >>> we discussed on irc last >>> night). I think this proper change stands on its own just fine. >>> >> >> So yeah, as you said in IRC, the brace is in the wrong spot and >> the di_aformat tests don't make any sense here. >> Basically, we know that fork offset is zero and therefore that the >> di_aformat should be >> set at XFS_DINODE_FMT_EXTENTS and di_anetents will be zero. >> As this is the state before we add in an attribute fork. >> Why we have this initial state as extents, I'm not too sure and >> wondered in the >> past. Maybe because this state is one which doesn't occupy any space >> in the literal area. >> A shortform EA has a header at least. >> >> My next concern is that the size that is calculated is presumably >> trying to accomodate >> the shortform EA. However, the calculation is for the sf header and >> the space for a >> a xfs_attr_leaf_name_local with given namelen and valuelen. >> It would be better to base it on an xfs_attr_sf_entry type. >> So I think we need to rework this calculation. >> >> Which leads me on to the next issue. >> We don't know what EA form we are going to take, >> so we can't really assume that it will be shortform. >> If the EA name or value is big then the EA will go into extents and >> could occupy very >> little room in the inode. >> With the current & proposed test this could make the bytesfit function >> return 0 >> (the offset calculated in bytesfit could also go negative) and >> then we would set the forkoff back at the old attr1 default. >> So we might have 1 EA extent in the inode taking little space and yet >> setting the forkoff >> in the middle. > > Yes I agree worst cast scenario is that the inode has reverted to an > attr1 split and > that space is being wasted in the attr portion. By the time an inode has > flipped to > btree mode for di_u how much of a performance hit is really going to be > noticed? > mapping the blocks for that inode is going to take multiple reads. > Attr2 seems most effective at space optimization for the local and extent > versions of di_u and probably not so much for btree. > > At least by fixing the size calculation the shortforms that do fit into > di_a will now > be added inline. What is happening now the btree is being re factored > which is probably expensive and the attr is being added as extents since > the > original size used for the btree refactoring wasn't enough. > > So the change to add in the header size will at least make single case > attrs more efficient since they will now be inline. > If the attr does not fit inline then worst case the forkoff flips to attr1 > default or a half and half split. > > Given the cost of refactoring a btree it might be better to have attr1 > behavior? > Since di_a will have extra space additional attr adds won't cause > forkoff to > move and thus won't cause a rebalance of di_u. > > So in thinking about this more does it make sense to actually not > try to optimize the space needed for di_u when it is a btree? > Maybe the first time an attr is added simply split the inode > space doing the rebalance once? > That would allow for more attrs to be added without rebalancing > the data btree. > The other scheme of space optimzation if the root btree node > is sparse would say sure give more space to di_a but at > the expense of a reblanace. > > So ya it's a bit of a guessing game. > >> >> Of course the setting of the forkoff is a bit of a guessing game since >> we can't >> predict the future usage but I think the plan is to set it to the >> minimum to fit >> on a first come first served basis. >> >> So I'm thinking that we should set it based on the size of shortform >> if that >> is how it will be stored or to the size taken up by the EA extents - >> I was initially thinking that this would be 1 extent but with a remote >> value >> block of up to 64K this could in theory be an extent for each fsb of >> the value >> I guess. >> Have to think about this some more. >> >> --Tim >> --------------090006010607070904010804 Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0"; name="attr2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="attr2.patch" --- .pc/attr2_tes/fs/xfs/xfs_attr.c 2006-10-26 17:45:01.000000000 +1000 +++ fs/xfs/xfs_attr.c 2006-11-27 22:58:49.753629073 +1100 @@ -210,7 +210,19 @@ xfs_attr_set_int(xfs_inode_t *dp, const * (inode must not be locked when we call this routine) */ if (XFS_IFORK_Q(dp) == 0) { - if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) + int req_size; + int sf_size = sizeof(xfs_attr_sf_hdr_t) + XFS_ATTR_SF_ENTSIZE_BYNAME(namelen, valuelen); + + if (local && (sf_size <= (XFS_LITINO(mp) - xfs_ifork_dsize_used(dp)))) + req_size = sf_size; + else + /* + * We can't fit our SF EA inline, so leave space for 2 EA extents + * which should cover most initial EAs and most EAs in general + */ + req_size = 2 * sizeof(xfs_bmbt_rec_t); + + if ((error = xfs_bmap_add_attrfork(dp, req_size, rsvd))) return(error); } --- .pc/attr2_tes/fs/xfs/xfs_attr_leaf.c 2006-10-26 17:45:01.000000000 +1000 +++ fs/xfs/xfs_attr_leaf.c 2006-11-27 22:59:07.295306537 +1100 @@ -170,18 +170,25 @@ xfs_attr_shortform_bytesfit(xfs_inode_t } /* data fork btree root can have at least this many key/ptr pairs */ - minforkoff = MAX(dp->i_df.if_bytes, XFS_BMDR_SPACE_CALC(MINDBTPTRS)); + minforkoff = MAX(xfs_ifork_dsize_used(dp), XFS_BMDR_SPACE_CALC(MINDBTPTRS)); minforkoff = roundup(minforkoff, 8) >> 3; /* attr fork btree root can have at least this many key/ptr pairs */ maxforkoff = XFS_LITINO(mp) - XFS_BMDR_SPACE_CALC(MINABTPTRS); maxforkoff = maxforkoff >> 3; /* rounded down */ - if (offset >= minforkoff && offset < maxforkoff) - return offset; + /* we can't fit inline */ + if (offset < minforkoff) + return 0; + + /* don't move the forkoff for data btree */ + if (dp->i_d.di_format == XFS_DINODE_FMT_BTREE && dp->i_d.di_forkoff) + return dp->i_d.di_forkoff << 3; + if (offset >= maxforkoff) return maxforkoff; - return 0; + else + return offset; } /* --- .pc/attr2_tes/fs/xfs/xfs_bmap.c 2006-11-17 14:35:46.000000000 +1100 +++ fs/xfs/xfs_bmap.c 2006-11-27 15:54:33.166590715 +1100 @@ -3543,6 +3543,7 @@ xfs_bmap_forkoff_reset( if (whichfork == XFS_ATTR_FORK && (ip->i_d.di_format != XFS_DINODE_FMT_DEV) && (ip->i_d.di_format != XFS_DINODE_FMT_UUID) && + (ip->i_d.di_format != XFS_DINODE_FMT_BTREE) && ((mp->m_attroffset >> 3) > ip->i_d.di_forkoff)) { ip->i_d.di_forkoff = mp->m_attroffset >> 3; ip->i_df.if_ext_max = XFS_IFORK_DSIZE(ip) / --- .pc/attr2_tes/fs/xfs/xfs_inode.c 2006-11-27 23:20:19.000000000 +1100 +++ fs/xfs/xfs_inode.c 2006-11-27 23:21:29.604958540 +1100 @@ -4747,3 +4747,34 @@ xfs_iext_irec_update_extoffs( ifp->if_u1.if_ext_irec[i].er_extoff += ext_diff; } } + +/* + * return how much space is used by the inode's data fork + */ +int +xfs_ifork_dsize_used(xfs_inode_t *ip) +{ + switch (ip->i_d.di_format) { + case XFS_DINODE_FMT_DEV: + return sizeof(xfs_dev_t); + case XFS_DINODE_FMT_UUID: + return sizeof(uuid_t); + case XFS_DINODE_FMT_LOCAL: + case XFS_DINODE_FMT_EXTENTS: + return ip->i_df.if_bytes; + case XFS_DINODE_FMT_BTREE: + if (ip->i_d.di_forkoff) + return ip->i_d.di_forkoff << 3; + else + /* + * For new attr fork, data btree takes all the space, + * so no room for any attrs with the current layout + * but we can know how much space it really needs + * i.e. the ptrs are half way along but we could compress to + * preserve the num of records. + */ + return XFS_BMDR_SPACE_CALC(XFS_BMAP_BROOT_NUMRECS(ip->i_df.if_broot)); + default: + return 0; + } +} --- .pc/attr2_tes/fs/xfs/xfs_inode.h 2006-11-17 14:35:46.000000000 +1100 +++ fs/xfs/xfs_inode.h 2006-11-27 23:24:09.376554289 +1100 @@ -535,6 +535,7 @@ void xfs_iext_irec_compact(xfs_ifork_t void xfs_iext_irec_compact_pages(xfs_ifork_t *); void xfs_iext_irec_compact_full(xfs_ifork_t *); void xfs_iext_irec_update_extoffs(xfs_ifork_t *, int, int); +int xfs_ifork_dsize_used(xfs_inode_t *); #define xfs_ipincount(ip) ((unsigned int) atomic_read(&ip->i_pincount)) --------------090006010607070904010804-- From owner-xfs@oss.sgi.com Mon Nov 27 06:15:47 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 27 Nov 2006 06:15:54 -0800 (PST) Received: from smtp104.sbc.mail.mud.yahoo.com (smtp104.sbc.mail.mud.yahoo.com [68.142.198.203]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAREFjaG007366 for ; Mon, 27 Nov 2006 06:15:47 -0800 Received: (qmail 13735 invoked from network); 27 Nov 2006 13:46:56 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp104.sbc.mail.mud.yahoo.com with SMTP; 27 Nov 2006 13:46:56 -0000 X-YMail-OSG: aUebnkYVM1kLbDrtESWCN58N7IbQ8YWb3Xp67ICWNqO3C2Y7.CQc.1nmbwAX8aaHUK0_3IsK9rEXbRNgBkFYrQlpxTbKGzvLjdhsW6.8PIlRewJz1bqSiQ-- Received: by tuatara.stupidest.org (Postfix, from userid 10000) id 959861827282; Mon, 27 Nov 2006 05:46:55 -0800 (PST) Date: Mon, 27 Nov 2006 05:46:55 -0800 From: Chris Wedgwood To: Marcin Zaj?czkowski Cc: linux-xfs@oss.sgi.com Subject: Re: Errors on XFS partition - ask for diagnose Message-ID: <20061127134655.GA11018@tuatara.stupidest.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-archive-position: 9791 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 180 Lines: 6 On Sun, Nov 26, 2006 at 10:02:49PM +0100, Marcin Zaj?czkowski wrote: > I run xfs_check and it returned me many errors (see below). http://oss.sgi.com/projects/xfs/faq.html#dir2 From owner-xfs@oss.sgi.com Mon Nov 27 07:34:57 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 27 Nov 2006 07:35:03 -0800 (PST) Received: from herkules.vie.weberhofer.at (85-124-132-100.work.xdsl-line.inode.at [85.124.132.100]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kARFYtaG019306 for ; Mon, 27 Nov 2006 07:34:57 -0800 Received: (qmail 8611 invoked by uid 89); 27 Nov 2006 15:07:26 -0000 Received: from unknown (HELO ?192.168.22.24?) (192.168.22.24) by herkules with SMTP; 27 Nov 2006 15:07:26 -0000 Message-ID: <456AFF30.2060904@weberhofer.at> Date: Mon, 27 Nov 2006 16:07:28 +0100 From: "Johannes Weberhofer, Weberhofer GmbH" User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Problem after removing SATA II disc without unmounting Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9792 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: office@weberhofer.at Precedence: bulk X-list: xfs Content-Length: 3281 Lines: 107 Hello! I have a problem with an hard-disk which has been removed from a SATA II slot without being unmounted. After inserting the disk, a mount attempt results in: server:~ # mount /backup/ mount: /dev/sdb1: can't read superblock **************************************** xfs_check /dev/sdb1 does not show any error **************************************** xfs_repair shows: server:~ # xfs_repair /dev/sdb1 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - deleting existing "lost+found" entry - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done **************************************** in /var/log/message I can see: Nov 27 10:00:01 server kernel: attempt to access beyond end of device Nov 27 10:00:01 server kernel: sdb1: rw=0, want=781417600, limit=781401537 Nov 27 10:00:01 server kernel: I/O error in filesystem ("sdb1") meta-data dev sdb1 block 0x2e937c7f ("xfs_read_buf") error 5 buf count 512 Nov 27 10:00:01 server kernel: XFS: size check 2 failed **************************************** server:~ # cat /proc/partitions major minor #blocks name 8 16 390711384 sdb 8 17 390700768 sdb1 **************************************** I have Opensuse with kernel kernel-default-2.6.16.21-0.25 running. Do you have any ideas/suggestions? Best regards, Johannes Weberhofer -- |--------------------------------- | weberhofer GmbH | Johannes Weberhofer | information technologies, Austria | | phone : +43 (0)1 5454421 0 | email: office@weberhofer.at | fax : +43 (0)1 5454421 19 | web : http://weberhofer.at | mobile: +43 (0)699 11998315 |----------------------------------------------------------->> From owner-xfs@oss.sgi.com Mon Nov 27 15:59:07 2006 Received: with ECARTIS (v1.0.0; list xfs); Mon, 27 Nov 2006 15:59:15 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kARNx2aG022699 for ; Mon, 27 Nov 2006 15:59:04 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA04538; Tue, 28 Nov 2006 10:57:55 +1100 Message-ID: <456B7C1A.90209@sgi.com> Date: Tue, 28 Nov 2006 11:00:26 +1100 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: Christoph Hellwig CC: sgi.bugs.xfs@engr.sgi.com, linux-xfs@oss.sgi.com Subject: Re: TAKE 956783 - xfs_dm_getall_dmattr() doesn't check if the user buffer is at valid address References: <45629AD8.8000800@sgi.com> <20061127055859.GC1374@infradead.org> In-Reply-To: <20061127055859.GC1374@infradead.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9794 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 894 Lines: 22 Christoph Hellwig wrote: > On Tue, Nov 21, 2006 at 05:21:12PM +1100, Vlad Apostolov wrote: > >> No EFAULT error when dm_getall_dmattr() called with an invalid user >> buffer address. >> > > This fix is broken. access_ok is not enough to verify the buffer, > it just does very few static check (basically the address space limit) > > You need to use copy_{from,to}_user to access user pointers. I had > an untested patch to fix this at my good old SGI time, but Dean wanted > to review and test it a lot more. I'll try to dig up that patch if you care. > The fix is actually fine as it gives an early indication (even not complete) that the user pointer is bad. There is another problem you are pointing at and it is the userspace pointer dereference later on without using copy_to_user(). If you have any patch fixing this problem it would be great. Thanks and regards, Vlad From owner-xfs@oss.sgi.com Tue Nov 28 06:10:47 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 06:10:54 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kASEAiaG023931 for ; Tue, 28 Nov 2006 06:10:46 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kASE9rYH006913; Tue, 28 Nov 2006 09:09:53 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kASE9m0v024849; Tue, 28 Nov 2006 09:09:48 -0500 Received: from [127.0.0.1] (sebastian-int.corp.redhat.com [172.16.52.221]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kASE9lxT013071; Tue, 28 Nov 2006 09:09:47 -0500 Message-ID: <456C432E.2050601@redhat.com> Date: Tue, 28 Nov 2006 08:09:50 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Eric Sandeen CC: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, rtc@gmx.de Subject: Re: [PATCH/RFC] pass dio_complete proper offset from finished_one_bio References: <45691753.60500@redhat.com> In-Reply-To: <45691753.60500@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9796 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@redhat.com Precedence: bulk X-list: xfs Content-Length: 655 Lines: 18 Eric Sandeen wrote: > We saw problems w/ xfs doing AIO+DIO into a sparse file. > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=217098 > > It seemed that xfs was doing "extent conversion" at the wrong offsets, so > written regions came up as unwritten (zeros) and stale data was exposed > in the region after the write. Thanks to Peter Backes for the very > nice testcase. > > This also broke xen with blktap over xfs. Hrmph. Zach's changes in -mm magically made this go away... I was about to submit a proper patch against -mm but it seem to be not needed. So, now digging around to see why that is, and what exactly "fixed" things. -Eric From owner-xfs@oss.sgi.com Tue Nov 28 07:33:03 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 07:33:10 -0800 (PST) Received: from EX01.ad.tulane.edu (ex01.ad.tulane.edu [129.81.114.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kASFX2aG002527 for ; Tue, 28 Nov 2006 07:33:02 -0800 Received: from EX08.ad.tulane.edu ([129.81.114.38]) by EX01.ad.tulane.edu with Microsoft SMTPSVC(6.0.3790.1830); Tue, 28 Nov 2006 09:22:49 -0600 Received: from 129.81.86.224 ([129.81.86.224]) by EX08.ad.tulane.edu ([129.81.114.38]) via Exchange Front-End Server ent.tulane.edu ([129.81.114.4]) with Microsoft Exchange Server HTTP-DAV ; Tue, 28 Nov 2006 15:22:49 +0000 User-Agent: Microsoft-Entourage/11.2.5.060620 Date: Tue, 28 Nov 2006 09:20:03 -0600 Subject: get xfs_quota info as regular user From: Rene Salmon To: CC: Rene Salmon Message-ID: Thread-Topic: get xfs_quota info as regular user Thread-Index: AccTAK0x687OBH7zEdu/5gAKlXZa5g== Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-OriginalArrivalTime: 28 Nov 2006 15:22:49.0335 (UTC) FILETIME=[10568870:01C71301] X-archive-position: 9797 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rsalmon@tulane.edu Precedence: bulk X-list: xfs Content-Length: 1213 Lines: 46 Hi, Did some searches on the list archives but could not find any useful info on this. Is there a way for a regular user to get info about his or her quota usage? I tried both of these as a regular user and get nothing: 120> xfs_quota -c "quota userid" 121> xfs_quota -x -c "quota userid" But if I log in as root I can manage to get the info I need: # xfs_quota -c "quota userid" Disk quotas for User userid (17080) Filesystem Blocks Quota Limit Warn/Time Mounted on /dev/vg_u00/lv_u00 1691516 10240000 10291200 00 [--------] /u00 /dev/vg_u01/lv_u01 4272876 10240000 10291200 00 [--------] /u01 Disk quotas for User userid (17080) Filesystem Blocks Quota Limit Warn/Time Mounted on /dev/vg_u00/lv_u00 1691516 10240000 10291200 00 [--------] /u00 /dev/vg_u01/lv_u01 4272876 10240000 10291200 00 [--------] /u01 Is there any way to allow users to check their quotas with out being root? Thanks Rene -- Rene Salmon Tulane University Center for Computational Science http://www.ccs.tulane.edu rsalmon@tulane.edu Tel 504-862-8393 Fax 504-862-8392 From owner-xfs@oss.sgi.com Tue Nov 28 08:15:36 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 08:15:47 -0800 (PST) Received: from wx-out-0506.google.com (wx-out-0506.google.com [66.249.82.226]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kASGFZaG008410 for ; Tue, 28 Nov 2006 08:15:36 -0800 Received: by wx-out-0506.google.com with SMTP id t4so1830693wxc for ; Tue, 28 Nov 2006 08:14:46 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:mime-version:content-type:content-transfer-encoding:content-disposition; b=qAl8HA+hO7vaOABJUVT3MyHsSYaRDjV1UnuuZ8Uaz3DUBO8rndKs90L4uwqJw/i2uE2aeOTFcoE3w7ztcsfoK2e4Ekbl+VFhG+GFNCIdtX3klOqpvmbk6oYt56OGz1Gf6h8GCXjMqC2skSdv3MZfT+Kwdhlxl1pjTdo/ou37quY= Received: by 10.90.49.19 with SMTP id w19mr857383agw.1164728941034; Tue, 28 Nov 2006 07:49:01 -0800 (PST) Received: by 10.90.106.11 with HTTP; Tue, 28 Nov 2006 07:49:00 -0800 (PST) Message-ID: <9a8748490611280749k5c97d21bx2e499d2209d27dfe@mail.gmail.com> Date: Tue, 28 Nov 2006 16:49:00 +0100 From: "Jesper Juhl" To: "Linux Kernel Mailing List" , xfs@oss.sgi.com, xfs-masters@oss.sgi.com Subject: XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c (kernel 2.6.18.1) Cc: "Keith Owens" , "Jesper Juhl" MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9798 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jesper.juhl@gmail.com Precedence: bulk X-list: xfs Content-Length: 2355 Lines: 59 Hi, One of my NFS servers just gave me a nasty surprise that I think it is relevant to tell you about: Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c. Caller 0xffffffff8034b47e Call Trace: [] show_trace+0xb2/0x380 [] dump_stack+0x15/0x20 [] xfs_error_report+0x3c/0x50 [] xfs_trans_cancel+0x6e/0x130 [] xfs_create+0x5ee/0x6a0 [] xfs_vn_mknod+0x156/0x2e0 [] xfs_vn_create+0xb/0x10 [] vfs_create+0x8c/0xd0 [] nfsd_create_v3+0x31a/0x560 [] nfsd3_proc_create+0x148/0x170 [] nfsd_dispatch+0xf9/0x1e0 [] svc_process+0x437/0x6e0 [] nfsd+0x1cd/0x360 [] child_rip+0xa/0x12 xfs_force_shutdown(dm-1,0x8) called from line 1139 of file fs/xfs/xfs_trans.c. Return address = 0xffffffff80359daa Filesystem "dm-1": Corruption of in-memory data detected. Shutting down filesystem: dm-1 Please umount the filesystem, and rectify the problem(s) nfsd: non-standard errno: 5 nfsd: non-standard errno: 5 nfsd: non-standard errno: 5 nfsd: non-standard errno: 5 nfsd: non-standard errno: 5 (the above message repeates 1670 times, then the following) xfs_force_shutdown(dm-1,0x1) called from line 424 of file fs/xfs/xfs_rw.c. Return address = 0xffffffff80359daa I unmounted the filesystem, ran xfs_repair which told me to try an mount it first to replay the log, so I did, unmounted it again, ran xfs_repair (which didn't find any problems) and finally mounted it and everything is good - the filesystem seems intact. Filesystem "dm-1": Disabling barriers, not supported with external log device XFS mounting filesystem dm-1 Starting XFS recovery on filesystem: dm-1 (logdev: /dev/Log1/ws22_log) Ending XFS recovery on filesystem: dm-1 (logdev: /dev/Log1/ws22_log) Filesystem "dm-1": Disabling barriers, not supported with external log device XFS mounting filesystem dm-1 Ending clean XFS mount for filesystem: dm-1 The server in question is running kernel 2.6.18.1 -- Jesper Juhl Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html From owner-xfs@oss.sgi.com Tue Nov 28 11:02:47 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 11:02:53 -0800 (PST) Received: from megapolis.pl (mail.megapolis.pl [193.218.115.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kASJ2jaG032447 for ; Tue, 28 Nov 2006 11:02:46 -0800 Received: from [85.232.226.14] (account szpak HELO [192.168.0.61]) by megapolis.pl (CommuniGate Pro SMTP 4.1.6) with ESMTP-TLS id 66712643; Tue, 28 Nov 2006 19:38:19 +0100 Message-ID: <456C7927.6070501@wp.pl> Date: Tue, 28 Nov 2006 19:00:07 +0100 From: =?ISO-8859-2?Q?Marcin_Zaj=B1czkowski?= User-Agent: Thunderbird 1.5.0.7 (X11/20060913) MIME-Version: 1.0 To: chatz@melbourne.sgi.com CC: linux-xfs@oss.sgi.com Subject: Re: Errors on XFS partition - ask for diagnose References: <456AA1D4.6020303@melbourne.sgi.com> In-Reply-To: <456AA1D4.6020303@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 9799 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mszpak@wp.pl Precedence: bulk X-list: xfs Content-Length: 1151 Lines: 36 On 2006-11-27 9:29:08 +0100, David Chatterton wrote: > See: > http://oss.sgi.com/projects/xfs/faq.html#dir2 Thanks, I had 2.6.17 kernel those days. I'll try to get xfs_repair >= 2.8.10 for SystemRescueCD which I use and try repair file system. Regards Marcin > Marcin Zaj±czkowski wrote: >> Hi, >> >> >> Recently some of my executable files (on XFS partition) have become >> "invisible" for whereis, locate, find, bash, mc, nautilus and others. >> They are not reported by autofill in bash, but I can run it by typing >> full file name. "ls" with full name returns info about file, "ls" with >> regexp no. >> >> I run xfs_check and it returned me many errors (see below). >> >> Because I don't have experience with errors on XFS partition I would >> like to ask, do you think that after ran of xfs_repair file system would >> be still usable (now only those files are invisible, but still accessible)? >> >> >> Btw, I didn't have any power failure, nor problems with hardware (at >> least on this partition - based on smart report). The trigger for that >> was to fill in partition in 100%. Could it cause those error on my >> partition? (...) From owner-xfs@oss.sgi.com Tue Nov 28 16:53:56 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 16:54:01 -0800 (PST) Received: from tyo200.gate.nec.co.jp (TYO200.gate.nec.co.jp [210.143.35.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAT0roaG016794 for ; Tue, 28 Nov 2006 16:53:56 -0800 Received: from tyo201.gate.nec.co.jp ([10.7.69.201]) by tyo200.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAT0r1nU004318 for ; Wed, 29 Nov 2006 09:53:02 +0900 (JST) Received: from mailgate3.nec.co.jp (mailgate54.nec.co.jp [10.7.69.195]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAT0kPX8012205 for ; Wed, 29 Nov 2006 09:46:25 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id kAT0kPN07315 for xfs@oss.sgi.com; Wed, 29 Nov 2006 09:46:25 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv4.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id kAT0kOu16052 for ; Wed, 29 Nov 2006 09:46:24 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20061129.095146.06200516 for ; Wed, 29 Nov 2006 09:51:46 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Wed Nov 29 09:51:45 2006 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id C9D5EAE4B0 for ; Wed, 29 Nov 2006 09:45:34 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id kAT0kOOo029522; Wed, 29 Nov 2006 09:46:24 +0900 Message-Id: <200611290046.AA04743@TNESG9305.tnes.nec.co.jp> Date: Wed, 29 Nov 2006 09:46:19 +0900 To: xfs@oss.sgi.com Subject: [PATCH] infinite loop in xfs_db From: Utako Kusaka MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=us-ascii X-archive-position: 9802 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 1923 Lines: 63 Hi, I found two issues in xfs_db. 1)bmap, ablock and dblock command hangs up when the target file has either data fork or attr fork which is set XFS_DINODE_FMT_LOCAL. Because bmap() in db/bmap.c performs XFS_DINODE_FMT_BTREE if-block and goes into an infinite loop. 2)bmap command does not show attribute area correctly when no option is specified. Because an offset for attr fork is changed by following code: co = be.startoff + be.blockcount; This patch fixes them. Signed-off-by: Utako Kusaka --- --- xfsprogs-2.8.15-orgn/db/bmap.c 2006-05-30 23:35:06.000000000 +0900 +++ xfsprogs-2.8.15/db/bmap.c 2006-11-22 14:41:35.004978096 +0900 @@ -77,7 +77,8 @@ bmap( fmt = (xfs_dinode_fmt_t)XFS_DFORK_FORMAT(dip, whichfork); typ = whichfork == XFS_DATA_FORK ? TYP_BMAPBTD : TYP_BMAPBTA; ASSERT(typtab[typ].typnm == typ); - ASSERT(fmt == XFS_DINODE_FMT_EXTENTS || fmt == XFS_DINODE_FMT_BTREE); + ASSERT(fmt == XFS_DINODE_FMT_LOCAL || fmt == XFS_DINODE_FMT_EXTENTS || + fmt == XFS_DINODE_FMT_BTREE); if (fmt == XFS_DINODE_FMT_EXTENTS) { nextents = XFS_DFORK_NEXTENTS(dip, whichfork); xp = (xfs_bmbt_rec_64_t *)XFS_DFORK_PTR(dip, whichfork); @@ -85,7 +86,7 @@ bmap( if (!bmap_one_extent(ep, &curoffset, eoffset, &n, bep)) break; } - } else { + } else if (fmt == XFS_DINODE_FMT_BTREE) { push_cur(); bno = NULLFSBLOCK; rblock = (xfs_bmdr_block_t *)XFS_DFORK_PTR(dip, whichfork); @@ -147,7 +148,7 @@ bmap_f( int afork = 0; bmap_ext_t be; int c; - xfs_dfiloff_t co; + xfs_dfiloff_t co, cosave; int dfork = 0; xfs_dinode_t *dip; xfs_dfiloff_t eo; @@ -205,6 +206,7 @@ bmap_f( co = 0; eo = -1; } + cosave = co; for (whichfork = XFS_DATA_FORK; whichfork <= XFS_ATTR_FORK; whichfork++) { @@ -226,6 +228,7 @@ bmap_f( be.blockcount, be.flag); co = be.startoff + be.blockcount; } + co = cosave; } return 0; } From owner-xfs@oss.sgi.com Tue Nov 28 17:26:13 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 17:26:20 -0800 (PST) Received: from tyo200.gate.nec.co.jp (TYO200.gate.nec.co.jp [210.143.35.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAT1QBaG021590 for ; Tue, 28 Nov 2006 17:26:13 -0800 Received: from tyo202.gate.nec.co.jp ([10.7.69.202]) by tyo200.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAT0S06T026813 for ; Wed, 29 Nov 2006 09:28:00 +0900 (JST) Received: from mailgate3.nec.co.jp (mailgate54.nec.co.jp [10.7.69.197]) by tyo202.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAT0RmEt016920 for ; Wed, 29 Nov 2006 09:27:48 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id kAT0RmN12385 for xfs@oss.sgi.com; Wed, 29 Nov 2006 09:27:48 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv5.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id kAT0Rla24782 for ; Wed, 29 Nov 2006 09:27:47 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20061129.093309.14002448 for ; Wed, 29 Nov 2006 09:33:09 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Wed Nov 29 09:33:08 2006 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id 44A8EAE4B0 for ; Wed, 29 Nov 2006 09:27:17 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id kAT0Rl0L028670; Wed, 29 Nov 2006 09:27:47 +0900 Message-Id: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> Date: Wed, 29 Nov 2006 09:27:42 +0900 To: xfs@oss.sgi.com Subject: [PATCH 2/2]xfs_io man page From: Utako Kusaka MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=us-ascii X-archive-position: 9804 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 2098 Lines: 50 Hi, This patch adds offset and length parameter to the man page for pread, pwrite, mread and mwrite command in xfs_io(8). I guess this is useful for testers. What do you think of it? Signed-off-by: Utako Kusaka --- --- xfsprogs-2.8.15-orgn/man/man8/xfs_io.8 2006-07-04 20:57:05.000000000 +0900 +++ xfsprogs-2.8.15/man/man8/xfs_io.8 2006-11-16 15:55:34.566928372 +0900 @@ -99,7 +99,7 @@ Closes the current open file, marking th \f3c\f1 See the \f3close\f1 command. .TP -\f3pread\f1 [ \f2\-b bsize\f1 ] [ \f2\-v\f1 ] +\f3pread\f1 [ \f2\-b bsize\f1 ] [ \f2\-v\f1 ] \f2offset\f1 \f2length\f1 Reads a range of bytes in a specified blocksize from the given offset. .br The \f3\-b\f1 option can be used to set the blocksize into which the @@ -112,7 +112,7 @@ by default only the count of bytes actua \f3r\f1 See the \f3pread\f1 command. .TP -\f3pwrite\f1 [ \f2\-i file\f1 ] [ \f2\-d\f1 ] [ \f2\-s skip\f1 ] [ \f2\-b size\f1 ] [ \f2\-S seed\f1 ] +\f3pwrite\f1 [ \f2\-i file\f1 ] [ \f2\-d\f1 ] [ \f2\-s skip\f1 ] [ \f2\-b size\f1 ] [ \f2\-S seed\f1 ] \f2offset \f1 \f2length\f1 Writes a range of bytes in a specified blocksize from the given offset. The bytes written can be either a set pattern or read in from another file before writing. @@ -211,7 +211,7 @@ Unmaps the current memory mapping. \f3mu\f1 See the \f3munmap\f1 command. .TP -\f3mread\f1 [ \-\f2frv\f1 ] +\f3mread\f1 [ \-\f2frv\f1 ] [ \f2offset\f1 \f2length\f1 ] Accesses a segment of the current memory mapping, optionally dumping it to the standard output stream (with \f2-v\f1 or \f2-f\f1 option) for inspection. The accesses are performed sequentially from the start offset by default, @@ -224,7 +224,7 @@ offsets relative to the start of the map \f3mr\f1 See the \f3mread\f1 command. .TP -\f3mwrite\f1 [ \f2-r\f1 ] [ \f2-S seed\f1 ] +\f3mwrite\f1 [ \f2-r\f1 ] [ \f2-S seed\f1 ] [ \f2offset\f1 \f2length\f1 ] Stores a byte into memory for a range within a mapping. The default stored value is 'X', repeated to fill the range specified, but this can be changed using the \f2-S\f1 option. From owner-xfs@oss.sgi.com Tue Nov 28 17:31:16 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 17:31:22 -0800 (PST) Received: from tyo200.gate.nec.co.jp (TYO200.gate.nec.co.jp [210.143.35.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAT1VEaG023528 for ; Tue, 28 Nov 2006 17:31:15 -0800 Received: from tyo202.gate.nec.co.jp ([10.7.69.202]) by tyo200.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAT0S06X026813 for ; Wed, 29 Nov 2006 09:28:01 +0900 (JST) Received: from mailgate3.nec.co.jp (mailgate53.nec.co.jp [10.7.69.162]) by tyo202.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAT0Qiba015707 for ; Wed, 29 Nov 2006 09:26:44 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id kAT0Qi909738 for xfs@oss.sgi.com; Wed, 29 Nov 2006 09:26:44 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv.nec.co.jp (8.11.7/3.7W-MAILSV-NEC) with ESMTP id kAT0Qhg19040 for ; Wed, 29 Nov 2006 09:26:44 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20061129.093205.29602448 for ; Wed, 29 Nov 2006 09:32:05 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Wed Nov 29 09:32:04 2006 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id DE91CAE4B0 for ; Wed, 29 Nov 2006 09:26:13 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id kAT0Qh9u028626; Wed, 29 Nov 2006 09:26:43 +0900 Message-Id: <200611290026.AA04738@TNESG9305.tnes.nec.co.jp> Date: Wed, 29 Nov 2006 09:26:38 +0900 To: xfs@oss.sgi.com Subject: [PATCH 1/2]segmentation fault in xfs_io mread/mwrite command From: Utako Kusaka MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=us-ascii X-archive-position: 9805 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 3181 Lines: 132 Hi, I found the following issues in xfs_io. mread command: a) Causes a segmentation fault. Because "length"+1 bytes data is copied to buffer in read_mapping(), but buffer size is "length". b) Reads from wrong offset. c) The first byte of dump data is incorrect when length > page size. mwrite command: d) Data placement is incorrect when -r option is specified because of wrong for-loop counter. This patch fixes them. Signed-off-by: Utako Kusaka --- --- xfsprogs-2.8.11-orgn/io/mmap.c 2006-06-26 14:01:15.000000000 +0900 +++ xfsprogs-2.8.11/io/mmap.c 2006-11-14 16:46:18.651458839 +0900 @@ -323,26 +323,6 @@ msync_f( return 0; } -static int -read_mapping( - char *dest, - off64_t offset, - int dump, - off64_t dumpoffset, - size_t dumplength) -{ - *dest = *(((char *)mapping->addr) + offset); - - if (offset % pagesize == 0) { - if (dump == 2) - dumpoffset += mapping->offset; - if (dump) - dump_buffer(dumpoffset, dumplength); - return 1; - } - return 0; -} - static void mread_help(void) { @@ -373,9 +353,9 @@ mread_f( int argc, char **argv) { - off64_t offset, tmp; + off64_t offset, tmp, dumpoffset, printoffset; ssize_t length; - size_t dumplen; + size_t dumplen, cnt = 0; char *bp; void *start; int dump = 0, rflag = 0, c; @@ -422,6 +402,11 @@ mread_f( start = check_mapping_range(mapping, offset, length, 0); if (!start) return 0; + dumpoffset = offset - mapping->offset; + if (dump == 2) + printoffset = offset; + else + printoffset = dumpoffset; if (alloc_buffer(pagesize, 0, 0) < 0) return 0; @@ -432,28 +417,35 @@ mread_f( dumplen = pagesize; if (rflag) { - for (tmp = length, c = 0; tmp > 0; tmp--, bp++, c = 1) - if (read_mapping(bp, tmp, c? dump:0, offset, dumplen)) { + for (tmp = length - 1, c = 0; tmp >= 0; tmp--, c = 1) { + *bp = *(((char *)mapping->addr) + dumpoffset + tmp); + cnt++; + if (c && cnt == dumplen) { + if (dump) { + dump_buffer(printoffset, dumplen); + printoffset += dumplen; + } bp = (char *)buffer; dumplen = pagesize; + cnt = 0; + } else { + bp++; } + } } else { - for (tmp = 0, c = 0; tmp < length; tmp++, bp++, c = 1) - if (read_mapping(bp, tmp, c? dump:0, offset, dumplen)) { + for (tmp = 0, c = 0; tmp < length; tmp++, c = 1) { + *bp = *(((char *)mapping->addr) + dumpoffset + tmp); + cnt++; + if (c && cnt == dumplen) { + if (dump) + dump_buffer(printoffset + tmp - + (dumplen - 1), dumplen); bp = (char *)buffer; dumplen = pagesize; + cnt = 0; + } else { + bp++; } - } - /* dump the remaining (partial page) part of the read buffer */ - if (dump) { - if (rflag) - dumplen = length % pagesize; - else - dumplen = tmp % pagesize; - if (dumplen) { - if (dump == 2) - tmp += mapping->offset; - dump_buffer(tmp, dumplen); } } return 0; @@ -571,7 +563,7 @@ mwrite_f( return 0; if (rflag) { - for (tmp = offset + length; tmp > offset; tmp--) + for (tmp = offset + length -1; tmp >= offset; tmp--) ((char *)mapping->addr)[tmp] = seed; } else { for (tmp = offset; tmp < offset + length; tmp++) From owner-xfs@oss.sgi.com Tue Nov 28 19:56:13 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 19:56:20 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAT3u9aG012229 for ; Tue, 28 Nov 2006 19:56:11 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA24216; Wed, 29 Nov 2006 14:33:48 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 1FBE458F5C06; Wed, 29 Nov 2006 14:33:48 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 958639 - xfs buftarg flags are almost broken Message-Id: <20061129033348.1FBE458F5C06@chook.melbourne.sgi.com> Date: Wed, 29 Nov 2006 14:33:48 +1100 (EST) From: dgc@sgi.com (David Chinner) X-archive-position: 9806 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1205 Lines: 31 Current usage of buftarg flags is incorrect. The {test,set,clear}_bit() operations take a bit index for the bit to operate on. The XBT_* flags are defined as bit fields which is incorrect, not to mention the way the bit fields are enumerated is broken too. This was only working by chance. Fix the definitions of the flags and make the code using them use the {test,set,clear}_bit() operations correctly. Date: Wed Nov 29 14:32:56 AEDT 2006 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: tes The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:27565a fs/xfs/linux-2.6/xfs_buf.h - 1.118 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.h.diff?r1=text&tr1=1.118&r2=text&tr2=1.117&f=h - Fix enumeration of xfs_buftarg_flags_t to declare bit indexes rather than bit fields as required by test_bit() and friends. fs/xfs/linux-2.6/xfs_buf.c - 1.231 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.c.diff?r1=text&tr1=1.231&r2=text&tr2=1.230&f=h - Set bt_flags appropriately for delwri list flushing rather than abusing flag indexes. From owner-xfs@oss.sgi.com Tue Nov 28 23:32:35 2006 Received: with ECARTIS (v1.0.0; list xfs); Tue, 28 Nov 2006 23:32:42 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAT7WWaG008564 for ; Tue, 28 Nov 2006 23:32:34 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA00498; Wed, 29 Nov 2006 18:31:39 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAT7Vb7Y48404751; Wed, 29 Nov 2006 18:31:37 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAT7VY2B44346457; Wed, 29 Nov 2006 18:31:34 +1100 (AEDT) Date: Wed, 29 Nov 2006 18:31:34 +1100 From: David Chinner To: David Chatterton Cc: David Chinner , Russell Cattelan , Tim Shimmin , Eric Sandeen , xfs@oss.sgi.com Subject: Re: [PATCH 1/2] Make stuff static Message-ID: <20061129073134.GD33919298@melbourne.sgi.com> References: <20061016232250.GM11034@melbourne.sgi.com> <1161042943.5723.117.camel@xenon.msp.redhat.com> <20061017005038.GN11034@melbourne.sgi.com> <20061017215706.GI8394166@melbourne.sgi.com> <1161125131.5723.158.camel@xenon.msp.redhat.com> <20061122004216.GT11034@melbourne.sgi.com> <1164157783.19915.46.camel@xenon.msp.redhat.com> <20061122042445.GR37654165@melbourne.sgi.com> <4563D7DD.1060907@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4563D7DD.1060907@melbourne.sgi.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9807 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 2466 Lines: 69 On Wed, Nov 22, 2006 at 03:53:49PM +1100, David Chatterton wrote: > David Chinner wrote: > > Attached is thecomplete patch. On ia64, the size of the xfs.ko > > and xfs_quota.ko modules decreases with this patch: .... > > Performance appears to be slight faster with the noinline > > patch, but the variation is within the error margins of > > my measurements so I'd say it's neutral. > > > > Comments? > > > > Just reducing xfs_bmapi by 118 bytes makes this worthwhile doesn't it? Well, this is ia64, so stack usage really isn't the issue there. A performance degradation was really what I really care about with this change on ia64. And the increase in xfs_bmap_btalloc() offsets this saving as well.... > Out of interest, what estimated improvement does this have on one of Jesper's > stacks? Depends on the compiler. For gcc 3.3.5, it makes no difference at all because it doesn't automatically inline static functions. The problem we're trying to address here is the agressive inlining that gcc 4.x does of static functions that increases the stack usage of critical functions. e.g we've got in the code: xfs_bmapi() xfs_bmap_alloc() xfs_bmap_btalloc() xfs_bmap_{bt}alloc() are static, single use functions, and so gcc 4.x inlines them and the stack usage of all three functions is brought into xfs_bmapi(). Now in some cases this is a slight win in terms in stack usage if the code always passes through that path, but if the inlined functions are leaf functions, then we increase stack usage for no gain. The clearest example of this on i386 is xfs_page_state_convert(), which goes from 368 bytes of stack usage to 160 bytes of stack usage once noinline is used on i386. There's over 200 bytes of extra stack used by inlining functions that are not in the critical path. Basically, we can factor the code all we like, but while gcc is undoing that factoring to "go fast" we are fighting a losing battle. Hence the noinline change is needed first to make the code stack how we want it to, not how the compiler thinks it should. > Should we be concerned that there are now more functions with 100 or more bytes? Not really - the work that Jesper and Keith have done shows us that we've got a certain set of functions that we really need to concentrate on and once we've dealt with them we can start looking at other leaf functions that are stack hogs.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Nov 29 01:18:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 01:18:22 -0800 (PST) Received: from wx-out-0506.google.com (wx-out-0506.google.com [66.249.82.236]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAT9IDaG004035 for ; Wed, 29 Nov 2006 01:18:14 -0800 Received: by wx-out-0506.google.com with SMTP id t4so2033233wxc for ; Wed, 29 Nov 2006 01:17:25 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=KKShFXk0eUtPY3N6rzS4FAe/PQHM0WcYvrIEM1NCk9HlHQ53Dbr672XRFnzuvMvoBQ+8SRsO21wIBzq7kQjuZIZ3dI0eXwDKwIPBFsOoRMDI8iPAzcW+uVS3fHi9yavR3CzEMMC1PFAf/kZqqNqdju2UQ1gxwceYNSuN5dA2PZQ= Received: by 10.90.103.2 with SMTP id a2mr1956339agc.1164791845429; Wed, 29 Nov 2006 01:17:25 -0800 (PST) Received: by 10.90.106.11 with HTTP; Wed, 29 Nov 2006 01:17:24 -0800 (PST) Message-ID: <9a8748490611290117oc0ba880v1a6407bc4f41088f@mail.gmail.com> Date: Wed, 29 Nov 2006 10:17:25 +0100 From: "Jesper Juhl" To: "David Chinner" Subject: Re: XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c (kernel 2.6.18.1) Cc: "Linux Kernel Mailing List" , xfs@oss.sgi.com, xfs-masters@oss.sgi.com, "Keith Owens" In-Reply-To: <20061129013214.GH44411608@melbourne.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <9a8748490611280749k5c97d21bx2e499d2209d27dfe@mail.gmail.com> <20061129013214.GH44411608@melbourne.sgi.com> X-archive-position: 9808 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jesper.juhl@gmail.com Precedence: bulk X-list: xfs Content-Length: 2992 Lines: 82 On 29/11/06, David Chinner wrote: > On Tue, Nov 28, 2006 at 04:49:00PM +0100, Jesper Juhl wrote: > > Hi, > > > > One of my NFS servers just gave me a nasty surprise that I think it is > > relevant to tell you about: > > Thanks, Jesper. > > > Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of > > file fs/xfs/xfs_trans.c. Caller 0xffffffff8034b47e > > > > Call Trace: > > [] show_trace+0xb2/0x380 > > [] dump_stack+0x15/0x20 > > [] xfs_error_report+0x3c/0x50 > > [] xfs_trans_cancel+0x6e/0x130 > > [] xfs_create+0x5ee/0x6a0 > > [] xfs_vn_mknod+0x156/0x2e0 > > [] xfs_vn_create+0xb/0x10 > > [] vfs_create+0x8c/0xd0 > > [] nfsd_create_v3+0x31a/0x560 > > [] nfsd3_proc_create+0x148/0x170 > > [] nfsd_dispatch+0xf9/0x1e0 > > [] svc_process+0x437/0x6e0 > > [] nfsd+0x1cd/0x360 > > [] child_rip+0xa/0x12 > > xfs_force_shutdown(dm-1,0x8) called from line 1139 of file > > fs/xfs/xfs_trans.c. Return address = 0xffffffff80359daa > > We shut down the filesystem because we cancelled a dirty transaction. > Once we start to dirty the incore objects, we can't roll back to > an unchanged state if a subsequent fatal error occurs during the > transaction and we have to abort it. > So you are saying that there's nothing I can do to prevent this from happening in the future? > If I understand historic occurrences of this correctly, there is > a possibility that it can be triggered in ENOMEM situations. Was your > machine running out of memoy when this occurred? > Not really. I just checked my monitoring software and, at the time this happened, the box had ~5.9G RAM free (of 8G total) and no swap used (but 11G available). > > Filesystem "dm-1": Corruption of in-memory data detected. Shutting > > down filesystem: dm-1 > > Please umount the filesystem, and rectify the problem(s) > > nfsd: non-standard errno: 5 > > EIO gets returned in certain locations once the filesystem has > been shutdown. > Makes sense. > > I unmounted the filesystem, ran xfs_repair which told me to try an > > mount it first to replay the log, so I did, unmounted it again, ran > > xfs_repair (which didn't find any problems) and finally mounted it and > > everything is good - the filesystem seems intact. > > Yeah, the above error report typically is due to an in-memory > problem, not an on disk issue. > Good to know. > > The server in question is running kernel 2.6.18.1 > > Can happen to XFS on any kernel version - got a report of this from > someone running a 2.4 kernel a couple of weeks ago.... > Ok. Thank you for your reply David. -- Jesper Juhl Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html From owner-xfs@oss.sgi.com Wed Nov 29 01:54:46 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 01:54:53 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAT9sgaG010400 for ; Wed, 29 Nov 2006 01:54:44 -0800 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id UAA03320; Wed, 29 Nov 2006 20:53:44 +1100 Date: Wed, 29 Nov 2006 20:56:28 +1100 From: Timothy Shimmin To: Russell Cattelan , Eric Sandeen cc: xfs@oss.sgi.com Subject: Re: [PATCH] attr2 patch for data btrees & attr 2 was: (and bad attr2 bug) - pack xfs_sb_t for 64-bit arches Message-ID: <199451295660174A6ADE13B3@timothy-shimmins-power-mac-g5.local> In-Reply-To: <456ADF08.4080002@sgi.com> References: <455CB54F.8080901@sandeen.net> <455CE1E3.7020703@sandeen.net> <45612621.5010404@sandeen.net> <45627A4D.3020502@sandeen.net> <1164157336.19915.43.camel@xenon.msp.redhat.com> <5A1AC29043EE33BEB778198A@timothy-shimmins-power-mac-g5.local> <45647042.2040604@sandeen.net> <1164212695.19915.65.camel@xenon.msp.redhat.com> <45647CF8.8020104@sandeen.net> <26F2AE58A7D40E5170649BC2@timothy-shimmins-power-mac-g5.local> <4565DC6A.9080602@thebarn.com> <456ADF08.4080002@sgi.com> X-Mailer: Mulberry/4.0.6 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 9810 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 4721 Lines: 146 Hi, FYI I've done more testing and still have more testing to do. Found a bug in my previous patch. Below is my latest one. --Tim --- .pc/attr2_tes/fs/xfs/xfs_attr.c 2006-10-26 17:45:01.000000000 +1000 +++ fs/xfs/xfs_attr.c 2006-11-28 14:45:09.191482676 +1100 @@ -199,18 +199,14 @@ xfs_attr_set_int(xfs_inode_t *dp, const return (error); /* - * Determine space new attribute will use, and if it would be - * "local" or "remote" (note: local != inline). - */ - size = xfs_attr_leaf_newentsize(namelen, valuelen, - mp->m_sb.sb_blocksize, &local); - - /* * If the inode doesn't have an attribute fork, add one. * (inode must not be locked when we call this routine) */ if (XFS_IFORK_Q(dp) == 0) { - if ((error = xfs_bmap_add_attrfork(dp, size, rsvd))) + int sf_size = sizeof(xfs_attr_sf_hdr_t) + + XFS_ATTR_SF_ENTSIZE_BYNAME(namelen, valuelen); + + if ((error = xfs_bmap_add_attrfork(dp, sf_size, rsvd))) return(error); } @@ -231,6 +227,13 @@ xfs_attr_set_int(xfs_inode_t *dp, const args.addname = 1; args.oknoent = 1; + /* + * Determine space new attribute will use, and if it would be + * "local" or "remote" (note: local != inline). + */ + size = xfs_attr_leaf_newentsize(namelen, valuelen, + mp->m_sb.sb_blocksize, &local); + nblks = XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK); if (local) { if (size > (mp->m_sb.sb_blocksize >> 1)) { --- .pc/attr2_tes/fs/xfs/xfs_attr_leaf.c 2006-10-26 17:45:01.000000000 +1000 +++ fs/xfs/xfs_attr_leaf.c 2006-11-29 20:25:17.273599367 +1100 @@ -170,18 +170,33 @@ xfs_attr_shortform_bytesfit(xfs_inode_t } /* data fork btree root can have at least this many key/ptr pairs */ - minforkoff = MAX(dp->i_df.if_bytes, XFS_BMDR_SPACE_CALC(MINDBTPTRS)); + minforkoff = MAX(xfs_ifork_dsize_used(dp), XFS_BMDR_SPACE_CALC(MINDBTPTRS)); minforkoff = roundup(minforkoff, 8) >> 3; /* attr fork btree root can have at least this many key/ptr pairs */ maxforkoff = XFS_LITINO(mp) - XFS_BMDR_SPACE_CALC(MINABTPTRS); maxforkoff = maxforkoff >> 3; /* rounded down */ - if (offset >= minforkoff && offset < maxforkoff) - return offset; + /* we can't fit inline */ + if (offset < minforkoff) + return 0; + + /* + * If have data btree then keep forkoff if we have one, + * otherwise we are adding a new attr, so then we set forkoff to where + * the btree root can finish so we have plenty of room for attrs + */ + if (dp->i_d.di_format == XFS_DINODE_FMT_BTREE) { + if (dp->i_d.di_forkoff) + return dp->i_d.di_forkoff; + else + return minforkoff; + } + if (offset >= maxforkoff) return maxforkoff; - return 0; + else + return offset; } /* --- .pc/attr2_tes/fs/xfs/xfs_bmap.c 2006-11-17 14:35:46.000000000 +1100 +++ fs/xfs/xfs_bmap.c 2006-11-27 15:54:33.166590715 +1100 @@ -3543,6 +3543,7 @@ xfs_bmap_forkoff_reset( if (whichfork == XFS_ATTR_FORK && (ip->i_d.di_format != XFS_DINODE_FMT_DEV) && (ip->i_d.di_format != XFS_DINODE_FMT_UUID) && + (ip->i_d.di_format != XFS_DINODE_FMT_BTREE) && ((mp->m_attroffset >> 3) > ip->i_d.di_forkoff)) { ip->i_d.di_forkoff = mp->m_attroffset >> 3; ip->i_df.if_ext_max = XFS_IFORK_DSIZE(ip) / --- .pc/attr2_tes/fs/xfs/xfs_inode.c 2006-11-27 23:20:19.000000000 +1100 +++ fs/xfs/xfs_inode.c 2006-11-29 20:29:16.994035217 +1100 @@ -4747,3 +4747,34 @@ xfs_iext_irec_update_extoffs( ifp->if_u1.if_ext_irec[i].er_extoff += ext_diff; } } + +/* + * return how much space is used by the inode's data fork + */ +int +xfs_ifork_dsize_used(xfs_inode_t *ip) +{ + switch (ip->i_d.di_format) { + case XFS_DINODE_FMT_DEV: + return sizeof(xfs_dev_t); + case XFS_DINODE_FMT_UUID: + return sizeof(uuid_t); + case XFS_DINODE_FMT_LOCAL: + case XFS_DINODE_FMT_EXTENTS: + return ip->i_df.if_bytes; + case XFS_DINODE_FMT_BTREE: + if (ip->i_d.di_forkoff) + return ip->i_d.di_forkoff << 3; + else + /* + * For new attr fork, data btree takes all the space, + * so no room for any attrs with the current layout + * but we can know how much space it really needs + * i.e. the ptrs are half way along but we could compress to + * preserve the num of records. + */ + return XFS_BMDR_SPACE_CALC(XFS_BMAP_BROOT_NUMRECS(ip->i_df.if_broot)); + default: + return 0; + } +} --- .pc/attr2_tes/fs/xfs/xfs_inode.h 2006-11-17 14:35:46.000000000 +1100 +++ fs/xfs/xfs_inode.h 2006-11-27 23:24:09.376554289 +1100 @@ -535,6 +535,7 @@ void xfs_iext_irec_compact(xfs_ifork_t void xfs_iext_irec_compact_pages(xfs_ifork_t *); void xfs_iext_irec_compact_full(xfs_ifork_t *); void xfs_iext_irec_update_extoffs(xfs_ifork_t *, int, int); +int xfs_ifork_dsize_used(xfs_inode_t *); #define xfs_ipincount(ip) ((unsigned int) atomic_read(&ip->i_pincount)) From owner-xfs@oss.sgi.com Wed Nov 29 06:04:28 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 06:04:35 -0800 (PST) Received: from EX01.ad.tulane.edu (ex01.ad.tulane.edu [129.81.114.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATE4QaG019672 for ; Wed, 29 Nov 2006 06:04:27 -0800 Received: from EX08.ad.tulane.edu ([129.81.114.38]) by EX01.ad.tulane.edu with Microsoft SMTPSVC(6.0.3790.1830); Wed, 29 Nov 2006 08:06:22 -0600 Received: from 129.81.86.224 ([129.81.86.224]) by EX08.ad.tulane.edu ([129.81.114.38]) via Exchange Front-End Server ent.tulane.edu ([129.81.114.4]) with Microsoft Exchange Server HTTP-DAV ; Wed, 29 Nov 2006 14:06:22 +0000 User-Agent: Microsoft-Entourage/11.2.5.060620 Date: Wed, 29 Nov 2006 08:03:34 -0600 Subject: Re: get xfs_quota info as regular user From: Rene Salmon To: Donald Douwsma CC: Message-ID: Thread-Topic: get xfs_quota info as regular user Thread-Index: AccTvyhZZxNpgH+yEduvEgAKlXZa5g== In-Reply-To: <456CCC77.7000001@sgi.com> Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-OriginalArrivalTime: 29 Nov 2006 14:06:22.0392 (UTC) FILETIME=[8CB85380:01C713BF] X-archive-position: 9811 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rsalmon@tulane.edu Precedence: bulk X-list: xfs Content-Length: 1142 Lines: 49 Great that is what I needed! Thank you Rene On 11/28/06 5:55 PM, "Donald Douwsma" wrote: > Rene Salmon wrote: >> Hi, >> >> >> Did some searches on the list archives but could not find any useful info on >> this. >> >> Is there a way for a regular user to get info about his or her quota usage? >> >> I tried both of these as a regular user and get nothing: >> >> 120> xfs_quota -c "quota userid" >> 121> xfs_quota -x -c "quota userid" > > By default the quota command does not display anything unless the user is > overquota. > To display the limits set for a user you need to specify the -v option. > > xfs_quota -c 'quota -v' > > Note there is currently a bug in xfs-cmds that causes xfs_quota to display > results multiple times > (once for each xfs filesystem). One work around for this is to specify the > specific filesystem on > the commandline. > > xfs_quota -c 'quota -v' /home > > Donald > > -- Rene Salmon Tulane University Center for Computational Science http://www.ccs.tulane.edu rsalmon@tulane.edu Tel 504-862-8393 Fax 504-862-8392 From owner-xfs@oss.sgi.com Wed Nov 29 07:37:17 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 07:37:23 -0800 (PST) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATFbGaG001344 for ; Wed, 29 Nov 2006 07:37:17 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1GpR6A-0001RB-PK; Wed, 29 Nov 2006 15:11:46 +0000 Date: Wed, 29 Nov 2006 15:11:46 +0000 From: Christoph Hellwig To: Vlad Apostolov Cc: Christoph Hellwig , sgi.bugs.xfs@engr.sgi.com, linux-xfs@oss.sgi.com Subject: Re: TAKE 956783 - xfs_dm_getall_dmattr() doesn't check if the user buffer is at valid address Message-ID: <20061129151146.GA4746@infradead.org> References: <45629AD8.8000800@sgi.com> <20061127055859.GC1374@infradead.org> <456B7C1A.90209@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <456B7C1A.90209@sgi.com> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 9818 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 513 Lines: 9 On Tue, Nov 28, 2006 at 11:00:26AM +1100, Vlad Apostolov wrote: > The fix is actually fine as it gives an early indication (even not complete) > that the user pointer is bad. There is another problem you are pointing at > and it is the userspace pointer dereference later on without using > copy_to_user(). If you have any patch fixing this problem it would be great. Unfortunately I haven't found my patch, I'm sorry. I have on the other hand found various old trivial XFS patches of mine that I'll submit. From owner-xfs@oss.sgi.com Wed Nov 29 08:01:54 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 08:02:05 -0800 (PST) Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATG1oaK005193 for ; Wed, 29 Nov 2006 08:01:54 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id kATFifWv006555 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 29 Nov 2006 16:44:42 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id kATFiffi006553 for xfs@oss.sgi.com; Wed, 29 Nov 2006 16:44:41 +0100 Date: Wed, 29 Nov 2006 16:44:41 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com Subject: [PATCH] fix sparse warning in xfs_da_btree.c Message-ID: <20061129154441.GA6400@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 9821 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 836 Lines: 23 The first use in xfs_da_node_lookup_int would have to be __be32. But we can just remove this temporary variable use completely and make sparse happy. The variable is used later in the function for a native endian variable so we'll have to keep it. Signed-off-by: Christoph Hellwig diff --git a/fs/xfs/xfs_da_btree.c b/fs/xfs/xfs_da_btree.c index a68bc1f..cccf69e 100644 --- a/fs/xfs/xfs_da_btree.c +++ b/fs/xfs/xfs_da_btree.c @@ -1090,8 +1090,7 @@ xfs_da_node_lookup_int(xfs_da_state_t *s if (blk->magic == XFS_DA_NODE_MAGIC) { node = blk->bp->data; max = be16_to_cpu(node->hdr.count); - btreehashval = node->btree[max-1].hashval; - blk->hashval = be32_to_cpu(btreehashval); + blk->hashval = be32_to_cpu(node->btree[max-1].hashval); /* * Binary search. (note: small blocks will skip loop) From owner-xfs@oss.sgi.com Wed Nov 29 08:01:53 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 08:02:00 -0800 (PST) Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATG1oaI005193 for ; Wed, 29 Nov 2006 08:01:53 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id kATFk7Wv006744 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 29 Nov 2006 16:46:08 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id kATFk7Y9006742 for xfs@oss.sgi.com; Wed, 29 Nov 2006 16:46:07 +0100 Date: Wed, 29 Nov 2006 16:46:07 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com Subject: [PATCH] use struct kvec in struct uio Message-ID: <20061129154607.GB6400@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 9820 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 1551 Lines: 44 All but one useage of struct uio are for kernel pointers, so let's use struct kvec instead of struct iovec. Because readlink by handle still uses it with a user pointer we still have two sparse warnings, but the noise level is reduced quite a bit by this. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/support/move.h =================================================================== --- linux-2.6.orig/fs/xfs/support/move.h 2006-11-29 16:27:25.000000000 +0100 +++ linux-2.6/fs/xfs/support/move.h 2006-11-29 16:30:18.000000000 +0100 @@ -55,7 +55,7 @@ }; struct uio { - struct iovec *uio_iov; /* pointer to array of iovecs */ + struct kvec *uio_iov; /* pointer to array of iovecs */ int uio_iovcnt; /* number of iovecs in array */ xfs_off_t uio_offset; /* offset in file this uio corresponds to */ int uio_resid; /* residual i/o count */ @@ -63,7 +63,7 @@ }; typedef struct uio uio_t; -typedef struct iovec iovec_t; +typedef struct kvec iovec_t; extern int xfs_uio_read (caddr_t, size_t, uio_t *); Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl.c 2006-11-29 16:33:37.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl.c 2006-11-29 16:34:43.000000000 +0100 @@ -388,7 +388,7 @@ aiov.iov_len = olen; aiov.iov_base = hreq.ohandle; - auio.uio_iov = &aiov; + auio.uio_iov = (struct kvec *)&aiov; auio.uio_iovcnt = 1; auio.uio_offset = 0; auio.uio_segflg = UIO_USERSPACE; From owner-xfs@oss.sgi.com Wed Nov 29 08:01:52 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 08:01:58 -0800 (PST) Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATG1oaG005193 for ; Wed, 29 Nov 2006 08:01:52 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id kATFlUWv006874 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO) for ; Wed, 29 Nov 2006 16:47:30 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id kATFlTaU006866 for xfs@oss.sgi.com; Wed, 29 Nov 2006 16:47:29 +0100 Date: Wed, 29 Nov 2006 16:47:29 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com Subject: [PATCH] remove v_number Message-ID: <20061129154729.GC6400@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 9819 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 2399 Lines: 62 v_number is unused except for the naming some locks (which is a functionality totally unused by Linux), so remove it and assorted crap. Besides saving two words in struct vnode this also gets rid of a spinlock per inode allocation. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/linux-2.6/xfs_vnode.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_vnode.c 2006-11-29 16:37:23.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_vnode.c 2006-11-29 16:38:09.000000000 +0100 @@ -17,8 +17,6 @@ */ #include "xfs.h" -uint64_t vn_generation; /* vnode generation number */ -DEFINE_SPINLOCK(vnumber_lock); /* * Dedicated vnode inactive/reclaim sync semaphores. @@ -82,12 +80,6 @@ vp->v_flag = VMODIFIED; spinlock_init(&vp->v_lock, "v_lock"); - spin_lock(&vnumber_lock); - if (!++vn_generation) /* v_number shouldn't be zero */ - vn_generation++; - vp->v_number = vn_generation; - spin_unlock(&vnumber_lock); - ASSERT(VN_CACHED(vp) == 0); /* Initialize the first behavior and the behavior chain head. */ Index: linux-2.6/fs/xfs/linux-2.6/xfs_vnode.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_vnode.h 2006-11-29 16:38:13.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_vnode.h 2006-11-29 16:39:08.000000000 +0100 @@ -41,7 +41,6 @@ typedef struct bhv_vnode { bhv_vflags_t v_flag; /* vnode flags (see above) */ bhv_vfs_t *v_vfsp; /* ptr to containing VFS */ - bhv_vnumber_t v_number; /* in-core vnode number */ bhv_head_t v_bh; /* behavior head */ spinlock_t v_lock; /* VN_LOCK/VN_UNLOCK */ atomic_t v_iocount; /* outstanding I/O count */ Index: linux-2.6/fs/xfs/xfs_iget.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_iget.c 2006-11-29 16:36:19.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_iget.c 2006-11-29 16:37:14.000000000 +0100 @@ -570,8 +570,8 @@ bhv_vnode_t *vp) { mrlock_init(&ip->i_lock, MRLOCK_ALLOW_EQUAL_PRI|MRLOCK_BARRIER, - "xfsino", (long)vp->v_number); - mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", vp->v_number); + "xfsino", ip->i_ino); + mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino); init_waitqueue_head(&ip->i_ipin_wait); atomic_set(&ip->i_pincount, 0); initnsema(&ip->i_flock, 1, "xfsfino"); From owner-xfs@oss.sgi.com Wed Nov 29 09:16:17 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 09:16:22 -0800 (PST) Received: from mail.pacifica.ch (HSI-KBW-085-216-115-221.hsi.kabelbw.de [85.216.115.221]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATHGDaG019575 for ; Wed, 29 Nov 2006 09:16:17 -0800 Received: from uranus.atlantica ([fec0::1:2a0:24ff:fe57:9888] helo=localhost ident=jasmin) by mail.pacifica.ch with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1GpSZF-000806-Um for xfs@oss.sgi.com; Wed, 29 Nov 2006 17:45:54 +0100 Date: Wed, 29 Nov 2006 17:45:53 +0100 From: Jasmin Buchert To: xfs@oss.sgi.com Subject: mkfs.xfs questions Message-Id: <20061129174553.e0ef3465.jasmin@pacifica.ch> X-Mailer: Sylpheed version 2.2.10 (GTK+ 2.10.6; x86_64-pc-linux-gnu) X-Face: "iEb1;b$bLA_O;|q(xx^*zcA|peZ||UGf#pbQgI$SR2iIk@g2'V"`0~}OX7N'gFS(-TA{u.@"fDI0Bv$wx)5l?|vX6z&vBSAbN#SVWB`/Dpe6u7"0`E)@tx+a>]09(4{1T]W?h-J6&(80_9Bm(Aa4D`Ch3g.K^28/b?72';xt+CEcI)4&:R,]JP?bpPw.{(" Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit DomainKey-Status: no signature X-archive-position: 9822 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jasmin@pacifica.ch Precedence: bulk X-list: xfs Content-Length: 367 Lines: 14 Hi, I'm planning to use XFS but have some questions.. Is there any real advantage of making the log size 32-64 MB and what is the difference between log version 1 and 2 regarding to efficency/performance? Is it true that a small agcount is better for most systems (Gentoo and some other sources recommend this)? It's a desktop machine. Greetings, Jasmin Buchert From owner-xfs@oss.sgi.com Wed Nov 29 14:24:43 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 14:24:51 -0800 (PST) Received: from page.mel.office.aconex.com (eth2333.vic.adsl.internode.on.net [150.101.159.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATMOgaG025573 for ; Wed, 29 Nov 2006 14:24:43 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id DC9EA534260; Thu, 30 Nov 2006 09:23:49 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 02892-01-15; Thu, 30 Nov 2006 09:23:48 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id E1573534249; Thu, 30 Nov 2006 09:23:48 +1100 (EST) Subject: Re: [PATCH 1/2]segmentation fault in xfs_io mread/mwrite command From: Nathan Scott Reply-To: nscott@aconex.com To: Utako Kusaka Cc: xfs@oss.sgi.com In-Reply-To: <200611290026.AA04738@TNESG9305.tnes.nec.co.jp> References: <200611290026.AA04738@TNESG9305.tnes.nec.co.jp> Content-Type: text/plain Organization: Aconex Date: Thu, 30 Nov 2006 09:22:41 +1100 Message-Id: <1164838961.4992.29.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 9823 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 922 Lines: 30 On Wed, 2006-11-29 at 09:26 +0900, Utako Kusaka wrote: > Hi, > > I found the following issues in xfs_io. > mread command: > a) Causes a segmentation fault. > Because "length"+1 bytes data is copied to buffer in read_mapping(), > but buffer size is "length". > b) Reads from wrong offset. > c) The first byte of dump data is incorrect when length > page size. > mwrite command: > d) Data placement is incorrect when -r option is specified > because of wrong for-loop counter. > > This patch fixes them. > Looks OK - could you send explicit test cases that demonstrate each problem please? (i.e. actual xfs_io invocations). Particularly the segfault should be easy to show, something like: xfs_io -f -c 'mmap ...' -c 'mread ...' /tmp/foo) That way they can be added to the regression test suite to ensure these things don't spontaneously break themselves in the future. thanks! -- Nathan From owner-xfs@oss.sgi.com Wed Nov 29 14:25:07 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 14:25:14 -0800 (PST) Received: from page.mel.office.aconex.com (eth2333.vic.adsl.internode.on.net [150.101.159.28]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kATMP5aG025613 for ; Wed, 29 Nov 2006 14:25:06 -0800 Received: from localhost (page.mel.aconex.com [127.0.0.1]) by page.mel.office.aconex.com (Postfix) with ESMTP id C8272534277; Thu, 30 Nov 2006 09:24:13 +1100 (EST) Received: from page.mel.office.aconex.com ([127.0.0.1]) by localhost (mail.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 00794-01-53; Thu, 30 Nov 2006 09:24:13 +1100 (EST) Received: from edge (unknown [192.168.0.246]) by page.mel.office.aconex.com (Postfix) with ESMTP id 14413534276; Thu, 30 Nov 2006 09:24:13 +1100 (EST) Subject: Re: [PATCH 2/2]xfs_io man page From: Nathan Scott Reply-To: nscott@aconex.com To: Utako Kusaka Cc: xfs@oss.sgi.com In-Reply-To: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> Content-Type: text/plain Organization: Aconex Date: Thu, 30 Nov 2006 09:23:04 +1100 Message-Id: <1164838985.4992.30.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 9824 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 296 Lines: 16 On Wed, 2006-11-29 at 09:27 +0900, Utako Kusaka wrote: > Hi, > > This patch adds offset and length parameter to the man page for > pread, pwrite, mread and mwrite command in xfs_io(8). > I guess this is useful for testers. > What do you think of it? > Looks good to me. cheers. -- Nathan From owner-xfs@oss.sgi.com Wed Nov 29 16:11:02 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 16:11:09 -0800 (PST) Received: from hapkido.dreamhost.com (hapkido.dreamhost.com [66.33.216.122]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU0B0aG000689 for ; Wed, 29 Nov 2006 16:11:02 -0800 Received: from randymail-a9.dreamhost.com (sd-green-bigip-207.dreamhost.com [208.97.132.207]) by hapkido.dreamhost.com (Postfix) with ESMTP id BE4B71845EC for ; Wed, 29 Nov 2006 15:43:09 -0800 (PST) Received: from [10.2.255.104] (unknown [208.51.196.2]) by randymail-a9.dreamhost.com (Postfix) with ESMTP id DF695EEDBE for ; Wed, 29 Nov 2006 15:43:05 -0800 (PST) Message-ID: <456E1B08.7090802@delusion.com> Date: Wed, 29 Nov 2006 15:43:04 -0800 From: Deanan User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> In-Reply-To: <1164838985.4992.30.camel@edge> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9825 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: delusion@delusion.com Precedence: bulk X-list: xfs Content-Length: 356 Lines: 17 Hi, I've got some systems that I can't change the kernel on (external vendor) that are 32bit but I'm running into the performance problem that is fixed by using inode64. Is there any known way of working around the problem on a 32bit kernel? In our case, the problem occurs as soon as you start to delete files and write new ones. Cheers, Deanan From owner-xfs@oss.sgi.com Wed Nov 29 16:31:56 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 16:32:04 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAU0VraG008284 for ; Wed, 29 Nov 2006 16:31:55 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA26268; Thu, 30 Nov 2006 11:30:54 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAU0Uq7Y49164141; Thu, 30 Nov 2006 11:30:52 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAU0UoQu49141501; Thu, 30 Nov 2006 11:30:50 +1100 (AEDT) Date: Thu, 30 Nov 2006 11:30:50 +1100 From: David Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com Subject: Re: [PATCH] remove v_number Message-ID: <20061130003050.GG33919298@melbourne.sgi.com> References: <20061129154729.GC6400@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061129154729.GC6400@lst.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 9826 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1065 Lines: 28 On Wed, Nov 29, 2006 at 04:47:29PM +0100, Christoph Hellwig wrote: > v_number is unused except for the naming some locks (which is a > functionality totally unused by Linux), so remove it and assorted > crap. Besides saving two words in struct vnode this also gets rid > of a spinlock per inode allocation. Hmm - given that I've just used the v_number in post-mortem analysis of a nasty bug to correlate the sequence of events during a series of mkdir operations (i.e. transactions in the incore log buffers, the resulting xfs_inodes and some screwed up dentries) that lead to a BUG_ON being tripped in d_instantiate. So, while it appears to be unused, it is _very_ useful for determining the SOE that has occurred in certain types of problems. FWIW, while analysing this crash dump a couple of days ago I was wishing that dentries had an equivalent sequence number because there is no way to tell what dentry was supposed to be related to what inode after it got screwed up... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Nov 29 16:48:44 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 16:48:53 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAU0mdaG010803 for ; Wed, 29 Nov 2006 16:48:42 -0800 Received: from [134.14.55.18] (dhcp18.melbourne.sgi.com [134.14.55.18]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA26682; Thu, 30 Nov 2006 11:47:44 +1100 Message-ID: <456E2A30.4010101@melbourne.sgi.com> Date: Thu, 30 Nov 2006 11:47:44 +1100 From: David Chatterton Reply-To: chatz@melbourne.sgi.com Organization: SGI User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Deanan CC: xfs@oss.sgi.com Subject: Re: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> <456E1B08.7090802@delusion.com> In-Reply-To: <456E1B08.7090802@delusion.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9827 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chatz@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 936 Lines: 43 Deanan, Would something like the inode rotor help? fs.xfs.rotorstep (Min: 1 Default: 1 Max: 256) In "inode32" allocation mode, this option determines how many files the allocator attempts to allocate in the same allocation group before moving to the next allocation group. The intent is to control the rate at which the allocator moves between allocation groups when allocating extents for new files. David Deanan wrote: > > Hi, > > I've got some systems that I can't change the kernel on (external > vendor) that > are 32bit but I'm running into the performance problem that is fixed by > using > inode64. Is there any known way of working around the problem on a 32bit > kernel? > > In our case, the problem occurs as soon as you start to delete files and > write new ones. > > Cheers, > > Deanan > -- David Chatterton XFS Engineering Manager SGI Australia From owner-xfs@oss.sgi.com Wed Nov 29 17:00:53 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 17:01:02 -0800 (PST) Received: from randymail-a2.dreamhost.com (sd-green-bigip-74.dreamhost.com [208.97.132.74]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU10paG012909 for ; Wed, 29 Nov 2006 17:00:53 -0800 Received: from [10.2.255.104] (unknown [208.51.196.2]) by randymail-a2.dreamhost.com (Postfix) with ESMTP id 0EDA5EEA45; Wed, 29 Nov 2006 17:00:02 -0800 (PST) Message-ID: <456E2D0E.2000007@delusion.com> Date: Wed, 29 Nov 2006 16:59:58 -0800 From: Deanan User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: chatz@melbourne.sgi.com Cc: xfs@oss.sgi.com Subject: Re: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> <456E1B08.7090802@delusion.com> <456E2A30.4010101@melbourne.sgi.com> In-Reply-To: <456E2A30.4010101@melbourne.sgi.com> Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 9828 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: delusion@delusion.com Precedence: bulk X-list: xfs Content-Length: 1115 Lines: 56 Hi David, I'm not sure if it will help but I'd like to try. Where do you set the rotor? BTW< tis particular box is 2.6.9. Thanks, Deanan > Deanan, > > Would something like the inode rotor help? > > fs.xfs.rotorstep (Min: 1 Default: 1 Max: 256) > > In "inode32" allocation mode, this option determines how many > > files the allocator attempts to allocate in the same allocation > > group before moving to the next allocation group. The intent > > is to control the rate at which the allocator moves between > > allocation groups when allocating extents for new files. > > David > > > Deanan wrote: > >> Hi, >> >> I've got some systems that I can't change the kernel on (external >> vendor) that >> are 32bit but I'm running into the performance problem that is fixed by >> using >> inode64. Is there any known way of working around the problem on a 32bit >> kernel? >> >> In our case, the problem occurs as soon as you start to delete files and >> write new ones. >> >> Cheers, >> >> Deanan >> >> > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Wed Nov 29 17:16:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 17:16:22 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAU1GAaG015225 for ; Wed, 29 Nov 2006 17:16:13 -0800 Received: from [134.14.55.18] (dhcp18.melbourne.sgi.com [134.14.55.18]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA27369; Thu, 30 Nov 2006 12:15:17 +1100 Message-ID: <456E30A5.6080109@melbourne.sgi.com> Date: Thu, 30 Nov 2006 12:15:17 +1100 From: David Chatterton Reply-To: chatz@melbourne.sgi.com Organization: SGI User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Deanan CC: xfs@oss.sgi.com Subject: Re: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> <456E1B08.7090802@delusion.com> <456E2A30.4010101@melbourne.sgi.com> <456E2D0E.2000007@delusion.com> In-Reply-To: <456E2D0E.2000007@delusion.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9829 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chatz@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 1438 Lines: 69 This is a sysctl, see sysctl(8). It was introduced to XFS in October 2004, I'm not sure if it made 2.6.9. If this doesn't help a little then I'm unsure why you think that inode64 is going to solve your problem? David Deanan wrote: > Hi David, > > I'm not sure if it will help but I'd like to try. > Where do you set the rotor? > BTW< tis particular box is 2.6.9. > > Thanks, > > Deanan > > >> Deanan, >> >> Would something like the inode rotor help? >> >> fs.xfs.rotorstep (Min: 1 Default: 1 Max: 256) >> >> In "inode32" allocation mode, this option determines how many >> >> files the allocator attempts to allocate in the same allocation >> >> group before moving to the next allocation group. The intent >> >> is to control the rate at which the allocator moves between >> >> allocation groups when allocating extents for new files. >> >> David >> >> >> Deanan wrote: >> >>> Hi, >>> >>> I've got some systems that I can't change the kernel on (external >>> vendor) that >>> are 32bit but I'm running into the performance problem that is fixed by >>> using >>> inode64. Is there any known way of working around the problem on a 32bit >>> kernel? >>> >>> In our case, the problem occurs as soon as you start to delete files and >>> write new ones. >>> >>> Cheers, >>> >>> Deanan >>> >>> >> >> > -- David Chatterton XFS Engineering Manager SGI Australia From owner-xfs@oss.sgi.com Wed Nov 29 17:38:10 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 17:38:17 -0800 (PST) Received: from randymail-a6.dreamhost.com (sd-green-bigip-119.dreamhost.com [208.97.132.119]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU1c8aG018434 for ; Wed, 29 Nov 2006 17:38:09 -0800 Received: from [10.2.255.104] (unknown [208.51.196.2]) by randymail-a6.dreamhost.com (Postfix) with ESMTP id 90498175944; Wed, 29 Nov 2006 17:37:16 -0800 (PST) Message-ID: <456E35CA.2000601@delusion.com> Date: Wed, 29 Nov 2006 17:37:14 -0800 From: Deanan User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: chatz@melbourne.sgi.com Cc: xfs@oss.sgi.com Subject: Re: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> <456E1B08.7090802@delusion.com> <456E2A30.4010101@melbourne.sgi.com> <456E2D0E.2000007@delusion.com> <456E30A5.6080109@melbourne.sgi.com> In-Reply-To: <456E30A5.6080109@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9831 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: delusion@delusion.com Precedence: bulk X-list: xfs Content-Length: 288 Lines: 12 Thanks. Unfortunately 2.6.9 doesn't have it. :( > This is a sysctl, see sysctl(8). > > It was introduced to XFS in October 2004, I'm not sure if it made 2.6.9. > > If this doesn't help a little then I'm unsure why you think that inode64 is > going to solve your problem? > > David > From owner-xfs@oss.sgi.com Wed Nov 29 19:17:13 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 19:17:24 -0800 (PST) Received: from eric-weiss.de (eric-weiss.de [212.42.235.197]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU3HCaG029655 for ; Wed, 29 Nov 2006 19:17:13 -0800 Received: by eric-weiss.de (Postfix, from userid 12005) id 3D7474BB4D; Thu, 30 Nov 2006 03:57:19 +0100 (CET) Received: from ZenIV.linux.org.uk (zeniv.linux.org.uk [195.92.253.2]) by eric-weiss.de (Postfix) with ESMTP id 26B764BB48 for ; Thu, 30 Nov 2006 03:57:13 +0100 (CET) Received: from [2002:d993:5cf9:1:201:3dff:fe00:156] (helo=lists.arm.linux.org.uk) by ZenIV.linux.org.uk with esmtpsa (Exim 4.52 #1 (Red Hat Linux)) id 1Gpc5r-0005Ob-Dl; Thu, 30 Nov 2006 02:56:11 +0000 Received: from localhost ([127.0.0.1] helo=lists.arm.linux.org.uk) by lists.arm.linux.org.uk with esmtp (Exim 4.50) id 1Gpc5J-0000hJ-AW; Thu, 30 Nov 2006 02:55:37 +0000 Received: from xi.wantstofly.org ([2002:53a0:b870::1]) by lists.arm.linux.org.uk with esmtp (Exim 4.50) id 1Gpc4k-0000hA-Qd for linux-arm@lists.arm.linux.org.uk; Thu, 30 Nov 2006 02:55:06 +0000 Received: by xi.wantstofly.org (Postfix, from userid 500) id AE9007FE1E; Thu, 30 Nov 2006 03:54:59 +0100 (CET) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=1148133259; d=wantstofly.org; h=date:from:to:cc:subject:message-id:mime-version:content-type: content-disposition:user-agent; b=IJcqT5klsiP17wMItSsyNsEHkpQ1q4dq233yThwcKjsUPAVeLkYnpko/5ms2F AMUPsQUjxW8B+3oRBR+P/gkzw== Date: Thu, 30 Nov 2006 03:54:59 +0100 From: Lennert Buytenhek To: agruen@suse.de, xfs@oss.sgi.com Message-ID: <20061130025459.GA23869@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i Cc: linux-arm@lists.arm.linux.org.uk Subject: [PATCH] libattr 2.4.32 arm eabi system call calling convention X-BeenThere: linux-arm@lists.arm.linux.org.uk X-Mailman-Version: 2.1.5 Precedence: list X-archive-position: 9832 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: xfs Content-Length: 1097 Lines: 27 When building for EABI, a different system call calling convention is used where system calls are numbered starting from zero, not 0x900000 as in the old ABI. This was causing 'ls -al' with an ls binary that was built with xattr support to SIGILL. --- attr-2.4.32/libattr/syscalls.c.orig 2006-11-30 03:34:25.000000000 +0100 +++ attr-2.4.32/libattr/syscalls.c 2006-11-30 03:35:12.000000000 +0100 @@ -110,7 +110,11 @@ # define __NR_fremovexattr 235 #elif defined (__arm__) # define HAVE_XATTR_SYSCALLS 1 -# define __NR_SYSCALL_BASE 0x900000 +# if defined(__ARM_EABI__) || defined(__thumb__) +# define __NR_SYSCALL_BASE 0 +# else +# define __NR_SYSCALL_BASE 0x900000 +# endif # define __NR_setxattr (__NR_SYSCALL_BASE+226) # define __NR_lsetxattr (__NR_SYSCALL_BASE+227) # define __NR_fsetxattr (__NR_SYSCALL_BASE+228) ------------------------------------------------------------------- List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php From owner-xfs@oss.sgi.com Wed Nov 29 21:11:53 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 21:12:07 -0800 (PST) Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU5BqaG013292 for ; Wed, 29 Nov 2006 21:11:53 -0800 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C5A9018E21425; Wed, 29 Nov 2006 23:11:01 -0600 (CST) Message-ID: <456E67EB.2030008@sandeen.net> Date: Wed, 29 Nov 2006 23:11:07 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: Deanan CC: chatz@melbourne.sgi.com, xfs@oss.sgi.com Subject: Re: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> <456E1B08.7090802@delusion.com> <456E2A30.4010101@melbourne.sgi.com> <456E2D0E.2000007@delusion.com> <456E30A5.6080109@melbourne.sgi.com> <456E35CA.2000601@delusion.com> In-Reply-To: <456E35CA.2000601@delusion.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9833 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 466 Lines: 23 Deanan wrote: > Thanks. Unfortunately 2.6.9 doesn't have it. :( Is this rhel4? You could probably pretty easily add the inode rotor code into the xfs modules that you're using, if that's the case. -Eric >> This is a sysctl, see sysctl(8). >> >> It was introduced to XFS in October 2004, I'm not sure if it made 2.6.9. >> >> If this doesn't help a little then I'm unsure why you think that >> inode64 is >> going to solve your problem? >> >> David >> > > From owner-xfs@oss.sgi.com Wed Nov 29 21:52:33 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 21:52:41 -0800 (PST) Received: from wx-out-0506.google.com (wx-out-0506.google.com [66.249.82.239]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU5qWaG018770 for ; Wed, 29 Nov 2006 21:52:33 -0800 Received: by wx-out-0506.google.com with SMTP id t4so2288224wxc for ; Wed, 29 Nov 2006 21:51:43 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Lhti7EuyVXqTgq+UefpF0sG4opl615T1mhuK9ints20Ud+Rh5PWEqUQkzo5/xr2QnXkduaWsN/jpJ31TiYHVNjh0FE5X/lW5fLZOmgZZdHsxZNW5gP9XiyJ1Zo5fCQkYvlp/t7602wn6L5Bur2+q5vgfBLKR7dqCrvX2FnB132Q= Received: by 10.90.103.2 with SMTP id a2mr3216976agc.1164865903641; Wed, 29 Nov 2006 21:51:43 -0800 (PST) Received: by 10.90.106.11 with HTTP; Wed, 29 Nov 2006 21:51:43 -0800 (PST) Message-ID: <9a8748490611292151m57cdbf4kacebb4dd20b95147@mail.gmail.com> Date: Thu, 30 Nov 2006 06:51:43 +0100 From: "Jesper Juhl" To: "David Chinner" Subject: Re: XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c (kernel 2.6.18.1) Cc: "Linux Kernel Mailing List" , xfs@oss.sgi.com, xfs-masters@oss.sgi.com, "Keith Owens" In-Reply-To: <20061130020734.GB37654165@melbourne.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <9a8748490611280749k5c97d21bx2e499d2209d27dfe@mail.gmail.com> <20061129013214.GH44411608@melbourne.sgi.com> <9a8748490611290117oc0ba880v1a6407bc4f41088f@mail.gmail.com> <20061130020734.GB37654165@melbourne.sgi.com> X-archive-position: 9834 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jesper.juhl@gmail.com Precedence: bulk X-list: xfs Content-Length: 2819 Lines: 67 On 30/11/06, David Chinner wrote: > On Wed, Nov 29, 2006 at 10:17:25AM +0100, Jesper Juhl wrote: > > On 29/11/06, David Chinner wrote: > > >On Tue, Nov 28, 2006 at 04:49:00PM +0100, Jesper Juhl wrote: > > >> Filesystem "dm-1": XFS internal error xfs_trans_cancel at line 1138 of > > >> file fs/xfs/xfs_trans.c. Caller 0xffffffff8034b47e > > >> > > >> Call Trace: > > >> [] show_trace+0xb2/0x380 > > >> [] dump_stack+0x15/0x20 > > >> [] xfs_error_report+0x3c/0x50 > > >> [] xfs_trans_cancel+0x6e/0x130 > > >> [] xfs_create+0x5ee/0x6a0 > > >> [] xfs_vn_mknod+0x156/0x2e0 > > >> [] xfs_vn_create+0xb/0x10 > > >> [] vfs_create+0x8c/0xd0 > > >> [] nfsd_create_v3+0x31a/0x560 > > >> [] nfsd3_proc_create+0x148/0x170 > > >> [] nfsd_dispatch+0xf9/0x1e0 > > >> [] svc_process+0x437/0x6e0 > > >> [] nfsd+0x1cd/0x360 > > >> [] child_rip+0xa/0x12 > > >> xfs_force_shutdown(dm-1,0x8) called from line 1139 of file > > >> fs/xfs/xfs_trans.c. Return address = 0xffffffff80359daa > > > > > >We shut down the filesystem because we cancelled a dirty transaction. > > >Once we start to dirty the incore objects, we can't roll back to > > >an unchanged state if a subsequent fatal error occurs during the > > >transaction and we have to abort it. > > > > > So you are saying that there's nothing I can do to prevent this from > > happening in the future? > > Pretty much - we need to work out what is going wrong and > we can't from teh shutdown message above - the error has > occurred in a path that doesn't have error report traps > in it. > > Is this reproducable? > Not on demand, no. It has happened only this once as far as I know and for unknown reasons. > > >If I understand historic occurrences of this correctly, there is > > >a possibility that it can be triggered in ENOMEM situations. Was your > > >machine running out of memoy when this occurred? > > > > > Not really. I just checked my monitoring software and, at the time > > this happened, the box had ~5.9G RAM free (of 8G total) and no swap > > used (but 11G available). > > Ok. Sounds like we need more error reporting points inserted > into that code so we dump an error earlier and hence have some > hope of working out what went wrong next time..... > > OOC, there weren't any I/O errors reported before this shutdown? > No. I looked but found none. Let me know if there's anything I can do to help. -- Jesper Juhl Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html Plain text mails only, please http://www.expita.com/nomime.html From owner-xfs@oss.sgi.com Wed Nov 29 22:54:24 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 22:54:32 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAU6sLaG026450 for ; Wed, 29 Nov 2006 22:54:23 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA04566; Thu, 30 Nov 2006 17:53:18 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAU6rH7Y48891094; Thu, 30 Nov 2006 17:53:17 +1100 (AEDT) Received: (from bnaujok@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAU6rFhN49198148; Thu, 30 Nov 2006 17:53:15 +1100 (AEDT) Date: Thu, 30 Nov 2006 17:53:15 +1100 (AEDT) From: Barry Naujok Message-Id: <200611300653.kAU6rFhN49198148@snort.melbourne.sgi.com> To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 954550 - xfs_db problems X-archive-position: 9835 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@snort.melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 708 Lines: 19 Fix libxfs SEGV when attempting to mount a non-XFS filesystem. Date: Thu Nov 30 17:52:33 AEDT 2006 Workarea: snort.melbourne.sgi.com:/home/bnaujok/isms/repair Inspected by: utako@tnes.nec.co.jp The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:27588a xfsprogs/doc/CHANGES - 1.225 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.225&r2=text&tr2=1.224&f=h xfsprogs/libxfs/init.c - 1.52 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/libxfs/init.c.diff?r1=text&tr1=1.52&r2=text&tr2=1.51&f=h - Fix libxfs SEGV when attempting to mount a non-XFS filesystem. From owner-xfs@oss.sgi.com Wed Nov 29 23:10:20 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 23:10:27 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAU7AIaG028626 for ; Wed, 29 Nov 2006 23:10:19 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA04917; Thu, 30 Nov 2006 18:09:28 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAU79S7Y49201397; Thu, 30 Nov 2006 18:09:28 +1100 (AEDT) Received: (from bnaujok@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAU79Q0H49207473; Thu, 30 Nov 2006 18:09:26 +1100 (AEDT) Date: Thu, 30 Nov 2006 18:09:26 +1100 (AEDT) From: Barry Naujok Message-Id: <200611300709.kAU79Q0H49207473@snort.melbourne.sgi.com> To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 958692 - xfs_repair aborts if it encounters an inode with an invalid type. X-archive-position: 9836 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@snort.melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 642 Lines: 17 Fix up xfs_repair aborting if it finds an inode with an invalid inode type. Date: Thu Nov 30 18:08:37 AEDT 2006 Workarea: snort.melbourne.sgi.com:/home/bnaujok/isms/repair Inspected by: NONE The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:27589a xfsprogs/doc/CHANGES - 1.226 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.226&r2=text&tr2=1.225&f=h xfsprogs/repair/dinode.c - 1.26 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/repair/dinode.c.diff?r1=text&tr1=1.26&r2=text&tr2=1.25&f=h From owner-xfs@oss.sgi.com Wed Nov 29 23:43:51 2006 Received: with ECARTIS (v1.0.0; list xfs); Wed, 29 Nov 2006 23:43:58 -0800 (PST) Received: from tyo200.gate.nec.co.jp (TYO200.gate.nec.co.jp [210.143.35.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU7hoaG032381 for ; Wed, 29 Nov 2006 23:43:51 -0800 Received: from tyo201.gate.nec.co.jp ([10.7.69.201]) by tyo200.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAU7h0BK025892 for ; Thu, 30 Nov 2006 16:43:00 +0900 (JST) Received: from mailgate3.nec.co.jp (mailgate54.nec.co.jp [10.7.69.195]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id kAU7bgG3018743 for ; Thu, 30 Nov 2006 16:37:42 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id kAU7bgZ18148 for xfs@oss.sgi.com; Thu, 30 Nov 2006 16:37:42 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv4.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id kAU7bgr02578 for ; Thu, 30 Nov 2006 16:37:42 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20061130.164305.42102236 for ; Thu, 30 Nov 2006 16:43:05 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Thu Nov 30 16:43:04 2006 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id 3946CAE4B0 for ; Thu, 30 Nov 2006 16:31:45 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id kAU7bfEe010053 for ; Thu, 30 Nov 2006 16:37:41 +0900 Message-Id: <200611300737.AA04758@TNESG9305.tnes.nec.co.jp> Date: Thu, 30 Nov 2006 16:37:39 +0900 To: xfs@oss.sgi.com Subject: Re: [PATCH 1/2]segmentation fault in xfs_io mread/mwrite command From: Utako Kusaka In-Reply-To: <1164838961.4992.29.camel@edge> References: <1164838961.4992.29.camel@edge> MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=iso-2022-jp X-archive-position: 9837 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 1317 Lines: 43 Thanks for your response. This is my test case. $ ./xfs_io -f mmap -c "pwrite 0 16384" -c "mmap 4096 4096" -c "mread -r" wrote 16384/16384 bytes at offset 0 16 KiB, 4 ops; 0.0000 sec (625 MiB/sec and 160000.0000 ops/sec) Segmentation fault $ cat /etc/SuSE-release SUSE LINUX 10.0 (X86-64) VERSION = 10.0 Thu, 30 Nov 2006 09:22:41 +1100 Nathan Scott wrote$B!'(B >On Wed, 2006-11-29 at 09:26 +0900, Utako Kusaka wrote: >> Hi, >> >> I found the following issues in xfs_io. >> mread command: >> a) Causes a segmentation fault. >> Because "length"+1 bytes data is copied to buffer in read_mapping(), >> but buffer size is "length". >> b) Reads from wrong offset. >> c) The first byte of dump data is incorrect when length > page size. >> mwrite command: >> d) Data placement is incorrect when -r option is specified >> because of wrong for-loop counter. >> >> This patch fixes them. >> > >Looks OK - could you send explicit test cases that demonstrate each >problem please? (i.e. actual xfs_io invocations). Particularly the >segfault should be easy to show, something like: >xfs_io -f -c 'mmap ...' -c 'mread ...' /tmp/foo) > >That way they can be added to the regression test suite to ensure these >things don't spontaneously break themselves in the future. > >thanks! > >-- >Nathan From owner-xfs@oss.sgi.com Thu Nov 30 00:04:58 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 00:05:04 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAU84taG004977 for ; Thu, 30 Nov 2006 00:04:57 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id TAA05970; Thu, 30 Nov 2006 19:04:00 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id E3F7C58FF58B; Thu, 30 Nov 2006 19:03:59 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 957159 - Reduce XFS 4k stack kernel issues Message-Id: <20061130080359.E3F7C58FF58B@chook.melbourne.sgi.com> Date: Thu, 30 Nov 2006 19:03:59 +1100 (EST) From: dgc@sgi.com (David Chinner) X-archive-position: 9838 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 7121 Lines: 147 Keep stack usage down for 4k stacks by using noinline. gcc-4.1 and more recent aggressively inline static functions which increases XFS stack usage by ~15% in critical paths. Prevent this from occurring by adding noinline to the STATIC definition. Also uninline some functions that are too large to be inlined and were causing problems with CONFIG_FORCED_INLINING=y. Finally, clean up all the different users of inline, __inline and __inline__ and put them under one STATIC_INLINE macro. For debug kernels the STATIC_INLINE macro uninlines those functions. Date: Thu Nov 30 19:02:40 AEDT 2006 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: tes,chatz The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:27585a fs/xfs/xfs_ialloc.c - 1.193 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_ialloc.c.diff?r1=text&tr1=1.193&r2=text&tr2=1.192&f=h - noinline static function declaration cleanup. fs/xfs/xfs_extfree_item.c - 1.66 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_extfree_item.c.diff?r1=text&tr1=1.66&r2=text&tr2=1.65&f=h - noinline static function declaration cleanup. fs/xfs/xfs_buf_item.c - 1.161 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.c.diff?r1=text&tr1=1.161&r2=text&tr2=1.160&f=h - noinline static function declaration cleanup. fs/xfs/xfs_bit.c - 1.30 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bit.c.diff?r1=text&tr1=1.30&r2=text&tr2=1.29&f=h - noinline static function declaration cleanup. fs/xfs/xfs_inode_item.c - 1.130 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode_item.c.diff?r1=text&tr1=1.130&r2=text&tr2=1.129&f=h - noinline static function declaration cleanup. fs/xfs/xfs_bmap_btree.c - 1.158 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap_btree.c.diff?r1=text&tr1=1.158&r2=text&tr2=1.157&f=h - noinline static function declaration cleanup. fs/xfs/xfs_mount.c - 1.387 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.387&r2=text&tr2=1.386&f=h - noinline static function declaration cleanup. fs/xfs/xfs_inode.c - 1.456 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.456&r2=text&tr2=1.455&f=h - noinline static function declaration cleanup. fs/xfs/xfs_attr_leaf.c - 1.104 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_attr_leaf.c.diff?r1=text&tr1=1.104&r2=text&tr2=1.103&f=h - noinline static function declaration cleanup. fs/xfs/xfs_attr.c - 1.140 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_attr.c.diff?r1=text&tr1=1.140&r2=text&tr2=1.139&f=h - noinline static function declaration cleanup. fs/xfs/support/debug.h - 1.16 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/support/debug.h.diff?r1=text&tr1=1.16&r2=text&tr2=1.15&f=h - noinline static function declaration cleanup. fs/xfs/quota/xfs_qm_bhv.c - 1.24 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm_bhv.c.diff?r1=text&tr1=1.24&r2=text&tr2=1.23&f=h - noinline static function declaration cleanup. fs/xfs/quota/xfs_dquot_item.c - 1.16 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot_item.c.diff?r1=text&tr1=1.16&r2=text&tr2=1.15&f=h - noinline static function declaration cleanup. fs/xfs/quota/xfs_qm.c - 1.45 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm.c.diff?r1=text&tr1=1.45&r2=text&tr2=1.44&f=h - noinline static function declaration cleanup. fs/xfs/xfs_refcache.c - 1.7 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_refcache.c.diff?r1=text&tr1=1.7&r2=text&tr2=1.6&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_vfs.c - 1.74 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vfs.c.diff?r1=text&tr1=1.74&r2=text&tr2=1.73&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_file.c - 1.146 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_file.c.diff?r1=text&tr1=1.146&r2=text&tr2=1.145&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_vnode.c - 1.143 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vnode.c.diff?r1=text&tr1=1.143&r2=text&tr2=1.142&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_vnode.h - 1.126 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vnode.h.diff?r1=text&tr1=1.126&r2=text&tr2=1.125&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_super.c - 1.374 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.374&r2=text&tr2=1.373&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_iops.c - 1.256 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.256&r2=text&tr2=1.255&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_aops.c - 1.136 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_aops.c.diff?r1=text&tr1=1.136&r2=text&tr2=1.135&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_sysctl.c - 1.39 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_sysctl.c.diff?r1=text&tr1=1.39&r2=text&tr2=1.38&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.4/xfs_file.c - 1.129 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_file.c.diff?r1=text&tr1=1.129&r2=text&tr2=1.128&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.4/xfs_vnode.h - 1.114 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_vnode.h.diff?r1=text&tr1=1.114&r2=text&tr2=1.113&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.4/xfs_super.c - 1.333 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_super.c.diff?r1=text&tr1=1.333&r2=text&tr2=1.332&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_buf.c - 1.232 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.c.diff?r1=text&tr1=1.232&r2=text&tr2=1.231&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.4/xfs_buf.c - 1.218 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_buf.c.diff?r1=text&tr1=1.218&r2=text&tr2=1.217&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.4/mrlock.c - 1.22 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/mrlock.c.diff?r1=text&tr1=1.22&r2=text&tr2=1.21&f=h - noinline static function declaration cleanup. fs/xfs/linux-2.6/xfs_export.c - 1.12 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_export.c.diff?r1=text&tr1=1.12&r2=text&tr2=1.11&f=h - noinline static function declaration cleanup. fs/xfs/dmapi/xfs_dm.c - 1.30 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/dmapi/xfs_dm.c.diff?r1=text&tr1=1.30&r2=text&tr2=1.29&f=h - noinline static function declaration cleanup. From owner-xfs@oss.sgi.com Thu Nov 30 01:39:34 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 01:39:41 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [83.160.184.112]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU9dWaG022944 for ; Thu, 30 Nov 2006 01:39:34 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 897E37FE36; Thu, 30 Nov 2006 10:38:43 +0100 (CET) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=1148133259; d=wantstofly.org; h=date:from:to:cc:subject:message-id:mime-version:content-type: content-disposition:in-reply-to:user-agent; b=nDdJsXN9qs4fuWTOZUtx7Ypltqyui2H7X+k0yi1wh2aF4FXOf9VO4tnt8bvqH B8R9VRwcV3reeskX5aFfxa5SA== Date: Thu, 30 Nov 2006 10:38:42 +0100 From: Lennert Buytenhek To: Christoph Hellwig Cc: agruen@suse.de, xfs@oss.sgi.com, linux-arm@lists.arm.linux.org.uk Subject: Re: [PATCH] libattr 2.4.32 arm eabi system call calling convention Message-ID: <20061130093842.GB24108@xi.wantstofly.org> References: <20061130025459.GA23869@xi.wantstofly.org> <20061130092853.GB1534@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061130092853.GB1534@infradead.org> User-Agent: Mutt/1.4.1i X-archive-position: 9839 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: xfs Content-Length: 581 Lines: 14 On Thu, Nov 30, 2006 at 09:28:53AM +0000, Christoph Hellwig wrote: > > When building for EABI, a different system call calling convention is > > used where system calls are numbered starting from zero, not 0x900000 > > as in the old ABI. This was causing 'ls -al' with an ls binary that > > was built with xattr support to SIGILL. > > Please just rip out the direct syscalls. The days glibc provices all > the xattr syscalls in sys/xattr.h, and libattr should just forward to > those. Sounds like the better option to me as well. (Would have saved me a bunch of work, too.) From owner-xfs@oss.sgi.com Thu Nov 30 01:55:26 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 01:55:33 -0800 (PST) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAU9tOaG025002 for ; Thu, 30 Nov 2006 01:55:26 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1GpiDt-00012u-EQ; Thu, 30 Nov 2006 09:28:53 +0000 Date: Thu, 30 Nov 2006 09:28:53 +0000 From: Christoph Hellwig To: Lennert Buytenhek Cc: agruen@suse.de, xfs@oss.sgi.com, linux-arm@lists.arm.linux.org.uk Subject: Re: [PATCH] libattr 2.4.32 arm eabi system call calling convention Message-ID: <20061130092853.GB1534@infradead.org> References: <20061130025459.GA23869@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061130025459.GA23869@xi.wantstofly.org> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 9840 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 473 Lines: 10 On Thu, Nov 30, 2006 at 03:54:59AM +0100, Lennert Buytenhek wrote: > When building for EABI, a different system call calling convention is > used where system calls are numbered starting from zero, not 0x900000 > as in the old ABI. This was causing 'ls -al' with an ls binary that > was built with xattr support to SIGILL. Please just rip out the direct syscalls. The days glibc provices all the xattr syscalls in sys/xattr.h, and libattr should just forward to those. From owner-xfs@oss.sgi.com Thu Nov 30 10:04:31 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 10:04:38 -0800 (PST) Received: from internal-mail-relay1.corp.sgi.com (internal-mail-relay1.corp.sgi.com [198.149.32.52]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAUI4VaG017106 for ; Thu, 30 Nov 2006 10:04:31 -0800 Received: from [134.15.160.2] (vpn-emea-sw-emea-160-2.emea.sgi.com [134.15.160.2]) by internal-mail-relay1.corp.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id kAUI3ebj61711469; Thu, 30 Nov 2006 10:03:41 -0800 (PST) Message-ID: <456F1CFC.2060705@sgi.com> Date: Thu, 30 Nov 2006 18:03:40 +0000 From: Lachlan McIlroy Reply-To: lachlan@sgi.com Organization: SGI User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.7.12) Gecko/20050920 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Chinner CC: xfs-dev@sgi.com, xfs@oss.sgi.com Subject: Re: Review: Reduce in-core superblock lock contention near ENOSPC References: <20061123044122.GU11034@melbourne.sgi.com> In-Reply-To: <20061123044122.GU11034@melbourne.sgi.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9842 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Content-Length: 2064 Lines: 75 Dave, Could you have changed the SB_LOCK from a spinlock to a blocking mutex and have achieved a similar effect? Has this change had much testing on a large machine? These changes wouldn't apply cleanly to tot (3 hunks failed in xfs_mount.c) but I couldn't see why. The changes look fine to me, couple of comments below. Lachlan @@ -1479,9 +1479,11 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, case XFS_SBS_IFREE: case XFS_SBS_FDBLOCKS: if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { - status = xfs_icsb_modify_counters_locked(mp, + XFS_SB_UNLOCK(mp, s); + status = xfs_icsb_modify_counters(mp, msbp->msb_field, msbp->msb_delta, rsvd); + s = XFS_SB_LOCK(mp); break; } /* FALLTHROUGH */ Is it safe to be releasing the SB_LOCK? Is it assumed that the superblock wont change while we process the list of xfs_mod_sb structures? @@ -1515,11 +1517,12 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, case XFS_SBS_IFREE: case XFS_SBS_FDBLOCKS: if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { - status = - xfs_icsb_modify_counters_locked(mp, + XFS_SB_UNLOCK(mp, s); + status = xfs_icsb_modify_counters(mp, msbp->msb_field, -(msbp->msb_delta), rsvd); + s = XFS_SB_LOCK(mp); break; } /* FALLTHROUGH */ Same as above. @@ -1882,6 +1895,17 @@ xfs_icsb_disable_counter( ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS)); + /* + * If we are already disabled, then there is nothing to do + * here. We check before locking all the counters to avoid + * the expensive lock operation when being called in the + * slow path and the counter is already disabled. This is + * safe because the only time we set or clear this state is under + * the m_icsb_mutex. + */ + if (xfs_icsb_counter_disabled(mp, field)) + return 0; + xfs_icsb_lock_all_counters(mp); if (!test_and_set_bit(field, &mp->m_icsb_counters)) { /* drain back to superblock */ Nice one, that will avoid a lot of unnecessary work. From owner-xfs@oss.sgi.com Thu Nov 30 12:34:35 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 12:34:42 -0800 (PST) Received: from randymail-a9.dreamhost.com (sd-green-bigip-119.dreamhost.com [208.97.132.119]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAUKYWaG006800 for ; Thu, 30 Nov 2006 12:34:35 -0800 Received: from [10.2.255.104] (unknown [208.51.196.2]) by randymail-a9.dreamhost.com (Postfix) with ESMTP id 9EDECEEDCA; Thu, 30 Nov 2006 12:33:37 -0800 (PST) Message-ID: <456F401F.3030902@delusion.com> Date: Thu, 30 Nov 2006 12:33:35 -0800 From: Deanan User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Eric Sandeen Cc: chatz@melbourne.sgi.com, xfs@oss.sgi.com Subject: Re: inode64 workaround References: <200611290027.AA04740@TNESG9305.tnes.nec.co.jp> <1164838985.4992.30.camel@edge> <456E1B08.7090802@delusion.com> <456E2A30.4010101@melbourne.sgi.com> <456E2D0E.2000007@delusion.com> <456E30A5.6080109@melbourne.sgi.com> <456E35CA.2000601@delusion.com> <456E67EB.2030008@sandeen.net> In-Reply-To: <456E67EB.2030008@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 9844 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: delusion@delusion.com Precedence: bulk X-list: xfs Content-Length: 1032 Lines: 43 It's a generic 2.6.9 kernel (AFAIK). I tried the same setting on a different box with SLES 9 SP3 (2.6.5) which does have rotorstep. With inode64 I can sustain 235+MB/s on the same array test after test. When I do not mount with inode64, I get the same results as the 32bit machine (~100-130MB/s) Without inode64 plus rotorstep (set to 255), the perfomance improves to about 140-160MB/s. Generally the first test is fast and then drops over the next few tests (even writing as few as 100 16mb files per test). Thanks, Deanan > Deanan wrote: >> Thanks. Unfortunately 2.6.9 doesn't have it. :( > > Is this rhel4? > > You could probably pretty easily add the inode rotor code into the xfs > modules that you're using, if that's the case. > > -Eric > >>> This is a sysctl, see sysctl(8). >>> >>> It was introduced to XFS in October 2004, I'm not sure if it made >>> 2.6.9. >>> >>> If this doesn't help a little then I'm unsure why you think that >>> inode64 is >>> going to solve your problem? >>> >>> David >>> >> >> > > From owner-xfs@oss.sgi.com Thu Nov 30 13:17:14 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 13:17:20 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kAULHCaG013040 for ; Thu, 30 Nov 2006 13:17:14 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id kAULGNRw011433; Thu, 30 Nov 2006 16:16:23 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAULGNws000560; Thu, 30 Nov 2006 16:16:23 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id kAULGMcl012595; Thu, 30 Nov 2006 16:16:22 -0500 Message-ID: <456F4A25.80709@sandeen.net> Date: Thu, 30 Nov 2006 15:16:21 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.8 (X11/20061107) MIME-Version: 1.0 To: David Chinner CC: xfs@oss.sgi.com Subject: Re: TAKE 957159 - Reduce XFS 4k stack kernel issues References: <20061130080359.E3F7C58FF58B@chook.melbourne.sgi.com> In-Reply-To: <20061130080359.E3F7C58FF58B@chook.melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 9845 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 163 Lines: 7 David, did this add any new static declarations, or just get the existing ones into shape? IOW, I think I should rework my make-more-things-static patch? -Eric From owner-xfs@oss.sgi.com Thu Nov 30 14:39:07 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 14:39:13 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAUMd3aG023758 for ; Thu, 30 Nov 2006 14:39:05 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA28056; Fri, 1 Dec 2006 09:38:12 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAUMcB7Y49850277; Fri, 1 Dec 2006 09:38:12 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAUMcBVE49063621; Fri, 1 Dec 2006 09:38:11 +1100 (AEDT) Date: Fri, 1 Dec 2006 09:38:11 +1100 From: David Chinner To: Lachlan McIlroy Cc: David Chinner , xfs-dev@sgi.com, xfs@oss.sgi.com Subject: Re: Review: Reduce in-core superblock lock contention near ENOSPC Message-ID: <20061130223810.GO37654165@melbourne.sgi.com> References: <20061123044122.GU11034@melbourne.sgi.com> <456F1CFC.2060705@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <456F1CFC.2060705@sgi.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9846 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 3519 Lines: 114 On Thu, Nov 30, 2006 at 06:03:40PM +0000, Lachlan McIlroy wrote: > Dave, > > Could you have changed the SB_LOCK from a spinlock to a blocking > mutex and have achieved a similar effect? Sort of - it would still be inefficient and wouldn't help solve the underlying causes of contention. Also, everything else that uses the SB_LOCK would now have a sleep point where there wasn't one previously. If we are nesting the SB_LOCK somewhere else inside a another spinlock (not sure if we are) then we can't sleep. I'd prefer not to change the semantics of such a lock if I can avoid it. I think the slow path code is somewhat clearer with a separate mutex - it clearly documents the serialisation barrier that the slow path uses and allows us to do slow path checks on the per-cpu counters without needing the SB_LOCK. It also means that in future, we can slowly remove the need for holding the SB_LOCK across the entire rebalance operation and only use it when referencing the global superblock fields during the rebalance. If the need arises, it also means we can move to a mutex per counter so we can independently rebalance different types of counters at the same time (which we can't do right now). > Has this change had much testing on a large machine? 8p is the largest I've run it on (junkbond) and it's been ENOSPC tested on a 2.7GB/s filesystem (junkbond once again) as well as one single, slow disks. I've tried and tried to get the ppl that reported the problem to test this fix but no luck so far (this bug has been open for months and most of that time has been me waiting for someone to run a test). I've basically got sick of waiting and I just want to move this on. It's already too late for sles10sp1 because of the lack of response. > These changes wouldn't apply cleanly to tot (3 hunks failed in > xfs_mount.c) but I couldn't see why. Whitespace issue? Try setting: $ export QUILT_PATCH_OPTS="--ignore-whitespace" I'll apply the patch to a separate tree and see if I hit the same problem.... > The changes look fine to me, couple of comments below. > > Lachlan > > > @@ -1479,9 +1479,11 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, > case XFS_SBS_IFREE: > case XFS_SBS_FDBLOCKS: > if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { > - status = xfs_icsb_modify_counters_locked(mp, > + XFS_SB_UNLOCK(mp, s); > + status = xfs_icsb_modify_counters(mp, > msbp->msb_field, > msbp->msb_delta, > rsvd); > + s = XFS_SB_LOCK(mp); > break; > } > /* FALLTHROUGH */ > > Is it safe to be releasing the SB_LOCK? Yes. > Is it assumed that the > superblock wont change while we process the list of xfs_mod_sb > structures? No. We are applying deltas - it doesn't matter if other deltas are applied at the same time by other callers because in the end all the deltas get applied and it adds up to the same thing. > @@ -1515,11 +1517,12 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, > case XFS_SBS_IFREE: > case XFS_SBS_FDBLOCKS: > if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) > { > - status = > - > xfs_icsb_modify_counters_locked(mp, > + XFS_SB_UNLOCK(mp, s); > + status = xfs_icsb_modify_counters(mp, > msbp->msb_field, > -(msbp->msb_delta), > rsvd); > + s = XFS_SB_LOCK(mp); > break; > } > /* FALLTHROUGH */ > > Same as above. Ditto ;) Thanks for looking at this, Lachlan. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 30 15:42:51 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 15:42:58 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kAUNgmaG031346 for ; Thu, 30 Nov 2006 15:42:50 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAB29697; Fri, 1 Dec 2006 10:41:49 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kAUNfm7Y49945520; Fri, 1 Dec 2006 10:41:48 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kAUNflY649919013; Fri, 1 Dec 2006 10:41:47 +1100 (AEDT) Date: Fri, 1 Dec 2006 10:41:46 +1100 From: David Chinner To: Eric Sandeen Cc: David Chinner , xfs@oss.sgi.com Subject: Re: TAKE 957159 - Reduce XFS 4k stack kernel issues Message-ID: <20061130234146.GR37654165@melbourne.sgi.com> References: <20061130080359.E3F7C58FF58B@chook.melbourne.sgi.com> <456F4A25.80709@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <456F4A25.80709@sandeen.net> User-Agent: Mutt/1.4.2.1i X-archive-position: 9847 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 479 Lines: 20 On Thu, Nov 30, 2006 at 03:16:21PM -0600, Eric Sandeen wrote: > David, did this add any new static declarations, or just get the > existing ones into shape? It just got the existing stuff into shape and fixed the xfs_clear_inode inlining problems. > IOW, I think I should rework my make-more-things-static patch? Sure - it would be good to get it in before the .20-rc1 merge window closes.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 30 16:27:58 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 16:28:05 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kB10RraG008351 for ; Thu, 30 Nov 2006 16:27:56 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA00996; Fri, 1 Dec 2006 11:26:59 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kB10Qw7Y49912274; Fri, 1 Dec 2006 11:26:58 +1100 (AEDT) Received: (from bnaujok@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kB10QuG449074185; Fri, 1 Dec 2006 11:26:56 +1100 (AEDT) Date: Fri, 1 Dec 2006 11:26:56 +1100 (AEDT) From: Barry Naujok Message-Id: <200612010026.kB10QuG449074185@snort.melbourne.sgi.com> To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 958517 - mkfs.xfs can create a corrupt filesystem X-archive-position: 9849 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@snort.melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 966 Lines: 24 Fix up mkfs.xfs which can create a corrupt filesystem with large block sizes. Date: Fri Dec 1 11:26:25 AEDT 2006 Workarea: snort.melbourne.sgi.com:/home/bnaujok/isms/repair Inspected by: dgc@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:27594a xfsprogs/doc/CHANGES - 1.227 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.227&r2=text&tr2=1.226&f=h xfsprogs/mkfs/xfs_mkfs.c - 1.79 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/mkfs/xfs_mkfs.c.diff?r1=text&tr1=1.79&r2=text&tr2=1.78&f=h - Fix up determination of realtime extent size so mkfs can't create a corrupt filesystem xfsprogs/include/xfs_rtalloc.h - 1.13 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/include/xfs_rtalloc.h.diff?r1=text&tr1=1.13&r2=text&tr2=1.12&f=h - Remove default realtime extent size definition. From owner-xfs@oss.sgi.com Thu Nov 30 16:42:11 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 16:42:18 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kB10g5aG009889 for ; Thu, 30 Nov 2006 16:42:08 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA01386; Fri, 1 Dec 2006 11:41:15 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id kB10fE7Y49945739; Fri, 1 Dec 2006 11:41:14 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id kB10fCrv49880094; Fri, 1 Dec 2006 11:41:12 +1100 (AEDT) Date: Fri, 1 Dec 2006 11:41:12 +1100 From: David Chinner To: David Chinner Cc: Lachlan McIlroy , xfs-dev@sgi.com, xfs@oss.sgi.com Subject: Re: Review: Reduce in-core superblock lock contention near ENOSPC Message-ID: <20061201004112.GW37654165@melbourne.sgi.com> References: <20061123044122.GU11034@melbourne.sgi.com> <456F1CFC.2060705@sgi.com> <20061130223810.GO37654165@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061130223810.GO37654165@melbourne.sgi.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 9850 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 14280 Lines: 453 On Fri, Dec 01, 2006 at 09:38:11AM +1100, David Chinner wrote: > On Thu, Nov 30, 2006 at 06:03:40PM +0000, Lachlan McIlroy wrote: > > > These changes wouldn't apply cleanly to tot (3 hunks failed in > > xfs_mount.c) but I couldn't see why. > > Whitespace issue? Try setting: > > $ export QUILT_PATCH_OPTS="--ignore-whitespace" > > I'll apply the patch to a separate tree and see if I hit the same > problem.... I see the problem - the next patch I am going to send out for review which is earlier in my series.... The growfs fix changes the delta parameter to xfs_icsb_modify_counters() from int to int64_t, and that is why the hunks don't apply. The attached patch should apply (with a 6 line offset to most hunks). Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs_mount.c | 234 +++++++++++++++++++++++++++++++---------------------- fs/xfs/xfs_mount.h | 1 2 files changed, 142 insertions(+), 93 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.c 2006-10-19 10:29:35.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.c 2006-10-19 10:32:16.626827226 +1000 @@ -52,21 +52,19 @@ STATIC void xfs_unmountfs_wait(xfs_mount #ifdef HAVE_PERCPU_SB STATIC void xfs_icsb_destroy_counters(xfs_mount_t *); -STATIC void xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t, int); +STATIC void xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t, int, +int); STATIC void xfs_icsb_sync_counters(xfs_mount_t *); STATIC int xfs_icsb_modify_counters(xfs_mount_t *, xfs_sb_field_t, int, int); -STATIC int xfs_icsb_modify_counters_locked(xfs_mount_t *, xfs_sb_field_t, - int, int); STATIC int xfs_icsb_disable_counter(xfs_mount_t *, xfs_sb_field_t); #else #define xfs_icsb_destroy_counters(mp) do { } while (0) -#define xfs_icsb_balance_counter(mp, a, b) do { } while (0) +#define xfs_icsb_balance_counter(mp, a, b, c) do { } while (0) #define xfs_icsb_sync_counters(mp) do { } while (0) #define xfs_icsb_modify_counters(mp, a, b, c) do { } while (0) -#define xfs_icsb_modify_counters_locked(mp, a, b, c) do { } while (0) #endif @@ -540,9 +538,11 @@ xfs_readsb(xfs_mount_t *mp, int flags) ASSERT(XFS_BUF_VALUSEMA(bp) <= 0); } - xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0); + mutex_lock(&mp->m_icsb_mutex); + xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0, 0); + mutex_unlock(&mp->m_icsb_mutex); mp->m_sb_bp = bp; xfs_buf_relse(bp); @@ -1479,9 +1479,11 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, case XFS_SBS_IFREE: case XFS_SBS_FDBLOCKS: if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { - status = xfs_icsb_modify_counters_locked(mp, + XFS_SB_UNLOCK(mp, s); + status = xfs_icsb_modify_counters(mp, msbp->msb_field, msbp->msb_delta, rsvd); + s = XFS_SB_LOCK(mp); break; } /* FALLTHROUGH */ @@ -1515,11 +1517,12 @@ xfs_mod_incore_sb_batch(xfs_mount_t *mp, case XFS_SBS_IFREE: case XFS_SBS_FDBLOCKS: if (!(mp->m_flags & XFS_MOUNT_NO_PERCPU_SB)) { - status = - xfs_icsb_modify_counters_locked(mp, + XFS_SB_UNLOCK(mp, s); + status = xfs_icsb_modify_counters(mp, msbp->msb_field, -(msbp->msb_delta), rsvd); + s = XFS_SB_LOCK(mp); break; } /* FALLTHROUGH */ @@ -1727,14 +1730,17 @@ xfs_icsb_cpu_notify( memset(cntp, 0, sizeof(xfs_icsb_cnts_t)); break; case CPU_ONLINE: - xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0); - xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0); + mutex_lock(&mp->m_icsb_mutex); + xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0, 0); + mutex_unlock(&mp->m_icsb_mutex); break; case CPU_DEAD: /* Disable all the counters, then fold the dead cpu's * count into the total on the global superblock and * re-enable the counters. */ + mutex_lock(&mp->m_icsb_mutex); s = XFS_SB_LOCK(mp); xfs_icsb_disable_counter(mp, XFS_SBS_ICOUNT); xfs_icsb_disable_counter(mp, XFS_SBS_IFREE); @@ -1746,10 +1752,14 @@ xfs_icsb_cpu_notify( memset(cntp, 0, sizeof(xfs_icsb_cnts_t)); - xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, XFS_ICSB_SB_LOCKED); - xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, XFS_ICSB_SB_LOCKED); - xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, XFS_ICSB_SB_LOCKED); + xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, + XFS_ICSB_SB_LOCKED, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, + XFS_ICSB_SB_LOCKED, 0); + xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, + XFS_ICSB_SB_LOCKED, 0); XFS_SB_UNLOCK(mp, s); + mutex_unlock(&mp->m_icsb_mutex); break; } @@ -1778,6 +1788,9 @@ xfs_icsb_init_counters( cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i); memset(cntp, 0, sizeof(xfs_icsb_cnts_t)); } + + mutex_init(&mp->m_icsb_mutex); + /* * start with all counters disabled so that the * initial balance kicks us off correctly @@ -1882,6 +1895,17 @@ xfs_icsb_disable_counter( ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS)); + /* + * If we are already disabled, then there is nothing to do + * here. We check before locking all the counters to avoid + * the expensive lock operation when being called in the + * slow path and the counter is already disabled. This is + * safe because the only time we set or clear this state is under + * the m_icsb_mutex. + */ + if (xfs_icsb_counter_disabled(mp, field)) + return 0; + xfs_icsb_lock_all_counters(mp); if (!test_and_set_bit(field, &mp->m_icsb_counters)) { /* drain back to superblock */ @@ -1991,24 +2015,33 @@ xfs_icsb_sync_counters_lazy( /* * Balance and enable/disable counters as necessary. * - * Thresholds for re-enabling counters are somewhat magic. - * inode counts are chosen to be the same number as single - * on disk allocation chunk per CPU, and free blocks is - * something far enough zero that we aren't going thrash - * when we get near ENOSPC. + * Thresholds for re-enabling counters are somewhat magic. inode counts are + * chosen to be the same number as single on disk allocation chunk per CPU, and + * free blocks is something far enough zero that we aren't going thrash when we + * get near ENOSPC. We also need to supply a minimum we require per cpu to + * prevent looping endlessly when xfs_alloc_space asks for more than will + * be distributed to a single CPU but each CPU has enough blocks to be + * reenabled. + * + * Note that we can be called when counters are already disabled. + * xfs_icsb_disable_counter() optimises the counter locking in this case to + * prevent locking every per-cpu counter needlessly. */ -#define XFS_ICSB_INO_CNTR_REENABLE 64 + +#define XFS_ICSB_INO_CNTR_REENABLE (uint64_t)64 #define XFS_ICSB_FDBLK_CNTR_REENABLE(mp) \ - (512 + XFS_ALLOC_SET_ASIDE(mp)) + (uint64_t)(512 + XFS_ALLOC_SET_ASIDE(mp)) STATIC void xfs_icsb_balance_counter( xfs_mount_t *mp, xfs_sb_field_t field, - int flags) + int flags, + int min_per_cpu) { uint64_t count, resid; int weight = num_online_cpus(); int s; + uint64_t min = (uint64_t)min_per_cpu; if (!(flags & XFS_ICSB_SB_LOCKED)) s = XFS_SB_LOCK(mp); @@ -2021,19 +2054,19 @@ xfs_icsb_balance_counter( case XFS_SBS_ICOUNT: count = mp->m_sb.sb_icount; resid = do_div(count, weight); - if (count < XFS_ICSB_INO_CNTR_REENABLE) + if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE)) goto out; break; case XFS_SBS_IFREE: count = mp->m_sb.sb_ifree; resid = do_div(count, weight); - if (count < XFS_ICSB_INO_CNTR_REENABLE) + if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE)) goto out; break; case XFS_SBS_FDBLOCKS: count = mp->m_sb.sb_fdblocks; resid = do_div(count, weight); - if (count < XFS_ICSB_FDBLK_CNTR_REENABLE(mp)) + if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp))) goto out; break; default: @@ -2048,32 +2081,39 @@ out: XFS_SB_UNLOCK(mp, s); } -STATIC int -xfs_icsb_modify_counters_int( +int +xfs_icsb_modify_counters( xfs_mount_t *mp, xfs_sb_field_t field, int delta, - int rsvd, - int flags) + int rsvd) { xfs_icsb_cnts_t *icsbp; long long lcounter; /* long counter for 64 bit fields */ - int cpu, s, locked = 0; - int ret = 0, balance_done = 0; + int cpu, ret = 0, s; + might_sleep(); again: cpu = get_cpu(); - icsbp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, cpu), - xfs_icsb_lock_cntr(icsbp); + icsbp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, cpu); + + /* + * if the counter is disabled, go to slow path + */ if (unlikely(xfs_icsb_counter_disabled(mp, field))) goto slow_path; + xfs_icsb_lock_cntr(icsbp); + if (unlikely(xfs_icsb_counter_disabled(mp, field))) { + xfs_icsb_unlock_cntr(icsbp); + goto slow_path; + } switch (field) { case XFS_SBS_ICOUNT: lcounter = icsbp->icsb_icount; lcounter += delta; if (unlikely(lcounter < 0)) - goto slow_path; + goto balance_counter; icsbp->icsb_icount = lcounter; break; @@ -2081,7 +2121,7 @@ again: lcounter = icsbp->icsb_ifree; lcounter += delta; if (unlikely(lcounter < 0)) - goto slow_path; + goto balance_counter; icsbp->icsb_ifree = lcounter; break; @@ -2091,7 +2131,7 @@ again: lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp); lcounter += delta; if (unlikely(lcounter < 0)) - goto slow_path; + goto balance_counter; icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp); break; default: @@ -2100,72 +2140,80 @@ again: } xfs_icsb_unlock_cntr(icsbp); put_cpu(); - if (locked) - XFS_SB_UNLOCK(mp, s); return 0; - /* - * The slow path needs to be run with the SBLOCK - * held so that we prevent other threads from - * attempting to run this path at the same time. - * this provides exclusion for the balancing code, - * and exclusive fallback if the balance does not - * provide enough resources to continue in an unlocked - * manner. - */ slow_path: - xfs_icsb_unlock_cntr(icsbp); put_cpu(); - /* need to hold superblock incase we need - * to disable a counter */ - if (!(flags & XFS_ICSB_SB_LOCKED)) { - s = XFS_SB_LOCK(mp); - locked = 1; - flags |= XFS_ICSB_SB_LOCKED; - } - if (!balance_done) { - xfs_icsb_balance_counter(mp, field, flags); - balance_done = 1; + /* + * serialise with a mutex so we don't burn lots of cpu on + * the superblock lock. We still need to hold the superblock + * lock, however, when we modify the global structures. + */ + mutex_lock(&mp->m_icsb_mutex); + + /* + * Now running atomically. + * + * If the counter is enabled, someone has beaten us to rebalancing. + * Drop the lock and try again in the fast path.... + */ + if (!(xfs_icsb_counter_disabled(mp, field))) { + mutex_unlock(&mp->m_icsb_mutex); goto again; - } else { - /* - * we might not have enough on this local - * cpu to allocate for a bulk request. - * We need to drain this field from all CPUs - * and disable the counter fastpath - */ - xfs_icsb_disable_counter(mp, field); } + /* + * The counter is currently disabled. Because we are + * running atomically here, we know a rebalance cannot + * be in progress. Hence we can go straight to operating + * on the global superblock. We do not call xfs_mod_incore_sb() + * here even though we need to get the SB_LOCK. Doing so + * will cause us to re-enter this function and deadlock. + * Hence we get the SB_LOCK ourselves and then call + * xfs_mod_incore_sb_unlocked() as the unlocked path operates + * directly on the global counters. + */ + s = XFS_SB_LOCK(mp); ret = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd); + XFS_SB_UNLOCK(mp, s); - if (locked) - XFS_SB_UNLOCK(mp, s); + /* + * Now that we've modified the global superblock, we + * may be able to re-enable the distributed counters + * (e.g. lots of space just got freed). After that + * we are done. + */ + if (ret != ENOSPC) + xfs_icsb_balance_counter(mp, field, 0, 0); + mutex_unlock(&mp->m_icsb_mutex); return ret; -} -STATIC int -xfs_icsb_modify_counters( - xfs_mount_t *mp, - xfs_sb_field_t field, - int delta, - int rsvd) -{ - return xfs_icsb_modify_counters_int(mp, field, delta, rsvd, 0); -} +balance_counter: + xfs_icsb_unlock_cntr(icsbp); + put_cpu(); -/* - * Called when superblock is already locked - */ -STATIC int -xfs_icsb_modify_counters_locked( - xfs_mount_t *mp, - xfs_sb_field_t field, - int delta, - int rsvd) -{ - return xfs_icsb_modify_counters_int(mp, field, delta, - rsvd, XFS_ICSB_SB_LOCKED); + /* + * We may have multiple threads here if multiple per-cpu + * counters run dry at the same time. This will mean we can + * do more balances than strictly necessary but it is not + * the common slowpath case. + */ + mutex_lock(&mp->m_icsb_mutex); + + /* + * running atomically. + * + * This will leave the counter in the correct state for future + * accesses. After the rebalance, we simply try again but with the + * global superblock lock held. This ensures that the counter state as + * a result of the balance does not change and our retry will either + * succeed through the fast path or slow path without another balance + * operation being required. + */ + xfs_icsb_balance_counter(mp, field, 0, delta); + mutex_unlock(&mp->m_icsb_mutex); + goto again; } + #endif Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.h 2006-10-19 10:25:12.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.h 2006-10-19 10:32:16.626827226 +1000 @@ -419,6 +419,7 @@ typedef struct xfs_mount { xfs_icsb_cnts_t *m_sb_cnts; /* per-cpu superblock counters */ unsigned long m_icsb_counters; /* disabled per-cpu counters */ struct notifier_block m_icsb_notifier; /* hotplug cpu notifier */ + struct mutex m_icsb_mutex; /* balancer sync lock */ #endif } xfs_mount_t; From owner-xfs@oss.sgi.com Thu Nov 30 19:34:19 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 19:34:26 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kB13YGaG029130 for ; Thu, 30 Nov 2006 19:34:17 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA06015; Fri, 1 Dec 2006 14:33:21 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 1116) id 4F8B658FF58C; Fri, 1 Dec 2006 14:33:21 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 958736 - cleanup old 5.3/6.1 log items - from Eric Message-Id: <20061201033321.4F8B658FF58C@chook.melbourne.sgi.com> Date: Fri, 1 Dec 2006 14:33:21 +1100 (EST) From: tes@sgi.com (Tim Shimmin) X-archive-position: 9852 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1256 Lines: 32 Get rid of old 5.3/6.1 v1 log items. Cleanup patch sent in by Eric Sandeen. Signed-off-by: Eric Sandeen Date: Fri Dec 1 14:31:05 AEDT 2006 Workarea: chook.melbourne.sgi.com:/build/tes/2.6.x-xfs Inspected by: tes@sgi.com,sandeen@sandeen.net The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:27596a fs/xfs/xfs_buf_item.h - 1.44 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.h.diff?r1=text&tr1=1.44&r2=text&tr2=1.43&f=h - Get rid of old 5.3/6.1 v1 log items. Cleanup patch sent in by Eric Sandeen. Signed-off-by: Eric Sandeen fs/xfs/xfs_log_recover.c - 1.314 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.314&r2=text&tr2=1.313&f=h - Get rid of old 5.3/6.1 v1 log items. Cleanup patch sent in by Eric Sandeen. Signed-off-by: Eric Sandeen fs/xfs/xfs_trans.h - 1.142 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans.h.diff?r1=text&tr1=1.142&r2=text&tr2=1.141&f=h - Get rid of old 5.3/6.1 v1 log items. Cleanup patch sent in by Eric Sandeen. Signed-off-by: Eric Sandeen From owner-xfs@oss.sgi.com Thu Nov 30 20:24:25 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 20:24:33 -0800 (PST) Received: from mail.g-house.de (ns2.g-housing.de [81.169.133.75]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id kB14OOaG007564 for ; Thu, 30 Nov 2006 20:24:25 -0800 Received: from [82.41.152.154] (helo=82-41-152-154.cable.ubr01.linl.blueyonder.co.uk) by mail.g-house.de with esmtpsa (TLS-1.0:DHE_RSA_AES_256_CBC_SHA:32) (Exim 4.50) id 1Gpzw0-0006ZW-79; Fri, 01 Dec 2006 05:23:36 +0100 Date: Fri, 1 Dec 2006 04:23:41 +0000 (GMT) From: Christian Kujau X-X-Sender: evil@sheep.housecafe.de To: Jasmin Buchert cc: xfs@oss.sgi.com Subject: Re: mkfs.xfs questions In-Reply-To: <20061129174553.e0ef3465.jasmin@pacifica.ch> Message-ID: References: <20061129174553.e0ef3465.jasmin@pacifica.ch> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 9853 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lists@nerdbynature.de Precedence: bulk X-list: xfs Content-Length: 1640 Lines: 46 On Wed, 29 Nov 2006, Jasmin Buchert wrote: > Is there any real advantage of making the log size 32-64 MB and From 'man mkfs.xfs': If the log is contained within the data section and size isn't specified, mkfs.xfs will try to select a suitable log size depending on the size of the filesystem. The actual logsize depends on the filesystem block size and the directory block size. Otherwise, the size suboption is only needed if the log section of the filesystem should occupy less space than the size of the special file. So, if you're not limited by very special space restrictions, you won't need the "size" option. > what is the difference between log version 1 and 2 regarding to > efficency/performance? The "version" option should have no effect on performance, from 'man mkfs.xfs' again: Using the version suboption to specify a version 2 log enables the sunit suboption, and allows the logbsize to be increased beyond 32K. The "sunit" options can be tweaked to provide better performace in raid5 environments, same for the "agcount" option: for special needs only but I'm not aware of any benchmarks for different sunit/agcount values. > Is it true that a small agcount is better for most systems > (Gentoo and some other sources recommend this)? It's a desktop machine. Hm, some people are indeed suggesting this [0], you might ask the author of the doc why he's doing this or test this by yourself. Christian. [0] http://www.rootforum.de/wiki/howto/gentoo/basesystem#formatieren_der_partitionen -- BOFH excuse #244: Your cat tried to eat the mouse. From owner-xfs@oss.sgi.com Thu Nov 30 21:17:18 2006 Received: with ECARTIS (v1.0.0; list xfs); Thu, 30 Nov 2006 21:17:26 -0800 (PST) Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id kB15HEaG013072 for ; Thu, 30 Nov 2006 21:17:16 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA08596; Fri, 1 Dec 2006 16:16:19 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 1116) id 7864858FF58C; Fri, 1 Dec 2006 16:16:19 +1100 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: TAKE 958736 - fix up xfsidbg.c for removal of old items Message-Id: <20061201051619.7864858FF58C@chook.melbourne.sgi.com> Date: Fri, 1 Dec 2006 16:16:19 +1100 (EST) From: tes@sgi.com (Tim Shimmin) X-archive-position: 9854 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 703 Lines: 20 Oops, I lost my kdb in my .config and so didn't build xfsidbg.c So now need to fix up the corresponding log item changes for xfsidbg.c. Date: Fri Dec 1 16:15:37 AEDT 2006 Workarea: chook.melbourne.sgi.com:/build/tes/2.6.x-xfs Inspected by: vapo@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:27602a fs/xfs/xfsidbg.c - 1.309 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfsidbg.c.diff?r1=text&tr1=1.309&r2=text&tr2=1.308&f=h - Fix up old uses of log items for pv#958736. Also abstract out the item string lookup table. And add in a missing item string (quotaoff) while we are there.