From owner-xfs@oss.sgi.com Thu Mar 1 08:16:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 08:16:53 -0800 (PST) X-Spam-oss-Status: No, score=0.8 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_45,SUBJECT_FUZZY_TION autolearn=no version=3.2.0-pre1-r499012 Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.225]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l21GGj6p017391 for ; Thu, 1 Mar 2007 08:16:46 -0800 Received: by nz-out-0506.google.com with SMTP id m22so431561nzf for ; Thu, 01 Mar 2007 08:16:45 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=MGFqqYjrGGgO5GbhGLwJU/JeflSFm7x7xpSr0VYQU51WDwQON+I4plRiKrnVZsKSzR7DCFACEKrqzOuiodEIoUCBWMxTq0QhZxJa9wHySw67GJDQpeukuyH6PtSB5skhqrpl135kLsxwsa6FaZUuqQ/NYEqsjRvat9r+j8HtPzE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=dgemHtZbXWQ7Qp+dK5xzzFXvwsRj29/H2s2wEYZQGKshZMywO6gODmGiuI2Ze7OcuZMS6L4JYGwDYWT6XdcVwmF/lNTlm8uNBdu8U3TSa13Mu6sdO4Ln2L9xpJi5X3k/P3PVTN/m7kei6zmvsrwdYtzAwUHLMjO9tk0vUk5ByMI= Received: by 10.65.53.3 with SMTP id f3mr2933459qbk.1172764121946; Thu, 01 Mar 2007 07:48:41 -0800 (PST) Received: by 10.64.148.7 with HTTP; Thu, 1 Mar 2007 07:48:41 -0800 (PST) Message-ID: <29a617af0703010748g51ef81b1x9c0467602d09793@mail.gmail.com> Date: Thu, 1 Mar 2007 16:48:41 +0100 From: "M Kili" To: xfs@oss.sgi.com Subject: can't mount xfs partition MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10738 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: m0kili@gmail.com Precedence: bulk X-list: xfs Content-Length: 3485 Lines: 104 Background: Windows XP installation overwrote the MBR with broken partition layout. I used gpart to figure out the partition table. Primary partitions were detected correctly and can be mounted, unfortunately all important data are in logical partitions, most of which are XFS. The beginings of the partitions were found, but the ends might be a little off, I don't know. If I try to mount any of the XFS partitions, I get this message: mount: /dev/hda5: can't read superblock I dded this partition to a file and run xfs_repair with this result: # xfs_repair -f hda5.img Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done ------------------------------------------------ xfs_check doesn't give any message, and still, the image can't be mounted, with the same error: # mount -o loop hda5.img /mnt/ mount: /dev/loop0: can't read superblock Output of fdisk print partition table (5, 6, 7, and 9 are XFS): Disk /dev/hda: 320.0 GB, 320072933376 bytes 86 heads, 15 sectors/track, 484606 cylinders, total 625142448 sectors Units = sectors of 1 * 512 = 512 bytes Device Boot Start End Blocks Id System /dev/hda1 63 37752742 18876340 83 Linux /dev/hda2 37752750 37833073 40162 83 Linux /dev/hda3 37833075 48323504 5245215 83 Linux /dev/hda4 48323505 625142447 288409471+ 5 Extended /dev/hda5 48323521 87389120 19532800 83 Linux /dev/hda6 87409666 560228673 236409504 83 Linux /dev/hda7 560459656 581660935 10600640 83 Linux /dev/hda8 581681521 582741795 530137+ 82 Linux swap / Solaris /dev/hda9 582741811 625116634 21187412 83 Linux Is there any help? Another question: I'm running now on one older hd, however, it has some read errors. Is it possible to use the already discovered Thanks, Marek From owner-xfs@oss.sgi.com Thu Mar 1 08:42:25 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 08:42:30 -0800 (PST) X-Spam-oss-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00, SPF_HELO_PASS,SUBJECT_FUZZY_TION autolearn=no version=3.2.0-pre1-r499012 Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l21GgN6p027191 for ; Thu, 1 Mar 2007 08:42:24 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l21GgGNJ032120; Thu, 1 Mar 2007 11:42:16 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l21GgGVZ031645; Thu, 1 Mar 2007 11:42:16 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l21GgDYi027082; Thu, 1 Mar 2007 11:42:14 -0500 Message-ID: <45E701A2.1020304@sandeen.net> Date: Thu, 01 Mar 2007 10:38:58 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.9 (X11/20070212) MIME-Version: 1.0 To: M Kili CC: xfs@oss.sgi.com Subject: Re: can't mount xfs partition References: <29a617af0703010748g51ef81b1x9c0467602d09793@mail.gmail.com> In-Reply-To: <29a617af0703010748g51ef81b1x9c0467602d09793@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 10740 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 260 Lines: 12 M Kili wrote: > xfs_check doesn't give any message, and still, the image can't be > mounted, with the same error: > # mount -o loop hda5.img /mnt/ > mount: /dev/loop0: can't read superblock look at dmesg output, see what it says. Does "-t xfs" help? -Eric From owner-xfs@oss.sgi.com Thu Mar 1 11:04:05 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 11:04:09 -0800 (PST) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from lab41.emea.sgi.com (lab41.emea.sgi.com [144.253.75.41]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l21J446p020323 for ; Thu, 1 Mar 2007 11:04:05 -0800 Received: by lab41.emea.sgi.com (Postfix, from userid 1000) id E9CE752AC0; Thu, 1 Mar 2007 18:56:55 +0000 (GMT) To: xfs@oss.sgi.com Subject: TAKE 961693 - The last argument "lsn" of xfs_trans_commit() is always called with NULL. Message-Id: <20070301185655.E9CE752AC0@lab41.emea.sgi.com> Date: Thu, 1 Mar 2007 18:56:55 +0000 (GMT) From: lachlan@lab41.emea.sgi.com (Lachlan McIlroy) X-archive-position: 10746 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@lab41.emea.sgi.com Precedence: bulk X-list: xfs Content-Length: 3771 Lines: 67 The last argument "lsn" of xfs_trans_commit() is always called with NULL. Patch provided by Eric Sandeen. Signed-off-by: Eric Sandeen Date: Fri Mar 2 05:02:20 AEDT 2007 Workarea: vpn-emea-sw-emea-160-20.emea.sgi.com:/home/lachlan/isms/2.6.x-xfs Inspected by: tes sandeen@sandeen.net Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28199a fs/xfs/xfs_rw.c - 1.398 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rw.c.diff?r1=text&tr1=1.398&r2=text&tr2=1.397&f=h fs/xfs/xfs_vnodeops.c - 1.690 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.690&r2=text&tr2=1.689&f=h fs/xfs/xfs_rtalloc.c - 1.106 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rtalloc.c.diff?r1=text&tr1=1.106&r2=text&tr2=1.105&f=h fs/xfs/xfs_log_recover.c - 1.316 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.316&r2=text&tr2=1.315&f=h fs/xfs/xfs_vfsops.c - 1.516 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.516&r2=text&tr2=1.515&f=h fs/xfs/xfs_dfrag.c - 1.58 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dfrag.c.diff?r1=text&tr1=1.58&r2=text&tr2=1.57&f=h fs/xfs/xfs_mount.c - 1.393 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.393&r2=text&tr2=1.392&f=h fs/xfs/xfs_inode.c - 1.461 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.461&r2=text&tr2=1.460&f=h fs/xfs/xfs_qmops.c - 1.15 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_qmops.c.diff?r1=text&tr1=1.15&r2=text&tr2=1.14&f=h fs/xfs/xfs_attr_leaf.c - 1.107 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_attr_leaf.c.diff?r1=text&tr1=1.107&r2=text&tr2=1.106&f=h fs/xfs/xfs_trans.c - 1.178 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans.c.diff?r1=text&tr1=1.178&r2=text&tr2=1.177&f=h fs/xfs/xfs_trans.h - 1.144 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans.h.diff?r1=text&tr1=1.144&r2=text&tr2=1.143&f=h fs/xfs/xfs_utils.c - 1.73 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_utils.c.diff?r1=text&tr1=1.73&r2=text&tr2=1.72&f=h fs/xfs/xfs_fsops.c - 1.122 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_fsops.c.diff?r1=text&tr1=1.122&r2=text&tr2=1.121&f=h fs/xfs/xfs_bmap.c - 1.365 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap.c.diff?r1=text&tr1=1.365&r2=text&tr2=1.364&f=h fs/xfs/xfs_rename.c - 1.70 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rename.c.diff?r1=text&tr1=1.70&r2=text&tr2=1.69&f=h fs/xfs/xfs_attr.c - 1.143 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_attr.c.diff?r1=text&tr1=1.143&r2=text&tr2=1.142&f=h fs/xfs/quota/xfs_qm_syscalls.c - 1.31 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm_syscalls.c.diff?r1=text&tr1=1.31&r2=text&tr2=1.30&f=h fs/xfs/quota/xfs_dquot.c - 1.29 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_dquot.c.diff?r1=text&tr1=1.29&r2=text&tr2=1.28&f=h fs/xfs/quota/xfs_qm.c - 1.47 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm.c.diff?r1=text&tr1=1.47&r2=text&tr2=1.46&f=h fs/xfs/xfs_iomap.c - 1.51 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iomap.c.diff?r1=text&tr1=1.51&r2=text&tr2=1.50&f=h fs/xfs/dmapi/xfs_dm.c - 1.33 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/dmapi/xfs_dm.c.diff?r1=text&tr1=1.33&r2=text&tr2=1.32&f=h - The last argument "lsn" of xfs_trans_commit() is always called with NULL. Signed-off-by: Eric Sandeen From owner-xfs@oss.sgi.com Thu Mar 1 15:27:02 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 15:27:07 -0800 (PST) X-Spam-oss-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_05 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l21NQx6p014519 for ; Thu, 1 Mar 2007 15:27:00 -0800 Received: from [134.14.55.84] (shark.melbourne.sgi.com [134.14.55.84]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA20336; Fri, 2 Mar 2007 10:26:51 +1100 Message-ID: <45E76138.2020202@sgi.com> Date: Fri, 02 Mar 2007 10:26:48 +1100 From: Donald Douwsma User-Agent: Thunderbird 1.5.0.9 (X11/20070103) MIME-Version: 1.0 To: Utako Kusaka CC: xfs@oss.sgi.com Subject: Re: [PATCH] repquota does't report correct space usage References: <200702280733.AA05017@TNESG9305.tnes.nec.co.jp> In-Reply-To: <200702280733.AA05017@TNESG9305.tnes.nec.co.jp> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 10750 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Content-Length: 1513 Lines: 38 Utako Kusaka wrote: > Hi, > > repquota may report incorrect space usage when the filesystem is mounted > repeatedly with different quota options. > The cause of the problem is that xfs_qm_quotacheck() is not called because > the `CHKD' flag in mp->m_qflags is not cleared until it is mounted with > no quota option. This patch fixes it. Good find, I've heard of some problems with quota 'corruption' that may actually be caused by this. > --- linux-2.6.20-orgn/fs/xfs/quota/xfs_qm.c.orgn 2007-02-22 17:30:07.000000000 +0900 > +++ linux-2.6.20/fs/xfs/xfs_qm.c 2007-02-22 17:30:58.000000000 +0900 > @@ -1175,8 +1175,6 @@ xfs_qm_init_quotainfo( > qinf->qi_dqperchunk = BBTOB(qinf->qi_dqchunklen); > do_div(qinf->qi_dqperchunk, sizeof(xfs_dqblk_t)); > > - mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD); > - > /* > * We try to get the limits from the superuser's limits fields. > * This is quite hacky, but it is standard quota practice. This disables the optimization that skips the quota check for the normal case where mount options have not been changed. I don't have any quota check performance figures handy but I don't think we can loose this optimization for people with large filesystems/machines. I think instead we need to clear the individual quota bit when a filesystem is mounted without a particular quota type. This will force a quota check but only when the filesystem is again mounted with that quota type. Donald From owner-xfs@oss.sgi.com Thu Mar 1 16:13:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 16:13:25 -0800 (PST) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS, SUBJECT_FUZZY_TION autolearn=no version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l220DH6p022444 for ; Thu, 1 Mar 2007 16:13:18 -0800 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 5B20218030AC3; Thu, 1 Mar 2007 18:13:16 -0600 (CST) Message-ID: <45E76C1C.4040903@sandeen.net> Date: Thu, 01 Mar 2007 18:13:16 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.9 (Macintosh/20061207) MIME-Version: 1.0 To: M Kili , xfs@oss.sgi.com Subject: Re: can't mount xfs partition References: <29a617af0703010748g51ef81b1x9c0467602d09793@mail.gmail.com> <45E701A2.1020304@sandeen.net> <29a617af0703011611p4ed51bb4g63b1ec0da42143bc@mail.gmail.com> In-Reply-To: <29a617af0703011611p4ed51bb4g63b1ec0da42143bc@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10751 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 970 Lines: 36 M Kili wrote: > On 3/1/07, Eric Sandeen wrote: >> M Kili wrote: >> > xfs_check doesn't give any message, and still, the image can't be >> > mounted, with the same error: >> > # mount -o loop hda5.img /mnt/ >> > mount: /dev/loop0: can't read superblock >> >> look at dmesg output, see what it says. > > [17206988.072000] SGI XFS with ACLs, security attributes, realtime, > large block numbers, no debug enabled > [17206988.072000] SGI XFS Quota Management subsystem > [17206988.080000] attempt to access beyond end of device > [17206988.080000] hda5: rw=0, want=39086080, limit=39065600 > [17206988.080000] I/O error in filesystem ("hda5") meta-data dev hda5 > block 0x25467ff ("xfs_read_buf") error 5 buf count 512 > [17206988.080000] XFS: size check 2 failed > > I guess I must repartition, am I right? Yep, looks like your partition endpoint is just slightly too small. -Eric >> >> Does "-t xfs" help? > > no > >> >> -Eric >> >> > From owner-xfs@oss.sgi.com Thu Mar 1 19:03:07 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 19:03:12 -0800 (PST) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from tyo200.gate.nec.co.jp (TYO200.gate.nec.co.jp [210.143.35.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l223366p023490 for ; Thu, 1 Mar 2007 19:03:07 -0800 Received: from tyo201.gate.nec.co.jp ([10.7.69.201]) by tyo200.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id l22330QB029427 for ; Fri, 2 Mar 2007 12:03:04 +0900 (JST) Received: from mailgate3.nec.co.jp (mailgate53.nec.co.jp [10.7.69.162]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id l222xpsK002231 for ; Fri, 2 Mar 2007 11:59:51 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id l222xpx05901 for xfs@oss.sgi.com; Fri, 2 Mar 2007 11:59:51 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv4.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id l222xpg16589 for ; Fri, 2 Mar 2007 11:59:51 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20070302.105937.51502548 for ; Fri, 2 Mar 2007 10:59:37 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Fri Mar 02 10:59:37 2007 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id CDD6AAE4B3; Fri, 2 Mar 2007 11:59:45 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id l222xoJP030334; Fri, 2 Mar 2007 11:59:50 +0900 Message-Id: <200703020259.AA05023@TNESG9305.tnes.nec.co.jp> From: Utako Kusaka Date: Fri, 02 Mar 2007 11:59:44 +0900 To: Donald Douwsma Cc: xfs@oss.sgi.com Subject: Re: [PATCH] repquota does't report correct space usage In-Reply-To: <45E76138.2020202@sgi.com> References: <45E76138.2020202@sgi.com> MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=iso-2022-jp X-archive-position: 10752 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 1697 Lines: 45 Hi, Donald I understand your explanation. I'll think a new patch for this. Thanks. Fri, 02 Mar 2007 10:26:48 +1100 Donald Douwsma wrote$B!'(B >Utako Kusaka wrote: >> Hi, >> >> repquota may report incorrect space usage when the filesystem is mounted >> repeatedly with different quota options. >> The cause of the problem is that xfs_qm_quotacheck() is not called because >> the `CHKD' flag in mp->m_qflags is not cleared until it is mounted with >> no quota option. This patch fixes it. > >Good find, I've heard of some problems with quota 'corruption' that may >actually be caused by this. > >> --- linux-2.6.20-orgn/fs/xfs/quota/xfs_qm.c.orgn 2007-02-22 17:30:07.000000000 +0900 >> +++ linux-2.6.20/fs/xfs/xfs_qm.c 2007-02-22 17:30:58.000000000 +0900 >> @@ -1175,8 +1175,6 @@ xfs_qm_init_quotainfo( >> qinf->qi_dqperchunk = BBTOB(qinf->qi_dqchunklen); >> do_div(qinf->qi_dqperchunk, sizeof(xfs_dqblk_t)); >> >> - mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD); >> - >> /* >> * We try to get the limits from the superuser's limits fields. >> * This is quite hacky, but it is standard quota practice. > >This disables the optimization that skips the quota check for the normal case where mount >options have not been changed. > >I don't have any quota check performance figures handy but I don't think we can loose this >optimization for people with large filesystems/machines. > >I think instead we need to clear the individual quota bit when a filesystem is mounted >without a particular quota type. This will force a quota check but only when the filesystem >is again mounted with that quota type. > >Donald > > From owner-xfs@oss.sgi.com Thu Mar 1 22:34:50 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Mar 2007 22:34:56 -0800 (PST) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from tyo201.gate.nec.co.jp (TYO201.gate.nec.co.jp [202.32.8.193]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l226Yl6p028373 for ; Thu, 1 Mar 2007 22:34:50 -0800 Received: from mailgate3.nec.co.jp (mailgate53.nec.co.jp [10.7.69.192]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id l226YlD1008695 for ; Fri, 2 Mar 2007 15:34:47 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id l226Yk015713 for xfs@oss.sgi.com; Fri, 2 Mar 2007 15:34:46 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv3.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id l226YkL00734 for ; Fri, 2 Mar 2007 15:34:46 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20070302.143432.87504704 for ; Fri, 2 Mar 2007 14:34:32 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Fri Mar 02 14:34:32 2007 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id 071E3AE4B3; Fri, 2 Mar 2007 15:34:42 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id l226YiA0015886; Fri, 2 Mar 2007 15:34:44 +0900 Message-Id: <200703020634.AA05027@TNESG9305.tnes.nec.co.jp> Date: Fri, 02 Mar 2007 15:34:34 +0900 To: donaldd@sgi.com, xfs@oss.sgi.com Subject: [PATCH] repquota doesn't report correct space usage #2 From: Utako Kusaka In-Reply-To: <45E76138.2020202@sgi.com> References: <45E76138.2020202@sgi.com> MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=iso-2022-jp X-archive-position: 10753 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 2573 Lines: 64 Hi, This new patch skips the quota check when the filesystem is mounted with the same quota option. Signed-off-by: Utako Kusaka --- --- fs/xfs/quota/xfs_qm.c.orgn 2007-02-22 17:30:07.000000000 +0900 +++ fs/xfs/quota/xfs_qm.c 2007-03-02 15:01:44.000000000 +0900 @@ -1175,7 +1175,12 @@ xfs_qm_init_quotainfo( qinf->qi_dqperchunk = BBTOB(qinf->qi_dqchunklen); do_div(qinf->qi_dqperchunk, sizeof(xfs_dqblk_t)); - mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD); + if (XFS_IS_UQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_UQUOTA_ACCT)) + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_UQUOTA_CHKD); + if (XFS_IS_GQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_GQUOTA_ACCT)) + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_OQUOTA_CHKD); + if (XFS_IS_PQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_PQUOTA_ACCT)) + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_OQUOTA_CHKD); /* * We try to get the limits from the superuser's limits fields. Fri, 02 Mar 2007 10:26:48 +1100 Donald Douwsma wrote$B!'(B >Utako Kusaka wrote: >> Hi, >> >> repquota may report incorrect space usage when the filesystem is mounted >> repeatedly with different quota options. >> The cause of the problem is that xfs_qm_quotacheck() is not called because >> the `CHKD' flag in mp->m_qflags is not cleared until it is mounted with >> no quota option. This patch fixes it. > >Good find, I've heard of some problems with quota 'corruption' that may >actually be caused by this. > >> --- linux-2.6.20-orgn/fs/xfs/quota/xfs_qm.c.orgn 2007-02-22 17:30:07.000000000 +0900 >> +++ linux-2.6.20/fs/xfs/xfs_qm.c 2007-02-22 17:30:58.000000000 +0900 >> @@ -1175,8 +1175,6 @@ xfs_qm_init_quotainfo( >> qinf->qi_dqperchunk = BBTOB(qinf->qi_dqchunklen); >> do_div(qinf->qi_dqperchunk, sizeof(xfs_dqblk_t)); >> >> - mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD); >> - >> /* >> * We try to get the limits from the superuser's limits fields. >> * This is quite hacky, but it is standard quota practice. > >This disables the optimization that skips the quota check for the normal case where mount >options have not been changed. > >I don't have any quota check performance figures handy but I don't think we can loose this >optimization for people with large filesystems/machines. > >I think instead we need to clear the individual quota bit when a filesystem is mounted >without a particular quota type. This will force a quota check but only when the filesystem >is again mounted with that quota type. > >Donald > > From owner-xfs@oss.sgi.com Fri Mar 2 09:42:55 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Mar 2007 09:43:02 -0800 (PST) X-Spam-oss-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from lab41.emea.sgi.com (lab41.emea.sgi.com [144.253.75.41]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l22Hgr6p016941 for ; Fri, 2 Mar 2007 09:42:54 -0800 Received: by lab41.emea.sgi.com (Postfix, from userid 1000) id A48CF52AD5; Fri, 2 Mar 2007 18:01:46 +0000 (GMT) To: xfs@oss.sgi.com Subject: TAKE 961694 - the "aendp" arg to xfs_dir2_data_freescan is always NULL, remove it. Message-Id: <20070302180146.A48CF52AD5@lab41.emea.sgi.com> Date: Fri, 2 Mar 2007 18:01:46 +0000 (GMT) From: lachlan@lab41.emea.sgi.com (Lachlan McIlroy) X-archive-position: 10754 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@lab41.emea.sgi.com Precedence: bulk X-list: xfs Content-Length: 1333 Lines: 34 the "aendp" arg to xfs_dir2_data_freescan is always NULL, remove it. Patch provided by Eric Sandeen. Signed-off-by: Eric Sandeen Date: Sat Mar 3 04:40:38 AEDT 2007 Workarea: vpn-emea-sw-emea-160-1.emea.sgi.com:/home/lachlan/isms/2.6.x-xfs Inspected by: lachlan sandeen@sandeen.net Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28204a fs/xfs/xfs_dir2_block.c - 1.54 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dir2_block.c.diff?r1=text&tr1=1.54&r2=text&tr2=1.53&f=h fs/xfs/xfs_dir2_data.c - 1.37 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dir2_data.c.diff?r1=text&tr1=1.37&r2=text&tr2=1.36&f=h fs/xfs/xfs_dir2_data.h - 1.20 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dir2_data.h.diff?r1=text&tr1=1.20&r2=text&tr2=1.19&f=h fs/xfs/xfs_dir2_leaf.c - 1.57 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dir2_leaf.c.diff?r1=text&tr1=1.57&r2=text&tr2=1.56&f=h fs/xfs/xfs_dir2_node.c - 1.58 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dir2_node.c.diff?r1=text&tr1=1.58&r2=text&tr2=1.57&f=h - the "aendp" arg to xfs_dir2_data_freescan is always NULL, remove it. Signed-off-by: Eric Sandeen From owner-xfs@oss.sgi.com Fri Mar 2 10:10:14 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Mar 2007 10:10:17 -0800 (PST) X-Spam-oss-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from lab41.emea.sgi.com (lab41.emea.sgi.com [144.253.75.41]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l22IAC6p021194 for ; Fri, 2 Mar 2007 10:10:14 -0800 Received: by lab41.emea.sgi.com (Postfix, from userid 1000) id 43F8D52AEA; Fri, 2 Mar 2007 18:29:06 +0000 (GMT) To: xfs@oss.sgi.com Subject: TAKE 961695 - remove more misc. unused args Message-Id: <20070302182906.43F8D52AEA@lab41.emea.sgi.com> Date: Fri, 2 Mar 2007 18:29:06 +0000 (GMT) From: lachlan@lab41.emea.sgi.com (Lachlan McIlroy) X-archive-position: 10755 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@lab41.emea.sgi.com Precedence: bulk X-list: xfs Content-Length: 961 Lines: 31 remove more misc. unused args Patch provided by Eric Sandeen. Signed-off-by: Eric Sandeen Date: Sat Mar 3 04:58:03 AEDT 2007 Workarea: vpn-emea-sw-emea-160-1.emea.sgi.com:/home/lachlan/isms/2.6.x-xfs Inspected by: lachlan sandeen@sandeen.net Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28205a fs/xfs/xfs_vnodeops.c - 1.691 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.691&r2=text&tr2=1.690&f=h fs/xfs/xfs_log_recover.c - 1.317 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.317&r2=text&tr2=1.316&f=h fs/xfs/xfs_bmap.c - 1.366 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap.c.diff?r1=text&tr1=1.366&r2=text&tr2=1.365&f=h - remove more misc. unused args Signed-off-by: Eric Sandeen From owner-xfs@oss.sgi.com Sun Mar 4 22:18:33 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Mar 2007 22:18:37 -0800 (PST) X-Spam-oss-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l256IT6p022537 for ; Sun, 4 Mar 2007 22:18:31 -0800 Received: from [134.14.55.84] (shark.melbourne.sgi.com [134.14.55.84]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA06027; Mon, 5 Mar 2007 17:18:18 +1100 Message-ID: <45EBB613.5040803@sgi.com> Date: Mon, 05 Mar 2007 17:17:55 +1100 From: Donald Douwsma User-Agent: Thunderbird 1.5.0.9 (X11/20070103) MIME-Version: 1.0 To: Utako Kusaka CC: xfs@oss.sgi.com Subject: Re: [PATCH] repquota doesn't report correct space usage #2 References: <45E76138.2020202@sgi.com> <200703020634.AA05027@TNESG9305.tnes.nec.co.jp> In-Reply-To: <200703020634.AA05027@TNESG9305.tnes.nec.co.jp> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-archive-position: 10759 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Content-Length: 2045 Lines: 61 Hi Utako, That's closer to what I was thinking of but I'd prefer to do the manipulation separate to init. Putting it in xfs_qm_mount_quotas() minimizes the number of places changes are made to the superblock. We dont need to worry about group/project differences as a quotacheck is forced by XFS_QM_NEED_QUOTACHECK() if there are incompatibilities. Signed-off-by: Donald Douwsma --- a/fs/xfs/quota/xfs_qm.c 2007-03-05 16:50:11.000000000 +1100 +++ b/fs/xfs/quota/xfs_qm.c 2007-03-05 15:36:12.000000000 +1100 @@ -388,6 +388,17 @@ xfs_qm_mount_quotas( return XFS_ERROR(error); } } + /* + * If one type of quotas is off, then it will lose its + * quotachecked status, since we won't be doing accounting for + * that type anymore. + */ + if (!XFS_IS_UQUOTA_ON(mp)) { + mp->m_qflags &= ~XFS_UQUOTA_CHKD; + } + if (!(XFS_IS_GQUOTA_ON(mp) || XFS_IS_PQUOTA_ON(mp))) { + mp->m_qflags &= ~XFS_OQUOTA_CHKD; + } write_changes: /* Utako Kusaka wrote: > Hi, > > This new patch skips the quota check when the filesystem is mounted > with the same quota option. > > Signed-off-by: Utako Kusaka > --- > > --- fs/xfs/quota/xfs_qm.c.orgn 2007-02-22 17:30:07.000000000 +0900 > +++ fs/xfs/quota/xfs_qm.c 2007-03-02 15:01:44.000000000 +0900 > @@ -1175,7 +1175,12 @@ xfs_qm_init_quotainfo( > qinf->qi_dqperchunk = BBTOB(qinf->qi_dqchunklen); > do_div(qinf->qi_dqperchunk, sizeof(xfs_dqblk_t)); > > - mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD); > + if (XFS_IS_UQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_UQUOTA_ACCT)) > + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_UQUOTA_CHKD); > + if (XFS_IS_GQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_GQUOTA_ACCT)) > + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_OQUOTA_CHKD); > + if (XFS_IS_PQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_PQUOTA_ACCT)) > + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_OQUOTA_CHKD); > > /* > * We try to get the limits from the superuser's limits fields. From owner-xfs@oss.sgi.com Mon Mar 5 00:37:05 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 00:37:09 -0800 (PST) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from tyo201.gate.nec.co.jp (TYO201.gate.nec.co.jp [202.32.8.193]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l258b26p028680 for ; Mon, 5 Mar 2007 00:37:05 -0800 Received: from mailgate3.nec.co.jp (mailgate54.nec.co.jp [10.7.69.193]) by tyo201.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id l258b0Nm012708 for ; Mon, 5 Mar 2007 17:37:00 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id l258axE05458 for xfs@oss.sgi.com; Mon, 5 Mar 2007 17:36:59 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv5.nec.co.jp (8.11.7/3.7W-MAILSV4-NEC) with ESMTP id l258ax222741 for ; Mon, 5 Mar 2007 17:36:59 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20070305.163650.35902140 for ; Mon, 5 Mar 2007 16:36:50 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Mon Mar 05 16:36:50 2007 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id 9C692AE4B3; Mon, 5 Mar 2007 17:36:55 +0900 (JST) Received: from TNESG9305.tnes.nec.co.jp (TNESG9305.bsd.tnes.nec.co.jp [10.1.104.199]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id l258awb4003280; Mon, 5 Mar 2007 17:36:58 +0900 Message-Id: <200703050836.AA05033@TNESG9305.tnes.nec.co.jp> Date: Mon, 05 Mar 2007 17:36:50 +0900 To: donaldd@sgi.com Cc: xfs@oss.sgi.com Subject: Re: [PATCH] repquota doesn't report correct space usage #2 From: Utako Kusaka In-Reply-To: <45EBB613.5040803@sgi.com> References: <45EBB613.5040803@sgi.com> MIME-Version: 1.0 X-Mailer: AL-Mail32 Version 1.13 Content-Type: text/plain; charset=iso-2022-jp X-archive-position: 10760 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: utako@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 2229 Lines: 66 Hi Donald, I see. I tested your patch and it was no problem. Mon, 05 Mar 2007 17:17:55 +1100 Donald Douwsma wrote$B!'(B >Hi Utako, > >That's closer to what I was thinking of but I'd prefer to do the >manipulation separate to init. Putting it in xfs_qm_mount_quotas() >minimizes the number of places changes are made to the superblock. > >We dont need to worry about group/project differences as a >quotacheck is forced by XFS_QM_NEED_QUOTACHECK() if there are >incompatibilities. > > >Signed-off-by: Donald Douwsma > >--- a/fs/xfs/quota/xfs_qm.c 2007-03-05 16:50:11.000000000 +1100 >+++ b/fs/xfs/quota/xfs_qm.c 2007-03-05 15:36:12.000000000 +1100 >@@ -388,6 +388,17 @@ xfs_qm_mount_quotas( > return XFS_ERROR(error); > } > } >+ /* >+ * If one type of quotas is off, then it will lose its >+ * quotachecked status, since we won't be doing accounting for >+ * that type anymore. >+ */ >+ if (!XFS_IS_UQUOTA_ON(mp)) { >+ mp->m_qflags &= ~XFS_UQUOTA_CHKD; >+ } >+ if (!(XFS_IS_GQUOTA_ON(mp) || XFS_IS_PQUOTA_ON(mp))) { >+ mp->m_qflags &= ~XFS_OQUOTA_CHKD; >+ } > > write_changes: > /* > > >Utako Kusaka wrote: >> Hi, >> >> This new patch skips the quota check when the filesystem is mounted >> with the same quota option. >> >> Signed-off-by: Utako Kusaka >> --- >> >> --- fs/xfs/quota/xfs_qm.c.orgn 2007-02-22 17:30:07.000000000 +0900 >> +++ fs/xfs/quota/xfs_qm.c 2007-03-02 15:01:44.000000000 +0900 >> @@ -1175,7 +1175,12 @@ xfs_qm_init_quotainfo( >> qinf->qi_dqperchunk = BBTOB(qinf->qi_dqchunklen); >> do_div(qinf->qi_dqperchunk, sizeof(xfs_dqblk_t)); >> >> - mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_ALL_QUOTA_CHKD); >> + if (XFS_IS_UQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_UQUOTA_ACCT)) >> + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_UQUOTA_CHKD); >> + if (XFS_IS_GQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_GQUOTA_ACCT)) >> + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_OQUOTA_CHKD); >> + if (XFS_IS_PQUOTA_ON(mp) && (mp->m_sb.sb_qflags & XFS_PQUOTA_ACCT)) >> + mp->m_qflags |= (mp->m_sb.sb_qflags & XFS_OQUOTA_CHKD); >> >> /* >> * We try to get the limits from the superuser's limits fields. From owner-xfs@oss.sgi.com Mon Mar 5 06:45:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 06:45:26 -0800 (PST) X-Spam-oss-Status: No, score=3.1 required=5.0 tests=AWL,BAYES_99,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l25EjG6p009021 for ; Mon, 5 Mar 2007 06:45:18 -0800 Received: from root by ciao.gmane.org with local (Exim 4.43) id 1HOEQw-0005Qh-A2 for linux-xfs@oss.sgi.com; Mon, 05 Mar 2007 15:45:02 +0100 Received: from pool-71-163-240-183.washdc.fios.verizon.net ([71.163.240.183]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 05 Mar 2007 15:45:02 +0100 Received: from chaweber by pool-71-163-240-183.washdc.fios.verizon.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 05 Mar 2007 15:45:02 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: linux-xfs@oss.sgi.com From: Chuck Weber Subject: xfs partial dismount issue Date: Mon, 5 Mar 2007 13:13:28 +0000 (UTC) Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: main.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 71.163.240.183 (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1) Gecko/20061027 Firefox/2.0) X-archive-position: 10761 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chaweber@gmail.com Precedence: bulk X-list: xfs Content-Length: 4494 Lines: 110 Hi everyone, I have a long running problem perhaps you can help with. I will include as much detail as I can. I can set up a spare server-disk set for testing if you have any bright ideas. We use XFS for samba and nfs on x86_64 Fedora Proliant DL585/385 servers. Our busiest server has disk partitions go away. The other servers do not show this behavior ever. The partitions show as mounted, but access to the partition just hangs. Open file count, process count and load average rise until the server becomes very unresponsive. Even if we catch it before the high load average, because it cannot unmount the partition, it must be powered off and back on to restart. Upon restart all partitions mount properly and everything is fine for days or months. There is nothing in log files that I have noticed. With sar, I can track the files open and process count rise. I believed this to be a hardware issue and embarked on replacing parts along the partition chain. I recently replaced the actual server and saw the same issue the next week, so I don't think it is hardware. The problem is related to XFS/Samba/acl/load usage I think, as I have 2-8 directories set up as samba shares in a given partition. When the problem occurs, first I cannot access a directory, shortly afterward I cannot access the entire partition. This problem has affected 3 partitions so far. Over the last 3 months this has occurred every week or 2. Configuration: Proliant DL585, 8GB ram, 2 proc with 3 smartarray 6404 4 channel U320 raid cards. 6 MSA30 dual channel disk carriers with 14 drives each in raid with 2 parity stripes. We started with 72 GB drives and have updated 1 carrier each with 146 GB and 300 GB drives. Each disk carrier is mounted as a single partition, store1 through store6. Example of last mounting problem partition below: /dev/cciss/c3d0p1 on /share/store3 type xfs (rw,logbufs=8) /dev/cciss/c3d0p1 814G 677G 138G 84% /share/store3 meta-data=/dev/cciss/c3d0p1 isize=2048 agcount=32, agsize=6668186 blks = sectsz=512 data = bsize=4096 blocks=213381952, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 I have added nobarrier and noatime mount options recently from the list but don't see that they affect the problem. For the 300 Gb disk carrier I am using LVM as it runs into a 6404 2TB limit but I only am using 3-400 GB on it so far. All servers are running x86_64 Fedora so I hope not to have the stack issue. The Dl585/3raid controllers/6 disk chassis without problems runs Fedora Core 2 and acts as an NFS server to some computational computers. Another DL585 with only 1 raid controller acts as windows home directory and mail store server. It runs Fedora Core 4/ samba 3.023a. These servers would show the same xfs_info as above on their raid partitions. Both of these servers have no problems and very long uptimes. Our problem server started as Fedora Core 2 and whatever samba we used then. When it first had problems, I upgraded to FC 4 and then to FC5 with samba 3.0.24. I have applied all current HP firmware throughout this process. I have changed out power, disks, disk carriers, scsi cables, and raid controllers. I finally swapped the DL585 for a DL385 with 4 processors and 16 GB ram. None of this made a difference. Fedora core 5 2.6.18 and 19 kernels dumped within 1 day of booting with a spinlock error, so I am now running the latest FC5 2.6.17 kernel, which does include the 17.13 patch. I have run HP diagnostics for hours with no results. I have taken the active server offline and run xfs_repair on the partitions. I have reformatted one of the partitions. I have been formatting the partitions with an inode size of 2k and no other options. Current rpms, but note that I have used different versions on this server from FC2 to present and downloaded/built acl/attr/xfsprogs at times all with no difference in my problem: acl-2.2.34-1.2 attr-2.4.28-1.2 samba-3.0.24-1.fc5 xfsprogs-2.7.3-1.2.1 kernel-2.6.17-1.2187_FC5 I could move to ext3, but in my one recent test it ran into trouble just copying acled files from an XFS partition to it. XFS performance seems quite good, with my limiting factor being AD user/group id times. All I can think of now is some resource/tuning/formatting/kernel change. I would appreciate any suggestions you can come up with. Thanks, Chuck From owner-xfs@oss.sgi.com Mon Mar 5 08:01:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 08:01:59 -0800 (PST) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_50, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l25G1q6p025206 for ; Mon, 5 Mar 2007 08:01:53 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l25G1asf005842; Mon, 5 Mar 2007 11:01:37 -0500 Received: from pobox-2.corp.redhat.com (pobox-2.corp.redhat.com [10.11.255.15]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l25G1WHW003104; Mon, 5 Mar 2007 11:01:32 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by pobox-2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l25G1US4022337; Mon, 5 Mar 2007 11:01:31 -0500 Message-ID: <45EC3DEA.3000105@sandeen.net> Date: Mon, 05 Mar 2007 09:57:30 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.9 (X11/20070212) MIME-Version: 1.0 To: Chuck Weber CC: linux-xfs@oss.sgi.com Subject: Re: xfs partial dismount issue References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 10762 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1196 Lines: 27 Chuck Weber wrote: > Hi everyone, I have a long running problem perhaps you can help with. I > will include as much detail as I can. I can set up a spare server-disk > set for testing if you have any bright ideas. > > We use XFS for samba and nfs on x86_64 Fedora Proliant DL585/385 > servers. Our busiest server has disk partitions go away. What do you mean by this, exactly? The partitions themselves go away, or are you talking about the problem described below where processes start hanging? > The other > servers do not show this behavior ever. The partitions show as mounted, > but access to the partition just hangs. Open file count, process count > and load average rise until the server becomes very unresponsive. Even > if we catch it before the high load average, because it cannot unmount > the partition, it must be powered off and back on to restart. Upon > restart all partitions mount properly and everything is fine for days or > months. There is nothing in log files that I have noticed. With sar, I > can track the files open and process count rise. Maybe try sysrq-t, to capture all backtraces when it's in this state, and see where the various threads are at. -Eric From owner-xfs@oss.sgi.com Mon Mar 5 08:28:45 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 08:28:51 -0800 (PST) X-Spam-oss-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from lab41.emea.sgi.com (lab41.emea.sgi.com [144.253.75.41]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l25GSi6p002278 for ; Mon, 5 Mar 2007 08:28:44 -0800 Received: by lab41.emea.sgi.com (Postfix, from userid 1000) id 38B2552AEE; Mon, 5 Mar 2007 16:47:48 +0000 (GMT) To: xfs@oss.sgi.com Subject: TAKE 961696 - reducing the number of random number functions. Message-Id: <20070305164748.38B2552AEE@lab41.emea.sgi.com> Date: Mon, 5 Mar 2007 16:47:48 +0000 (GMT) From: lachlan@lab41.emea.sgi.com (Lachlan McIlroy) X-archive-position: 10763 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@lab41.emea.sgi.com Precedence: bulk X-list: xfs Content-Length: 1271 Lines: 32 reducing the number of random number functions. Patch provided by Joe Perches Signed-off-by: Joe Perches Date: Tue Mar 6 03:25:54 AEDT 2007 Workarea: vpn-emea-sw-emea-160-34.emea.sgi.com:/home/lachlan/isms/2.6.x-xfs Inspected by: lachlan joe@perches.com Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28209a fs/xfs/xfs_error.c - 1.56 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_error.c.diff?r1=text&tr1=1.56&r2=text&tr2=1.55&f=h fs/xfs/xfs_alloc.c - 1.185 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_alloc.c.diff?r1=text&tr1=1.185&r2=text&tr2=1.184&f=h fs/xfs/support/debug.h - 1.17 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/support/debug.h.diff?r1=text&tr1=1.17&r2=text&tr2=1.16&f=h fs/xfs/support/debug.c - 1.37 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/support/debug.c.diff?r1=text&tr1=1.37&r2=text&tr2=1.36&f=h fs/xfs/linux-2.6/xfs_ksyms.c - 1.56 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ksyms.c.diff?r1=text&tr1=1.56&r2=text&tr2=1.55&f=h - reducing the number of random number functions. Signed-off-by: Joe Perches From owner-xfs@oss.sgi.com Mon Mar 5 10:26:20 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 10:26:28 -0800 (PST) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50,SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l25IQI6p022416 for ; Mon, 5 Mar 2007 10:26:19 -0800 Received: from list by ciao.gmane.org with local (Exim 4.43) id 1HOHsf-0007SD-KE for linux-xfs@oss.sgi.com; Mon, 05 Mar 2007 19:25:53 +0100 Received: from fernwood-arbiter-b.net.nih.gov ([128.231.88.7]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 05 Mar 2007 19:25:53 +0100 Received: from chaweber by fernwood-arbiter-b.net.nih.gov with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 05 Mar 2007 19:25:53 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: linux-xfs@oss.sgi.com From: Charles Weber Subject: Re: xfs partial dismount issue Date: Mon, 5 Mar 2007 18:25:03 +0000 (UTC) Message-ID: References: <45EC3DEA.3000105@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: main.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 128.231.88.7 (Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1) Gecko/20061027 Firefox/2.0) X-archive-position: 10764 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: chaweber@gmail.com Precedence: bulk X-list: xfs Content-Length: 1836 Lines: 44 Eric Sandeen sandeen.net> writes: > > Chuck Weber wrote: > > Hi everyone, I have a long running problem perhaps you can help with. I > > will include as much detail as I can. I can set up a spare server-disk > > set for testing if you have any bright ideas. > > > > We use XFS for samba and nfs on x86_64 Fedora Proliant DL585/385 > > servers. Our busiest server has disk partitions go away. > > What do you mean by this, exactly? The partitions themselves go away, > or are you talking about the problem described below where processes > start hanging? > Here is an example partition (1 of 6 or more xfs storage only). /share/store3 with samba shares on /share/store3/lls, lds, lxs and so on. I will get a call saying my groups share (lxs) is no longer accessable. I ssh into server and can ls /share/store3 but ls will hang when I ls /share/store3/lxs. Shortly there after ls will hang for the root or any directory on the partition. Other partitions will be fine and other samba shares will be fine until the queued up process load bogs the server down. > > The other > > servers do not show this behavior ever. The partitions show as mounted, > > but access to the partition just hangs. Open file count, process count > > and load average rise until the server becomes very unresponsive. Even > > if we catch it before the high load average, because it cannot unmount > > the partition, it must be powered off and back on to restart. Upon > > restart all partitions mount properly and everything is fine for days or > > months. There is nothing in log files that I have noticed. With sar, I > > can track the files open and process count rise. > > Maybe try sysrq-t, to capture all backtraces when it's in this state, > and see where the various threads are at. > OK I'll look over sysrq > -Eric > > From owner-xfs@oss.sgi.com Mon Mar 5 12:47:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 12:47:07 -0800 (PST) X-Spam-oss-Status: No, score=2.5 required=5.0 tests=AWL,BAYES_50,RCVD_IN_PSBL autolearn=no version=3.2.0-pre1-r499012 Received: from smtp2.mundo-r.com (smtp6.mundo-r.com [212.51.32.153]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l25Kl16p019638 for ; Mon, 5 Mar 2007 12:47:02 -0800 Received: from cm44039.red83-165.mundo-r.com (HELO [192.168.1.36]) ([83.165.44.39]) by smtp2.mundo-r.com with ESMTP; 05 Mar 2007 21:36:55 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAADoO7EVTpSwn/2dsb2JhbAAN X-IronPort-AV: i="4.14,251,1170630000"; d="scan'208"; a="62726853:sNHT15148644" Message-ID: <45EC7F67.3050308@mundo-r.com> Date: Mon, 05 Mar 2007 21:36:55 +0100 From: Antonio Trueba User-Agent: IceDove 1.5.0.9 (X11/20061220) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Spanish and Galician translation Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10765 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: atrueba@mundo-r.com Precedence: bulk X-list: xfs Content-Length: 148 Lines: 8 Hello, I'd like to know if someone is already translating XFS to Spanish (es) or Galician (gl). If not, I'd like to do both myself. Regards. -- From owner-xfs@oss.sgi.com Mon Mar 5 13:19:34 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 13:19:41 -0800 (PST) X-Spam-oss-Status: No, score=0.8 required=5.0 tests=AWL,BAYES_50, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from mail.atipa.com (125.14.124.24.cm.sunflower.com [24.124.14.125]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l25LJU6p024539 for ; Mon, 5 Mar 2007 13:19:33 -0800 Received: from [192.168.100.181] ([192.168.100.181]) by mail.atipa.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 5 Mar 2007 15:09:20 -0600 Message-ID: <45EC868A.4060607@atipa.com> Date: Mon, 05 Mar 2007 15:07:22 -0600 From: Roger Heflin User-Agent: Thunderbird 1.5.0.9 (X11/20070102) MIME-Version: 1.0 To: Charles Weber CC: linux-xfs@oss.sgi.com Subject: Re: xfs partial dismount issue References: <45EC3DEA.3000105@sandeen.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 05 Mar 2007 21:09:20.0046 (UTC) FILETIME=[8AA304E0:01C75F6A] X-archive-position: 10766 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rheflin@atipa.com Precedence: bulk X-list: xfs Content-Length: 2097 Lines: 43 Charles Weber wrote: > Eric Sandeen sandeen.net> writes: > >> Chuck Weber wrote: >>> Hi everyone, I have a long running problem perhaps you can help with. I >>> will include as much detail as I can. I can set up a spare server-disk >>> set for testing if you have any bright ideas. >>> >>> We use XFS for samba and nfs on x86_64 Fedora Proliant DL585/385 >>> servers. Our busiest server has disk partitions go away. >> What do you mean by this, exactly? The partitions themselves go away, >> or are you talking about the problem described below where processes >> start hanging? >> > Here is an example partition (1 of 6 or more xfs storage only). > /share/store3 with samba shares on /share/store3/lls, lds, lxs and so on. > I will get a call saying my groups share (lxs) is no longer accessable. I ssh > into server and can ls /share/store3 but ls will hang when I ls > /share/store3/lxs. Shortly there after ls will hang for the root or any > directory on the partition. Other partitions will be fine and other samba shares > will be fine until the queued up process load bogs the server down. > Charles, I have seen what may be a similar issue on SLES9SP2, we had 1 xfs partition, and under certain conditions it would stop responding, all non-xfs partitions were ok, and everything was fine after a reboot. Under sysrq-t it appeared to me that 2 separate processes were calling fsync and were causing each other to deadlock (and locking all others out of changing the xfs partition). I was not able to determine exactly what the underlying bug was, but all of the hung processes were waiting on locks in at least several widely different parts of the xfs and kernel code, and adjusting the application to not fsync has apparently resulted in the deadlock not occuring. In this case there were multiple (2-4) different instances of the application calling fsync apparently sometimes at close to the same time. With the given application the failure was almost a certainly on one machine (of 100) running the application overnight. Roger From owner-xfs@oss.sgi.com Mon Mar 5 18:09:37 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Mar 2007 18:09:43 -0800 (PST) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50, DATE_IN_FUTURE_06_12 autolearn=no version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2629a6p026087 for ; Mon, 5 Mar 2007 18:09:37 -0800 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 7B413AAC319; Tue, 6 Mar 2007 12:50:24 +1100 (EST) Subject: Re: Spanish and Galician translation From: Nathan Scott Reply-To: nscott@aconex.com To: Antonio Trueba Cc: xfs@oss.sgi.com In-Reply-To: <45EC7F67.3050308@mundo-r.com> References: <45EC7F67.3050308@mundo-r.com> Content-Type: text/plain Organization: Aconex Date: Tue, 06 Mar 2007 23:06:24 +1100 Message-Id: <1173182784.5051.7.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10767 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 591 Lines: 19 On Mon, 2007-03-05 at 21:36 +0100, Antonio Trueba wrote: > Hello, > > I'd like to know if someone is already translating XFS to Spanish (es) > or Galician (gl). If not, I'd like to do both myself. Go for it, noone else if working on those AFAIK. You should probably rebuild the translation database ("cd xfsprogs/po && make xfsprogs.pot" IIRC) first, its probably not been updated in awhile, and there's new strings in xfs_repair at least. Let me know if theres any issues getting it working - I know a little bit about the build system in this area, so can help. cheers. -- Nathan From owner-xfs@oss.sgi.com Tue Mar 6 03:10:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Mar 2007 03:10:50 -0800 (PST) X-Spam-oss-Status: No, score=1.0 required=5.0 tests=BAYES_60,HTML_MESSAGE autolearn=ham version=3.2.0-pre1-r499012 Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.173]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l26BAh6p018947 for ; Tue, 6 Mar 2007 03:10:45 -0800 Received: by ug-out-1314.google.com with SMTP id a2so138217ugf for ; Tue, 06 Mar 2007 03:10:43 -0800 (PST) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type; b=T8qN4YwWbuyH9Uzh210YVY25zD5LuHjZlS1YcqS0h0ExD5ydzyClEsCCIfC7OnLrnE3ybZWOrLS+4YV5nzJK2gToEm0LzBs71q63MfBImDDsTUsxpGJ7AWahCsFD3dXgk9EjG9A3YnebfgKzoBGtKiHBe5/y/cAZswPBzNNUUyo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type; b=rJmS/SfwhBh8POc7SoA6POkvp4bjA3O4vrvOADAVXgIMKDyNnBHqorKuDQa+Z99BcX5uSxRmWz8/h8zzosPF8hD4LxTY3hcpO05FnZn0wwTtU4NmjNYoQnD0bCuDSnLnCA6tRU3xxZKp7DOj5k3V/LjxEylLWFFg0x+EM0bRCoY= Received: by 10.114.166.1 with SMTP id o1mr1649849wae.1173176087391; Tue, 06 Mar 2007 02:14:47 -0800 (PST) Received: by 10.114.13.6 with HTTP; Tue, 6 Mar 2007 02:14:47 -0800 (PST) Message-ID: <5d96567b0703060214n45838ceh9375545613a1557c@mail.gmail.com> Date: Tue, 6 Mar 2007 12:14:47 +0200 From: "Raz Ben-Jehuda(caro)" To: linux-xfs@oss.sgi.com Subject: xfs_repair and extents size MIME-Version: 1.0 Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 10768 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: raziebe@gmail.com Precedence: bulk X-list: xfs Content-Length: 517 Lines: 18 Hello. I am having some problems with extents sizes and xfs_repair. Problem is that my files and directories are not real time files and i am setting the extent size to 1MB. I have managed to set the extent size to 1M by setting parent directory extent to 1M and setting the inherit flag on. But if I am running xfs_repair all extents on all directories in the file system are being reset to zero. I am using last xfsprogs 2.8.18 downloaded from sgi web site. thank you -- Raz [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Tue Mar 6 15:59:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Mar 2007 15:59:45 -0800 (PST) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l26Nxc6p004839 for ; Tue, 6 Mar 2007 15:59:39 -0800 Received: from pcbnaujok (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA13317; Wed, 7 Mar 2007 10:48:00 +1100 Message-Id: <200703062348.KAA13317@larry.melbourne.sgi.com> From: "Barry Naujok" To: "'Raz Ben-Jehuda\(caro\)'" , Subject: RE: xfs_repair and extents size Date: Wed, 7 Mar 2007 10:54:24 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 In-Reply-To: <5d96567b0703060214n45838ceh9375545613a1557c@mail.gmail.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 Thread-Index: Acdf4FGeYt1jC7C8RiOyziv7RZ2ISQAalszQ X-archive-position: 10770 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 947 Lines: 35 Hi Raz, It's quite possible xfs_repair hasn't been updated to handle extent hints on directories and files. I will investigate this. Regards, Barry. > -----Original Message----- > From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] > On Behalf Of Raz Ben-Jehuda(caro) > Sent: Tuesday, 6 March 2007 9:15 PM > To: linux-xfs@oss.sgi.com > Subject: xfs_repair and extents size > > Hello. > I am having some problems with extents sizes and xfs_repair. > Problem is that my files and directories are not real time files > and i am setting the extent size to 1MB. I have managed to > set the extent size to 1M by setting parent directory extent > to 1M and setting the inherit flag on. > But if I am running xfs_repair all extents on all directories in > the file system are being reset to zero. > I am using last xfsprogs 2.8.18 downloaded from sgi web > site. > > thank you > -- > Raz > > > [[HTML alternate version deleted]] > > From owner-xfs@oss.sgi.com Tue Mar 6 19:18:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Mar 2007 19:18:23 -0800 (PST) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_45,J_CHICKENPOX_61,J_CHICKENPOX_62,J_CHICKENPOX_63, J_CHICKENPOX_65,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l273ID6p018016 for ; Tue, 6 Mar 2007 19:18:15 -0800 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 525101807DECF for ; Tue, 6 Mar 2007 21:18:12 -0600 (CST) Message-ID: <45EE2EF3.8090707@sandeen.net> Date: Tue, 06 Mar 2007 21:18:11 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: [PATCH] get rid of fname[] for tracing functions Content-Type: multipart/mixed; boundary="------------040201070004000506070208" X-archive-position: 10771 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 50903 Lines: 1385 This is a multi-part message in MIME format. --------------040201070004000506070208 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit this gets rid of the #ifdef XFS_BMAP_TRACE static char fname[] = "xfs_iextents_copy"; #endif ugliness littered in the bmap code for tracing, and instead just uses gcc's __FUNCTION__, which never gets out of sync with the actual function name.... It also makes some of this tracing more consistently use the #define XFS_BMBT_TRACE_ARGBI(c,b,i) \ xfs_bmbt_trace_argbi(__FUNCTION__, c, b, i, __LINE__) type constructs, to automatically pick up the gcc extensions. the vn tracing could probably get a similar treatment, so that every call to vn_trace_foo wouldn't have to include a function name and a __builtin_return_address, but could be done via macros... it's currently a mishmash of __FUNCTION__ and "function" *shrug* what do you think? -Eric --------------040201070004000506070208 Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0"; name="no_fname.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="no_fname.patch" Get rid of fname[] arrays for tracing code, and just use gcc's __FUNCTION__ instead. xfs_alloc.c | 53 ++-------- xfs_bmap.c | 266 +++++++++++++++++++++++-------------------------------- xfs_bmap.h | 6 - xfs_bmap_btree.c | 88 +++--------------- xfs_inode.c | 8 - 5 files changed, 149 insertions(+), 272 deletions(-) Signed-off-by: Eric Sandeen Index: linux/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux.orig/fs/xfs/xfs_bmap_btree.c +++ linux/fs/xfs/xfs_bmap_btree.c @@ -76,7 +76,7 @@ static char EXIT[] = "exit"; */ STATIC void xfs_bmbt_trace_enter( - char *func, + const char *func, xfs_btree_cur_t *cur, char *s, int type, @@ -117,7 +117,7 @@ xfs_bmbt_trace_enter( */ STATIC void xfs_bmbt_trace_argbi( - char *func, + const char *func, xfs_btree_cur_t *cur, xfs_buf_t *b, int i, @@ -134,7 +134,7 @@ xfs_bmbt_trace_argbi( */ STATIC void xfs_bmbt_trace_argbii( - char *func, + const char *func, xfs_btree_cur_t *cur, xfs_buf_t *b, int i0, @@ -153,7 +153,7 @@ xfs_bmbt_trace_argbii( */ STATIC void xfs_bmbt_trace_argfffi( - char *func, + const char *func, xfs_btree_cur_t *cur, xfs_dfiloff_t o, xfs_dfsbno_t b, @@ -172,7 +172,7 @@ xfs_bmbt_trace_argfffi( */ STATIC void xfs_bmbt_trace_argi( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, int line) @@ -188,7 +188,7 @@ xfs_bmbt_trace_argi( */ STATIC void xfs_bmbt_trace_argifk( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, xfs_fsblock_t f, @@ -206,7 +206,7 @@ xfs_bmbt_trace_argifk( */ STATIC void xfs_bmbt_trace_argifr( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, xfs_fsblock_t f, @@ -235,7 +235,7 @@ xfs_bmbt_trace_argifr( */ STATIC void xfs_bmbt_trace_argik( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, xfs_bmbt_key_t *k, @@ -255,7 +255,7 @@ xfs_bmbt_trace_argik( */ STATIC void xfs_bmbt_trace_cursor( - char *func, + const char *func, xfs_btree_cur_t *cur, char *s, int line) @@ -274,21 +274,21 @@ xfs_bmbt_trace_cursor( } #define XFS_BMBT_TRACE_ARGBI(c,b,i) \ - xfs_bmbt_trace_argbi(fname, c, b, i, __LINE__) + xfs_bmbt_trace_argbi(__FUNCTION__, c, b, i, __LINE__) #define XFS_BMBT_TRACE_ARGBII(c,b,i,j) \ - xfs_bmbt_trace_argbii(fname, c, b, i, j, __LINE__) + xfs_bmbt_trace_argbii(__FUNCTION__, c, b, i, j, __LINE__) #define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) \ - xfs_bmbt_trace_argfffi(fname, c, o, b, i, j, __LINE__) + xfs_bmbt_trace_argfffi(__FUNCTION__, c, o, b, i, j, __LINE__) #define XFS_BMBT_TRACE_ARGI(c,i) \ - xfs_bmbt_trace_argi(fname, c, i, __LINE__) + xfs_bmbt_trace_argi(__FUNCTION__, c, i, __LINE__) #define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) \ - xfs_bmbt_trace_argifk(fname, c, i, f, s, __LINE__) + xfs_bmbt_trace_argifk(__FUNCTION__, c, i, f, s, __LINE__) #define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) \ - xfs_bmbt_trace_argifr(fname, c, i, f, r, __LINE__) + xfs_bmbt_trace_argifr(__FUNCTION__, c, i, f, r, __LINE__) #define XFS_BMBT_TRACE_ARGIK(c,i,k) \ - xfs_bmbt_trace_argik(fname, c, i, k, __LINE__) + xfs_bmbt_trace_argik(__FUNCTION__, c, i, k, __LINE__) #define XFS_BMBT_TRACE_CURSOR(c,s) \ - xfs_bmbt_trace_cursor(fname, c, s, __LINE__) + xfs_bmbt_trace_cursor(__FUNCTION__, c, s, __LINE__) #else #define XFS_BMBT_TRACE_ARGBI(c,b,i) #define XFS_BMBT_TRACE_ARGBII(c,b,i,j) @@ -318,9 +318,6 @@ xfs_bmbt_delrec( xfs_fsblock_t bno; /* fs-relative block number */ xfs_buf_t *bp; /* buffer for block */ int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_delrec"; -#endif int i; /* loop counter */ int j; /* temp state */ xfs_bmbt_key_t key; /* bmap btree key */ @@ -694,9 +691,6 @@ xfs_bmbt_insrec( xfs_bmbt_block_t *block; /* bmap btree block */ xfs_buf_t *bp; /* buffer for block */ int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_insrec"; -#endif int i; /* loop index */ xfs_bmbt_key_t key; /* bmap btree key */ xfs_bmbt_key_t *kp=NULL; /* pointer to bmap btree key */ @@ -881,9 +875,6 @@ xfs_bmbt_killroot( #ifdef DEBUG int error; #endif -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_killroot"; -#endif int i; xfs_bmbt_key_t *kp; xfs_inode_t *ip; @@ -973,9 +964,6 @@ xfs_bmbt_log_keys( int kfirst, int klast) { -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_keys"; -#endif xfs_trans_t *tp; XFS_BMBT_TRACE_CURSOR(cur, ENTRY); @@ -1012,9 +1000,6 @@ xfs_bmbt_log_ptrs( int pfirst, int plast) { -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_ptrs"; -#endif xfs_trans_t *tp; XFS_BMBT_TRACE_CURSOR(cur, ENTRY); @@ -1055,9 +1040,6 @@ xfs_bmbt_lookup( xfs_daddr_t d; xfs_sfiloff_t diff; int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_lookup"; -#endif xfs_fsblock_t fsbno=0; int high; int i; @@ -1195,9 +1177,6 @@ xfs_bmbt_lshift( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_lshift"; -#endif #ifdef DEBUG int i; /* loop counter */ #endif @@ -1331,9 +1310,6 @@ xfs_bmbt_rshift( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_rshift"; -#endif int i; /* loop counter */ xfs_bmbt_key_t key; /* bmap btree key */ xfs_buf_t *lbp; /* left buffer pointer */ @@ -1492,9 +1468,6 @@ xfs_bmbt_split( { xfs_alloc_arg_t args; /* block allocation args */ int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_split"; -#endif int i; /* loop counter */ xfs_fsblock_t lbno; /* left sibling block number */ xfs_buf_t *lbp; /* left buffer pointer */ @@ -1641,9 +1614,6 @@ xfs_bmbt_updkey( #ifdef DEBUG int error; #endif -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_updkey"; -#endif xfs_bmbt_key_t *kp; int ptr; @@ -1712,9 +1682,6 @@ xfs_bmbt_decrement( xfs_bmbt_block_t *block; xfs_buf_t *bp; int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_decrement"; -#endif xfs_fsblock_t fsbno; int lev; xfs_mount_t *mp; @@ -1785,9 +1752,6 @@ xfs_bmbt_delete( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_delete"; -#endif int i; int level; @@ -2000,9 +1964,6 @@ xfs_bmbt_increment( xfs_bmbt_block_t *block; xfs_buf_t *bp; int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_increment"; -#endif xfs_fsblock_t fsbno; int lev; xfs_mount_t *mp; @@ -2080,9 +2041,6 @@ xfs_bmbt_insert( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_insert"; -#endif int i; int level; xfs_fsblock_t nbno; @@ -2142,9 +2100,6 @@ xfs_bmbt_log_block( int fields) { int first; -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_block"; -#endif int last; xfs_trans_t *tp; static const short offsets[] = { @@ -2181,9 +2136,6 @@ xfs_bmbt_log_recs( { xfs_bmbt_block_t *block; int first; -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_recs"; -#endif int last; xfs_bmbt_rec_t *rp; xfs_trans_t *tp; @@ -2245,9 +2197,6 @@ xfs_bmbt_newroot( xfs_bmbt_key_t *ckp; /* child key pointer */ xfs_bmbt_ptr_t *cpp; /* child ptr pointer */ int error; /* error return code */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_newroot"; -#endif #ifdef DEBUG int i; /* loop counter */ #endif @@ -2630,9 +2579,6 @@ xfs_bmbt_update( xfs_bmbt_block_t *block; xfs_buf_t *bp; int error; -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_update"; -#endif xfs_bmbt_key_t key; int ptr; xfs_bmbt_rec_t *rp; Index: linux/fs/xfs/xfs_alloc.c =================================================================== --- linux.orig/fs/xfs/xfs_alloc.c +++ linux/fs/xfs/xfs_alloc.c @@ -55,17 +55,17 @@ xfs_alloc_search_busy(xfs_trans_t *tp, ktrace_t *xfs_alloc_trace_buf; #define TRACE_ALLOC(s,a) \ - xfs_alloc_trace_alloc(fname, s, a, __LINE__) + xfs_alloc_trace_alloc(__FUNCTION__, s, a, __LINE__) #define TRACE_FREE(s,a,b,x,f) \ - xfs_alloc_trace_free(fname, s, mp, a, b, x, f, __LINE__) + xfs_alloc_trace_free(__FUNCTION__, s, mp, a, b, x, f, __LINE__) #define TRACE_MODAGF(s,a,f) \ - xfs_alloc_trace_modagf(fname, s, mp, a, f, __LINE__) -#define TRACE_BUSY(fname,s,ag,agb,l,sl,tp) \ - xfs_alloc_trace_busy(fname, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSY, __LINE__) -#define TRACE_UNBUSY(fname,s,ag,sl,tp) \ - xfs_alloc_trace_busy(fname, s, mp, ag, -1, -1, sl, tp, XFS_ALLOC_KTRACE_UNBUSY, __LINE__) -#define TRACE_BUSYSEARCH(fname,s,ag,agb,l,sl,tp) \ - xfs_alloc_trace_busy(fname, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSYSEARCH, __LINE__) + xfs_alloc_trace_modagf(__FUNCTION__, s, mp, a, f, __LINE__) +#define TRACE_BUSY(__FUNCTION__,s,ag,agb,l,sl,tp) \ + xfs_alloc_trace_busy(__FUNCTION__, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSY, __LINE__) +#define TRACE_UNBUSY(__FUNCTION__,s,ag,sl,tp) \ + xfs_alloc_trace_busy(__FUNCTION__, s, mp, ag, -1, -1, sl, tp, XFS_ALLOC_KTRACE_UNBUSY, __LINE__) +#define TRACE_BUSYSEARCH(__FUNCTION__,s,ag,agb,l,sl,tp) \ + xfs_alloc_trace_busy(__FUNCTION__, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSYSEARCH, __LINE__) #else #define TRACE_ALLOC(s,a) #define TRACE_FREE(s,a,b,x,f) @@ -420,7 +420,7 @@ xfs_alloc_read_agfl( */ STATIC void xfs_alloc_trace_alloc( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_alloc_arg_t *args, /* allocation argument structure */ int line) /* source line number */ @@ -453,7 +453,7 @@ xfs_alloc_trace_alloc( */ STATIC void xfs_alloc_trace_free( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_mount_t *mp, /* file system mount point */ xfs_agnumber_t agno, /* allocation group number */ @@ -479,7 +479,7 @@ xfs_alloc_trace_free( */ STATIC void xfs_alloc_trace_modagf( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_mount_t *mp, /* file system mount point */ xfs_agf_t *agf, /* new agf value */ @@ -507,7 +507,7 @@ xfs_alloc_trace_modagf( STATIC void xfs_alloc_trace_busy( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_mount_t *mp, /* file system mount point */ xfs_agnumber_t agno, /* allocation group number */ @@ -549,9 +549,6 @@ xfs_alloc_ag_vextent( xfs_alloc_arg_t *args) /* argument structure for allocation */ { int error=0; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent"; -#endif ASSERT(args->minlen > 0); ASSERT(args->maxlen > 0); @@ -635,9 +632,6 @@ xfs_alloc_ag_vextent_exact( xfs_agblock_t fbno; /* start block of found extent */ xfs_agblock_t fend; /* end block of found extent */ xfs_extlen_t flen; /* length of found extent */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_exact"; -#endif int i; /* success/failure of operation */ xfs_agblock_t maxend; /* end of maximal extent */ xfs_agblock_t minend; /* end of minimal extent */ @@ -737,9 +731,6 @@ xfs_alloc_ag_vextent_near( xfs_btree_cur_t *bno_cur_gt; /* cursor for bno btree, right side */ xfs_btree_cur_t *bno_cur_lt; /* cursor for bno btree, left side */ xfs_btree_cur_t *cnt_cur; /* cursor for count btree */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_near"; -#endif xfs_agblock_t gtbno; /* start bno of right side entry */ xfs_agblock_t gtbnoa; /* aligned ... */ xfs_extlen_t gtdiff; /* difference to right side entry */ @@ -1270,9 +1261,6 @@ xfs_alloc_ag_vextent_size( int error; /* error result */ xfs_agblock_t fbno; /* start of found freespace */ xfs_extlen_t flen; /* length of found freespace */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_size"; -#endif int i; /* temp status variable */ xfs_agblock_t rbno; /* returned block number */ xfs_extlen_t rlen; /* length of returned extent */ @@ -1427,9 +1415,6 @@ xfs_alloc_ag_vextent_small( int error; xfs_agblock_t fbno; xfs_extlen_t flen; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_small"; -#endif int i; if ((error = xfs_alloc_decrement(ccur, 0, &i))) @@ -1515,9 +1500,6 @@ xfs_free_ag_extent( xfs_btree_cur_t *bno_cur; /* cursor for by-block btree */ xfs_btree_cur_t *cnt_cur; /* cursor for by-size btree */ int error; /* error return value */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_free_ag_extent"; -#endif xfs_agblock_t gtbno; /* start of right neighbor block */ xfs_extlen_t gtlen; /* length of right neighbor block */ int haveleft; /* have a left neighbor block */ @@ -1998,9 +1980,6 @@ xfs_alloc_get_freelist( xfs_buf_t *agflbp;/* buffer for a.g. freelist structure */ xfs_agblock_t bno; /* block number returned */ int error; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_get_freelist"; -#endif xfs_mount_t *mp; /* mount structure */ xfs_perag_t *pag; /* per allocation group data */ @@ -2112,9 +2091,6 @@ xfs_alloc_put_freelist( xfs_agfl_t *agfl; /* a.g. free block array */ __be32 *blockp;/* pointer to array entry */ int error; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_put_freelist"; -#endif xfs_mount_t *mp; /* mount structure */ xfs_perag_t *pag; /* per allocation group data */ @@ -2235,9 +2211,6 @@ xfs_alloc_vextent( xfs_agblock_t agsize; /* allocation group size */ int error; int flags; /* XFS_ALLOC_FLAG_... locking flags */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_vextent"; -#endif xfs_extlen_t minleft;/* minimum left value, temp copy */ xfs_mount_t *mp; /* mount structure pointer */ xfs_agnumber_t sagno; /* starting allocation group number */ Index: linux/fs/xfs/xfs_bmap.c =================================================================== --- linux.orig/fs/xfs/xfs_bmap.c +++ linux/fs/xfs/xfs_bmap.c @@ -277,7 +277,7 @@ xfs_bmap_isaeof( STATIC void xfs_bmap_trace_addentry( int opcode, /* operation */ - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(ies) */ @@ -291,7 +291,7 @@ xfs_bmap_trace_addentry( */ STATIC void xfs_bmap_trace_delete( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) deleted */ @@ -304,7 +304,7 @@ xfs_bmap_trace_delete( */ STATIC void xfs_bmap_trace_insert( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) inserted */ @@ -318,7 +318,7 @@ xfs_bmap_trace_insert( */ STATIC void xfs_bmap_trace_post_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry updated */ @@ -329,17 +329,25 @@ xfs_bmap_trace_post_update( */ STATIC void xfs_bmap_trace_pre_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry to be updated */ int whichfork); /* data or attr fork */ +#define XFS_BMAP_TRACE_DELETE(d,ip,i,c,w) \ + xfs_bmap_trace_delete(__FUNCTION__,d,ip,i,c,w) +#define XFS_BMAP_TRACE_INSERT(d,ip,i,c,r1,r2,w) \ + xfs_bmap_trace_insert(__FUNCTION__,d,ip,i,c,r1,r2,w) +#define XFS_BMAP_TRACE_POST_UPDATE(d,ip,i,w) \ + xfs_bmap_trace_post_update(__FUNCTION__,d,ip,i,w) +#define XFS_BMAP_TRACE_PRE_UPDATE(d,ip,i,w) \ + xfs_bmap_trace_pre_update(__FUNCTION__,d,ip,i,w) #else -#define xfs_bmap_trace_delete(f,d,ip,i,c,w) -#define xfs_bmap_trace_insert(f,d,ip,i,c,r1,r2,w) -#define xfs_bmap_trace_post_update(f,d,ip,i,w) -#define xfs_bmap_trace_pre_update(f,d,ip,i,w) +#define XFS_BMAP_TRACE_DELETE(d,ip,i,c,w) +#define XFS_BMAP_TRACE_INSERT(d,ip,i,c,r1,r2,w) +#define XFS_BMAP_TRACE_POST_UPDATE(d,ip,i,w) +#define XFS_BMAP_TRACE_PRE_UPDATE(d,ip,i,w) #endif /* XFS_BMAP_TRACE */ /* @@ -531,9 +539,6 @@ xfs_bmap_add_extent( xfs_filblks_t da_new; /* new count del alloc blocks used */ xfs_filblks_t da_old; /* old count del alloc blocks used */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent"; -#endif xfs_ifork_t *ifp; /* inode fork ptr */ int logflags; /* returned value */ xfs_extnum_t nextents; /* number of extents in file now */ @@ -551,8 +556,8 @@ xfs_bmap_add_extent( * already extents in the list. */ if (nextents == 0) { - xfs_bmap_trace_insert(fname, "insert empty", ip, 0, 1, new, - NULL, whichfork); + XFS_BMAP_TRACE_INSERT("insert empty", ip, 0, 1, new, NULL, + whichfork); xfs_iext_insert(ifp, 0, 1, new); ASSERT(cur == NULL); ifp->if_lastex = 0; @@ -710,9 +715,6 @@ xfs_bmap_add_extent_delay_real( int diff; /* temp value */ xfs_bmbt_rec_t *ep; /* extent entry for idx */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_delay_real"; -#endif int i; /* temp state */ xfs_ifork_t *ifp; /* inode fork pointer */ xfs_fileoff_t new_endoff; /* end offset of new entry */ @@ -808,15 +810,14 @@ xfs_bmap_add_extent_delay_real( * Filling in all of a previously delayed allocation extent. * The left and right neighbors are both contiguous with new. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount + RIGHT.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC|RC", ip, idx - 1, - XFS_DATA_FORK); - xfs_bmap_trace_delete(fname, "LF|RF|LC|RC", ip, idx, 2, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC|RC", ip, idx, 2, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 2); ip->i_df.if_lastex = idx - 1; ip->i_d.di_nextents--; @@ -855,15 +856,14 @@ xfs_bmap_add_extent_delay_real( * Filling in all of a previously delayed allocation extent. * The left neighbor is contiguous, the right is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; - xfs_bmap_trace_delete(fname, "LF|RF|LC", ip, idx, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC", ip, idx, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 1); if (cur == NULL) rval = XFS_ILOG_DEXT; @@ -892,16 +892,13 @@ xfs_bmap_add_extent_delay_real( * Filling in all of a previously delayed allocation extent. * The right neighbor is contiguous, the left is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|RC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startblock(ep, new->br_startblock); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount + RIGHT.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|RC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; - xfs_bmap_trace_delete(fname, "LF|RF|RC", ip, idx + 1, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|RC", ip, idx + 1, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx + 1, 1); if (cur == NULL) rval = XFS_ILOG_DEXT; @@ -931,11 +928,9 @@ xfs_bmap_add_extent_delay_real( * Neither the left nor right neighbors are contiguous with * the new one. */ - xfs_bmap_trace_pre_update(fname, "LF|RF", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startblock(ep, new->br_startblock); - xfs_bmap_trace_post_update(fname, "LF|RF", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; ip->i_d.di_nextents++; if (cur == NULL) @@ -963,17 +958,14 @@ xfs_bmap_add_extent_delay_real( * Filling in the first part of a previous delayed allocation. * The left neighbor is contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx - 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + new->br_blockcount); xfs_bmbt_set_startoff(ep, PREV.br_startoff + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx - 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); temp = PREV.br_blockcount - new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); ip->i_df.if_lastex = idx - 1; if (cur == NULL) @@ -995,8 +987,7 @@ xfs_bmap_add_extent_delay_real( temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), STARTBLOCKVAL(PREV.br_startblock)); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); *dnew = temp; /* DELTA: The boundary between two in-core extents moved. */ temp = LEFT.br_startoff; @@ -1009,11 +1000,11 @@ xfs_bmap_add_extent_delay_real( * Filling in the first part of a previous delayed allocation. * The left neighbor is not contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startoff(ep, new_endoff); temp = PREV.br_blockcount - new->br_blockcount; xfs_bmbt_set_blockcount(ep, temp); - xfs_bmap_trace_insert(fname, "LF", ip, idx, 1, new, NULL, + XFS_BMAP_TRACE_INSERT("LF", ip, idx, 1, new, NULL, XFS_DATA_FORK); xfs_iext_insert(ifp, idx, 1, new); ip->i_df.if_lastex = idx; @@ -1046,8 +1037,7 @@ xfs_bmap_add_extent_delay_real( (cur ? cur->bc_private.b.allocated : 0)); ep = xfs_iext_get_ext(ifp, idx + 1); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "LF", ip, idx + 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF", ip, idx + 1, XFS_DATA_FORK); *dnew = temp; /* DELTA: One in-core extent is split in two. */ temp = PREV.br_startoff; @@ -1060,17 +1050,14 @@ xfs_bmap_add_extent_delay_real( * The right neighbor is contiguous with the new allocation. */ temp = PREV.br_blockcount - new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx, - XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx + 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); xfs_bmbt_set_allf(xfs_iext_get_ext(ifp, idx + 1), new->br_startoff, new->br_startblock, new->br_blockcount + RIGHT.br_blockcount, RIGHT.br_state); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx + 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx + 1; if (cur == NULL) rval = XFS_ILOG_DEXT; @@ -1091,8 +1078,7 @@ xfs_bmap_add_extent_delay_real( temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), STARTBLOCKVAL(PREV.br_startblock)); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); *dnew = temp; /* DELTA: The boundary between two in-core extents moved. */ temp = PREV.br_startoff; @@ -1106,10 +1092,10 @@ xfs_bmap_add_extent_delay_real( * The right neighbor is not contiguous. */ temp = PREV.br_blockcount - new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); - xfs_bmap_trace_insert(fname, "RF", ip, idx + 1, 1, - new, NULL, XFS_DATA_FORK); + XFS_BMAP_TRACE_INSERT("RF", ip, idx + 1, 1, new, NULL, + XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 1, new); ip->i_df.if_lastex = idx + 1; ip->i_d.di_nextents++; @@ -1141,7 +1127,7 @@ xfs_bmap_add_extent_delay_real( (cur ? cur->bc_private.b.allocated : 0)); ep = xfs_iext_get_ext(ifp, idx); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF", ip, idx, XFS_DATA_FORK); *dnew = temp; /* DELTA: One in-core extent is split in two. */ temp = PREV.br_startoff; @@ -1155,7 +1141,7 @@ xfs_bmap_add_extent_delay_real( * This case is avoided almost all the time. */ temp = new->br_startoff - PREV.br_startoff; - xfs_bmap_trace_pre_update(fname, "0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); r[0] = *new; r[1].br_state = PREV.br_state; @@ -1163,7 +1149,7 @@ xfs_bmap_add_extent_delay_real( r[1].br_startoff = new_endoff; temp2 = PREV.br_startoff + PREV.br_blockcount - new_endoff; r[1].br_blockcount = temp2; - xfs_bmap_trace_insert(fname, "0", ip, idx + 1, 2, &r[0], &r[1], + XFS_BMAP_TRACE_INSERT("0", ip, idx + 1, 2, &r[0], &r[1], XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 2, &r[0]); ip->i_df.if_lastex = idx + 1; @@ -1222,13 +1208,11 @@ xfs_bmap_add_extent_delay_real( } ep = xfs_iext_get_ext(ifp, idx); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "0", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "0", ip, idx + 2, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx + 2, XFS_DATA_FORK); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, idx + 2), NULLSTARTBLOCK((int)temp2)); - xfs_bmap_trace_post_update(fname, "0", ip, idx + 2, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx + 2, XFS_DATA_FORK); *dnew = temp + temp2; /* DELTA: One in-core extent is split in three. */ temp = PREV.br_startoff; @@ -1287,9 +1271,6 @@ xfs_bmap_add_extent_unwritten_real( xfs_btree_cur_t *cur; /* btree cursor */ xfs_bmbt_rec_t *ep; /* extent entry for idx */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_unwritten_real"; -#endif int i; /* temp state */ xfs_ifork_t *ifp; /* inode fork pointer */ xfs_fileoff_t new_endoff; /* end offset of new entry */ @@ -1390,15 +1371,14 @@ xfs_bmap_add_extent_unwritten_real( * Setting all of a previous oldext extent to newext. * The left and right neighbors are both contiguous with new. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount + RIGHT.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC|RC", ip, idx - 1, - XFS_DATA_FORK); - xfs_bmap_trace_delete(fname, "LF|RF|LC|RC", ip, idx, 2, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC|RC", ip, idx, 2, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 2); ip->i_df.if_lastex = idx - 1; ip->i_d.di_nextents -= 2; @@ -1441,15 +1421,14 @@ xfs_bmap_add_extent_unwritten_real( * Setting all of a previous oldext extent to newext. * The left neighbor is contiguous, the right is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; - xfs_bmap_trace_delete(fname, "LF|RF|LC", ip, idx, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC", ip, idx, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 1); ip->i_d.di_nextents--; if (cur == NULL) @@ -1484,16 +1463,15 @@ xfs_bmap_add_extent_unwritten_real( * Setting all of a previous oldext extent to newext. * The right neighbor is contiguous, the left is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|RC", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount + RIGHT.br_blockcount); xfs_bmbt_set_state(ep, newext); - xfs_bmap_trace_post_update(fname, "LF|RF|RC", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; - xfs_bmap_trace_delete(fname, "LF|RF|RC", ip, idx + 1, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|RC", ip, idx + 1, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx + 1, 1); ip->i_d.di_nextents--; if (cur == NULL) @@ -1529,10 +1507,10 @@ xfs_bmap_add_extent_unwritten_real( * Neither the left nor right neighbors are contiguous with * the new one. */ - xfs_bmap_trace_pre_update(fname, "LF|RF", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_state(ep, newext); - xfs_bmap_trace_post_update(fname, "LF|RF", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; if (cur == NULL) @@ -1559,21 +1537,21 @@ xfs_bmap_add_extent_unwritten_real( * Setting the first part of a previous oldext extent to newext. * The left neighbor is contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + new->br_blockcount); xfs_bmbt_set_startoff(ep, PREV.br_startoff + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startblock(ep, new->br_startblock + new->br_blockcount); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; if (cur == NULL) @@ -1610,15 +1588,15 @@ xfs_bmap_add_extent_unwritten_real( * Setting the first part of a previous oldext extent to newext. * The left neighbor is not contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF", ip, idx, XFS_DATA_FORK); ASSERT(ep && xfs_bmbt_get_state(ep) == oldext); xfs_bmbt_set_startoff(ep, new_endoff); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); xfs_bmbt_set_startblock(ep, new->br_startblock + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_insert(fname, "LF", ip, idx, 1, new, NULL, + XFS_BMAP_TRACE_POST_UPDATE("LF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_INSERT("LF", ip, idx, 1, new, NULL, XFS_DATA_FORK); xfs_iext_insert(ifp, idx, 1, new); ip->i_df.if_lastex = idx; @@ -1653,18 +1631,18 @@ xfs_bmap_add_extent_unwritten_real( * Setting the last part of a previous oldext extent to newext. * The right neighbor is contiguous with the new allocation. */ - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx + 1, + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_allf(xfs_iext_get_ext(ifp, idx + 1), new->br_startoff, new->br_startblock, new->br_blockcount + RIGHT.br_blockcount, newext); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx + 1, + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx + 1; if (cur == NULL) @@ -1700,12 +1678,12 @@ xfs_bmap_add_extent_unwritten_real( * Setting the last part of a previous oldext extent to newext. * The right neighbor is not contiguous. */ - xfs_bmap_trace_pre_update(fname, "RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); - xfs_bmap_trace_post_update(fname, "RF", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_insert(fname, "RF", ip, idx + 1, 1, - new, NULL, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_INSERT("RF", ip, idx + 1, 1, new, NULL, + XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 1, new); ip->i_df.if_lastex = idx + 1; ip->i_d.di_nextents++; @@ -1744,17 +1722,17 @@ xfs_bmap_add_extent_unwritten_real( * newext. Contiguity is impossible here. * One extent becomes three extents. */ - xfs_bmap_trace_pre_update(fname, "0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, new->br_startoff - PREV.br_startoff); - xfs_bmap_trace_post_update(fname, "0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx, XFS_DATA_FORK); r[0] = *new; r[1].br_startoff = new_endoff; r[1].br_blockcount = PREV.br_startoff + PREV.br_blockcount - new_endoff; r[1].br_startblock = new->br_startblock + new->br_blockcount; r[1].br_state = oldext; - xfs_bmap_trace_insert(fname, "0", ip, idx + 1, 2, &r[0], &r[1], + XFS_BMAP_TRACE_INSERT("0", ip, idx + 1, 2, &r[0], &r[1], XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 2, &r[0]); ip->i_df.if_lastex = idx + 1; @@ -1845,9 +1823,6 @@ xfs_bmap_add_extent_hole_delay( int rsvd) /* OK to allocate reserved blocks */ { xfs_bmbt_rec_t *ep; /* extent record for idx */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_hole_delay"; -#endif xfs_ifork_t *ifp; /* inode fork pointer */ xfs_bmbt_irec_t left; /* left neighbor extent entry */ xfs_filblks_t newlen=0; /* new indirect size */ @@ -1919,7 +1894,7 @@ xfs_bmap_add_extent_hole_delay( */ temp = left.br_blockcount + new->br_blockcount + right.br_blockcount; - xfs_bmap_trace_pre_update(fname, "LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LC|RC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), temp); oldlen = STARTBLOCKVAL(left.br_startblock) + @@ -1928,10 +1903,9 @@ xfs_bmap_add_extent_hole_delay( newlen = xfs_bmap_worst_indlen(ip, temp); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, idx - 1), NULLSTARTBLOCK((int)newlen)); - xfs_bmap_trace_post_update(fname, "LC|RC", ip, idx - 1, - XFS_DATA_FORK); - xfs_bmap_trace_delete(fname, "LC|RC", ip, idx, 1, + XFS_BMAP_TRACE_POST_UPDATE("LC|RC", ip, idx - 1, XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LC|RC", ip, idx, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 1); ip->i_df.if_lastex = idx - 1; /* DELTA: Two in-core extents were replaced by one. */ @@ -1946,7 +1920,7 @@ xfs_bmap_add_extent_hole_delay( * Merge the new allocation with the left neighbor. */ temp = left.br_blockcount + new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), temp); oldlen = STARTBLOCKVAL(left.br_startblock) + @@ -1954,7 +1928,7 @@ xfs_bmap_add_extent_hole_delay( newlen = xfs_bmap_worst_indlen(ip, temp); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, idx - 1), NULLSTARTBLOCK((int)newlen)); - xfs_bmap_trace_post_update(fname, "LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LC", ip, idx - 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; /* DELTA: One in-core extent grew into a hole. */ @@ -1968,14 +1942,14 @@ xfs_bmap_add_extent_hole_delay( * on the right. * Merge the new allocation with the right neighbor. */ - xfs_bmap_trace_pre_update(fname, "RC", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RC", ip, idx, XFS_DATA_FORK); temp = new->br_blockcount + right.br_blockcount; oldlen = STARTBLOCKVAL(new->br_startblock) + STARTBLOCKVAL(right.br_startblock); newlen = xfs_bmap_worst_indlen(ip, temp); xfs_bmbt_set_allf(ep, new->br_startoff, NULLSTARTBLOCK((int)newlen), temp, right.br_state); - xfs_bmap_trace_post_update(fname, "RC", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; /* DELTA: One in-core extent grew into a hole. */ temp2 = temp; @@ -1989,7 +1963,7 @@ xfs_bmap_add_extent_hole_delay( * Insert a new entry. */ oldlen = newlen = 0; - xfs_bmap_trace_insert(fname, "0", ip, idx, 1, new, NULL, + XFS_BMAP_TRACE_INSERT("0", ip, idx, 1, new, NULL, XFS_DATA_FORK); xfs_iext_insert(ifp, idx, 1, new); ip->i_df.if_lastex = idx; @@ -2039,9 +2013,6 @@ xfs_bmap_add_extent_hole_real( { xfs_bmbt_rec_t *ep; /* pointer to extent entry ins. point */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_hole_real"; -#endif int i; /* temp state */ xfs_ifork_t *ifp; /* inode fork pointer */ xfs_bmbt_irec_t left; /* left neighbor extent entry */ @@ -2118,15 +2089,14 @@ xfs_bmap_add_extent_hole_real( * left and on the right. * Merge all three into a single extent record. */ - xfs_bmap_trace_pre_update(fname, "LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LC|RC", ip, idx - 1, whichfork); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), left.br_blockcount + new->br_blockcount + right.br_blockcount); - xfs_bmap_trace_post_update(fname, "LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LC|RC", ip, idx - 1, whichfork); - xfs_bmap_trace_delete(fname, "LC|RC", ip, - idx, 1, whichfork); + XFS_BMAP_TRACE_DELETE("LC|RC", ip, idx, 1, whichfork); xfs_iext_remove(ifp, idx, 1); ifp->if_lastex = idx - 1; XFS_IFORK_NEXT_SET(ip, whichfork, @@ -2168,10 +2138,10 @@ xfs_bmap_add_extent_hole_real( * on the left. * Merge the new allocation with the left neighbor. */ - xfs_bmap_trace_pre_update(fname, "LC", ip, idx - 1, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("LC", ip, idx - 1, whichfork); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), left.br_blockcount + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LC", ip, idx - 1, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("LC", ip, idx - 1, whichfork); ifp->if_lastex = idx - 1; if (cur == NULL) { rval = XFS_ILOG_FEXT(whichfork); @@ -2202,11 +2172,11 @@ xfs_bmap_add_extent_hole_real( * on the right. * Merge the new allocation with the right neighbor. */ - xfs_bmap_trace_pre_update(fname, "RC", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("RC", ip, idx, whichfork); xfs_bmbt_set_allf(ep, new->br_startoff, new->br_startblock, new->br_blockcount + right.br_blockcount, right.br_state); - xfs_bmap_trace_post_update(fname, "RC", ip, idx, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("RC", ip, idx, whichfork); ifp->if_lastex = idx; if (cur == NULL) { rval = XFS_ILOG_FEXT(whichfork); @@ -2237,8 +2207,7 @@ xfs_bmap_add_extent_hole_real( * real allocation. * Insert a new entry. */ - xfs_bmap_trace_insert(fname, "0", ip, idx, 1, new, NULL, - whichfork); + XFS_BMAP_TRACE_INSERT("0", ip, idx, 1, new, NULL, whichfork); xfs_iext_insert(ifp, idx, 1, new); ifp->if_lastex = idx; XFS_IFORK_NEXT_SET(ip, whichfork, @@ -3051,9 +3020,6 @@ xfs_bmap_del_extent( xfs_bmbt_rec_t *ep; /* current extent entry pointer */ int error; /* error return value */ int flags; /* inode logging flags */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_del_extent"; -#endif xfs_bmbt_irec_t got; /* current extent entry */ xfs_fileoff_t got_endoff; /* first offset past got */ int i; /* temp state */ @@ -3147,7 +3113,7 @@ xfs_bmap_del_extent( /* * Matches the whole extent. Delete the entry. */ - xfs_bmap_trace_delete(fname, "3", ip, idx, 1, whichfork); + XFS_BMAP_TRACE_DELETE("3", ip, idx, 1, whichfork); xfs_iext_remove(ifp, idx, 1); ifp->if_lastex = idx; if (delay) @@ -3168,7 +3134,7 @@ xfs_bmap_del_extent( /* * Deleting the first part of the extent. */ - xfs_bmap_trace_pre_update(fname, "2", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("2", ip, idx, whichfork); xfs_bmbt_set_startoff(ep, del_endoff); temp = got.br_blockcount - del->br_blockcount; xfs_bmbt_set_blockcount(ep, temp); @@ -3177,13 +3143,13 @@ xfs_bmap_del_extent( temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), da_old); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "2", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("2", ip, idx, whichfork); da_new = temp; break; } xfs_bmbt_set_startblock(ep, del_endblock); - xfs_bmap_trace_post_update(fname, "2", ip, idx, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("2", ip, idx, whichfork); if (!cur) { flags |= XFS_ILOG_FEXT(whichfork); break; @@ -3199,19 +3165,19 @@ xfs_bmap_del_extent( * Deleting the last part of the extent. */ temp = got.br_blockcount - del->br_blockcount; - xfs_bmap_trace_pre_update(fname, "1", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("1", ip, idx, whichfork); xfs_bmbt_set_blockcount(ep, temp); ifp->if_lastex = idx; if (delay) { temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), da_old); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "1", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("1", ip, idx, whichfork); da_new = temp; break; } - xfs_bmap_trace_post_update(fname, "1", ip, idx, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("1", ip, idx, whichfork); if (!cur) { flags |= XFS_ILOG_FEXT(whichfork); break; @@ -3228,7 +3194,7 @@ xfs_bmap_del_extent( * Deleting the middle of the extent. */ temp = del->br_startoff - got.br_startoff; - xfs_bmap_trace_pre_update(fname, "0", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx, whichfork); xfs_bmbt_set_blockcount(ep, temp); new.br_startoff = del_endoff; temp2 = got_endoff - del_endoff; @@ -3315,8 +3281,8 @@ xfs_bmap_del_extent( } } } - xfs_bmap_trace_post_update(fname, "0", ip, idx, whichfork); - xfs_bmap_trace_insert(fname, "0", ip, idx + 1, 1, &new, NULL, + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx, whichfork); + XFS_BMAP_TRACE_INSERT("0", ip, idx + 1, 1, &new, NULL, whichfork); xfs_iext_insert(ifp, idx + 1, 1, &new); ifp->if_lastex = idx + 1; @@ -3556,9 +3522,6 @@ xfs_bmap_local_to_extents( { int error; /* error return value */ int flags; /* logging flags returned */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_local_to_extents"; -#endif xfs_ifork_t *ifp; /* inode fork pointer */ /* @@ -3613,7 +3576,7 @@ xfs_bmap_local_to_extents( xfs_iext_add(ifp, 0, 1); ep = xfs_iext_get_ext(ifp, 0); xfs_bmbt_set_allf(ep, 0, args.fsbno, 1, XFS_EXT_NORM); - xfs_bmap_trace_post_update(fname, "new", ip, 0, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("new", ip, 0, whichfork); XFS_IFORK_NEXT_SET(ip, whichfork, 1); ip->i_d.di_nblocks = 1; XFS_TRANS_MOD_DQUOT_BYINO(args.mp, tp, ip, @@ -3736,7 +3699,7 @@ ktrace_t *xfs_bmap_trace_buf; STATIC void xfs_bmap_trace_addentry( int opcode, /* operation */ - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(ies) */ @@ -3795,7 +3758,7 @@ xfs_bmap_trace_addentry( */ STATIC void xfs_bmap_trace_delete( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) deleted */ @@ -3817,7 +3780,7 @@ xfs_bmap_trace_delete( */ STATIC void xfs_bmap_trace_insert( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) inserted */ @@ -3846,7 +3809,7 @@ xfs_bmap_trace_insert( */ STATIC void xfs_bmap_trace_post_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry updated */ @@ -3864,7 +3827,7 @@ xfs_bmap_trace_post_update( */ STATIC void xfs_bmap_trace_pre_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry to be updated */ @@ -4478,9 +4441,6 @@ xfs_bmap_read_extents( xfs_buf_t *bp; /* buffer for "block" */ int error; /* error return value */ xfs_exntfmt_t exntf; /* XFS_EXTFMT_NOSTATE, if checking */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_read_extents"; -#endif xfs_extnum_t i, j; /* index into the extents list */ xfs_ifork_t *ifp; /* fork structure */ int level; /* btree level, for checking */ @@ -4597,7 +4557,7 @@ xfs_bmap_read_extents( } ASSERT(i == (ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t))); ASSERT(i == XFS_IFORK_NEXTENTS(ip, whichfork)); - xfs_bmap_trace_exlist(fname, ip, i, whichfork); + XFS_BMAP_TRACE_EXLIST(ip, i, whichfork); return 0; error0: xfs_trans_brelse(tp, bp); @@ -4625,7 +4585,7 @@ xfs_bmap_trace_exlist( for (idx = 0; idx < cnt; idx++) { ep = xfs_iext_get_ext(ifp, idx); xfs_bmbt_get_all(ep, &s); - xfs_bmap_trace_insert(fname, "exlist", ip, idx, 1, &s, NULL, + XFS_BMAP_TRACE_INSERT("exlist", ip, idx, 1, &s, NULL, whichfork); } } Index: linux/fs/xfs/xfs_bmap.h =================================================================== --- linux.orig/fs/xfs/xfs_bmap.h +++ linux/fs/xfs/xfs_bmap.h @@ -144,12 +144,14 @@ extern ktrace_t *xfs_bmap_trace_buf; */ void xfs_bmap_trace_exlist( - char *fname, /* function name */ + const char *fname, /* function name */ struct xfs_inode *ip, /* incore inode pointer */ xfs_extnum_t cnt, /* count of entries in list */ int whichfork); /* data or attr fork */ +#define XFS_BMAP_TRACE_EXLIST(ip,c,w) \ + xfs_bmap_trace_exlist(__FUNCTION__,ip,c,w) #else -#define xfs_bmap_trace_exlist(f,ip,c,w) +#define XFS_BMAP_TRACE_EXLIST(ip,c,w) #endif /* Index: linux/fs/xfs/xfs_inode.c =================================================================== --- linux.orig/fs/xfs/xfs_inode.c +++ linux/fs/xfs/xfs_inode.c @@ -642,8 +642,7 @@ xfs_iformat_extents( ep->l1 = INT_GET(get_unaligned((__uint64_t*)&dp->l1), ARCH_CONVERT); } - xfs_bmap_trace_exlist("xfs_iformat_extents", ip, nex, - whichfork); + XFS_BMAP_TRACE_EXLIST(ip, nex, whichfork); if (whichfork != XFS_DATA_FORK || XFS_EXTFMT_INODE(ip) == XFS_EXTFMT_NOSTATE) if (unlikely(xfs_check_nostate_extents( @@ -2845,9 +2844,6 @@ xfs_iextents_copy( int copied; xfs_bmbt_rec_t *dest_ep; xfs_bmbt_rec_t *ep; -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_iextents_copy"; -#endif int i; xfs_ifork_t *ifp; int nrecs; @@ -2858,7 +2854,7 @@ xfs_iextents_copy( ASSERT(ifp->if_bytes > 0); nrecs = ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t); - xfs_bmap_trace_exlist(fname, ip, nrecs, whichfork); + XFS_BMAP_TRACE_EXLIST(ip, nrecs, whichfork); ASSERT(nrecs > 0); /* --------------040201070004000506070208-- From owner-xfs@oss.sgi.com Wed Mar 7 00:54:07 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 00:54:11 -0800 (PST) X-Spam-oss-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l278s46p017851 for ; Wed, 7 Mar 2007 00:54:06 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HOrV3-0001YG-Db; Wed, 07 Mar 2007 08:27:53 +0000 Date: Wed, 7 Mar 2007 08:27:53 +0000 From: Christoph Hellwig To: Eric Sandeen Cc: xfs@oss.sgi.com Subject: Re: [PATCH] get rid of fname[] for tracing functions Message-ID: <20070307082753.GA5469@infradead.org> References: <45EE2EF3.8090707@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45EE2EF3.8090707@sandeen.net> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 10772 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 1279 Lines: 35 On Tue, Mar 06, 2007 at 09:18:11PM -0600, Eric Sandeen wrote: > this gets rid of the > > #ifdef XFS_BMAP_TRACE > static char fname[] = "xfs_iextents_copy"; > #endif > > ugliness littered in the bmap code for tracing, and instead just uses > gcc's __FUNCTION__, which never gets out of sync with the actual > function name.... > > It also makes some of this tracing more consistently use the > > #define XFS_BMBT_TRACE_ARGBI(c,b,i) \ > xfs_bmbt_trace_argbi(__FUNCTION__, c, b, i, __LINE__) > > type constructs, to automatically pick up the gcc extensions. Very nice. I might hear some people to scream that we should use the C99 __func__ and not the __FUNCTION__ gccism, but __FUNCTION__ is what the rest of the Linux kernel uses, and can be emulated with a trivial #define __func__ __FUNCTION__ on any non-gcc C99 system. > the vn tracing could probably get a similar treatment, so that every > call to vn_trace_foo wouldn't have to include a function name and a > __builtin_return_address, but could be done via macros... it's currently > a mishmash of __FUNCTION__ and "function" *shrug* what do you think? It should probably use __FUNCTION__ and hide use of both __FUNCTION__ and __builtin_return_address behind a macro. From owner-xfs@oss.sgi.com Wed Mar 7 02:13:21 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 02:13:24 -0800 (PST) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_05 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27ADJ6p031368 for ; Wed, 7 Mar 2007 02:13:21 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l27ADEb2030938 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Mar 2007 11:13:14 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l27ADE9l030936; Wed, 7 Mar 2007 11:13:14 +0100 Date: Wed, 7 Mar 2007 11:13:14 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org Cc: linux-kernel@vger.kernel.org Subject: [PATCH 1/2] xfs: use xfs_get_buf_noaddr for iclogs Message-ID: <20070307101314.GB30587@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10774 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 2443 Lines: 66 Currently xlog_alloc allocates memory for the iclogs first, then allocates a buffer using xfs_buf_get_empty and finally assigns the memory to the buffer. We don't really want to do this, but rather allocate a buffer with memory attached to it using xfs_buf_get_noaddr. There's a subtile change because xfs_buf_get_empty returns the buffer locked, but xfs_buf_get_noaddr returns it unlocked. From my auditing and testing nothing in the log I/O code cares about this distincition, but I'd be happy if someone could try to prove this independently. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/xfs_log.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_log.c 2007-03-06 17:26:40.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_log.c 2007-03-06 17:28:03.000000000 +0100 @@ -1199,11 +1199,16 @@ *iclogp = (xlog_in_core_t *) kmem_zalloc(sizeof(xlog_in_core_t), KM_SLEEP); iclog = *iclogp; - iclog->hic_data = (xlog_in_core_2_t *) - kmem_zalloc(iclogsize, KM_SLEEP | KM_LARGE); - iclog->ic_prev = prev_iclog; prev_iclog = iclog; + + bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); + XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); + XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); + XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); + iclog->ic_bp = bp; + iclog->hic_data = bp->b_addr; + log->l_iclog_bak[i] = (xfs_caddr_t)&(iclog->ic_header); head = &iclog->ic_header; @@ -1216,11 +1221,6 @@ INT_SET(head->h_fmt, ARCH_CONVERT, XLOG_FMT); memcpy(&head->h_fs_uuid, &mp->m_sb.sb_uuid, sizeof(uuid_t)); - bp = xfs_buf_get_empty(log->l_iclog_size, mp->m_logdev_targp); - XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); - XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); - XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); - iclog->ic_bp = bp; iclog->ic_size = XFS_BUF_SIZE(bp) - log->l_iclog_hsize; iclog->ic_state = XLOG_STATE_ACTIVE; @@ -1229,7 +1229,6 @@ iclog->ic_datap = (char *)iclog->hic_data + log->l_iclog_hsize; ASSERT(XFS_BUF_ISBUSY(iclog->ic_bp)); - ASSERT(XFS_BUF_VALUSEMA(iclog->ic_bp) <= 0); sv_init(&iclog->ic_forcesema, SV_DEFAULT, "iclog-force"); sv_init(&iclog->ic_writesema, SV_DEFAULT, "iclog-write"); @@ -1528,7 +1527,6 @@ } #endif next_iclog = iclog->ic_next; - kmem_free(iclog->hic_data, log->l_iclog_size); kmem_free(iclog, sizeof(xlog_in_core_t)); iclog = next_iclog; } From owner-xfs@oss.sgi.com Wed Mar 7 02:13:32 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 02:13:35 -0800 (PST) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_05 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27ADT6p031427 for ; Wed, 7 Mar 2007 02:13:31 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l27ADOb2030971 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Mar 2007 11:13:24 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l27ADOQi030969; Wed, 7 Mar 2007 11:13:24 +0100 Date: Wed, 7 Mar 2007 11:13:24 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org Cc: linux-kernel@vger.kernel.org Subject: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr Message-ID: <20070307101324.GC30587@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10775 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 4209 Lines: 128 Currently xfs_buf_get_noaddr allocates memory using kmem_alloc which can end up either in kmalloc or vmalloc and assigns it to the buffer. This patch changes it to allocate individual pages and if there is more then one maps it into kernel virtual space using vmap. This means the minimum buffer allocation is PAGE_SIZE now. For two of the three caller (log buffers, log recovery) that is perfectly fine, because they always allocate buffers that are a power of two of the page size anyway. For xfs_zero_remaining_bytes the minimum allocation goes up from blocksize to pagesize and thus there is a potential waste of memory for blocksize < pagesize allocations, which is unfortunate but not directly solveable when block drivers expect reference countable pages. To fix this waste xfs_zero_remaining_bytes could be rewritten to zero more than a single block at a time, which sounds like a good idea in general. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-03-05 15:54:40.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-03-05 15:54:47.000000000 +0100 @@ -314,7 +314,7 @@ ASSERT(list_empty(&bp->b_hash_list)); - if (bp->b_flags & _XBF_PAGE_CACHE) { + if (bp->b_flags & (_XBF_PAGE_CACHE|_XBF_PAGES)) { uint i; if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) @@ -323,18 +323,11 @@ for (i = 0; i < bp->b_page_count; i++) { struct page *page = bp->b_pages[i]; - ASSERT(!PagePrivate(page)); + if (bp->b_flags & _XBF_PAGE_CACHE) + ASSERT(!PagePrivate(page)); page_cache_release(page); } _xfs_buf_free_pages(bp); - } else if (bp->b_flags & _XBF_KMEM_ALLOC) { - /* - * XXX(hch): bp->b_count_desired might be incorrect (see - * xfs_buf_associate_memory for details), but fortunately - * the Linux version of kmem_free ignores the len argument.. - */ - kmem_free(bp->b_addr, bp->b_count_desired); - _xfs_buf_free_pages(bp); } xfs_buf_deallocate(bp); @@ -764,41 +757,41 @@ size_t len, xfs_buftarg_t *target) { - size_t malloc_len = len; + unsigned long page_count = PAGE_ALIGN(len) >> PAGE_SHIFT; + int error, i; xfs_buf_t *bp; - void *data; - int error; bp = xfs_buf_allocate(0); if (unlikely(bp == NULL)) goto fail; _xfs_buf_initialize(bp, target, 0, len, 0); - try_again: - data = kmem_alloc(malloc_len, KM_SLEEP | KM_MAYFAIL | KM_LARGE); - if (unlikely(data == NULL)) + error = _xfs_buf_get_pages(bp, page_count, 0); + if (error) goto fail_free_buf; - /* check whether alignment matches.. */ - if ((__psunsigned_t)data != - ((__psunsigned_t)data & ~target->bt_smask)) { - /* .. else double the size and try again */ - kmem_free(data, malloc_len); - malloc_len <<= 1; - goto try_again; - } - - error = xfs_buf_associate_memory(bp, data, len); - if (error) + for (i = 0; i < page_count; i++) { + bp->b_pages[i] = alloc_page(GFP_KERNEL); + if (!bp->b_pages[i]) + goto fail_free_mem; + } + bp->b_flags |= _XBF_PAGES; + + error = _xfs_buf_map_pages(bp, XBF_MAPPED); + if (unlikely(error)) { + printk(KERN_WARNING "%s: failed to map pages\n", + __FUNCTION__); goto fail_free_mem; - bp->b_flags |= _XBF_KMEM_ALLOC; + } xfs_buf_unlock(bp); XB_TRACE(bp, "no_daddr", data); return bp; + fail_free_mem: - kmem_free(data, malloc_len); + for ( ; i >= 0; i++) + __free_page(bp->b_pages[i]); fail_free_buf: xfs_buf_free(bp); fail: Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.h 2007-03-05 15:54:40.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.h 2007-03-05 15:55:06.000000000 +0100 @@ -63,7 +63,7 @@ /* flags used only internally */ _XBF_PAGE_CACHE = (1 << 17),/* backed by pagecache */ - _XBF_KMEM_ALLOC = (1 << 18),/* backed by kmem_alloc() */ + _XBF_PAGES = (1 << 18), /* backed by refcounted pages */ _XBF_RUN_QUEUES = (1 << 19),/* run block device task queue */ _XBF_DELWRI_Q = (1 << 21), /* buffer on delwri queue */ } xfs_buf_flags_t; From owner-xfs@oss.sgi.com Wed Mar 7 02:13:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 02:13:16 -0800 (PST) X-Spam-oss-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27ADA6p031333 for ; Wed, 7 Mar 2007 02:13:12 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l27AD3b2030919 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Mar 2007 11:13:03 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l27AD2AS030917; Wed, 7 Mar 2007 11:13:02 +0100 Date: Wed, 7 Mar 2007 11:13:02 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org Cc: linux-kernel@vger.kernel.org Subject: [PATCH 0/2] xfs: only use refcounted pages for I/O Message-ID: <20070307101302.GA30587@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10773 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 272 Lines: 6 Many block drivers (aoe, iscsi) really want refcountable pages in bios, which is what almost everyone send down. XFS unfortunately has a few places where it sends down buffers that may come from kmalloc, which breaks them. The patches in this series fix this issue up. From owner-xfs@oss.sgi.com Wed Mar 7 04:06:00 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 04:06:04 -0800 (PST) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27C5x6p028635 for ; Wed, 7 Mar 2007 04:06:00 -0800 Received: from agami.com (mail [192.168.168.5]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id l27C5v3A021162 for ; Wed, 7 Mar 2007 04:05:57 -0800 Received: from mx1.agami.com (mx1.agami.com [10.123.10.30]) by agami.com (8.12.11/8.12.11) with ESMTP id l27C64UH010619 for ; Wed, 7 Mar 2007 04:06:04 -0800 Received: from [10.12.12.141] ([10.12.12.141]) by mx1.agami.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 7 Mar 2007 04:06:02 -0800 Message-ID: <45EEA92C.9080505@agami.com> Date: Wed, 07 Mar 2007 17:29:40 +0530 From: Shailendra Tripathi User-Agent: Mozilla Thunderbird 0.9 (X11/20041127) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] xfs: use xfs_get_buf_noaddr for iclogs References: <20070307101314.GB30587@lst.de> In-Reply-To: <20070307101314.GB30587@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Mar 2007 12:06:03.0295 (UTC) FILETIME=[FA490AF0:01C760B0] X-Scanned-By: MIMEDefang 2.58 on 192.168.168.13 X-archive-position: 10776 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 2988 Lines: 78 Does not look to me either. Looks logical as well because these buffers are used only in log syncing and only one thread can be ever flushing one ICLOG and, hence, no need for protection. Even split buffer (log->l_xbuf) is used by only ICLOG at a time, should not matter. I don't see protection for this even today as no locking is done in split sync path. -shailendra Christoph Hellwig wrote: > Currently xlog_alloc allocates memory for the iclogs first, then > allocates a buffer using xfs_buf_get_empty and finally assigns > the memory to the buffer. We don't really want to do this, but > rather allocate a buffer with memory attached to it using > xfs_buf_get_noaddr. There's a subtile change because > xfs_buf_get_empty returns the buffer locked, but xfs_buf_get_noaddr > returns it unlocked. From my auditing and testing nothing in the > log I/O code cares about this distincition, but I'd be happy if > someone could try to prove this independently. > > > Signed-off-by: Christoph Hellwig > > Index: linux-2.6/fs/xfs/xfs_log.c > =================================================================== > --- linux-2.6.orig/fs/xfs/xfs_log.c 2007-03-06 17:26:40.000000000 +0100 > +++ linux-2.6/fs/xfs/xfs_log.c 2007-03-06 17:28:03.000000000 +0100 > @@ -1199,11 +1199,16 @@ > *iclogp = (xlog_in_core_t *) > kmem_zalloc(sizeof(xlog_in_core_t), KM_SLEEP); > iclog = *iclogp; > - iclog->hic_data = (xlog_in_core_2_t *) > - kmem_zalloc(iclogsize, KM_SLEEP | KM_LARGE); > - > iclog->ic_prev = prev_iclog; > prev_iclog = iclog; > + > + bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); > + XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); > + XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); > + XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); > + iclog->ic_bp = bp; > + iclog->hic_data = bp->b_addr; > + > log->l_iclog_bak[i] = (xfs_caddr_t)&(iclog->ic_header); > > head = &iclog->ic_header; > @@ -1216,11 +1221,6 @@ > INT_SET(head->h_fmt, ARCH_CONVERT, XLOG_FMT); > memcpy(&head->h_fs_uuid, &mp->m_sb.sb_uuid, sizeof(uuid_t)); > > - bp = xfs_buf_get_empty(log->l_iclog_size, mp->m_logdev_targp); > - XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); > - XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); > - XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); > - iclog->ic_bp = bp; > > iclog->ic_size = XFS_BUF_SIZE(bp) - log->l_iclog_hsize; > iclog->ic_state = XLOG_STATE_ACTIVE; > @@ -1229,7 +1229,6 @@ > iclog->ic_datap = (char *)iclog->hic_data + log->l_iclog_hsize; > > ASSERT(XFS_BUF_ISBUSY(iclog->ic_bp)); > - ASSERT(XFS_BUF_VALUSEMA(iclog->ic_bp) <= 0); > sv_init(&iclog->ic_forcesema, SV_DEFAULT, "iclog-force"); > sv_init(&iclog->ic_writesema, SV_DEFAULT, "iclog-write"); > > @@ -1528,7 +1527,6 @@ > } > #endif > next_iclog = iclog->ic_next; > - kmem_free(iclog->hic_data, log->l_iclog_size); > kmem_free(iclog, sizeof(xlog_in_core_t)); > iclog = next_iclog; > } > > From owner-xfs@oss.sgi.com Wed Mar 7 04:20:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 04:20:44 -0800 (PST) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27CKe6p005930 for ; Wed, 7 Mar 2007 04:20:41 -0800 Received: from agami.com (mail [192.168.168.5]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id l27CKc3A021346 for ; Wed, 7 Mar 2007 04:20:38 -0800 Received: from mx1.agami.com (mx1.agami.com [10.123.10.30]) by agami.com (8.12.11/8.12.11) with ESMTP id l27CKjiU010748 for ; Wed, 7 Mar 2007 04:20:45 -0800 Received: from [10.12.12.141] ([10.12.12.141]) by mx1.agami.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 7 Mar 2007 04:20:44 -0800 Message-ID: <45EEACA0.4050206@agami.com> Date: Wed, 07 Mar 2007 17:44:24 +0530 From: Shailendra Tripathi User-Agent: Mozilla Thunderbird 0.9 (X11/20041127) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr References: <20070307101324.GC30587@lst.de> In-Reply-To: <20070307101324.GC30587@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Mar 2007 12:20:45.0223 (UTC) FILETIME=[07F49370:01C760B3] X-Scanned-By: MIMEDefang 2.58 on 192.168.168.13 X-archive-position: 10777 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 5096 Lines: 149 Hi Christoph, Did you do some testing for recovery when end of the physical log is seen ? When you will be dealing with striped ICLOG buffers or big sized ICLOGs, header size might range from 512 to 2k. Also, this header might be split into 2 parts at the end of physical log. Then, you don't have page size buffer. Please verify that XFS_BUF_SETP_PTRs work correctly for those cases. Same thing is true when data section is split around physical log. You can get one part which is not PAGE sized. (I am referring to the function xlog_do_recovery_pass). Regards, Shailendra Christoph Hellwig wrote: > Currently xfs_buf_get_noaddr allocates memory using kmem_alloc which > can end up either in kmalloc or vmalloc and assigns it to the buffer. > This patch changes it to allocate individual pages and if there is > more then one maps it into kernel virtual space using vmap. > > This means the minimum buffer allocation is PAGE_SIZE now. For two > of the three caller (log buffers, log recovery) that is perfectly > fine, because they always allocate buffers that are a power of two > of the page size anyway. For xfs_zero_remaining_bytes the minimum > allocation goes up from blocksize to pagesize and thus there is > a potential waste of memory for blocksize < pagesize allocations, > which is unfortunate but not directly solveable when block > drivers expect reference countable pages. To fix this waste > xfs_zero_remaining_bytes could be rewritten to zero more than > a single block at a time, which sounds like a good idea in general. > > > Signed-off-by: Christoph Hellwig > > Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-03-05 15:54:40.000000000 +0100 > +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-03-05 15:54:47.000000000 +0100 > @@ -314,7 +314,7 @@ > > ASSERT(list_empty(&bp->b_hash_list)); > > - if (bp->b_flags & _XBF_PAGE_CACHE) { > + if (bp->b_flags & (_XBF_PAGE_CACHE|_XBF_PAGES)) { > uint i; > > if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) > @@ -323,18 +323,11 @@ > for (i = 0; i < bp->b_page_count; i++) { > struct page *page = bp->b_pages[i]; > > - ASSERT(!PagePrivate(page)); > + if (bp->b_flags & _XBF_PAGE_CACHE) > + ASSERT(!PagePrivate(page)); > page_cache_release(page); > } > _xfs_buf_free_pages(bp); > - } else if (bp->b_flags & _XBF_KMEM_ALLOC) { > - /* > - * XXX(hch): bp->b_count_desired might be incorrect (see > - * xfs_buf_associate_memory for details), but fortunately > - * the Linux version of kmem_free ignores the len argument.. > - */ > - kmem_free(bp->b_addr, bp->b_count_desired); > - _xfs_buf_free_pages(bp); > } > > xfs_buf_deallocate(bp); > @@ -764,41 +757,41 @@ > size_t len, > xfs_buftarg_t *target) > { > - size_t malloc_len = len; > + unsigned long page_count = PAGE_ALIGN(len) >> PAGE_SHIFT; > + int error, i; > xfs_buf_t *bp; > - void *data; > - int error; > > bp = xfs_buf_allocate(0); > if (unlikely(bp == NULL)) > goto fail; > _xfs_buf_initialize(bp, target, 0, len, 0); > > - try_again: > - data = kmem_alloc(malloc_len, KM_SLEEP | KM_MAYFAIL | KM_LARGE); > - if (unlikely(data == NULL)) > + error = _xfs_buf_get_pages(bp, page_count, 0); > + if (error) > goto fail_free_buf; > > - /* check whether alignment matches.. */ > - if ((__psunsigned_t)data != > - ((__psunsigned_t)data & ~target->bt_smask)) { > - /* .. else double the size and try again */ > - kmem_free(data, malloc_len); > - malloc_len <<= 1; > - goto try_again; > - } > - > - error = xfs_buf_associate_memory(bp, data, len); > - if (error) > + for (i = 0; i < page_count; i++) { > + bp->b_pages[i] = alloc_page(GFP_KERNEL); > + if (!bp->b_pages[i]) > + goto fail_free_mem; > + } > + bp->b_flags |= _XBF_PAGES; > + > + error = _xfs_buf_map_pages(bp, XBF_MAPPED); > + if (unlikely(error)) { > + printk(KERN_WARNING "%s: failed to map pages\n", > + __FUNCTION__); > goto fail_free_mem; > - bp->b_flags |= _XBF_KMEM_ALLOC; > + } > > xfs_buf_unlock(bp); > > XB_TRACE(bp, "no_daddr", data); > return bp; > + > fail_free_mem: > - kmem_free(data, malloc_len); > + for ( ; i >= 0; i++) > + __free_page(bp->b_pages[i]); > fail_free_buf: > xfs_buf_free(bp); > fail: > Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.h > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.h 2007-03-05 15:54:40.000000000 +0100 > +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.h 2007-03-05 15:55:06.000000000 +0100 > @@ -63,7 +63,7 @@ > > /* flags used only internally */ > _XBF_PAGE_CACHE = (1 << 17),/* backed by pagecache */ > - _XBF_KMEM_ALLOC = (1 << 18),/* backed by kmem_alloc() */ > + _XBF_PAGES = (1 << 18), /* backed by refcounted pages */ > _XBF_RUN_QUEUES = (1 << 19),/* run block device task queue */ > _XBF_DELWRI_Q = (1 << 21), /* buffer on delwri queue */ > } xfs_buf_flags_t; > > From owner-xfs@oss.sgi.com Wed Mar 7 04:40:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 04:40:40 -0800 (PST) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27CeZ6p011221 for ; Wed, 7 Mar 2007 04:40:36 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l27CcPb2004102 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Mar 2007 13:38:25 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l27CcO4m004099; Wed, 7 Mar 2007 13:38:24 +0100 Date: Wed, 7 Mar 2007 13:38:24 +0100 From: Christoph Hellwig To: Shailendra Tripathi Cc: Christoph Hellwig , xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr Message-ID: <20070307123824.GA3996@lst.de> References: <20070307101324.GC30587@lst.de> <45EEACA0.4050206@agami.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45EEACA0.4050206@agami.com> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10778 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 1042 Lines: 22 On Wed, Mar 07, 2007 at 05:44:24PM +0530, Shailendra Tripathi wrote: > Hi Christoph, > Did you do some testing for recovery when end of the physical > log is seen ? I ran xfsqa over it, which should catch this case. > When you will be dealing with striped ICLOG buffers or big > sized ICLOGs, header size might range from 512 to 2k. Also, this header > might be split into 2 parts at the end of physical log. Then, you don't > have page size buffer. Please verify that XFS_BUF_SETP_PTRs work correctly > for those cases. > Same thing is true when data section is split around physical log. > You can get one part which is not PAGE sized. I should have made my wording more clear, we always do PAGE_SIZE + buffer allocations. After XFS_BUF_SETP_PTR the actually used buffer might be smaller. I tested XFS_BUF_SETP_PTR manually with artifical test code aswell, and made sure it still works. Long term I have a plan to replace XFS_BUF_SETP_PTR with better schemes, but that's irrelevant for this patch. From owner-xfs@oss.sgi.com Wed Mar 7 06:34:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 06:34:59 -0800 (PST) X-Spam-oss-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_05, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27EYq6p009623 for ; Wed, 7 Mar 2007 06:34:53 -0800 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id B8D9C1807DF12; Wed, 7 Mar 2007 08:34:51 -0600 (CST) Message-ID: <45EECD8A.7000503@sandeen.net> Date: Wed, 07 Mar 2007 08:34:50 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com Subject: Re: [PATCH] get rid of fname[] for tracing functions References: <45EE2EF3.8090707@sandeen.net> <20070307082753.GA5469@infradead.org> In-Reply-To: <20070307082753.GA5469@infradead.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10780 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 524 Lines: 14 Christoph Hellwig wrote: >> the vn tracing could probably get a similar treatment, so that every >> call to vn_trace_foo wouldn't have to include a function name and a >> __builtin_return_address, but could be done via macros... it's currently >> a mishmash of __FUNCTION__ and "function" *shrug* what do you think? > > It should probably use __FUNCTION__ and hide use of both __FUNCTION__ > and __builtin_return_address behind a macro. > Yep, that's exactly what I meant, I've started a patch to do that too. -Eric From owner-xfs@oss.sgi.com Wed Mar 7 09:19:06 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 09:19:14 -0800 (PST) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27HJ46p023039 for ; Wed, 7 Mar 2007 09:19:06 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l27HFob2022759 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 7 Mar 2007 18:15:50 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l27HFlT1022755; Wed, 7 Mar 2007 18:15:47 +0100 Date: Wed, 7 Mar 2007 18:15:47 +0100 From: Christoph Hellwig To: Michael Nishimoto Cc: Christoph Hellwig , xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr Message-ID: <20070307171547.GA22641@lst.de> References: <20070307101324.GC30587@lst.de> <45EEF0B5.40905@agami.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45EEF0B5.40905@agami.com> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10781 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 653 Lines: 14 On Wed, Mar 07, 2007 at 09:04:53AM -0800, Michael Nishimoto wrote: > Incore log buffers are not always a power of two of the page size. > In particular, when xfs is running over software raid devices, the > log buffers are allocated to match the size of a stripe. > > However, they are always a multiple of PAGE_SIZE, so we are still safe. It's not actually about beeing safe - any allocation is still safe with this patch. The issue is just that we waste memory because we round up allocations to the next page size. The power of two bit is actually wrong in this mail, it was about another optimization I have that needs some more testing first. From owner-xfs@oss.sgi.com Wed Mar 7 10:02:37 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Mar 2007 10:02:43 -0800 (PST) X-Spam-oss-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_00, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l27I2E6p008255 for ; Wed, 7 Mar 2007 10:02:18 -0800 Received: from agami.com (mail [192.168.168.5]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id l27H4e3A025528 for ; Wed, 7 Mar 2007 09:04:41 -0800 Received: from mx1.agami.com (mx1.agami.com [10.123.10.30]) by agami.com (8.12.11/8.12.11) with ESMTP id l27H4mNH013692 for ; Wed, 7 Mar 2007 09:04:48 -0800 Received: from [127.0.0.1] ([10.123.0.56]) by mx1.agami.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 7 Mar 2007 09:04:47 -0800 Message-ID: <45EEF0B5.40905@agami.com> Date: Wed, 07 Mar 2007 09:04:53 -0800 From: Michael Nishimoto User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Christoph Hellwig CC: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr References: <20070307101324.GC30587@lst.de> In-Reply-To: <20070307101324.GC30587@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Mar 2007 17:04:47.0821 (UTC) FILETIME=[B622A7D0:01C760DA] X-Scanned-By: MIMEDefang 2.58 on 192.168.168.13 X-archive-position: 10782 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: miken@agami.com Precedence: bulk X-list: xfs Content-Length: 4649 Lines: 142 Incore log buffers are not always a power of two of the page size. In particular, when xfs is running over software raid devices, the log buffers are allocated to match the size of a stripe. However, they are always a multiple of PAGE_SIZE, so we are still safe. Michael Christoph Hellwig wrote: >Currently xfs_buf_get_noaddr allocates memory using kmem_alloc which >can end up either in kmalloc or vmalloc and assigns it to the buffer. >This patch changes it to allocate individual pages and if there is >more then one maps it into kernel virtual space using vmap. > >This means the minimum buffer allocation is PAGE_SIZE now. For two >of the three caller (log buffers, log recovery) that is perfectly >fine, because they always allocate buffers that are a power of two >of the page size anyway. For xfs_zero_remaining_bytes the minimum >allocation goes up from blocksize to pagesize and thus there is >a potential waste of memory for blocksize < pagesize allocations, >which is unfortunate but not directly solveable when block >drivers expect reference countable pages. To fix this waste >xfs_zero_remaining_bytes could be rewritten to zero more than >a single block at a time, which sounds like a good idea in general. > > >Signed-off-by: Christoph Hellwig > >Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c >=================================================================== >--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-03-05 15:54:40.000000000 +0100 >+++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-03-05 15:54:47.000000000 +0100 >@@ -314,7 +314,7 @@ > > ASSERT(list_empty(&bp->b_hash_list)); > >- if (bp->b_flags & _XBF_PAGE_CACHE) { >+ if (bp->b_flags & (_XBF_PAGE_CACHE|_XBF_PAGES)) { > uint i; > > if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) >@@ -323,18 +323,11 @@ > for (i = 0; i < bp->b_page_count; i++) { > struct page *page = bp->b_pages[i]; > >- ASSERT(!PagePrivate(page)); >+ if (bp->b_flags & _XBF_PAGE_CACHE) >+ ASSERT(!PagePrivate(page)); > page_cache_release(page); > } > _xfs_buf_free_pages(bp); >- } else if (bp->b_flags & _XBF_KMEM_ALLOC) { >- /* >- * XXX(hch): bp->b_count_desired might be incorrect (see >- * xfs_buf_associate_memory for details), but fortunately >- * the Linux version of kmem_free ignores the len argument.. >- */ >- kmem_free(bp->b_addr, bp->b_count_desired); >- _xfs_buf_free_pages(bp); > } > > xfs_buf_deallocate(bp); >@@ -764,41 +757,41 @@ > size_t len, > xfs_buftarg_t *target) > { >- size_t malloc_len = len; >+ unsigned long page_count = PAGE_ALIGN(len) >> PAGE_SHIFT; >+ int error, i; > xfs_buf_t *bp; >- void *data; >- int error; > > bp = xfs_buf_allocate(0); > if (unlikely(bp == NULL)) > goto fail; > _xfs_buf_initialize(bp, target, 0, len, 0); > >- try_again: >- data = kmem_alloc(malloc_len, KM_SLEEP | KM_MAYFAIL | KM_LARGE); >- if (unlikely(data == NULL)) >+ error = _xfs_buf_get_pages(bp, page_count, 0); >+ if (error) > goto fail_free_buf; > >- /* check whether alignment matches.. */ >- if ((__psunsigned_t)data != >- ((__psunsigned_t)data & ~target->bt_smask)) { >- /* .. else double the size and try again */ >- kmem_free(data, malloc_len); >- malloc_len <<= 1; >- goto try_again; >- } >- >- error = xfs_buf_associate_memory(bp, data, len); >- if (error) >+ for (i = 0; i < page_count; i++) { >+ bp->b_pages[i] = alloc_page(GFP_KERNEL); >+ if (!bp->b_pages[i]) >+ goto fail_free_mem; >+ } >+ bp->b_flags |= _XBF_PAGES; >+ >+ error = _xfs_buf_map_pages(bp, XBF_MAPPED); >+ if (unlikely(error)) { >+ printk(KERN_WARNING "%s: failed to map pages\n", >+ __FUNCTION__); > goto fail_free_mem; >- bp->b_flags |= _XBF_KMEM_ALLOC; >+ } > > xfs_buf_unlock(bp); > > XB_TRACE(bp, "no_daddr", data); > return bp; >+ > fail_free_mem: >- kmem_free(data, malloc_len); >+ for ( ; i >= 0; i++) >+ __free_page(bp->b_pages[i]); > fail_free_buf: > xfs_buf_free(bp); > fail: >Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.h >=================================================================== >--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.h 2007-03-05 15:54:40.000000000 +0100 >+++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.h 2007-03-05 15:55:06.000000000 +0100 >@@ -63,7 +63,7 @@ > > /* flags used only internally */ > _XBF_PAGE_CACHE = (1 << 17),/* backed by pagecache */ >- _XBF_KMEM_ALLOC = (1 << 18),/* backed by kmem_alloc() */ >+ _XBF_PAGES = (1 << 18), /* backed by refcounted pages */ > _XBF_RUN_QUEUES = (1 << 19),/* run block device task queue */ > _XBF_DELWRI_Q = (1 << 21), /* buffer on delwri queue */ > } xfs_buf_flags_t; > > > > From owner-xfs@oss.sgi.com Thu Mar 8 19:58:20 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 19:58:25 -0800 (PST) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l293wE6p002149 for ; Thu, 8 Mar 2007 19:58:18 -0800 Received: from pcbnaujok (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA12270; Fri, 9 Mar 2007 14:58:13 +1100 Message-Id: <200703090358.OAA12270@larry.melbourne.sgi.com> From: "Barry Naujok" To: Cc: Subject: RE: [PATCH] xfs_repair doesn't detect corrupt btree roots in nodes Date: Fri, 9 Mar 2007 14:59:10 +1100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0083_01C7625B.7E6A2520" X-Mailer: Microsoft Office Outlook, Build 11.0.6353 Thread-Index: AcdWIVkkHK4i+MFOTcyO7HrNHKhnvAL3cH7w X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 In-Reply-To: <200702220126.MAA19208@larry.melbourne.sgi.com> X-archive-position: 10784 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 5735 Lines: 183 This is a multi-part message in MIME format. ------=_NextPart_000_0083_01C7625B.7E6A2520 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Ping? > -----Original Message----- > From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] > On Behalf Of Barry Naujok > Sent: Thursday, 22 February 2007 12:33 PM > To: xfs@oss.sgi.com > Cc: xfs-dev@sgi.com > Subject: [PATCH] xfs_repair doesn't detect corrupt btree > roots in nodes > > The attached patch detect invalid btree root field (numrecs = > 0, levels > > permissable value). > > The patch also does some cleanup with the level and numrecs > usage for the > process_btinode function. > > ------=_NextPart_000_0083_01C7625B.7E6A2520 Content-Type: application/octet-stream; name="detect_bad_btree_root.diff" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="detect_bad_btree_root.diff" --- a/xfsprogs/repair/dinode.c 2007-03-09 14:57:49.000000000 +1100 +++ b/xfsprogs/repair/dinode.c 2007-03-09 14:54:59.354264507 +1100 @@ -1223,6 +1223,8 @@ process_btinode( xfs_bmbt_key_t *pkey; char *forkname; int i; + int level; + int numrecs; bmap_cursor_t cursor; =20 dib =3D (xfs_bmdr_block_t *)XFS_DFORK_PTR(dip, whichfork); @@ -1235,13 +1237,11 @@ process_btinode( else forkname =3D _("attr"); =20 - if (INT_GET(dib->bb_level, ARCH_CONVERT) =3D=3D 0) { + level =3D INT_GET(dib->bb_level, ARCH_CONVERT); + numrecs =3D INT_GET(dib->bb_numrecs, ARCH_CONVERT); + + if ((level =3D=3D 0) || (level > XFS_BM_MAXLEVELS(mp, whichfork))) { /* - * This should never happen since a btree inode - * has to have at least one other block in the - * bmap in addition to the root block in the - * inode's data fork. - * * XXX - if we were going to fix up the inode, * we'd try to treat the fork as an interior * node and see if we could get an accurate @@ -1249,28 +1249,30 @@ process_btinode( * to by the pointers in the fork. For now * though, we just bail (and blow out the inode). */ - do_warn(_("bad level 0 in inode %llu bmap btree root block\n"), + do_warn(_("bad level %d in inode %llu bmap btree root block\n"), + level, XFS_AGINO_TO_INO(mp, agno, ino)); + return(1); + } + if (numrecs =3D=3D 0) { + do_warn(_("bad numrecs 0 in inode %llu bmap btree root block\n"), XFS_AGINO_TO_INO(mp, agno, ino)); return(1); } /* * use bmdr/dfork_dsize since the root block is in the data fork */ - init_bm_cursor(&cursor, INT_GET(dib->bb_level, ARCH_CONVERT) + 1); - - if (XFS_BMDR_SPACE_CALC(INT_GET(dib->bb_numrecs, ARCH_CONVERT)) > - ((whichfork =3D=3D XFS_DATA_FORK) ? + if (XFS_BMDR_SPACE_CALC(numrecs) > ((whichfork =3D=3D XFS_DATA_FORK) ? XFS_DFORK_DSIZE(dip, mp) : XFS_DFORK_ASIZE(dip, mp))) { do_warn( _("indicated size of %s btree root (%d bytes) greater than space in " "inode %llu %s fork\n"), - forkname, XFS_BMDR_SPACE_CALC(INT_GET(dib->bb_numrecs, - ARCH_CONVERT)), - lino, forkname); + forkname, XFS_BMDR_SPACE_CALC(numrecs), lino, forkname); return(1); } =20 + init_bm_cursor(&cursor, level + 1); + pp =3D XFS_BTREE_PTR_ADDR( XFS_DFORK_SIZE(dip, mp, whichfork), xfs_bmdr, dib, 1, @@ -1286,7 +1288,7 @@ process_btinode( =20 last_key =3D NULLDFILOFF; =20 - for (i =3D 0; i < INT_GET(dib->bb_numrecs, ARCH_CONVERT); i++) { + for (i =3D 0; i < numrecs; i++) { /* * XXX - if we were going to do more to fix up the inode * btree, we'd do it right here. For now, if there's a @@ -1298,8 +1300,8 @@ process_btinode( return(1); } =20 - if (scan_lbtree((xfs_dfsbno_t)INT_GET(pp[i], ARCH_CONVERT), INT_GET(dib-= >bb_level, ARCH_CONVERT), - scanfunc_bmap, type, whichfork, + if (scan_lbtree((xfs_dfsbno_t)INT_GET(pp[i], ARCH_CONVERT), + level, scanfunc_bmap, type, whichfork, lino, tot, nex, blkmapp, &cursor, 1, check_dups)) return(1); @@ -1310,8 +1312,7 @@ process_btinode( * blocks but the parent hasn't been updated */ if (check_dups =3D=3D 0 && - cursor.level[INT_GET(dib->bb_level, - ARCH_CONVERT)-1].first_key !=3D + cursor.level[level-1].first_key !=3D INT_GET(pkey[i].br_startoff, ARCH_CONVERT)) { if (!no_modify) { do_warn( @@ -1319,22 +1320,19 @@ process_btinode( "%llu %s fork\n"), INT_GET(pkey[i].br_startoff, ARCH_CONVERT), - cursor.level[INT_GET(dib->bb_level, - ARCH_CONVERT)-1].first_key, + cursor.level[level-1].first_key, XFS_AGINO_TO_INO(mp, agno, ino), forkname); *dirty =3D 1; INT_SET(pkey[i].br_startoff, ARCH_CONVERT, - cursor.level[INT_GET(dib->bb_level, - ARCH_CONVERT)-1].first_key); + cursor.level[level-1].first_key); } else { do_warn( _("bad key in bmbt root (is %llu, would reset to %llu) in inode " "%llu %s fork\n"), INT_GET(pkey[i].br_startoff, ARCH_CONVERT), - cursor.level[INT_GET(dib->bb_level, - ARCH_CONVERT)-1].first_key, + cursor.level[level-1].first_key, XFS_AGINO_TO_INO(mp, agno, ino), forkname); } @@ -1345,8 +1343,7 @@ process_btinode( */ if (check_dups =3D=3D 0) { if (last_key !=3D NULLDFILOFF && last_key >=3D - cursor.level[INT_GET(dib->bb_level, - ARCH_CONVERT)-1].first_key) { + cursor.level[level-1].first_key) { do_warn( _("out of order bmbt root key %llu in inode %llu %s fork\n"), first_key, @@ -1354,8 +1351,7 @@ process_btinode( forkname); return(1); } - last_key =3D cursor.level[INT_GET(dib->bb_level, - ARCH_CONVERT)-1].first_key; + last_key =3D cursor.level[level-1].first_key; } } /* ------=_NextPart_000_0083_01C7625B.7E6A2520-- From owner-xfs@oss.sgi.com Thu Mar 8 20:04:14 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 20:04:18 -0800 (PST) X-Spam-oss-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2944B6p004445 for ; Thu, 8 Mar 2007 20:04:12 -0800 Received: from linuxbuild.melbourne.sgi.com (linuxbuild.melbourne.sgi.com [134.14.54.115]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA12463; Fri, 9 Mar 2007 15:04:05 +1100 From: donaldd@sgi.com Received: by linuxbuild.melbourne.sgi.com (Postfix, from userid 16365) id 35F48250DE15; Fri, 9 Mar 2007 15:04:05 +1100 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: TAKE 961964 - xfs quota information can become stale when mounted with different quota types. Message-Id: <20070309040405.35F48250DE15@linuxbuild.melbourne.sgi.com> Date: Fri, 9 Mar 2007 15:04:05 +1100 (EST) X-archive-position: 10785 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Content-Length: 843 Lines: 22 Invalidate quotacheck when mounting without a quota type. When quotas are mounted or remounted without a particular quota type the quota accounting for that type becomes invalid. Previously we were ignoring this leading to accounting errors. Thanks to Utako Kursaka for finding this and helping develop the fix. Date: Fri Mar 9 15:02:34 AEDT 2007 Workarea: linuxbuild.melbourne.sgi.com:/home/donaldd/isms/2.6.x-xfs Inspected by: Utako Kusaka ,vapo The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28225a fs/xfs/quota/xfs_qm.c - 1.48 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm.c.diff?r1=text&tr1=1.48&r2=text&tr2=1.47&f=h - Invalidate quotacheck state when mounting without a paticular quota type. From owner-xfs@oss.sgi.com Thu Mar 8 20:17:47 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 20:17:52 -0800 (PST) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l294Hg6p009325 for ; Thu, 8 Mar 2007 20:17:46 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA12782; Fri, 9 Mar 2007 15:17:38 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 1116) id 3C36B58F9284; Fri, 9 Mar 2007 15:17:38 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com Subject: TAKE 961970 - __u32 test in package m4 macro instead of AC_CHECK_TYPES Message-Id: <20070309041738.3C36B58F9284@chook.melbourne.sgi.com> Date: Fri, 9 Mar 2007 15:17:38 +1100 (EST) From: tes@sgi.com (Tim Shimmin) X-archive-position: 10786 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1126 Lines: 24 Need to check for __u32 in our own m4 macro instead of using AC_CHECK_TYPES which doesn't exist on some older autoconf's. Date: Fri Mar 9 15:16:33 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/tes/xfs-cmds Inspected by: nscott@aconex.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:28226a xfsprogs/configure.in - 1.38 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/configure.in.diff?r1=text&tr1=1.38&r2=text&tr2=1.37&f=h xfsprogs/doc/CHANGES - 1.236 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.236&r2=text&tr2=1.235&f=h xfsprogs/aclocal.m4 - 1.25 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/aclocal.m4.diff?r1=text&tr1=1.25&r2=text&tr2=1.24&f=h xfsprogs/m4/package_types.m4 - 1.3 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/m4/package_types.m4.diff?r1=text&tr1=1.3&r2=text&tr2=1.2&f=h - Need to check for __u32 in our own m4 macro instead of using AC_CHECK_TYPES which doesn't exist on some older autoconf's. From owner-xfs@oss.sgi.com Thu Mar 8 20:26:17 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 20:26:22 -0800 (PST) X-Spam-oss-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l294QD6p017080 for ; Thu, 8 Mar 2007 20:26:15 -0800 Received: from linuxbuild.melbourne.sgi.com (linuxbuild.melbourne.sgi.com [134.14.54.115]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA12902; Fri, 9 Mar 2007 15:26:09 +1100 From: donaldd@sgi.com Received: by linuxbuild.melbourne.sgi.com (Postfix, from userid 16365) id EA86225933C5; Fri, 9 Mar 2007 15:26:08 +1100 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: TAKE 961964 - Enabling group quota enforcement on a filesystem with group accounting fails. Message-Id: <20070309042608.EA86225933C5@linuxbuild.melbourne.sgi.com> Date: Fri, 9 Mar 2007 15:26:08 +1100 (EST) X-archive-position: 10787 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Content-Length: 767 Lines: 23 Fix quotaon syscall failures for group enforcement requests. xfs_qm_scall_quotaon was incorrectly failing requests to enable group quota enforcement. Fixes logic error in OQUOTA handling. Patch provided by Kouta Ooizumi. Signed-off-by: Kouta Ooizumi Date: Fri Mar 9 15:22:52 AEDT 2007 Workarea: linuxbuild.melbourne.sgi.com:/home/donaldd/isms/2.6.x-xfs Inspected by: donaldd The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28227a fs/xfs/quota/xfs_qm_syscalls.c - 1.32 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm_syscalls.c.diff?r1=text&tr1=1.32&r2=text&tr2=1.31&f=h - Fix handling of OQUOTA in xfs_qm_scall_quotaon. From owner-xfs@oss.sgi.com Thu Mar 8 22:05:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 22:05:43 -0800 (PST) X-Spam-oss-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_50, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,J_CHICKENPOX_14,J_CHICKENPOX_15, RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2965U6p010711 for ; Thu, 8 Mar 2007 22:05:32 -0800 Received: from agami.com (mail [192.168.168.5]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id l2965T3A030392 for ; Thu, 8 Mar 2007 22:05:29 -0800 Received: from mx1.agami.com (mx1.agami.com [10.123.10.30]) by agami.com (8.12.11/8.12.11) with ESMTP id l2965auY010732 for ; Thu, 8 Mar 2007 22:05:36 -0800 Received: from [10.12.12.141] ([10.12.12.141]) by mx1.agami.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 8 Mar 2007 22:05:36 -0800 Message-ID: <45F0F7B6.9070406@agami.com> Date: Fri, 09 Mar 2007 11:29:18 +0530 From: Shailendra Tripathi User-Agent: Mozilla Thunderbird 0.9 (X11/20041127) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Barry Naujok CC: xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] xfs_repair doesn't detect corrupt btree roots in nodes References: <200703090358.OAA12270@larry.melbourne.sgi.com> In-Reply-To: <200703090358.OAA12270@larry.melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 09 Mar 2007 06:05:36.0586 (UTC) FILETIME=[F496A2A0:01C76210] X-Scanned-By: MIMEDefang 2.58 on 192.168.168.13 X-archive-position: 10788 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 954 Lines: 35 Looks ok to me. Can you test if it works by manually corrupting some B+Tree using xfs_db (or in fact, you can write a small program which does so. You can just write a program which reads the location on the disk directly, modify the 512 size sector corresponding to these with some selected fields and write them out.) Try the code when major B+Trees are corrupted --> AGI /BCNTi/BSIZE tree. -shailendra Barry Naujok wrote: > Ping? > > >>-----Original Message----- >>From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] >>On Behalf Of Barry Naujok >>Sent: Thursday, 22 February 2007 12:33 PM >>To: xfs@oss.sgi.com >>Cc: xfs-dev@sgi.com >>Subject: [PATCH] xfs_repair doesn't detect corrupt btree >>roots in nodes >> >>The attached patch detect invalid btree root field (numrecs = >>0, levels > >>permissable value). >> >>The patch also does some cleanup with the level and numrecs >>usage for the >>process_btinode function. >> >> From owner-xfs@oss.sgi.com Thu Mar 8 22:19:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 22:19:58 -0800 (PST) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50, MIME_QP_LONG_LINE autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l296Jl6p014723 for ; Thu, 8 Mar 2007 22:19:49 -0800 Received: from pcbnaujok (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA15327; Fri, 9 Mar 2007 17:19:45 +1100 Message-Id: <200703090619.RAA15327@larry.melbourne.sgi.com> From: "Barry Naujok" To: , Subject: [PATCH] New xfs_repair handling for inode nlink counts Date: Fri, 9 Mar 2007 17:20:28 +1100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_008F_01C7626F.3C00DD50" X-Mailer: Microsoft Office Outlook, Build 11.0.6353 Thread-Index: AcdiEwgGddBZtAjVToWvsumIR0RD5Q== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 X-archive-position: 10789 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 33658 Lines: 1246 This is a multi-part message in MIME format. ------=_NextPart_000_008F_01C7626F.3C00DD50 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit The attached patch has 3 parts to it: - optimised phase 7 (inode nlink count) speed - improved memory usage for inode nlink counts - memory usage tracking - other speed improvements Overall, phase 7 is almost instant, and phases 6/7 use less memory than current versions of xfs_repair. The optimised phase 7 involved the patches to: dino_chunks.c This stores the on-disk nlink count for inodes into the inode tree that is created in phase 3. phase7.c This compares the on-disk nlink counts read in phase 3 to the actual count it should be generated in phase 6. If they are different, creates a transaction and updates the inode on disk. No other disk I/O is generated. incore.h Added disk_nlinks to ino_tree_node_t structure and renamed nlinks to counted_nlinks in the backptrs_t structure. Also created set/get_inode_disk_nlinks inline functions. Due to the massive increase in memory required to store these counts for each inode in the filesystem, I have implemented memory optimisation using a dynamically sized elements for each inode cluster. Initially, they start at 8 bits each and double in bits as required by inodes with large nlink counts. This implementation uses an "nlinkops" function pointers to keep CPU usage to a minimum. This is entirely implemented in incore.h and incore_ino.c. To measure memory used by various parts xfs_repair, I implemented memory tracking in global.h and global.c. Default is not to compile this in, but can enabled by defining TRACK_MEMORY when compiling these two files. Finally, a small enhancement was made in xfs_repair.c. For filesystems that fit within the libxfs block cache, phase 6 6 is now significantly faster by flushing dirty blocks to disk rather than purging them from memory and then re-reading again them during phase 6. The flush is required as the libxfs block and inode cache is not unified. ------=_NextPart_000_008F_01C7626F.3C00DD50 Content-Type: application/octet-stream; name="improved_repair_nlink_handling.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="improved_repair_nlink_handling.patch" Index: xfsprogs/repair/globals.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/globals.c +++ xfsprogs/repair/globals.c @@ -20,3 +20,183 @@ =20 #define EXTERN #include "globals.h" + +#ifdef TRACK_MEMORY + +#undef calloc +#undef malloc +#undef memalign +#undef realloc +#undef free + +/* + * Track by file name pointer and also by return pointer + */ + +typedef struct func { + const char *file; + int line; + int64_t acount; + int64_t fcount; + int64_t rcount; + int64_t current; + int64_t peak; +} func_t; + +typedef struct entry { + struct entry *next; + func_t *fileline; + size_t size; + void *ptr; +} entry_t; + +static int caller_count =3D 0; +static int caller_size =3D 0; +static func_t *callers =3D NULL; + +static entry_t *ptrhash[256]; + +static +void track_alloc(const char *file, int line, size_t size, void *p) +{ + int i; + entry_t *e; + + /* find an existing func call from file/line */ + for (i =3D 0; i < caller_count; i++) { + if ((callers[i].file =3D=3D file) && (callers[i].line =3D=3D line)) + break; + } + if (i =3D=3D caller_count) { /* add new func if not found */ + if (caller_count =3D=3D caller_size) { + caller_size +=3D 64; + callers =3D realloc(callers, sizeof(func_t) * caller_size); + } + memset(&callers[i], 0, sizeof(func_t)); + callers[i].file =3D file; + callers[i].line =3D line; + caller_count++; + } + + e =3D malloc(sizeof(entry_t)); + e->size =3D size; + e->ptr =3D p; + e->fileline =3D &callers[i]; + + callers[i].acount++; + callers[i].current +=3D size; + if (callers[i].current > callers[i].peak) + callers[i].peak =3D callers[i].current; + + /* add pointer to hash list, very basic simple hash function */ + i =3D (((size_t)p) >> 8) & 0xff; + + e->next =3D ptrhash[i]; + ptrhash[i] =3D e; +} + +void *track_calloc(const char *file, int line, size_t num, size_t size) +{ + void *retval =3D calloc(num, size); + + if (retval !=3D NULL) + track_alloc(file, line, num * size, retval); + + return retval; +} + +void *track_malloc(const char *file, int line, size_t size) +{ + void *retval =3D malloc(size); + + if (retval !=3D NULL) + track_alloc(file, line, size, retval); + + return retval; +} + +void *track_memalign(const char *file, int line, size_t boundary, size_t s= ize) +{ + void *retval =3D memalign(boundary, size); + + if (retval !=3D NULL) + track_alloc(file, line, size, retval); + + return retval; +} + +void *track_realloc(const char *file, int line, void *ptr, size_t size) +{ + int i; + entry_t *e, *prev; + void *newptr =3D realloc(ptr, size); + + if (ptr =3D=3D NULL && newptr !=3D NULL) { + track_alloc(file, line, size, newptr); + return newptr; + } + + i =3D (((size_t)ptr) >> 8) & 0xff; + + prev =3D NULL; + for (e =3D ptrhash[i]; e; e =3D e->next) { + if (e->ptr =3D=3D ptr) + break; + prev =3D e; + } + if (!e) + return newptr; + + e->fileline->rcount++; + e->fileline->current =3D e->fileline->current + size - e->size; + if (e->fileline->current > e->fileline->peak) + e->fileline->peak =3D e->fileline->current; + e->size =3D size; + e->ptr =3D newptr; + + return newptr; +} + +void track_free(const char *file, int line, void *ptr) +{ + int i; + entry_t *e, *prev; + + free(ptr); + + /* find associated entry */ + i =3D (((size_t)ptr) >> 8) & 0xff; + + prev =3D NULL; + for (e =3D ptrhash[i]; e; e =3D e->next) { + if (e->ptr =3D=3D ptr) + break; + prev =3D e; + } + if (!e) + return; + + e->fileline->fcount++; + e->fileline->current -=3D e->size; + + if (prev) + prev->next =3D e->next; + else + ptrhash[i] =3D e->next; + free(e); +} + +void print_memory_usage(void) +{ + int i; + + printf("%20s:line \ta_cnt\tf_cnt\tr_cnt\tremain\tpeak\n", "file"); + for (i =3D 0; i < caller_count; i++) { + printf("%20s:%-5d\t%lld\t%lld\t%lld\t%lld\t%lld\n", + callers[i].file, callers[i].line, + callers[i].acount, callers[i].fcount, callers[i].rcount, + callers[i].current, callers[i].peak); + } +} + +#endif /* TRACK_MEMORY */ \ No newline at end of file Index: xfsprogs/repair/globals.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/globals.h +++ xfsprogs/repair/globals.h @@ -16,6 +16,16 @@ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ =20 +#ifdef TRACK_MEMORY + +#define calloc(n,s) track_calloc(__FILE__, __LINE__, (n), (s)) +#define malloc(s) track_malloc(__FILE__, __LINE__, (s)) +#define memalign(b,s) track_memalign(__FILE__, __LINE__, (b), (s)) +#define realloc(p,s) track_realloc(__FILE__, __LINE__, (p), (s)) +#define free(p) track_free(__FILE__, __LINE__, (p)) + +#endif + #ifndef _XFS_REPAIR_GLOBAL_H #define _XFS_REPAIR_GLOBAL_H =20 @@ -23,6 +33,21 @@ #define EXTERN extern #endif =20 +#ifdef TRACK_MEMORY + +void print_memory_usage(void); +void *track_calloc(const char *file, int line, size_t num, size_t size); +void *track_malloc(const char *file, int line, size_t size); +void *track_memalign(const char *file, int line, size_t boundary, size_t s= ize); +void *track_realloc(const char *file, int line, void *ptr, size_t size); +void track_free(const char *file, int line, void *ptr); + +#else + +#define print_memory_usage() do { } while(0) + +#endif + /* useful macros */ =20 #define rounddown(x, y) (((x)/(y))*(y)) Index: xfsprogs/repair/incore.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/incore.h +++ xfsprogs/repair/incore.h @@ -328,6 +328,8 @@ =20 typedef xfs_ino_t parent_entry_t; =20 +struct nlink_ops; + typedef struct parent_list { __uint64_t pmask; parent_entry_t *pentries; @@ -339,8 +341,8 @@ typedef struct backptrs { __uint64_t ino_reached; /* bit =3D=3D 1 if reached */ __uint64_t ino_processed; /* reference checked bit mask */ - __uint32_t nlinks[XFS_INODES_PER_CHUNK]; parent_list_t *parents; + __uint8_t counted_nlinks[XFS_INODES_PER_CHUNK]; } backptrs_t; =20 typedef struct ino_tree_node { @@ -349,12 +351,24 @@ xfs_inofree_t ir_free; /* inode free bit mask */ __uint64_t ino_confirmed; /* confirmed bitmask */ __uint64_t ino_isa_dir; /* bit =3D=3D 1 if a directory */ + struct nlink_ops *nlinkops; /* pointer to current nlink ops */ + __uint8_t *disk_nlinks; /* pointer to an array of nlinks */ union { backptrs_t *backptrs; parent_list_t *plist; } ino_un; } ino_tree_node_t; =20 +typedef struct nlink_ops { + const int nlink_size; + void (*disk_nlink_set)(ino_tree_node_t *, int, __uint32_t); + __uint32_t (*disk_nlink_get)(ino_tree_node_t *, int); + __uint32_t (*counted_nlink_get)(ino_tree_node_t *, int); + __uint32_t (*counted_nlink_inc)(ino_tree_node_t *, int); + __uint32_t (*counted_nlink_dec)(ino_tree_node_t *, int); +} nlink_ops_t; + + #define INOS_PER_IREC (sizeof(__uint64_t) * NBBY) void add_ino_backptrs(xfs_mount_t *mp); =20 @@ -528,7 +542,7 @@ { ASSERT(ino_rec->ino_un.backptrs !=3D NULL); =20 - ino_rec->ino_un.backptrs->nlinks[ino_offset]++; + (*ino_rec->nlinkops->counted_nlink_inc)(ino_rec, ino_offset); XFS_INO_RCHD_SET_RCHD(ino_rec, ino_offset); =20 ASSERT(is_inode_reached(ino_rec, ino_offset)); @@ -539,16 +553,15 @@ { ASSERT(ino_rec->ino_un.backptrs !=3D NULL); =20 - ino_rec->ino_un.backptrs->nlinks[ino_offset]++; + (*ino_rec->nlinkops->counted_nlink_inc)(ino_rec, ino_offset); } =20 static inline void drop_inode_ref(ino_tree_node_t *ino_rec, int ino_offset) { ASSERT(ino_rec->ino_un.backptrs !=3D NULL); - ASSERT(ino_rec->ino_un.backptrs->nlinks[ino_offset] > 0); =20 - if (--ino_rec->ino_un.backptrs->nlinks[ino_offset] =3D=3D 0) + if ((*ino_rec->nlinkops->counted_nlink_dec)(ino_rec, ino_offset) =3D=3D 0) XFS_INO_RCHD_CLR_RCHD(ino_rec, ino_offset); } =20 @@ -556,14 +569,28 @@ is_inode_referenced(ino_tree_node_t *ino_rec, int ino_offset) { ASSERT(ino_rec->ino_un.backptrs !=3D NULL); - return(ino_rec->ino_un.backptrs->nlinks[ino_offset] > 0); + + return (*ino_rec->nlinkops->counted_nlink_get)(ino_rec, ino_offset) > 0; } =20 static inline __uint32_t num_inode_references(ino_tree_node_t *ino_rec, int ino_offset) { ASSERT(ino_rec->ino_un.backptrs !=3D NULL); - return(ino_rec->ino_un.backptrs->nlinks[ino_offset]); + + return (*ino_rec->nlinkops->counted_nlink_get)(ino_rec, ino_offset); +} + +static inline void +set_inode_disk_nlinks(ino_tree_node_t *ino_rec, int ino_offset, __uint32_t= nlinks) +{ + (*ino_rec->nlinkops->disk_nlink_set)(ino_rec, ino_offset, nlinks); +} + +static inline __uint32_t +get_inode_disk_nlinks(ino_tree_node_t *ino_rec, int ino_offset) +{ + return (*ino_rec->nlinkops->disk_nlink_get)(ino_rec, ino_offset); } =20 /* Index: xfsprogs/repair/incore_ino.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/incore_ino.c +++ xfsprogs/repair/incore_ino.c @@ -50,6 +50,223 @@ =20 static ino_flist_t ino_flist; /* free list must be initialized before use = */ =20 +/* memory optimised nlink counting for all inodes */ + +static void +nlink_grow_8_to_16(ino_tree_node_t *ino_rec); +static void +nlink_grow_16_to_32(ino_tree_node_t *ino_rec); + +static void +disk_nlink_32_set(ino_tree_node_t *ino_rec, int ino_offset, __uint32_t nli= nks) +{ + ((__uint32_t*)ino_rec->disk_nlinks)[ino_offset] =3D nlinks; +} + +static __uint32_t +disk_nlink_32_get(ino_tree_node_t *ino_rec, int ino_offset) +{ + return ((__uint32_t*)ino_rec->disk_nlinks)[ino_offset]; +} + +static __uint32_t +counted_nlink_32_get(ino_tree_node_t *ino_rec, int ino_offset) +{ + __uint32_t *nlinks =3D (__uint32_t*)&ino_rec->ino_un.backptrs->counted_nl= inks; + + return nlinks[ino_offset]; +} + +static __uint32_t +counted_nlink_32_inc(ino_tree_node_t *ino_rec, int ino_offset) +{ + __uint32_t *nlinks =3D (__uint32_t*)&ino_rec->ino_un.backptrs->counted_nl= inks; + + return ++(nlinks[ino_offset]); +} + +static __uint32_t +counted_nlink_32_dec(ino_tree_node_t *ino_rec, int ino_offset) +{ + __uint32_t *nlinks =3D (__uint32_t*)&ino_rec->ino_un.backptrs->counted_nl= inks; + + ASSERT(nlinks[ino_offset] > 0); + return --(nlinks[ino_offset]); +} + + +static void +disk_nlink_16_set(ino_tree_node_t *ino_rec, int ino_offset, __uint32_t nli= nks) +{ + if (nlinks >=3D 0x10000) { + nlink_grow_16_to_32(ino_rec); + disk_nlink_32_set(ino_rec, ino_offset, nlinks); + } else + ((__uint16_t*)ino_rec->disk_nlinks)[ino_offset] =3D nlinks; +} + +static __uint32_t +disk_nlink_16_get(ino_tree_node_t *ino_rec, int ino_offset) +{ + return ((__uint16_t*)ino_rec->disk_nlinks)[ino_offset]; +} + +static __uint32_t +counted_nlink_16_get(ino_tree_node_t *ino_rec, int ino_offset) +{ + __uint16_t *nlinks =3D (__uint16_t*)&ino_rec->ino_un.backptrs->counted_nl= inks; + + return nlinks[ino_offset]; +} + +static __uint32_t +counted_nlink_grow_16_to_32(ino_tree_node_t *ino_rec, int ino_offset) +{ + int i; + backptrs_t *grown =3D realloc(ino_rec->ino_un.backptrs, + offsetof(backptrs_t, counted_nlinks) + + sizeof(__uint32_t) * XFS_INODES_PER_CHUNK); + if (grown =3D=3D NULL) + do_error(_("couldn't allocate memory for backptrs\n")); + + /* start from end working to start as we are overwriting the array */ + for (i =3D XFS_INODES_PER_CHUNK-1; i >=3D 0; i--) { + ((__uint32_t*)&grown->counted_nlinks)[i] =3D + ((__uint16_t*)&grown->counted_nlinks)[i]; + } + ino_rec->ino_un.backptrs =3D grown; + nlink_grow_16_to_32(ino_rec); + return ++((__uint32_t*)&grown->counted_nlinks)[ino_offset]; +} + +static __uint32_t +counted_nlink_16_inc(ino_tree_node_t *ino_rec, int ino_offset) +{ + __uint16_t *nlinks =3D (__uint16_t*)&ino_rec->ino_un.backptrs->counted_nl= inks; + + if (nlinks[ino_offset] =3D=3D 0xffff) + return counted_nlink_grow_16_to_32(ino_rec, ino_offset); + return ++(nlinks[ino_offset]); +} + +static __uint32_t +counted_nlink_16_dec(ino_tree_node_t *ino_rec, int ino_offset) +{ + __uint16_t *nlinks =3D (__uint16_t*)&ino_rec->ino_un.backptrs->counted_nl= inks; + + ASSERT(nlinks[ino_offset] > 0); + return --(nlinks[ino_offset]); +} + + +static void +disk_nlink_8_set(ino_tree_node_t *ino_rec, int ino_offset, __uint32_t nlin= ks) +{ + ASSERT(full_backptrs =3D=3D 0); + + if (nlinks >=3D 0x100) { + nlink_grow_8_to_16(ino_rec); + disk_nlink_16_set(ino_rec, ino_offset, nlinks); + } else + ino_rec->disk_nlinks[ino_offset] =3D nlinks; +} + +static __uint32_t +disk_nlink_8_get(ino_tree_node_t *ino_rec, int ino_offset) +{ + return ino_rec->disk_nlinks[ino_offset]; +} + +static __uint32_t +counted_nlink_8_get(ino_tree_node_t *ino_rec, int ino_offset) +{ + return ino_rec->ino_un.backptrs->counted_nlinks[ino_offset]; +} + +static __uint32_t +counted_nlink_grow_8_to_16(ino_tree_node_t *ino_rec, int ino_offset) +{ + int i; + backptrs_t *grown =3D realloc(ino_rec->ino_un.backptrs, + offsetof(backptrs_t, counted_nlinks) + + sizeof(__uint16_t) * XFS_INODES_PER_CHUNK); + if (grown =3D=3D NULL) + do_error(_("couldn't allocate memory for backptrs\n")); + + /* + * start from end working to start as we are overwriting the array + */ + for (i =3D XFS_INODES_PER_CHUNK-1; i >=3D 0; i--) { + ((__uint16_t*)&grown->counted_nlinks)[i] =3D + grown->counted_nlinks[i]; + } + ino_rec->ino_un.backptrs =3D grown; + nlink_grow_8_to_16(ino_rec); + return ++((__uint16_t*)&grown->counted_nlinks)[ino_offset]; +} + +static __uint32_t +counted_nlink_8_inc(ino_tree_node_t *ino_rec, int ino_offset) +{ + if (ino_rec->ino_un.backptrs->counted_nlinks[ino_offset] =3D=3D 0xff) + return counted_nlink_grow_8_to_16(ino_rec, ino_offset); + return ++(ino_rec->ino_un.backptrs->counted_nlinks[ino_offset]); +} + +static __uint32_t +counted_nlink_8_dec(ino_tree_node_t *ino_rec, int ino_offset) +{ + ASSERT(ino_rec->ino_un.backptrs->counted_nlinks[ino_offset] > 0); + return --(ino_rec->ino_un.backptrs->counted_nlinks[ino_offset]); +} + + +static nlink_ops_t nlinkops[] =3D { + {sizeof(__uint8_t) * XFS_INODES_PER_CHUNK, + disk_nlink_8_set, disk_nlink_8_get, + counted_nlink_8_get, counted_nlink_8_inc, counted_nlink_8_dec}, + {sizeof(__uint16_t) * XFS_INODES_PER_CHUNK, + disk_nlink_16_set, disk_nlink_16_get, + counted_nlink_16_get, counted_nlink_16_inc, counted_nlink_16_dec}, + {sizeof(__uint32_t) * XFS_INODES_PER_CHUNK, + disk_nlink_32_set, disk_nlink_32_get, + counted_nlink_32_get, counted_nlink_32_inc, counted_nlink_32_dec}, +}; + +static void +nlink_grow_8_to_16(ino_tree_node_t *ino_rec) +{ + __uint16_t *new_nlinks; + int i; + + new_nlinks =3D malloc(sizeof(__uint16_t) * XFS_INODES_PER_CHUNK); + if (new_nlinks =3D=3D NULL) + do_error(_("could not allocate expanded nlink array\n")); + for (i =3D 0; i < XFS_INODES_PER_CHUNK; i++) + new_nlinks[i] =3D ino_rec->disk_nlinks[i]; + free(ino_rec->disk_nlinks); + ino_rec->disk_nlinks =3D (__uint8_t*)new_nlinks; + + ino_rec->nlinkops =3D &nlinkops[1]; +} + +static void +nlink_grow_16_to_32(ino_tree_node_t *ino_rec) +{ + __uint32_t *new_nlinks; + int i; + + new_nlinks =3D malloc(sizeof(__uint32_t) * XFS_INODES_PER_CHUNK); + if (new_nlinks =3D=3D NULL) + do_error(_("could not allocate expanded nlink array\n")); + for (i =3D 0; i < XFS_INODES_PER_CHUNK; i++) + new_nlinks[i] =3D ((__int16_t*)&ino_rec->disk_nlinks)[i]; + free(ino_rec->disk_nlinks); + ino_rec->disk_nlinks =3D (__uint8_t*)new_nlinks; + + ino_rec->nlinkops =3D &nlinkops[2]; +} + /* * next is the uncertain inode list -- a sorted (in ascending order) * list of inode records sorted on the starting inode number. There @@ -104,6 +321,10 @@ new->ino_isa_dir =3D 0; new->ir_free =3D (xfs_inofree_t) - 1; new->ino_un.backptrs =3D NULL; + new->nlinkops =3D &nlinkops[0]; + new->disk_nlinks =3D calloc(sizeof(__uint8_t), XFS_INODES_PER_CHUNK); + if (new->disk_nlinks =3D=3D NULL) + do_error(_("inode nlink array malloc failed\n")); =20 return(new); } @@ -131,6 +352,8 @@ ino_flist.list =3D ino_rec; ino_flist.cnt++; =20 + free(ino_rec->disk_nlinks); + if (ino_rec->ino_un.backptrs !=3D NULL) { if (full_backptrs && ino_rec->ino_un.backptrs->parents !=3D NULL) free(ino_rec->ino_un.backptrs->parents); @@ -555,73 +778,39 @@ return(0LL); } =20 -backptrs_t * -get_backptr(void) +void +alloc_backptr(ino_tree_node_t *irec) { - backptrs_t *ptr; - - if ((ptr =3D malloc(sizeof(backptrs_t))) =3D=3D NULL) + int size; + backptrs_t *ptr; + parent_list_t *tmp; + + tmp =3D irec->ino_un.plist; + size =3D offsetof(backptrs_t, counted_nlinks) + + irec->nlinkops->nlink_size; + irec->ino_un.backptrs =3D (backptrs_t *)malloc(size); + if (irec->ino_un.backptrs =3D=3D NULL) do_error(_("could not malloc back pointer table\n")); =20 - bzero(ptr, sizeof(backptrs_t)); - - return(ptr); + memset(irec->ino_un.backptrs, 0, size); + irec->ino_un.backptrs->parents =3D tmp; } =20 void add_ino_backptrs(xfs_mount_t *mp) { -#ifdef XR_BCKPTR_DBG - xfs_ino_t ino; - int j, k; -#endif /* XR_BCKPTR_DBG */ ino_tree_node_t *ino_rec; - parent_list_t *tmp; xfs_agnumber_t i; =20 for (i =3D 0; i < mp->m_sb.sb_agcount; i++) { ino_rec =3D findfirst_inode_rec(i); =20 while (ino_rec !=3D NULL) { - tmp =3D ino_rec->ino_un.plist; - ino_rec->ino_un.backptrs =3D get_backptr(); - ino_rec->ino_un.backptrs->parents =3D tmp; - -#ifdef XR_BCKPTR_DBG - if (tmp !=3D NULL) { - k =3D 0; - for (j =3D 0; j < XFS_INODES_PER_CHUNK; j++) { - ino =3D XFS_AGINO_TO_INO(mp, i, - ino_rec->ino_startnum + j); - if (ino =3D=3D 25165846) { - do_warn("THERE 1 !!!\n"); - } - if (tmp->pentries[j] !=3D 0) { - k++; - do_warn( - "inode %llu - parent %llu\n", - ino, - tmp->pentries[j]); - if (ino =3D=3D 25165846) { - do_warn("THERE!!!\n"); - } - } - } - - if (k !=3D tmp->cnt) { - do_warn( - "ERROR - count =3D %d, counted %d\n", - tmp->cnt, k); - } - } -#endif /* XR_BCKPTR_DBG */ + alloc_backptr(ino_rec); ino_rec =3D next_ino_rec(ino_rec); } } - full_backptrs =3D 1; - - return; } =20 static __psunsigned_t Index: xfsprogs/repair/xfs_repair.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/xfs_repair.c +++ xfsprogs/repair/xfs_repair.c @@ -277,7 +277,7 @@ case 't': report_interval =3D (int) strtol(optarg, 0, 0); break; -=09=09=09 + case '?': usage(); } @@ -563,7 +563,7 @@ =20 /* XXX: nathans - something in phase4 ain't playing by */ /* the buffer cache rules.. why doesn't IRIX hit this? */ - libxfs_bcache_purge(); + libxfs_bcache_flush(); =20 if (no_modify) printf(_("No modify flag set, skipping phase 5\n")); @@ -576,6 +576,8 @@ phase6(mp); timestamp(PHASE_END, 6, NULL); =20 + libxfs_bcache_flush(); + phase7(mp); timestamp(PHASE_END, 7, NULL); } else { @@ -640,6 +642,9 @@ if (do_parallel && report_interval) stop_progress_rpt(); =20 + if (verbose > 1) + print_memory_usage(); + if (no_modify) { do_log( _("No modify flag set, skipping filesystem flush and exiting.\n")); Index: xfsprogs/repair/dino_chunks.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/dino_chunks.c +++ xfsprogs/repair/dino_chunks.c @@ -779,6 +779,13 @@ do_warn(_("would correct imap\n")); } set_inode_used(ino_rec, irec_offset); + /* + * store on-disk nlink count for comparing in phase 7 + */ + set_inode_disk_nlinks(ino_rec, irec_offset, + dino->di_core.di_version > XFS_DINODE_VERSION_1 + ? be32_to_cpu(dino->di_core.di_nlink) + : be16_to_cpu(dino->di_core.di_onlink)); } else { set_inode_free(ino_rec, irec_offset); } Index: xfsprogs/repair/phase7.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- xfsprogs.orig/repair/phase7.c +++ xfsprogs/repair/phase7.c @@ -30,91 +30,110 @@ #include "threads.h" =20 /* dinoc is a pointer to the IN-CORE dinode core */ -void -set_nlinks(xfs_dinode_core_t *dinoc, - xfs_ino_t ino, - __uint32_t nrefs, - int *dirty) +static void +set_nlinks( + xfs_dinode_core_t *dinoc, + xfs_ino_t ino, + __uint32_t nrefs, + int *dirty) { - if (!no_modify) { - if (dinoc->di_nlink !=3D nrefs) { - *dirty =3D 1; - do_warn( - _("resetting inode %llu nlinks from %d to %d\n"), - ino, dinoc->di_nlink, nrefs); + if (dinoc->di_nlink =3D=3D nrefs) + return; =20 - if (nrefs > XFS_MAXLINK_1) { - ASSERT(fs_inode_nlink); - do_warn( + if (!no_modify) { + *dirty =3D 1; + do_warn(_("resetting inode %llu nlinks from %d to %d\n"), + ino, dinoc->di_nlink, nrefs); + + if (nrefs > XFS_MAXLINK_1) { + ASSERT(fs_inode_nlink); + do_warn( _("nlinks %d will overflow v1 ino, ino %llu will be converted to version 2= \n"), - nrefs, ino); + nrefs, ino); =20 - } - dinoc->di_nlink =3D nrefs; } + dinoc->di_nlink =3D nrefs; } else { - if (dinoc->di_nlink !=3D nrefs) + do_warn(_("would have reset inode %llu nlinks from %d to %d\n"), + ino, dinoc->di_nlink, nrefs); + } +} + +static void +update_inode_nlinks( + xfs_mount_t *mp, + xfs_ino_t ino, + __uint32_t nlinks) +{ + xfs_trans_t *tp; + xfs_inode_t *ip; + int error; + int dirty; + + tp =3D libxfs_trans_alloc(mp, XFS_TRANS_REMOVE); + + error =3D libxfs_trans_reserve(tp, (no_modify ? 0 : 10), + XFS_REMOVE_LOG_RES(mp), 0, XFS_TRANS_PERM_LOG_RES, + XFS_REMOVE_LOG_COUNT); + + ASSERT(error =3D=3D 0); + + error =3D libxfs_trans_iget(mp, tp, ino, 0, 0, &ip); + + if (error) { + if (!no_modify) + do_error(_("couldn't map inode %llu, err =3D %d\n"), + ino, error); + else { do_warn( - _("would have reset inode %llu nlinks from %d to %d\n"), - ino, dinoc->di_nlink, nrefs); + _("couldn't map inode %llu, err =3D %d, can't compare link counts\n"), + ino, error); + return; + } + } + + dirty =3D 0; + + /* + * compare and set links for all inodes + * but the lost+found inode. we keep + * that correct as we go. + */ + if (ino !=3D orphanage_ino) + set_nlinks(&ip->i_d, ino, nlinks, &dirty); + + if (!dirty) { + libxfs_trans_iput(tp, ip, 0); + libxfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES); + } else { + libxfs_trans_log_inode(tp, ip, XFS_ILOG_CORE); + /* + * no need to do a bmap finish since + * we're not allocating anything + */ + ASSERT(error =3D=3D 0); + error =3D libxfs_trans_commit(tp, XFS_TRANS_RELEASE_LOG_RES | + XFS_TRANS_SYNC, NULL); + + ASSERT(error =3D=3D 0); } } =20 -void +static void phase7_alt_function(xfs_mount_t *mp, xfs_agnumber_t agno) { - register ino_tree_node_t *irec; + ino_tree_node_t *irec; int j; - int chunk_dirty; - int inode_dirty; - xfs_ino_t ino; __uint32_t nrefs; - xfs_agblock_t agbno; - xfs_dinode_t *dip; - ino_tree_node_t *ino_ra; - xfs_buf_t *bp; - - if (verbose) - do_log(_(" - agno =3D %d\n"), agno); - - ino_ra =3D prefetch_inode_chunks(mp, agno, NULL); =20 /* - * read on-disk inodes in chunks. then, - * look at each on-disk inode 1 at a time. - * if the number of links is bad, reset it. + * using the nlink values memorised during phase3/4, compare to the + * nlink counted in phase 6, and if different, update on-disk. */ =20 irec =3D findfirst_inode_rec(agno); =20 while (irec !=3D NULL) { - - if (ino_ra && (irec->ino_startnum >=3D ino_ra->ino_startnum)) - ino_ra =3D prefetch_inode_chunks(mp, agno, ino_ra); - - agbno =3D XFS_AGINO_TO_AGBNO(mp, irec->ino_startnum); - bp =3D libxfs_readbuf(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, XFS_IALLOC_BLOCKS(mp)), 0); - if (bp =3D=3D NULL) { - if (!no_modify) { - do_error( - _("cannot read inode %llu, disk block %lld, cnt %d\n"), - XFS_AGINO_TO_INO(mp, agno, irec->ino_startnum), - XFS_AGB_TO_DADDR(mp, agno, agbno), - (int)XFS_FSB_TO_BB(mp, XFS_IALLOC_BLOCKS(mp))); - /* NOT REACHED */ - } - do_warn( - _("cannot read inode %llu, disk block %lld, cnt %d\n"), - XFS_AGINO_TO_INO(mp, agno, irec->ino_startnum), - XFS_AGB_TO_DADDR(mp, agno, agbno), - (int)XFS_FSB_TO_BB(mp, XFS_IALLOC_BLOCKS(mp))); - - irec =3D next_ino_rec(irec); - continue; /* while */ - } - chunk_dirty =3D 0; for (j =3D 0; j < XFS_INODES_PER_CHUNK; j++) { assert(is_inode_confirmed(irec, j)); =20 @@ -122,110 +141,27 @@ continue; =20 assert(no_modify || is_inode_reached(irec, j)); - assert(no_modify || - is_inode_referenced(irec, j)); + assert(no_modify || is_inode_referenced(irec, j)); =20 nrefs =3D num_inode_references(irec, j); =20 - ino =3D XFS_AGINO_TO_INO(mp, agno, - irec->ino_startnum + j); - - dip =3D (xfs_dinode_t *)(XFS_BUF_PTR(bp) + - (j << mp->m_sb.sb_inodelog)); -=09=09=09 - inode_dirty =3D 0; - - /* Swap the fields we care about to native format */ - dip->di_core.di_magic =3D INT_GET(dip->di_core.di_magic,=20 - ARCH_CONVERT); - dip->di_core.di_onlink =3D INT_GET(dip->di_core.di_onlink,=20 - ARCH_CONVERT); - if (INT_GET(dip->di_core.di_version, ARCH_CONVERT) =3D=3D - XFS_DINODE_VERSION_1)=20 - dip->di_core.di_nlink =3D dip->di_core.di_onlink; - else=20 - dip->di_core.di_nlink =3D=20 - INT_GET(dip->di_core.di_nlink,=20 - ARCH_CONVERT); - - if (dip->di_core.di_magic !=3D XFS_DINODE_MAGIC) { - if (!no_modify) { - do_error( - _("ino: %llu, bad d_inode magic saw: (0x%x) expecting (0x%x)\n"), - ino, dip->di_core.di_magic, XFS_DINODE_MAGIC); - /* NOT REACHED */ - } - do_warn( - _("ino: %llu, bad d_inode magic saw: (0x%x) expecting (0x%x)\n"), - ino, dip->di_core.di_magic, XFS_DINODE_MAGIC); - continue; - } - /* - * compare and set links for all inodes - * but the lost+found inode. we keep - * that correct as we go. - */ - if (dip->di_core.di_nlink !=3D nrefs) { - if (ino !=3D orphanage_ino) { - set_nlinks(&dip->di_core, ino, - nrefs, &inode_dirty); - } - } - - /* Swap the fields back */ - dip->di_core.di_magic =3D INT_GET(dip->di_core.di_magic,=20 - ARCH_CONVERT); - if (inode_dirty && INT_GET(dip->di_core.di_version,=20 - ARCH_CONVERT) =3D=3D XFS_DINODE_VERSION_1) { - if (!XFS_SB_VERSION_HASNLINK(&mp->m_sb)) { - ASSERT(dip->di_core.di_nlink <=3D=20 - XFS_MAXLINK_1); - INT_SET(dip->di_core.di_onlink,=20 - ARCH_CONVERT, - dip->di_core.di_nlink); - dip->di_core.di_nlink =3D=20 - INT_GET(dip->di_core.di_nlink,=20 - ARCH_CONVERT); - } else { - /* superblock support v2 nlinks */ - INT_SET(dip->di_core.di_version,=20 - ARCH_CONVERT, XFS_DINODE_VERSION_2); - dip->di_core.di_nlink =3D=20 - INT_GET(dip->di_core.di_nlink,=20 - ARCH_CONVERT); - dip->di_core.di_onlink =3D 0; - memset(&(dip->di_core.di_pad[0]), 0, - sizeof(dip->di_core.di_pad)); - }=09 - } else { - dip->di_core.di_nlink =3D=20 - INT_GET(dip->di_core.di_nlink,=20 - ARCH_CONVERT); - dip->di_core.di_onlink =3D=20 - INT_GET(dip->di_core.di_onlink,=20 - ARCH_CONVERT); - } - chunk_dirty |=3D inode_dirty; + if (get_inode_disk_nlinks(irec, j) !=3D nrefs) + update_inode_nlinks(mp, XFS_AGINO_TO_INO(mp, + agno, irec->ino_startnum + j), + nrefs); } - - if (chunk_dirty) - libxfs_writebuf(bp, 0); - else - libxfs_putbuf(bp); - irec =3D next_ino_rec(irec); PROG_RPT_INC(prog_rpt_done[agno], XFS_INODES_PER_CHUNK); } } =20 -void +static void phase7_alt(xfs_mount_t *mp) { int i; =20 set_progress_msg(no_modify ? PROGRESS_FMT_VRFY_LINK : PROGRESS_FMT_CORR_L= INK, (__uint64_t) mp->m_sb.sb_icount); - libxfs_bcache_purge(); =20 for (i =3D 0; i < glob_agcount; i++) { queue_work(phase7_alt_function, mp, i); @@ -238,13 +174,8 @@ phase7(xfs_mount_t *mp) { ino_tree_node_t *irec; - xfs_inode_t *ip; - xfs_trans_t *tp; int i; int j; - int error; - int dirty; - xfs_ino_t ino; __uint32_t nrefs; =20 if (!no_modify) @@ -252,25 +183,14 @@ else do_log(_("Phase 7 - verify link counts...\n")); =20 - if (do_prefetch) { phase7_alt(mp); return; } =20 - tp =3D libxfs_trans_alloc(mp, XFS_TRANS_REMOVE); - - error =3D libxfs_trans_reserve(tp, (no_modify ? 0 : 10), - XFS_REMOVE_LOG_RES(mp), 0, XFS_TRANS_PERM_LOG_RES, - XFS_REMOVE_LOG_COUNT); - - ASSERT(error =3D=3D 0); - /* - * for each ag, look at each inode 1 at a time using the - * sim code. if the number of links is bad, reset it, - * log the inode core, commit the transaction, and - * allocate a new transaction + * for each ag, look at each inode 1 at a time. If the number of + * links is bad, reset it, log the inode core, commit the transaction */ for (i =3D 0; i < glob_agcount; i++) { irec =3D findfirst_inode_rec(i); @@ -288,69 +208,13 @@ =20 nrefs =3D num_inode_references(irec, j); =20 - ino =3D XFS_AGINO_TO_INO(mp, i, - irec->ino_startnum + j); - - error =3D libxfs_trans_iget(mp, tp, ino, 0, 0, &ip); - - if (error) { - if (!no_modify) - do_error( - _("couldn't map inode %llu, err =3D %d\n"), - ino, error); - else { - do_warn( - _("couldn't map inode %llu, err =3D %d, can't compare link counts\n"), - ino, error); - continue; - } - } - - dirty =3D 0; - - /* - * compare and set links for all inodes - * but the lost+found inode. we keep - * that correct as we go. - */ - if (ino !=3D orphanage_ino) - set_nlinks(&ip->i_d, ino, nrefs, - &dirty); - - if (!dirty) { - libxfs_trans_iput(tp, ip, 0); - } else { - libxfs_trans_log_inode(tp, ip, - XFS_ILOG_CORE); - /* - * no need to do a bmap finish since - * we're not allocating anything - */ - ASSERT(error =3D=3D 0); - error =3D libxfs_trans_commit(tp, - XFS_TRANS_RELEASE_LOG_RES| - XFS_TRANS_SYNC, NULL); - - ASSERT(error =3D=3D 0); - - tp =3D libxfs_trans_alloc(mp, - XFS_TRANS_REMOVE); - - error =3D libxfs_trans_reserve(tp, - (no_modify ? 0 : 10), - XFS_REMOVE_LOG_RES(mp), - 0, XFS_TRANS_PERM_LOG_RES, - XFS_REMOVE_LOG_COUNT); - ASSERT(error =3D=3D 0); - } + if (get_inode_disk_nlinks(irec, j) !=3D nrefs) + update_inode_nlinks(mp, + XFS_AGINO_TO_INO(mp, i, + irec->ino_startnum + j), + nrefs); } irec =3D next_ino_rec(irec); } } - - /* - * always have one unfinished transaction coming out - * of the loop. cancel it. - */ - libxfs_trans_cancel(tp, XFS_TRANS_RELEASE_LOG_RES); } ------=_NextPart_000_008F_01C7626F.3C00DD50-- From owner-xfs@oss.sgi.com Thu Mar 8 23:54:08 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Mar 2007 23:54:12 -0800 (PST) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_61,J_CHICKENPOX_71,J_CHICKENPOX_81 autolearn=no version=3.2.0-pre1-r499012 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l297s66p005196 for ; Thu, 8 Mar 2007 23:54:08 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HPZcA-0002bc-EC; Fri, 09 Mar 2007 07:34:10 +0000 Date: Fri, 9 Mar 2007 07:34:10 +0000 From: Christoph Hellwig To: Barry Naujok Cc: xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] New xfs_repair handling for inode nlink counts Message-ID: <20070309073410.GA8798@infradead.org> References: <200703090619.RAA15327@larry.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200703090619.RAA15327@larry.melbourne.sgi.com> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 10790 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 1227 Lines: 50 +#ifdef TRACK_MEMORY + +#undef calloc +#undef malloc +#undef memalign +#undef realloc +#undef free Can you put all thise into a memory_tracking.h file that gets include with: #ifdef TRACK_MEMORY #include "track_memory.h" #endif Instead of polluting the implementation file? + /* add pointer to hash list, very basic simple hash function */ + i = (((size_t)p) >> 8) & 0xff; + i = (((size_t)ptr) >> 8) & 0xff; Note that there is not guarantee that size_t and pointers have the same lenght, and there are system where it's not (win64?), better cast things use uintptr_t here. --- xfsprogs.orig/repair/globals.h +++ xfsprogs/repair/globals.h @@ -16,6 +16,16 @@ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ +#ifdef TRACK_MEMORY + +#define calloc(n,s) track_calloc(__FILE__, __LINE__, (n), (s)) +#define malloc(s) track_malloc(__FILE__, __LINE__, (s)) +#define memalign(b,s) track_memalign(__FILE__, __LINE__, (b), (s)) +#define realloc(p,s) track_realloc(__FILE__, __LINE__, (p), (s)) +#define free(p) track_free(__FILE__, __LINE__, (p)) + +#endif + #ifndef _XFS_REPAIR_GLOBAL_H #define _XFS_REPAIR_GLOBAL_H The memory tracking should probably come after the inclusion guards. From owner-xfs@oss.sgi.com Fri Mar 9 03:55:23 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 09 Mar 2007 03:55:29 -0800 (PST) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l29BtJ6p003153 for ; Fri, 9 Mar 2007 03:55:22 -0800 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l29BtBb2020551 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 9 Mar 2007 12:55:11 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l29BtBtD020549; Fri, 9 Mar 2007 12:55:11 +0100 Date: Fri, 9 Mar 2007 12:55:11 +0100 From: Christoph Hellwig To: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org Cc: linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr Message-ID: <20070309115511.GA20426@lst.de> References: <20070307101324.GC30587@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070307101324.GC30587@lst.de> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10791 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 3368 Lines: 112 Ed Cashin found a bug in the error handling code for the case where a page allocation fails. Here's the updated version: Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-03-08 19:08:38.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-03-09 08:59:15.000000000 +0100 @@ -314,7 +314,7 @@ ASSERT(list_empty(&bp->b_hash_list)); - if (bp->b_flags & _XBF_PAGE_CACHE) { + if (bp->b_flags & (_XBF_PAGE_CACHE|_XBF_PAGES)) { uint i; if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) @@ -323,18 +323,11 @@ for (i = 0; i < bp->b_page_count; i++) { struct page *page = bp->b_pages[i]; - ASSERT(!PagePrivate(page)); + if (bp->b_flags & _XBF_PAGE_CACHE) + ASSERT(!PagePrivate(page)); page_cache_release(page); } _xfs_buf_free_pages(bp); - } else if (bp->b_flags & _XBF_KMEM_ALLOC) { - /* - * XXX(hch): bp->b_count_desired might be incorrect (see - * xfs_buf_associate_memory for details), but fortunately - * the Linux version of kmem_free ignores the len argument.. - */ - kmem_free(bp->b_addr, bp->b_count_desired); - _xfs_buf_free_pages(bp); } xfs_buf_deallocate(bp); @@ -764,41 +757,41 @@ size_t len, xfs_buftarg_t *target) { - size_t malloc_len = len; + unsigned long page_count = PAGE_ALIGN(len) >> PAGE_SHIFT; + int error, i; xfs_buf_t *bp; - void *data; - int error; bp = xfs_buf_allocate(0); if (unlikely(bp == NULL)) goto fail; _xfs_buf_initialize(bp, target, 0, len, 0); - try_again: - data = kmem_alloc(malloc_len, KM_SLEEP | KM_MAYFAIL | KM_LARGE); - if (unlikely(data == NULL)) + error = _xfs_buf_get_pages(bp, page_count, 0); + if (error) goto fail_free_buf; - /* check whether alignment matches.. */ - if ((__psunsigned_t)data != - ((__psunsigned_t)data & ~target->bt_smask)) { - /* .. else double the size and try again */ - kmem_free(data, malloc_len); - malloc_len <<= 1; - goto try_again; - } - - error = xfs_buf_associate_memory(bp, data, len); - if (error) + for (i = 0; i < page_count; i++) { + bp->b_pages[i] = alloc_page(GFP_KERNEL); + if (!bp->b_pages[i]) + goto fail_free_mem; + } + bp->b_flags |= _XBF_PAGES; + + error = _xfs_buf_map_pages(bp, XBF_MAPPED); + if (unlikely(error)) { + printk(KERN_WARNING "%s: failed to map pages\n", + __FUNCTION__); goto fail_free_mem; - bp->b_flags |= _XBF_KMEM_ALLOC; + } xfs_buf_unlock(bp); XB_TRACE(bp, "no_daddr", data); return bp; + fail_free_mem: - kmem_free(data, malloc_len); + for ( ; i >= 0; i--) + __free_page(bp->b_pages[i]); fail_free_buf: xfs_buf_free(bp); fail: Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.h 2007-03-08 19:08:38.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.h 2007-03-09 08:58:50.000000000 +0100 @@ -63,7 +63,7 @@ /* flags used only internally */ _XBF_PAGE_CACHE = (1 << 17),/* backed by pagecache */ - _XBF_KMEM_ALLOC = (1 << 18),/* backed by kmem_alloc() */ + _XBF_PAGES = (1 << 18), /* backed by refcounted pages */ _XBF_RUN_QUEUES = (1 << 19),/* run block device task queue */ _XBF_DELWRI_Q = (1 << 21), /* buffer on delwri queue */ } xfs_buf_flags_t; From owner-xfs@oss.sgi.com Fri Mar 9 07:33:57 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 09 Mar 2007 07:34:00 -0800 (PST) X-Spam-oss-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_99 autolearn=no version=3.2.0-pre1-r499012 Received: from youju.siksai.co.uk (youju.siksai.co.uk [87.127.14.180]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l29FXs6p024336 for ; Fri, 9 Mar 2007 07:33:56 -0800 Received: from xiao.siksai.co.uk ([87.127.14.179] ident=rhowe) by youju.siksai.co.uk with smtp (Exim 4.50) id 1HPgSY-0005BH-QQ for xfs@oss.sgi.com; Fri, 09 Mar 2007 14:52:52 +0000 Received: by xiao.siksai.co.uk (sSMTP sendmail emulation); Fri, 09 Mar 2007 14:54:01 +0000 From: "Russell Howe" Date: Fri, 9 Mar 2007 14:54:01 +0000 To: xfs@oss.sgi.com Subject: Andrew Morton talking about filesystems at FOSDEM 2007 Message-ID: <20070309145400.GA23808@xiao.rsnet> Mail-Followup-To: xfs@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.13 (2006-08-11) X-archive-position: 10792 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rhowe@siksai.co.uk Precedence: bulk X-list: xfs Content-Length: 2909 Lines: 55 Just thought this might be of interest to some folks on the list - akpm gave a talk at FOSDEM about the kernel and various bits & pieces and someone asked about filesystems during the Q&A session. Full video available from http://www.fosdem.org/2007/media/video "Hi. Um, earlier you were a little bit scathing about ext4. Um, what are the alternatives? Hans Reiser doesn't appear to be at FOSDEM this year. Is ZFS likely to get merged in, or are there any other possibilities there and what sucks about ext4? What don't you like?" "Sorry, which filesystems did you mention?" "ext4?" "Yep, I thought you mentioned some other filesystems" "Oh, ZFS, the Sun..." "ZFS? Well, um I've yet to see the patch [chuckles]. I'm not aware of any Sun-supported effort to do that. ext3? Well, I've worked on ext3 for so long I'm kinda sick of it I guess. It doesn't perform very well. The way in which it journals is a little bit klunky, and there are many things I'd like to do to it but simply do not have time to. So I think the block based journalling probably wasn't the right way to do it. It was a good way to get a journalling filesystem that was compatible with ext2 but I think logical journalling is probably / would have been a smarter approach to take. Also, the performance of ext3 is not great. Occasionally when I get time I'll get down and run some benchmarks against XFS and I just scratch my head and I just do not know how XFS does some of the things it does. Particularly with respect to file layout, it is astonishingly good. But unfortunately with XFS the codebase is so complex we're not / really vendors are not supporting it, so erm in some ways that's just [??] so few people understand XFS internals, one of the attractive things about ext3 and ext4 is that so many different companies have engineers working on it. So erm given we got this excellent performance but very complicated codebase which few people understand and on the other side we've got one which doesn't perform so well but a lot of people understand it seems the decision's been taken to evolve the slower but well understood one. It could be JFS is a good filesystem now as well. JFS had reputational problems in the first couple of years when it tended to crash a lot. That may not be the case any more but I'm just not aware of anybody who's using JFS much any more. But I'd imagine xfs... ext4 should be getting as good as XFS on file layout by the end of the year. How it'll compare on the benchmarks I don't know yet." [question indistinct] "... shadow copying, things like that?" "Nope, no there are no plans for fancy features like that that I'm aware of. At this stage we're simply trying to get the bandwidth, lock contention, file layout and those sorts of issues sorted out." -- Russell Howe | Why be just another cog in the machine, rhowe@siksai.co.uk | when you can be the spanner in the works? From owner-xfs@oss.sgi.com Fri Mar 9 09:22:57 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 09 Mar 2007 09:23:05 -0800 (PST) X-Spam-oss-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from lab41.emea.sgi.com (lab41.emea.sgi.com [144.253.75.41]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l29HMu6p019124 for ; Fri, 9 Mar 2007 09:22:57 -0800 Received: by lab41.emea.sgi.com (Postfix, from userid 1000) id E9FA222ADE; Fri, 9 Mar 2007 17:42:15 +0000 (GMT) To: xfs@oss.sgi.com Subject: TAKE 961990 - propogate return codes from flush routines Message-Id: <20070309174215.E9FA222ADE@lab41.emea.sgi.com> Date: Fri, 9 Mar 2007 17:42:15 +0000 (GMT) From: lachlan@lab41.emea.sgi.com (Lachlan McIlroy) X-archive-position: 10793 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@lab41.emea.sgi.com Precedence: bulk X-list: xfs Content-Length: 2670 Lines: 47 propogate return codes from flush routines This patch handles error return values in fs_flush_pages and fs_flushinval_pages. It changes the prototype of fs_flushinval_pages so we can propogate the errors and handle them at higher layers. I also modified xfs_itruncate_start so that it could propogate the error further. Date: Sat Mar 10 04:19:34 AEDT 2007 Workarea: vpn-emea-sw-emea-160-34.emea.sgi.com:/home/lachlan/isms/2.6.x-mod Inspected by: stewart@flamingspork.com Author: lachlan The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28231a fs/xfs/xfs_vnodeops.c - 1.692 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.692&r2=text&tr2=1.691&f=h fs/xfs/xfs_vfsops.c - 1.517 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.517&r2=text&tr2=1.516&f=h fs/xfs/xfs_dfrag.c - 1.59 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dfrag.c.diff?r1=text&tr1=1.59&r2=text&tr2=1.58&f=h fs/xfs/xfs_inode.c - 1.462 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.462&r2=text&tr2=1.461&f=h fs/xfs/xfs_inode.h - 1.218 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.218&r2=text&tr2=1.217&f=h fs/xfs/xfs_utils.c - 1.74 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_utils.c.diff?r1=text&tr1=1.74&r2=text&tr2=1.73&f=h fs/xfs/linux-2.6/xfs_lrw.c - 1.257 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_lrw.c.diff?r1=text&tr1=1.257&r2=text&tr2=1.256&f=h fs/xfs/linux-2.6/xfs_vnode.h - 1.127 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_vnode.h.diff?r1=text&tr1=1.127&r2=text&tr2=1.126&f=h fs/xfs/linux-2.6/xfs_fs_subr.c - 1.50 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_fs_subr.c.diff?r1=text&tr1=1.50&r2=text&tr2=1.49&f=h fs/xfs/linux-2.6/xfs_fs_subr.h - 1.14 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_fs_subr.h.diff?r1=text&tr1=1.14&r2=text&tr2=1.13&f=h fs/xfs/linux-2.4/xfs_vnode.h - 1.115 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_vnode.h.diff?r1=text&tr1=1.115&r2=text&tr2=1.114&f=h fs/xfs/linux-2.4/xfs_fs_subr.c - 1.49 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_fs_subr.c.diff?r1=text&tr1=1.49&r2=text&tr2=1.48&f=h fs/xfs/linux-2.4/xfs_fs_subr.h - 1.18 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.4/xfs_fs_subr.h.diff?r1=text&tr1=1.18&r2=text&tr2=1.17&f=h - propogate return codes from flush routines From owner-xfs@oss.sgi.com Sun Mar 11 21:41:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 11 Mar 2007 21:41:54 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2C4fZ6p026772 for ; Sun, 11 Mar 2007 21:41:37 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA06110; Mon, 12 Mar 2007 15:41:25 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2C4fLAf25558398; Mon, 12 Mar 2007 15:41:23 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2C4fH0L25507609; Mon, 12 Mar 2007 15:41:17 +1100 (AEDT) Date: Mon, 12 Mar 2007 15:41:17 +1100 From: David Chinner To: Christoph Hellwig Cc: xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] xfs: use xfs_get_buf_noaddr for iclogs Message-ID: <20070312044117.GK6095633@melbourne.sgi.com> References: <20070307101314.GB30587@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070307101314.GB30587@lst.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 10797 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 2278 Lines: 69 On Wed, Mar 07, 2007 at 11:13:14AM +0100, Christoph Hellwig wrote: > xfs_buf_get_noaddr. There's a subtile change because > xfs_buf_get_empty returns the buffer locked, but xfs_buf_get_noaddr > returns it unlocked. From my auditing and testing nothing in the > log I/O code cares about this distincition, but I'd be happy if > someone could try to prove this independently. Looks safe to me - we initialise all the fields in the xfs_buf_t when we allocate out of the slab, so it doesn't really matter what state the buffer is in when we free it. OTOH, all other buffers are supposed to be locked when under I/O. This change makes a special case for the log buffers, and I'd prefer not to have to remember that this behaviour changed fo log buffers at some point in time. I suggest that adding: > - iclog->hic_data = (xlog_in_core_2_t *) > - kmem_zalloc(iclogsize, KM_SLEEP | KM_LARGE); > - > iclog->ic_prev = prev_iclog; > prev_iclog = iclog; > + > + bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); > + XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); > + XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); > + XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); + XFS_BUF_PSEMA(bp, PRIBIO); > + iclog->ic_bp = bp; > + iclog->hic_data = bp->b_addr; > + > log->l_iclog_bak[i] = (xfs_caddr_t)&(iclog->ic_header); > > head = &iclog->ic_header; To lock the buffer should be added here. That way we don't change any semantics of the code at all. > @@ -1216,11 +1221,6 @@ > INT_SET(head->h_fmt, ARCH_CONVERT, XLOG_FMT); > memcpy(&head->h_fs_uuid, &mp->m_sb.sb_uuid, sizeof(uuid_t)); > > - bp = xfs_buf_get_empty(log->l_iclog_size, mp->m_logdev_targp); > - XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); > - XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); > - XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); > - iclog->ic_bp = bp; > > iclog->ic_size = XFS_BUF_SIZE(bp) - log->l_iclog_hsize; > iclog->ic_state = XLOG_STATE_ACTIVE; > @@ -1229,7 +1229,6 @@ > iclog->ic_datap = (char *)iclog->hic_data + log->l_iclog_hsize; > > ASSERT(XFS_BUF_ISBUSY(iclog->ic_bp)); > - ASSERT(XFS_BUF_VALUSEMA(iclog->ic_bp) <= 0); And this assert can then stay... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Mar 12 12:14:37 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 12 Mar 2007 12:14:48 -0700 (PDT) X-Spam-oss-Status: No, score=1.0 required=5.0 tests=BAYES_60,HTML_MESSAGE autolearn=ham version=3.2.0-pre1-r499012 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2CJEW6p004126 for ; Mon, 12 Mar 2007 12:14:34 -0700 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by mga09.intel.com with ESMTP; 12 Mar 2007 12:03:37 -0700 Received: from fmsmsx334.amr.corp.intel.com ([132.233.42.1]) by fmsmga002.fm.intel.com with ESMTP; 12 Mar 2007 12:03:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: i="4.14,275,1170662400"; d="scan'208,217"; a="57173450:sNHT43053696" Received: from fmsmsx415.amr.corp.intel.com ([10.19.19.7]) by fmsmsx334.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 12 Mar 2007 12:03:36 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 MIME-Version: 1.0 Subject: one question about XFS file system Date: Mon, 12 Mar 2007 12:03:33 -0700 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: one question about XFS file system Thread-Index: Acdk2SFHr9UUh2kWSFysqHXBPvEY3g== From: "Meng, Nick" To: X-OriginalArrivalTime: 12 Mar 2007 19:03:36.0777 (UTC) FILETIME=[2363C390:01C764D9] Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 10799 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nick.meng@intel.com Precedence: bulk X-list: xfs Content-Length: 789 Lines: 54 Hi XFSers, I am planning to build a XFS file system in an EM64T/RHEL4U2 system. Here is my question: Can I build a XFS file system with 2 x U320SCSI controllers and 8 x R15K high speed Disks in my EM64T/RHEL4U2? =================================== SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 SEAGATE ST373454LC 0005 =================================== Any input will be appreciated. Best Regards, Nick Meng [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Mon Mar 12 13:56:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 12 Mar 2007 13:56:47 -0700 (PDT) X-Spam-oss-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_33,J_CHICKENPOX_34 autolearn=no version=3.2.0-pre1-r499012 Received: from relay.sw.ru (mailhub.sw.ru [195.214.233.200]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2CKuc6p028881 for ; Mon, 12 Mar 2007 13:56:39 -0700 Received: from localhost ([192.168.3.106]) by relay.sw.ru (8.13.4/8.13.4) with ESMTP id l2CKIwoL025016; Mon, 12 Mar 2007 23:19:00 +0300 (MSK) To: linux-kernel@vger.kernel.org CC: Nick Piggin , devel@openvz.org, xfs@oss.sgi.com, linux-ntfs-dev@lists.sourceforge.net Subject: [PATCH 1/2] mm: move common segment checks to separate helper function (v7) From: Dmitriy Monakhov Date: Mon, 12 Mar 2007 23:19:31 +0300 Message-ID: <87veh6z030.fsf@sw.ru> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 10800 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dmonakhov@openvz.org Precedence: bulk X-list: xfs Content-Length: 6660 Lines: 222 Changes against v6 - remove duplicated code from xfs,ntfs - export generic_segment_checks, because it used by xfs,nfs now. - change arguments initialization pocily according to Nick's comments. Tested with: ltp readv/writev tests Signed-off-by: Monakhov Dmitriy --- fs/ntfs/file.c | 21 ++--------- fs/xfs/linux-2.6/xfs_lrw.c | 22 ++---------- include/linux/fs.h | 3 ++ mm/filemap.c | 83 ++++++++++++++++++++++++------------------- 4 files changed, 55 insertions(+), 74 deletions(-) diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c index dbbac55..621de36 100644 --- a/fs/ntfs/file.c +++ b/fs/ntfs/file.c @@ -2129,28 +2129,13 @@ static ssize_t ntfs_file_aio_write_nolock(struct kiocb *iocb, struct address_space *mapping = file->f_mapping; struct inode *inode = mapping->host; loff_t pos; - unsigned long seg; size_t count; /* after file limit checks */ ssize_t written, err; count = 0; - for (seg = 0; seg < nr_segs; seg++) { - const struct iovec *iv = &iov[seg]; - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - count += iv->iov_len; - if (unlikely((ssize_t)(count|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_READ, iv->iov_base, iv->iov_len)) - continue; - if (!seg) - return -EFAULT; - nr_segs = seg; - count -= iv->iov_len; /* This segment is no good */ - break; - } + err = generic_segment_checks(iov, &nr_segs, &count, VERIFY_READ); + if (err) + return err; pos = *ppos; vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE); /* We can write back this queue in page reclaim. */ diff --git a/fs/xfs/linux-2.6/xfs_lrw.c b/fs/xfs/linux-2.6/xfs_lrw.c index ff8d64e..558076d 100644 --- a/fs/xfs/linux-2.6/xfs_lrw.c +++ b/fs/xfs/linux-2.6/xfs_lrw.c @@ -639,7 +639,6 @@ xfs_write( xfs_fsize_t isize, new_size; xfs_iocore_t *io; bhv_vnode_t *vp; - unsigned long seg; int iolock; int eventsent = 0; bhv_vrwlock_t locktype; @@ -652,24 +651,9 @@ xfs_write( vp = BHV_TO_VNODE(bdp); xip = XFS_BHVTOI(bdp); - for (seg = 0; seg < segs; seg++) { - const struct iovec *iv = &iovp[seg]; - - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - ocount += iv->iov_len; - if (unlikely((ssize_t)(ocount|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_READ, iv->iov_base, iv->iov_len)) - continue; - if (seg == 0) - return -EFAULT; - segs = seg; - ocount -= iv->iov_len; /* This segment is no good */ - break; - } + error = generic_segment_checks(iovp, &segs, &ocount, VERIFY_READ); + if (error) + return error; count = ocount; pos = *offset; diff --git a/include/linux/fs.h b/include/linux/fs.h index 6a3d22e..3b99450 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1778,6 +1778,9 @@ extern ssize_t generic_file_sendfile(struct file *, loff_t *, size_t, read_actor extern void do_generic_mapping_read(struct address_space *mapping, struct file_ra_state *, struct file *, loff_t *, read_descriptor_t *, read_actor_t); +extern int generic_segment_checks(const struct iovec *iov, + unsigned long *nr_segs, size_t *count, + unsigned long access_flags); /* fs/splice.c */ extern ssize_t generic_file_splice_read(struct file *, loff_t *, diff --git a/mm/filemap.c b/mm/filemap.c index 8e1849a..8bd1ea4 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1159,6 +1159,46 @@ success: return size; } +/* + * Performs necessary checks before doing a write + * @iov: io vector request + * @nr_segs: number of segments in the iovec + * @count: number of bytes to write + * @access_flags: type of access: %VERIFY_READ or %VERIFY_WRITE + * + * Adjust number of segments and amount of bytes to write (nr_segs should be + * properly initialized first). Returns appropriate error code that caller + * should return or zero in case that write should be allowed. + */ +int generic_segment_checks(const struct iovec *iov, + unsigned long *nr_segs, size_t *count, + unsigned long access_flags) +{ + unsigned long seg; + size_t cnt = 0; + for (seg = 0; seg < *nr_segs; seg++) { + const struct iovec *iv = &iov[seg]; + + /* + * If any segment has a negative length, or the cumulative + * length ever wraps negative then return -EINVAL. + */ + cnt += iv->iov_len; + if (unlikely((ssize_t)(cnt|iv->iov_len) < 0)) + return -EINVAL; + if (access_ok(access_flags, iv->iov_base, iv->iov_len)) + continue; + if (seg == 0) + return -EFAULT; + *nr_segs = seg; + cnt -= iv->iov_len; /* This segment is no good */ + break; + } + *count = cnt; + return 0; +} +EXPORT_SYMBOL(generic_segment_checks); + /** * generic_file_aio_read - generic filesystem read routine * @iocb: kernel I/O control block @@ -1180,24 +1220,9 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov, loff_t *ppos = &iocb->ki_pos; count = 0; - for (seg = 0; seg < nr_segs; seg++) { - const struct iovec *iv = &iov[seg]; - - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - count += iv->iov_len; - if (unlikely((ssize_t)(count|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_WRITE, iv->iov_base, iv->iov_len)) - continue; - if (seg == 0) - return -EFAULT; - nr_segs = seg; - count -= iv->iov_len; /* This segment is no good */ - break; - } + retval = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE); + if (retval) + return retval; /* coalesce the iovecs and go direct-to-BIO for O_DIRECT */ if (filp->f_flags & O_DIRECT) { @@ -2094,30 +2119,14 @@ __generic_file_aio_write_nolock(struct kiocb *iocb, const struct iovec *iov, size_t ocount; /* original count */ size_t count; /* after file limit checks */ struct inode *inode = mapping->host; - unsigned long seg; loff_t pos; ssize_t written; ssize_t err; ocount = 0; - for (seg = 0; seg < nr_segs; seg++) { - const struct iovec *iv = &iov[seg]; - - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - ocount += iv->iov_len; - if (unlikely((ssize_t)(ocount|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_READ, iv->iov_base, iv->iov_len)) - continue; - if (seg == 0) - return -EFAULT; - nr_segs = seg; - ocount -= iv->iov_len; /* This segment is no good */ - break; - } + err = generic_segment_checks(iov, &nr_segs, &ocount, VERIFY_READ); + if (err) + return err; count = ocount; pos = *ppos; -- 1.5.0.1 From owner-xfs@oss.sgi.com Mon Mar 12 16:04:23 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 12 Mar 2007 16:04:28 -0700 (PDT) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_20, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2CN4L6p022578 for ; Mon, 12 Mar 2007 16:04:23 -0700 Received: by lucidpixels.com (Postfix, from userid 1001) id 114041A0001D2; Mon, 12 Mar 2007 19:04:21 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id 0D458A046498; Mon, 12 Mar 2007 19:04:21 -0400 (EDT) Date: Mon, 12 Mar 2007 19:04:21 -0400 (EDT) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: "Meng, Nick" cc: xfs@oss.sgi.com Subject: Re: one question about XFS file system In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 10801 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs Content-Length: 927 Lines: 64 On Mon, 12 Mar 2007, Meng, Nick wrote: > Hi XFSers, > > > > I am planning to build a XFS file system in an EM64T/RHEL4U2 > system. Here is my question: > > > > Can I build a XFS file system with 2 x U320SCSI controllers and 8 x > R15K high speed Disks in my EM64T/RHEL4U2? > > > > =================================== > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > SEAGATE ST373454LC 0005 > > =================================== > > > > Any input will be appreciated. > > > > Best Regards, > > > > Nick Meng > > > > > > > > [[HTML alternate version deleted]] > > I don't see why not? Justin. From owner-xfs@oss.sgi.com Mon Mar 12 17:09:02 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 12 Mar 2007 17:09:06 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00, J_CHICKENPOX_43 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2D08x6p007327 for ; Mon, 12 Mar 2007 17:09:01 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA00584; Tue, 13 Mar 2007 11:08:47 +1100 Date: Tue, 13 Mar 2007 11:08:29 +1100 From: Timothy Shimmin To: Christoph Hellwig , xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org cc: linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr Message-ID: <73E41C01F3C8F79AD31CEDAF@timothy-shimmins-power-mac-g5.local> In-Reply-To: <20070309115511.GA20426@lst.de> References: <20070307101324.GC30587@lst.de> <20070309115511.GA20426@lst.de> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10802 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1489 Lines: 55 Hi, --On 9 March 2007 12:55:11 PM +0100 Christoph Hellwig wrote: > Ed Cashin found a bug in the error handling code for the case where > a page allocation fails. Here's the updated version: > > Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c > =================================================================== > --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-03-08 19:08:38.000000000 +0100 > +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-03-09 08:59:15.000000000 +0100 .... > + for (i = 0; i < page_count; i++) { > + bp->b_pages[i] = alloc_page(GFP_KERNEL); > + if (!bp->b_pages[i]) > + goto fail_free_mem; > + } > + bp->b_flags |= _XBF_PAGES; > + > + error = _xfs_buf_map_pages(bp, XBF_MAPPED); > + if (unlikely(error)) { > + printk(KERN_WARNING "%s: failed to map pages\n", > + __FUNCTION__); > goto fail_free_mem; > - bp->b_flags |= _XBF_KMEM_ALLOC; > + } > > xfs_buf_unlock(bp); > > XB_TRACE(bp, "no_daddr", data); > return bp; > + > fail_free_mem: > - kmem_free(data, malloc_len); > + for ( ; i >= 0; i--) > + __free_page(bp->b_pages[i]); > fail_free_buf: > xfs_buf_free(bp); > fail: It looks like you might need: for (i--; i >= 0; i--) (or: for (j = 0; j < i; j++) etc.) Because if the initial alloc_page loop goes to completion then: i == pagecount and if alloc_page loop terminates early then bp->b_pages[i] == NULL So we have gone 1 too far in both cases and need to start free'ing back one. Unless I missed something. --Tim From owner-xfs@oss.sgi.com Mon Mar 12 18:51:25 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 12 Mar 2007 18:51:32 -0700 (PDT) X-Spam-oss-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_61,J_CHICKENPOX_71,J_CHICKENPOX_81,MIME_QP_LONG_LINE autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2D1pL6p030724 for ; Mon, 12 Mar 2007 18:51:23 -0700 Received: from pcbnaujok (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA03235; Tue, 13 Mar 2007 12:51:09 +1100 Message-Id: <200703130151.MAA03235@larry.melbourne.sgi.com> From: "Barry Naujok" To: "'Christoph Hellwig'" Cc: , Subject: RE: [PATCH] New xfs_repair handling for inode nlink counts Date: Tue, 13 Mar 2007 12:51:35 +1100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0193_01C7656E.58D2DAD0" X-Mailer: Microsoft Office Outlook, Build 11.0.6353 Thread-Index: AcdiIEgDVZMqaloYQXmtccScAQglyAC8a3AA X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 In-Reply-To: <20070309073410.GA8798@infradead.org> X-archive-position: 10803 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 13168 Lines: 434 This is a multi-part message in MIME format. ------=_NextPart_000_0193_01C7656E.58D2DAD0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi Christoph, Thanks for the feedback. I've attached an update to the trackmem stuff for review. globals.c is now unmodified. Regards, Barry. > -----Original Message----- > From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] > On Behalf Of Christoph Hellwig > Sent: Friday, 9 March 2007 6:34 PM > To: Barry Naujok > Cc: xfs@oss.sgi.com; xfs-dev@sgi.com > Subject: Re: [PATCH] New xfs_repair handling for inode nlink counts > > > +#ifdef TRACK_MEMORY > + > +#undef calloc > +#undef malloc > +#undef memalign > +#undef realloc > +#undef free > > > Can you put all thise into a memory_tracking.h file that > gets include with: > > #ifdef TRACK_MEMORY > #include "track_memory.h" > #endif > > Instead of polluting the implementation file? > > + /* add pointer to hash list, very basic simple hash function */ > + i = (((size_t)p) >> 8) & 0xff; > > + i = (((size_t)ptr) >> 8) & 0xff; > > Note that there is not guarantee that size_t and > pointers have the > same lenght, and there are system where it's not > (win64?), better > cast things use uintptr_t here. > > --- xfsprogs.orig/repair/globals.h > +++ xfsprogs/repair/globals.h > @@ -16,6 +16,16 @@ > * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > */ > > +#ifdef TRACK_MEMORY > + > +#define calloc(n,s) track_calloc(__FILE__, __LINE__, (n), (s)) > +#define malloc(s) track_malloc(__FILE__, __LINE__, (s)) > +#define memalign(b,s) track_memalign(__FILE__, > __LINE__, (b), (s)) > +#define realloc(p,s) track_realloc(__FILE__, __LINE__, (p), (s)) > +#define free(p) track_free(__FILE__, __LINE__, (p)) > + > +#endif > + > #ifndef _XFS_REPAIR_GLOBAL_H > #define _XFS_REPAIR_GLOBAL_H > > The memory tracking should probably come after the inclusion > guards. > > ------=_NextPart_000_0193_01C7656E.58D2DAD0 Content-Type: application/octet-stream; name="trackmem_update.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="trackmem_update.patch" =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D xfsprogs/repair/Makefile =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- a/xfsprogs/repair/Makefile 2007-03-13 12:48:40.000000000 +1100 +++ b/xfsprogs/repair/Makefile 2007-03-13 12:02:05.061400111 +1100 @@ -9,13 +9,14 @@ LTCOMMAND =3D xfs_repair =20 HFILES =3D agheader.h attr_repair.h avl.h avl64.h bmap.h dinode.h dir.h \ dir2.h dir_stack.h err_protos.h globals.h incore.h protos.h rt.h \ - progress.h scan.h versions.h prefetch.h threads.h + progress.h scan.h versions.h prefetch.h threads.h trackmem.h =20 CFILES =3D agheader.c attr_repair.c avl.c avl64.c bmap.c dino_chunks.c \ dinode.c dir.c dir2.c dir_stack.c globals.c incore.c \ incore_bmc.c init.c incore_ext.c incore_ino.c phase1.c \ phase2.c phase3.c phase4.c phase5.c phase6.c phase7.c rt.c sb.c \ - progress.c prefetch.c scan.c versions.c xfs_repair.c threads.c + progress.c prefetch.c scan.c versions.c xfs_repair.c threads.c \ + trackmem.c =20 LLDLIBS =3D $(LIBXFS) $(LIBXLOG) $(LIBUUID) $(LIBPTHREAD) $(LIBRT) LTDEPENDENCIES =3D $(LIBXFS) $(LIBXLOG) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D xfsprogs/repair/globals.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- a/xfsprogs/repair/globals.h 2007-03-13 12:48:40.000000000 +1100 +++ b/xfsprogs/repair/globals.h 2007-03-13 12:39:46.487391946 +1100 @@ -23,6 +23,10 @@ #define EXTERN extern #endif =20 +#ifdef TRACK_MEMORY +#include "trackmem.h" +#endif + /* useful macros */ =20 #define rounddown(x, y) (((x)/(y))*(y)) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D xfsprogs/repair/trackmem.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- a/xfsprogs/repair/trackmem.c 2006-06-17 00:58:24.000000000 +1000 +++ b/xfsprogs/repair/trackmem.c 2007-03-13 12:48:25.331111073 +1100 @@ -0,0 +1,195 @@ +/* + * Copyright (c) 2007 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include + +#undef calloc +#undef malloc +#undef memalign +#undef realloc +#undef free + +/* + * Track by file name pointer and also by return pointer + */ + +typedef struct func { + const char *file; + int line; + int64_t acount; + int64_t fcount; + int64_t rcount; + int64_t current; + int64_t peak; +} func_t; + +typedef struct entry { + struct entry *next; + func_t *fileline; + size_t size; + void *ptr; +} entry_t; + +static int caller_count =3D 0; +static int caller_size =3D 0; +static func_t *callers =3D NULL; + +static entry_t *ptrhash[256]; + +static +void track_alloc(const char *file, int line, size_t size, void *p) +{ + int i; + entry_t *e; + + /* find an existing func call from file/line */ + for (i =3D 0; i < caller_count; i++) { + if ((callers[i].file =3D=3D file) && (callers[i].line =3D=3D line)) + break; + } + if (i =3D=3D caller_count) { /* add new func if not found */ + if (caller_count =3D=3D caller_size) { + caller_size +=3D 64; + callers =3D realloc(callers, sizeof(func_t) * caller_size); + } + memset(&callers[i], 0, sizeof(func_t)); + callers[i].file =3D file; + callers[i].line =3D line; + caller_count++; + } + + e =3D malloc(sizeof(entry_t)); + e->size =3D size; + e->ptr =3D p; + e->fileline =3D &callers[i]; + + callers[i].acount++; + callers[i].current +=3D size; + if (callers[i].current > callers[i].peak) + callers[i].peak =3D callers[i].current; + + /* add pointer to hash list, very basic simple hash function */ + i =3D (((int)p) >> 8) & 0xff; + + e->next =3D ptrhash[i]; + ptrhash[i] =3D e; +} + +void *track_calloc(const char *file, int line, size_t num, size_t size) +{ + void *retval =3D calloc(num, size); + + if (retval !=3D NULL) + track_alloc(file, line, num * size, retval); + + return retval; +} + +void *track_malloc(const char *file, int line, size_t size) +{ + void *retval =3D malloc(size); + + if (retval !=3D NULL) + track_alloc(file, line, size, retval); + + return retval; +} + +void *track_memalign(const char *file, int line, size_t boundary, size_t s= ize) +{ + void *retval =3D memalign(boundary, size); + + if (retval !=3D NULL) + track_alloc(file, line, size, retval); + + return retval; +} + +void *track_realloc(const char *file, int line, void *ptr, size_t size) +{ + int i; + entry_t *e, *prev; + void *newptr =3D realloc(ptr, size); + + if (ptr =3D=3D NULL && newptr !=3D NULL) { + track_alloc(file, line, size, newptr); + return newptr; + } + + i =3D (((size_t)ptr) >> 8) & 0xff; + + prev =3D NULL; + for (e =3D ptrhash[i]; e; e =3D e->next) { + if (e->ptr =3D=3D ptr) + break; + prev =3D e; + } + if (!e) + return newptr; + + e->fileline->rcount++; + e->fileline->current =3D e->fileline->current + size - e->size; + if (e->fileline->current > e->fileline->peak) + e->fileline->peak =3D e->fileline->current; + e->size =3D size; + e->ptr =3D newptr; + + return newptr; +} + +void track_free(const char *file, int line, void *ptr) +{ + int i; + entry_t *e, *prev; + + free(ptr); + + /* find associated entry */ + i =3D (((size_t)ptr) >> 8) & 0xff; + + prev =3D NULL; + for (e =3D ptrhash[i]; e; e =3D e->next) { + if (e->ptr =3D=3D ptr) + break; + prev =3D e; + } + if (!e) + return; + + e->fileline->fcount++; + e->fileline->current -=3D e->size; + + if (prev) + prev->next =3D e->next; + else + ptrhash[i] =3D e->next; + free(e); +} + +void print_memory_usage(void) +{ + int i; + + printf("%20s:line \ta_cnt\tf_cnt\tr_cnt\tremain\tpeak\n", "file"); + for (i =3D 0; i < caller_count; i++) { + printf("%20s:%-5d\t%lld\t%lld\t%lld\t%lld\t%lld\n", + callers[i].file, callers[i].line, + callers[i].acount, callers[i].fcount, callers[i].rcount, + callers[i].current, callers[i].peak); + } +} =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D xfsprogs/repair/trackmem.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- a/xfsprogs/repair/trackmem.h 2006-06-17 00:58:24.000000000 +1000 +++ b/xfsprogs/repair/trackmem.h 2007-03-13 12:47:31.226230019 +1100 @@ -0,0 +1,35 @@ +/* + * Copyright (c) 2007 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef _XFS_REPAIR_TRACK_MEM_H +#define _XFS_REPAIR_TRACK_MEM_H + +#define calloc(n,s) track_calloc(__FILE__, __LINE__, (n), (s)) +#define malloc(s) track_malloc(__FILE__, __LINE__, (s)) +#define memalign(b,s) track_memalign(__FILE__, __LINE__, (b), (s)) +#define realloc(p,s) track_realloc(__FILE__, __LINE__, (p), (s)) +#define free(p) track_free(__FILE__, __LINE__, (p)) + +void print_memory_usage(void); +void *track_calloc(const char *file, int line, size_t num, size_t size); +void *track_malloc(const char *file, int line, size_t size); +void *track_memalign(const char *file, int line, size_t boundary, size_t s= ize); +void *track_realloc(const char *file, int line, void *ptr, size_t size); +void track_free(const char *file, int line, void *ptr); + +#endif /* _XFS_REPAIR_TRACK_MEM_H */ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D xfsprogs/repair/xfs_repair.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- a/xfsprogs/repair/xfs_repair.c 2007-03-13 12:48:40.000000000 +1100 +++ b/xfsprogs/repair/xfs_repair.c 2007-03-13 12:02:45.340085859 +1100 @@ -563,7 +563,7 @@ main(int argc, char **argv) =20 /* XXX: nathans - something in phase4 ain't playing by */ /* the buffer cache rules.. why doesn't IRIX hit this? */ - libxfs_bcache_purge(); + libxfs_bcache_flush(); =20 if (no_modify) printf(_("No modify flag set, skipping phase 5\n")); @@ -576,6 +576,8 @@ main(int argc, char **argv) phase6(mp); timestamp(PHASE_END, 6, NULL); =20 + libxfs_bcache_flush(); + phase7(mp); timestamp(PHASE_END, 7, NULL); } else { @@ -640,6 +642,10 @@ _("Warning: project quota information w if (do_parallel && report_interval) stop_progress_rpt(); =20 +#ifdef TRACK_MEMORY + print_memory_usage(); +#endif + if (no_modify) { do_log( _("No modify flag set, skipping filesystem flush and exiting.\n")); ------=_NextPart_000_0193_01C7656E.58D2DAD0-- From owner-xfs@oss.sgi.com Tue Mar 13 08:13:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 13 Mar 2007 08:13:32 -0700 (PDT) X-Spam-oss-Status: No, score=3.5 required=5.0 tests=BAYES_99 autolearn=no version=3.2.0-pre1-r499012 Received: from amsfep18-int.chello.nl (amsfep17-int.chello.nl [213.46.243.15] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2DFDN6p017617 for ; Tue, 13 Mar 2007 08:13:26 -0700 Received: from cable-213-132-129-14.upc.chello.be ([213.132.129.14]) by amsfep11-int.chello.nl (InterMail vM.6.01.04.04 201-2131-118-104-20050224) with ESMTP id <20070313134057.RBZM2958.amsfep11-int.chello.nl@cable-213-132-129-14.upc.chello.be> for ; Tue, 13 Mar 2007 14:40:57 +0100 From: clflush To: xfs@oss.sgi.com Subject: Questions about XFS Date: Tue, 13 Mar 2007 14:40:56 +0100 User-Agent: KMail/1.9.6 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200703131440.56678.clflush@chello.be> X-archive-position: 10805 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: clflush@chello.be Precedence: bulk X-list: xfs Content-Length: 1828 Lines: 36 Hi, I have a few simple questions regarding the XFS file system. I built a new small server here (commodity hardware, x86-64) and I've installed 32-bit openSUSE 10.2 on it. After the system was installed, configured and up and running, it hung while I was browsing with Firefox. The only thing I could do was to press the reset button on the computer. After the reboot, when I opened Firefox again, I noticed that all my bookmarks were gone. Those bookmarks were imported from my desktop machine a few days after I configured the new server. All file systems on this new server are XFS because I heard good things about it and it generally performs better in database operations compared to other file systems available for Linux. However, I was pretty surprised that when I had to reset the machine because it hung for some reason, all the bookmarks in Firefox were gone, so now I have my doubts about the reliability and data integrity of XFS. My older server, which also runs openSUSE 10.2 (32-bit) but uses Ext3 as file system never had such issues and I had to reset it many times because it was hanging for some reason. Am I right to assume that XFS compared to Ext3 does not do a very good job regarding data integrity? I know a little bit about file systems and I know that most file systems depend on the application to do the right job regarding the way it opens/locks/saves files, but in reality not all applications are written in a safe way to guarantee this. Basically, my two question that I have are: - Why did I lost bookmarks on a machine running XFS while on another one which runs the same OS version but uses Ext3 as file system, it never happened, no matter how many times I had to reset it. - Are there any efforts currently made to increase the data integrity of XFS? Regards From owner-xfs@oss.sgi.com Tue Mar 13 08:49:19 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 13 Mar 2007 08:49:25 -0700 (PDT) X-Spam-oss-Status: No, score=1.5 required=5.0 tests=AWL,BAYES_80,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.179]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2DFnH6p029616 for ; Tue, 13 Mar 2007 08:49:18 -0700 Received: from [85.115.16.62] (helo=[172.25.16.7]) by mrelayeu.kundenserver.de (node=mrelayeu1) with ESMTP (Nemesis), id 0MKwpI-1HR9352NiA-00060Z; Tue, 13 Mar 2007 16:36:28 +0100 Message-ID: <45F6C503.5010608@gmx.net> Date: Tue, 13 Mar 2007 16:36:35 +0100 From: Klaus Strebel User-Agent: Thunderbird 2.0b2 (Windows/20070116) MIME-Version: 1.0 To: clflush CC: xfs@oss.sgi.com Subject: Re: Questions about XFS References: <200703131440.56678.clflush@chello.be> In-Reply-To: <200703131440.56678.clflush@chello.be> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V01U2FsdGVkX1/Oqo/5lm6/Q64Ed8XY9/xf1e1yxWc2+Jxn5TH Ns6PGbP8lZ/Pki5hqHfknjouUJouDRgySozyn/YWgUHYcsBVbr lmPfib1BpovmEd17QldXg== X-archive-position: 10806 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: klaus.strebel@gmx.net Precedence: bulk X-list: xfs Content-Length: 2488 Lines: 62 clflush schrieb: > Hi, > > I have a few simple questions regarding the XFS file system. I built a new > small server here (commodity hardware, x86-64) and I've installed 32-bit > openSUSE 10.2 on it. After the system was installed, configured and up and > running, it hung while I was browsing with Firefox. The only thing I could do > was to press the reset button on the computer. After the reboot, when I > opened Firefox again, I noticed that all my bookmarks were gone. Those > bookmarks were imported from my desktop machine a few days after I configured > the new server. > > All file systems on this new server are XFS because I heard good things about > it and it generally performs better in database operations compared to other > file systems available for Linux. However, I was pretty surprised that when I > had to reset the machine because it hung for some reason, all the bookmarks > in Firefox were gone, so now I have my doubts about the reliability and data > integrity of XFS. My older server, which also runs openSUSE 10.2 (32-bit) but > uses Ext3 as file system never had such issues and I had to reset it many > times because it was hanging for some reason. > > Am I right to assume that XFS compared to Ext3 does not do a very good job > regarding data integrity? I know a little bit about file systems and I know > that most file systems depend on the application to do the right job > regarding the way it opens/locks/saves files, but in reality not all > applications are written in a safe way to guarantee this. > > Basically, my two question that I have are: > > - Why did I lost bookmarks on a machine running XFS while on another one which > runs the same OS version but uses Ext3 as file system, it never happened, no > matter how many times I had to reset it. > > - Are there any efforts currently made to increase the data integrity of XFS? > > Regards > > Hi, short and rude answer: 'Search the archives and FAQs'. Simply short answer: no and no. Longer answer: XFS only cares about meta-data integrity, if unwritten extends exist in memory, you'll get these empty on the disk if you reset your box. You should consider using the 'Magic SysRq' hotkeys to emergency sync your disk in cases like these before you reset your box. Ciao Klaus -- Mit freundlichen Grüssen / best regards Klaus Strebel, Dipl.-Inform. (FH), mailto:klaus.strebel@gmx.net /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ From owner-xfs@oss.sgi.com Tue Mar 13 08:55:32 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 13 Mar 2007 08:55:38 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_50, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2DFtV6p031134 for ; Tue, 13 Mar 2007 08:55:32 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id CA545180173DC; Tue, 13 Mar 2007 10:55:25 -0500 (CDT) Message-ID: <45F6C972.1080508@sandeen.net> Date: Tue, 13 Mar 2007 10:55:30 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: clflush CC: xfs@oss.sgi.com Subject: Re: Questions about XFS References: <200703131440.56678.clflush@chello.be> In-Reply-To: <200703131440.56678.clflush@chello.be> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10807 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 2511 Lines: 51 clflush wrote: > Hi, > > I have a few simple questions regarding the XFS file system. I built a new > small server here (commodity hardware, x86-64) and I've installed 32-bit > openSUSE 10.2 on it. After the system was installed, configured and up and > running, it hung while I was browsing with Firefox. The only thing I could do > was to press the reset button on the computer. After the reboot, when I > opened Firefox again, I noticed that all my bookmarks were gone. Those > bookmarks were imported from my desktop machine a few days after I configured > the new server. > > All file systems on this new server are XFS because I heard good things about > it and it generally performs better in database operations compared to other > file systems available for Linux. However, I was pretty surprised that when I > had to reset the machine because it hung for some reason, all the bookmarks > in Firefox were gone, so now I have my doubts about the reliability and data > integrity of XFS. My older server, which also runs openSUSE 10.2 (32-bit) but > uses Ext3 as file system never had such issues and I had to reset it many > times because it was hanging for some reason. sounds like you have several reliability problems ;-) > Am I right to assume that XFS compared to Ext3 does not do a very good job > regarding data integrity? I know a little bit about file systems and I know > that most file systems depend on the application to do the right job > regarding the way it opens/locks/saves files, but in reality not all > applications are written in a safe way to guarantee this. > > Basically, my two question that I have are: > > - Why did I lost bookmarks on a machine running XFS while on another one which > runs the same OS version but uses Ext3 as file system, it never happened, no > matter how many times I had to reset it. see also http://oss.sgi.com/projects/xfs/faq.html#nulls > - Are there any efforts currently made to increase the data integrity of XFS? this is essentially a loss of buffered data in the VM, outside the realm of what xfs can realistically protect. With ext3, you probably were losing your "latest" bookmarks as well, but were luckily(?) getting back whatever used to be on-disk. On the other hand, there were some changes made to xfs to explicitly sync files on close, if they have been truncated, which should help this sort of problem. Depending on what's in OpenSuSE 10.2, that change may or may not be in your code... -Eric From owner-xfs@oss.sgi.com Tue Mar 13 09:53:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 13 Mar 2007 09:53:25 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from osl1smout1.broadpark.no (osl1smout1.broadpark.no [80.202.4.58]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2DGrF6p010511 for ; Tue, 13 Mar 2007 09:53:18 -0700 Received: from osl1sminn1.broadpark.no ([80.202.4.59]) by osl1smout1.broadpark.no (Sun Java System Messaging Server 6.1 HotFix 0.05 (built Oct 21 2004)) with ESMTP id <0JEU005X3MSP1CA0@osl1smout1.broadpark.no> for xfs@oss.sgi.com; Tue, 13 Mar 2007 16:53:13 +0100 (CET) Received: from [10.0.0.3] ([80.202.169.161]) by osl1sminn1.broadpark.no (Sun Java System Messaging Server 6.1 HotFix 0.05 (built Oct 21 2004)) with ESMTP id <0JEU003VKMSO7TH2@osl1sminn1.broadpark.no> for xfs@oss.sgi.com; Tue, 13 Mar 2007 16:53:13 +0100 (CET) Date: Tue, 13 Mar 2007 16:53:12 +0100 From: "Stein M. Hugubakken" Subject: Re: Questions about XFS In-reply-to: <200703131440.56678.clflush@chello.be> To: xfs@oss.sgi.com Message-id: <45F6C8E8.5070208@start.no> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1; format=flowed Content-transfer-encoding: 7BIT References: <200703131440.56678.clflush@chello.be> User-Agent: Thunderbird 1.5.0.10 (X11/20070303) X-archive-position: 10808 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dulci@start.no Precedence: bulk X-list: xfs Content-Length: 558 Lines: 19 clflush wrote: > Basically, my two question that I have are: > > - Why did I lost bookmarks on a machine running XFS while on another one which > runs the same OS version but uses Ext3 as file system, it never happened, no > matter how many times I had to reset it. > > - Are there any efforts currently made to increase the data integrity of XFS? > Take a look at the FAQ: http://oss.sgi.com/projects/xfs/faq.html#wcache Regarding the lost bookmarks, you might find an old backup in ~/.mozilla/firefox//bookmarkbackups. Regards Stein From owner-xfs@oss.sgi.com Wed Mar 14 09:23:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 09:23:18 -0700 (PDT) X-Spam-oss-Status: No, score=2.1 required=5.0 tests=BAYES_80,J_CHICKENPOX_43, SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2EGNB6p016859 for ; Wed, 14 Mar 2007 09:23:11 -0700 Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2.netapp.com with ESMTP; 14 Mar 2007 09:12:47 -0700 X-IronPort-AV: i="4.14,285,1170662400"; d="scan'208"; a="41204839:sNHT21946064" Received: from svlexc03.hq.netapp.com (svlexc03.corp.netapp.com [10.57.156.149]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id l2EGCk7l007746 for ; Wed, 14 Mar 2007 09:12:46 -0700 (PDT) Received: from exsvlrb02.hq.netapp.com ([10.56.8.63]) by svlexc03.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 14 Mar 2007 09:13:54 -0700 Received: from exnane01.hq.netapp.com ([10.97.0.61]) by exsvlrb02.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 14 Mar 2007 09:13:59 -0700 Received: from tmt.netapp.com ([10.30.32.62]) by exnane01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 14 Mar 2007 12:13:58 -0400 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Wed, 14 Mar 2007 12:11:56 -0400 To: xfs@oss.sgi.com From: "Talpey, Thomas" Subject: Strange XFS issue on tiny-NAS ARM NFS server Cc: "Talpey, Thomas" Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-ID: X-OriginalArrivalTime: 14 Mar 2007 16:13:58.0282 (UTC) FILETIME=[C55C2EA0:01C76653] X-archive-position: 10810 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Thomas.Talpey@netapp.com Precedence: bulk X-list: xfs Content-Length: 3506 Lines: 103 This might be pilot error, but a *very* strange thing happens with an XFS filesystem on an NFS server I'm experimenting with. This is an NSLU2 ARM-based machine, running 2.6.20.1 and an XFS filesystem freshly built on a usb-attached 2.5" drive. Running Connectathon 04 basic tests against the server, things are fine with an EXT-formatted filesystem. However, reformatting the export as a default XFS filesystem (mkfs.xfs -f /dev/sda3), the following occurs: >[tmt@tmt2 cthon04]$ ./server -b -p /mnt/export -m /mnt 192.168.1.77 >Start tests on path /mnt/tmt2.test [y/n]? y > >sh ./runtests -b -t /mnt/tmt2.test > >Starting BASIC tests: test directory /mnt/tmt2.test (arg: -t) > >./test1: File and directory creation test > created 155 files 62 directories 5 levels deep in 0.78 seconds > ./test1 ok. > >./test2: File and directory removal test > removed 155 files 62 directories 5 levels deep in 0.60 seconds > ./test2 ok. > >./test3: lookups across mount point > 500 getcwd and stat calls in 0.0 seconds > ./test3 ok. > >./test4: setattr, getattr, and lookup > 1000 chmods and stats on 10 files in 0.80 seconds > ./test4 ok. > >./test5: read and write >rm: cannot remove `/mnt/tmt2.test/file.7': No such file or directory >rm: cannot remove `/mnt/tmt2.test/file.8': No such file or directory >rm: cannot remove `/mnt/tmt2.test/file.9': No such file or directory > ./test5: (/home/tmt/nfs/cthon04/basic) can't remove old test directory /mnt/tmt2.test >basic tests failed >Tests failed, leaving /mnt mounted >[tmt@tmt2 cthon04]$ ls -lsa /mnt/tmt2.test >total 0 >0 drwxrwxrwx 2 tmt tmt 17 Mar 14 11:23 >0 drwxrwxrwx 2 tmt tmt 17 Mar 14 11:23 >0 drwxrwxrwx 2 tmt tmt 17 Mar 14 11:23 . >0 drwxrwxrwx 3 root root 32 Mar 14 11:23 .. >[tmt@tmt2 cthon04]$ Those first two entries are entirely null - piping the output to "od" shows no filename at all. The same result is seen if listed from a login shell on the server. The problem stems from connectathon test4, which is attemting to create 10 files and chmod them repeatedly. If the test is run with a file count less than 8, it works fine. If >= 8, then upon removing the 6th file (file.5), the remaining files vanish, and the blank entry appears. There are no complaints in dmesg (see below fyi). The behavior is the same across all client mount options and server export options, only going away by reformatting the export to ext. Before giving the problem report a full work-up, I'm wondering if I'm missing the obvious, or if it's a known issue. A look around with google (etc) didn't turn up anything. Thanks for any info. (please include me in the reply, I'm not on the list). Tom. (dmesg) >... ><6>SGI XFS with no debug enabled ><5>XFS mounting filesystem sda3 ><7>Ending clean XFS mount for filesystem: sda3 ><6>Installing knfsd (copyright (C) 1996 okir@monad.swb.de). ><4>NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory ><4>NFSD: starting 90-second grace period >root@LKG881990:/$ >root@LKG881990:/$ cat /proc/fs/xfs/stat >extent_alloc 19 73 18 69 >abt 119 87 43 41 >blk_map 8 1 1 1 1 11 0 >bmbt 0 0 0 0 >dir 233 232 230 10 >trans 0 1726 0 >ig 2659 2426 0 233 0 227 1002 >log 67 2010 0 11 1 >push_ail 1729 0 0 0 0 0 0 0 0 0 >xstrat 0 0 >rw 0 0 >attr 0 0 0 0 >icluster 5 2 19 >vnodes 6 233 0 1382 227 227 227 0 >buf 1940 239 1709 0 0 0 0 231 64 >xpc 0 0 0 >debug 0 >root@LKG881990:/$ mkfs.xfs -v >mkfs.xfs version 2.8.16 >root@LKG881990:/$ From owner-xfs@oss.sgi.com Wed Mar 14 09:49:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 09:49:19 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, MIME_QP_LONG_LINE,SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from ipmail03.adl2.internode.on.net (ipmail03.adl2.internode.on.net [203.16.214.135]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2EGnA6p021728 for ; Wed, 14 Mar 2007 09:49:13 -0700 Received: from ppp163-199.static.internode.on.net (HELO saturn.flamingspork.com) ([150.101.163.199]) by ipmail03.adl2.internode.on.net with ESMTP; 15 Mar 2007 03:03:42 +1030 X-IronPort-AV: i="4.14,285,1170595800"; d="asc'?scan'208"; a="62478814:sNHT197025794" Received: from localhost.localdomain (saturn.flamingspork.com [127.0.0.1]) by saturn.flamingspork.com (Postfix) with ESMTP id 4D0A0C009B1; Thu, 15 Mar 2007 03:33:40 +1100 (EST) Received: by localhost.localdomain (Postfix, from userid 1000) id 6FC98140EA9D; Wed, 14 Mar 2007 17:33:37 +0100 (CET) Subject: Re: Questions about XFS From: Stewart Smith To: clflush Cc: xfs@oss.sgi.com In-Reply-To: <200703131440.56678.clflush@chello.be> References: <200703131440.56678.clflush@chello.be> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-PipK4s1B2QL96SKgRJBZ" Date: Wed, 14 Mar 2007 17:33:36 +0100 Message-Id: <1173890016.20671.11.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 X-archive-position: 10811 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stewart@flamingspork.com Precedence: bulk X-list: xfs Content-Length: 1582 Lines: 50 --=-PipK4s1B2QL96SKgRJBZ Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, 2007-03-13 at 14:40 +0100, clflush wrote: > I have a few simple questions regarding the XFS file system. I built a ne= w=20 > small server here (commodity hardware, x86-64) and I've installed 32-bit= =20 > openSUSE 10.2 on it. After the system was installed, configured and up an= d=20 > running, it hung while I was browsing with Firefox. The only thing I coul= d do=20 > was to press the reset button on the computer. After the reboot, when I= =20 > opened Firefox again, I noticed that all my bookmarks were gone. Those=20 > bookmarks were imported from my desktop machine a few days after I config= ured=20 > the new server. This is a firefox bug - I've seen it before (on my mother's machine). It's due to firefox not doing the correct thing with IO on the bookmarks file. As mentioned in another mail you can restore from a backup that firefox makes. In a future release, firefox is going to be using sqlite for storing thes ethings, which will mean that these problems go away (pretty sure sqllite does all the right things) --=20 Stewart Smith (stewart@flamingspork.com) http://www.flamingspork.com/ --=-PipK4s1B2QL96SKgRJBZ Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) iD8DBQBF+CPgKglWCUL+FDoRAvkMAJ0V6pPr9mHHJzt1FrNMKMJHhdEDUACeKk2C 4qkOj4rKEA9M8/Q2fZrA/6E= =MJqH -----END PGP SIGNATURE----- --=-PipK4s1B2QL96SKgRJBZ-- From owner-xfs@oss.sgi.com Wed Mar 14 13:32:59 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 13:33:03 -0700 (PDT) X-Spam-oss-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_20, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2EKWv6p015177 for ; Wed, 14 Mar 2007 13:32:59 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 8EE8A180173DC; Wed, 14 Mar 2007 15:32:56 -0500 (CDT) Message-ID: <45F85BFA.1070505@sandeen.net> Date: Wed, 14 Mar 2007 15:32:58 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: "Talpey, Thomas" CC: xfs@oss.sgi.com Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10812 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 658 Lines: 16 Talpey, Thomas wrote: > This might be pilot error, but a *very* strange thing happens with > an XFS filesystem on an NFS server I'm experimenting with. This is > an NSLU2 ARM-based machine, running 2.6.20.1 and an XFS filesystem > freshly built on a usb-attached 2.5" drive. > > Running Connectathon 04 basic tests against the server, things are > fine with an EXT-formatted filesystem. However, reformatting the > export as a default XFS filesystem (mkfs.xfs -f /dev/sda3), the > following occurs: arm compiler has bugs that miscompile xfs... I think if you google arm + xfs and maybe search the list archives, you'll find a possible workaround. -Eric From owner-xfs@oss.sgi.com Wed Mar 14 14:20:14 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 14:20:19 -0700 (PDT) X-Spam-oss-Status: No, score=1.1 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2ELK96p024067 for ; Wed, 14 Mar 2007 14:20:13 -0700 Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2.netapp.com with ESMTP; 14 Mar 2007 14:20:09 -0700 X-IronPort-AV: i="4.14,285,1170662400"; d="scan'208"; a="41282793:sNHT15699292" Received: from svlexrs01.hq.netapp.com (svlexrs01.corp.netapp.com [10.57.156.158]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id l2ELK8o8004752; Wed, 14 Mar 2007 14:20:08 -0700 (PDT) Received: from exsvlrb02.hq.netapp.com ([10.56.8.63]) by svlexrs01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 14 Mar 2007 14:21:21 -0700 Received: from exnane01.hq.netapp.com ([10.97.0.61]) by exsvlrb02.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 14 Mar 2007 14:21:21 -0700 Received: from tmt.netapp.com ([10.30.32.62]) by exnane01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 14 Mar 2007 17:21:19 -0400 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Wed, 14 Mar 2007 17:19:20 -0400 To: Eric Sandeen From: "Talpey, Thomas" Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server Cc: "Talpey, Thomas" , xfs@oss.sgi.com In-Reply-To: <45F85BFA.1070505@sandeen.net> References: <45F85BFA.1070505@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-ID: X-OriginalArrivalTime: 14 Mar 2007 21:21:19.0650 (UTC) FILETIME=[B545A020:01C7667E] X-archive-position: 10813 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Thomas.Talpey@netapp.com Precedence: bulk X-list: xfs Content-Length: 1203 Lines: 34 Wow good memory, 3 years ago: I'm compiling with gcc4.1.1 for arm5t big endian, patched for multiple arm ports (OpenEmbedded) and it still botches that code today. Works fine with the arithmetic decomposition in the message. And I never blame the compiler! ;-) Thanks! BTW, XFS gives this little machine a nice bump in NFS write bandwidth. Goes from ~6MB/sec to ~7MB/s. CPU limited, mainly. Tom. At 04:32 PM 3/14/2007, Eric Sandeen wrote: >Talpey, Thomas wrote: >> This might be pilot error, but a *very* strange thing happens with >> an XFS filesystem on an NFS server I'm experimenting with. This is >> an NSLU2 ARM-based machine, running 2.6.20.1 and an XFS filesystem >> freshly built on a usb-attached 2.5" drive. >> >> Running Connectathon 04 basic tests against the server, things are >> fine with an EXT-formatted filesystem. However, reformatting the >> export as a default XFS filesystem (mkfs.xfs -f /dev/sda3), the >> following occurs: > >arm compiler has bugs that miscompile xfs... I think if you google arm + >xfs and maybe search the list archives, you'll find a possible workaround. > >-Eric > > From owner-xfs@oss.sgi.com Wed Mar 14 14:21:56 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 14:22:00 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2ELLt6p024680 for ; Wed, 14 Mar 2007 14:21:56 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C7EA81807DF13; Wed, 14 Mar 2007 16:21:54 -0500 (CDT) Message-ID: <45F86777.9030909@sandeen.net> Date: Wed, 14 Mar 2007 16:21:59 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: "Talpey, Thomas" CC: xfs@oss.sgi.com Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server References: <45F85BFA.1070505@sandeen.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10814 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 585 Lines: 20 Talpey, Thomas wrote: > Wow good memory, 3 years ago: > > > I'm compiling with gcc4.1.1 for arm5t big endian, patched for multiple > arm ports (OpenEmbedded) and it still botches that code today. > Works fine with the arithmetic decomposition in the message. > And I never blame the compiler! ;-) > > Thanks! > > BTW, XFS gives this little machine a nice bump in NFS write > bandwidth. Goes from ~6MB/sec to ~7MB/s. CPU limited, mainly. Glad it worked! So is netapp selling those now? ;-) -Eric From owner-xfs@oss.sgi.com Wed Mar 14 14:32:23 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 14:32:27 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2ELWL6p026848 for ; Wed, 14 Mar 2007 14:32:23 -0700 Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2.netapp.com with ESMTP; 14 Mar 2007 14:32:21 -0700 X-IronPort-AV: i="4.14,285,1170662400"; d="scan'208"; a="41285173:sNHT17284148" Received: from svlexrs02.hq.netapp.com (svlexrs02.corp.netapp.com [10.57.156.154]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id l2ELVkZu011120; Wed, 14 Mar 2007 14:32:17 -0700 (PDT) Received: from exsvlrb01.hq.netapp.com ([10.56.8.62]) by svlexrs02.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 14 Mar 2007 14:33:02 -0700 Received: from exnane01.hq.netapp.com ([10.97.0.61]) by exsvlrb01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 14 Mar 2007 14:33:01 -0700 Received: from tmt.netapp.com ([10.30.32.62]) by exnane01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Wed, 14 Mar 2007 17:32:59 -0400 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Wed, 14 Mar 2007 17:31:39 -0400 To: Eric Sandeen From: "Talpey, Thomas" Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server Cc: "Talpey, Thomas" , xfs@oss.sgi.com In-Reply-To: <45F86777.9030909@sandeen.net> References: <45F85BFA.1070505@sandeen.net> <45F86777.9030909@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-ID: X-OriginalArrivalTime: 14 Mar 2007 21:32:59.0858 (UTC) FILETIME=[56A0E320:01C76680] X-archive-position: 10815 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Thomas.Talpey@netapp.com Precedence: bulk X-list: xfs Content-Length: 291 Lines: 11 At 05:21 PM 3/14/2007, Eric Sandeen wrote: >Glad it worked! > >So is netapp selling those now? ;-) Nope, but they sure are instructive platforms for expermenting with Linux. Their memory bandwidth is so low (and they only have 32MB of it) that little things make a big difference. Tom. From owner-xfs@oss.sgi.com Wed Mar 14 21:53:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 14 Mar 2007 21:53:06 -0700 (PDT) X-Spam-oss-Status: No, score=2.2 required=5.0 tests=BAYES_80,J_CHICKENPOX_21, J_CHICKENPOX_72,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from s62.xrea.com (s62.xrea.com [221.186.251.67]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2F4r06p011797 for ; Wed, 14 Mar 2007 21:53:02 -0700 Received: (qmail 15370 invoked by uid 89); 15 Mar 2007 13:26:18 +0900 Received: from kd125053235206.ppp-bb.dion.ne.jp (HELO ?127.0.0.1?) (tai@125.53.235.206) by 192.168.1.21 with SMTP; 15 Mar 2007 13:26:18 +0900 Message-ID: <45F8CAEA.3050408@list.rakugaki.org> Date: Thu, 15 Mar 2007 13:26:18 +0900 From: Taisuke Yamada User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 Newsgroups: gmane.comp.file-systems.xfs.general To: xfs@oss.sgi.com Subject: Re: Questions about XFS References: <200703131440.56678.clflush@chello.be> <1173890016.20671.11.camel@localhost.localdomain> In-Reply-To: <1173890016.20671.11.camel@localhost.localdomain> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-archive-position: 10816 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tyamadajp@list.rakugaki.org Precedence: bulk X-list: xfs Content-Length: 1725 Lines: 44 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - From end-user's POV, this infamous XFS behavior is somewhat taken as XFS's inferiority compared to other filesystems. Even with "bad" applications (ex. firefox), this rarely happens on others, so regardless of what's on the FAQ, people logically concludes that the fault belongs to XFS anyway. So, what is the correct way to do IO? Is what firefox (and other bad apps) doing is so obvious(ly buggy), that it'll be acknowledged as a bug once reported? Or is it simply a mismatch between application expectation and XFS behavior, requiring a non-(obvious|generic) fix? Although I'm not a filesystem developer, I'm pretty impressed with XFS and willing to file a report/patch to those "buggy" apps if the issue is explainable to other app developers. >> was to press the reset button on the computer. After the reboot, when I >> opened Firefox again, I noticed that all my bookmarks were gone. Those >> bookmarks were imported from my desktop machine a few days after I configured >> the new server. > > This is a firefox bug - I've seen it before (on my mother's machine). > > It's due to firefox not doing the correct thing with IO on the bookmarks > file. - -- Taisuke Yamada , http://rakugaki.org/ 2268 E9A2 D4F9 014E F11D 1DF7 DCA3 83BC 78E5 CD3A Message to my public address may not be handled in a timely manner. For a direct contact, please use my private address on my namecard. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF+Mrq3KODvHjlzToRAu/vAKC8pky15WJwocHbWhbRx9f2H+c5aQCeIeYp ZJPcSeawAIbZN80GXJz+kYg= =oAY3 -----END PGP SIGNATURE----- From owner-xfs@oss.sgi.com Thu Mar 15 02:07:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 02:07:45 -0700 (PDT) X-Spam-oss-Status: No, score=1.1 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_72 autolearn=no version=3.2.0-pre1-r499012 Received: from amsfep13-int.chello.nl (amsfep17-int.chello.nl [213.46.243.15] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2F97a6p027218 for ; Thu, 15 Mar 2007 02:07:40 -0700 Received: from cable-213-132-129-14.upc.chello.be ([213.132.129.14]) by amsfep13-int.chello.nl (InterMail vM.6.01.04.04 201-2131-118-104-20050224) with ESMTP id <20070315090733.OTOX14819.amsfep13-int.chello.nl@cable-213-132-129-14.upc.chello.be>; Thu, 15 Mar 2007 10:07:33 +0100 From: clflush To: Taisuke Yamada Subject: Re: Questions about XFS Date: Thu, 15 Mar 2007 10:07:32 +0100 User-Agent: KMail/1.9.6 References: <200703131440.56678.clflush@chello.be> <1173890016.20671.11.camel@localhost.localdomain> <45F8CAEA.3050408@list.rakugaki.org> In-Reply-To: <45F8CAEA.3050408@list.rakugaki.org> Cc: xfs@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="iso-2022-jp" Content-Disposition: inline Message-Id: <200703151007.32630.clflush@chello.be> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2F97f6p027235 X-archive-position: 10817 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: clflush@chello.be Precedence: bulk X-list: xfs Content-Length: 2807 Lines: 55 From what I know, and correct me if I'm wrong, XFS relies on the application side to do the right job but real world experience shows us that *a lot* of applications out there behave badly and cannot be trusted hence if something happens, XFS cannot "correct" the problem leaving you with headaches behind depending on how much data you lost/corrupted and the importance of it. IMHO, XFS *should* do some effort at assuring integrity to minimize the bad behavior of badly written applications out there. I know that XFS wasn't written for PC class hardware in the first place, but most people do not read enough to understand XFS and use it on their desktops/laptops because to be honest Linux doesn't really have a good file system, and XFS out of all available file systems, is the best in performance and scalability terms. On the one hand you have the old Ext3 FS which doesn't perform very well in many areas but IMO is a lot safer to work on (doesn't loose data that easily compared to XFS - and I'm talking from experience here because I use both file systems and I lost much more on the XFS system than on the Ext3 one) and on the other hand you have this excellent XFS file system with its clean layout and awesome performance + fancy features like GRIO, extents, allocate on flush, real time volumes, etc *but* is not "safe" enough to work with if you have unreliable hardware and/or a lot of power outage issues - I've never lost data on Ext3 during a power outage but already lost 2 times data on XFS Just my $0.02 On Thursday 15 March 2007 05:26:18 you wrote: > From end-user's POV, this infamous XFS behavior is somewhat > taken as XFS's inferiority compared to other filesystems. > Even with "bad" applications (ex. firefox), this rarely happens > on others, so regardless of what's on the FAQ, people logically > concludes that the fault belongs to XFS anyway. > > So, what is the correct way to do IO? > Is what firefox (and other bad apps) doing is so obvious(ly buggy), > that it'll be acknowledged as a bug once reported? Or is it simply > a mismatch between application expectation and XFS behavior, > requiring a non-(obvious|generic) fix? > > Although I'm not a filesystem developer, I'm pretty impressed with > XFS and willing to file a report/patch to those "buggy" apps if the > issue is explainable to other app developers. > > >> was to press the reset button on the computer. After the reboot, when I > >> opened Firefox again, I noticed that all my bookmarks were gone. Those > >> bookmarks were imported from my desktop machine a few days after I > >> configured the new server. > > > > This is a firefox bug - I've seen it before (on my mother's machine). > > > > It's due to firefox not doing the correct thing with IO on the bookmarks > > file. From owner-xfs@oss.sgi.com Thu Mar 15 02:16:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 02:16:51 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.edu.haifa.ac.il (mail.edu.haifa.ac.il [132.74.40.10]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2F9Gi6p028910 for ; Thu, 15 Mar 2007 02:16:46 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.edu.haifa.ac.il (Postfix) with ESMTP id 61C0314BA0C for ; Thu, 15 Mar 2007 11:22:20 +0200 (IST) Received: from mail.edu.haifa.ac.il ([127.0.0.1]) by localhost (mail.edu.haifa.ac.il [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Pycjp+m-uYkd for ; Thu, 15 Mar 2007 11:22:20 +0200 (IST) Received: from kozanostra (leon.edu.haifa.ac.il [132.74.41.33]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by mail.edu.haifa.ac.il (Postfix) with ESMTP id 1B09B193FE for ; Thu, 15 Mar 2007 11:22:20 +0200 (IST) From: "Leon Kolchinsky" To: Subject: cache+barriers vs cache+nobarriers vs disabled cache+barriers vs disabled cache+nobarriers Date: Thu, 15 Mar 2007 11:16:38 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1255" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 thread-index: Acdm4qKx4R68Mmc1QByLL9d3MW/aCA== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 Message-Id: <20070315092220.1B09B193FE@mail.edu.haifa.ac.il> X-archive-position: 10818 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: leonk@construct.haifa.ac.il Precedence: bulk X-list: xfs Content-Length: 869 Lines: 28 Hello All, After reading http://oss.sgi.com/projects/xfs/faq.html#wcache and some posts on the list I've got the following question: If I have disabled write cache on the disk (hdparm -W0 /dev/hda) and by default FS is mounted with "barrier" enabled, Is there any taste in enabling "barrier"(by default) because write cache is disabled anyway or may be it's a good idea to mount with "nobarriers" in this case? Or may be I'm wrong here and write cache has nothing to do with "barrier" option? I thought that "barrier" is on by default to somewhat minimize potential dangers of enabled write cache? But if write cache is disabled, would "barrier" option just slow down the FS performance (which is already slowed down by "hdparm -W0 /dev/had" anyway)? Any inside wisdom on the subject of this mail would be much appreciated :) Best Regards, Leon Kolchinsky From owner-xfs@oss.sgi.com Thu Mar 15 04:35:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 04:35:17 -0700 (PDT) X-Spam-oss-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FBZB6p000450 for ; Thu, 15 Mar 2007 04:35:12 -0700 Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2.netapp.com with ESMTP; 15 Mar 2007 04:35:10 -0700 X-IronPort-AV: i="4.14,288,1170662400"; d="scan'208"; a="41423013:sNHT19314358" Received: from svlexrs02.hq.netapp.com (svlexrs02.corp.netapp.com [10.57.156.154]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id l2FBZA13004966 for ; Thu, 15 Mar 2007 04:35:10 -0700 (PDT) Received: from exsvlrb01.hq.netapp.com ([10.56.8.62]) by svlexrs02.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 15 Mar 2007 04:36:23 -0700 Received: from exnane01.hq.netapp.com ([10.97.0.61]) by exsvlrb01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 15 Mar 2007 04:36:23 -0700 Received: from tmt.netapp.com ([10.30.32.42]) by exnane01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 15 Mar 2007 07:36:21 -0400 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Thu, 15 Mar 2007 07:34:37 -0400 To: xfs@oss.sgi.com From: "Talpey, Thomas" Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server Cc: "Talpey, Thomas" In-Reply-To: References: <45F85BFA.1070505@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-ID: X-OriginalArrivalTime: 15 Mar 2007 11:36:21.0412 (UTC) FILETIME=[27814640:01C766F6] X-archive-position: 10819 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Thomas.Talpey@netapp.com Precedence: bulk X-list: xfs Content-Length: 1855 Lines: 52 Evidently that was not the only compiling issue (surprise): ><6>attempt to access beyond end of device ><6>sda3: rw=2, want=2574098408, limit=154272195 ><6>attempt to access beyond end of device ><6>sda3: rw=2, want=2574098408, limit=154272195 ><6>attempt to access beyond end of device ><6>sda3: rw=2, want=2574098408, limit=154272195 ><6>attempt to access beyond end of device ><6>sda3: rw=0, want=2574098408, limit=154272195 ><1>I/O error in filesystem ("sda3") meta-data dev sda3 block 0x996d9fe0 ("xfs_trans_read_buf") error 5 buf count 4096 Oh, well. Tom. At 05:19 PM 3/14/2007, Talpey, Thomas wrote: >Wow good memory, 3 years ago: >20287.html> > >I'm compiling with gcc4.1.1 for arm5t big endian, patched for multiple >arm ports (OpenEmbedded) and it still botches that code today. >Works fine with the arithmetic decomposition in the message. >And I never blame the compiler! ;-) > >Thanks! > >BTW, XFS gives this little machine a nice bump in NFS write >bandwidth. Goes from ~6MB/sec to ~7MB/s. CPU limited, mainly. > >Tom. > >At 04:32 PM 3/14/2007, Eric Sandeen wrote: >>Talpey, Thomas wrote: >>> This might be pilot error, but a *very* strange thing happens with >>> an XFS filesystem on an NFS server I'm experimenting with. This is >>> an NSLU2 ARM-based machine, running 2.6.20.1 and an XFS filesystem >>> freshly built on a usb-attached 2.5" drive. >>> >>> Running Connectathon 04 basic tests against the server, things are >>> fine with an EXT-formatted filesystem. However, reformatting the >>> export as a default XFS filesystem (mkfs.xfs -f /dev/sda3), the >>> following occurs: >> >>arm compiler has bugs that miscompile xfs... I think if you google arm + >>xfs and maybe search the list archives, you'll find a possible workaround. >> >>-Eric >> >> From owner-xfs@oss.sgi.com Thu Mar 15 04:39:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 04:39:42 -0700 (PDT) X-Spam-oss-Status: No, score=2.0 required=5.0 tests=BAYES_80 autolearn=no version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FBdZ6p001711 for ; Thu, 15 Mar 2007 04:39:38 -0700 Received: from [130.167.102.81] (redhat1.stsci.edu [130.167.102.81]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJE33979; Thu, 15 Mar 2007 07:25:48 -0400 (EDT) Message-ID: <45F92D8C.3090708@stsci.edu> Date: Thu, 15 Mar 2007 07:27:08 -0400 From: Thomas Walker User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Should xfs_repair take this long? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10820 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 1082 Lines: 24 I am trying to restore a corrupt xfs partition. It is 6TB total, it is an LVM of two 3TB fiber channel SAN volumes. The host is running RHEL4, 2.6.9-42.0.2.ELsmp, and the version of xfsprogs is xfsprogs-2.6.13-2. The host has four threaded AMD Opterons, 4GB of RAM and 2GB of swap located on an internal SCSI disk. It is unclear how the xfs partition was damaged, but it reports a bad superblock and will not mount. I am running this command; xfs_repair -o assume_xfs /dev/mapper/vg0-hladata3 This command has been running for two days now. There is cpu activity and i/o activity on the physical SAN. There is some swapping but not an unusual amount and swapon -s shows only a small amount in use. I have seen information implying xfs_repair needs a large amount of memory to work well, otherwise it will take a long time. My question is, giving my setup, is there an estimate of how long I should wait before expecting a result? Should I add swap space? Is there anything else I should do? thanks in advance for any help. Thomas Walker From owner-xfs@oss.sgi.com Thu Mar 15 06:01:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 06:01:20 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FD1E6p018217 for ; Thu, 15 Mar 2007 06:01:15 -0700 Received: from localhost (dslb-084-056-119-204.pools.arcor-ip.net [84.56.119.204]) by mail.lichtvoll.de (Postfix) with ESMTP id B19D95ADEC for ; Thu, 15 Mar 2007 13:39:37 +0100 (CET) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: Re: cache+barriers vs cache+nobarriers vs disabled cache+barriers vs disabled cache+nobarriers Date: Thu, 15 Mar 2007 13:39:30 +0100 User-Agent: KMail/1.9.6 References: <20070315092220.1B09B193FE@mail.edu.haifa.ac.il> In-Reply-To: <20070315092220.1B09B193FE@mail.edu.haifa.ac.il> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart12764108.IlLcJFcpmg"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200703151339.36259.Martin@lichtvoll.de> X-archive-position: 10821 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 2160 Lines: 61 --nextPart12764108.IlLcJFcpmg Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Am Donnerstag 15 M=E4rz 2007 schrieb Leon Kolchinsky: > Hello All, > > > After reading http://oss.sgi.com/projects/xfs/faq.html#wcache > and some posts on the list I've got the following question: > > If I have disabled write cache on the disk (hdparm -W0 /dev/hda) and by > default FS is mounted with "barrier" enabled, Is there any taste in > enabling "barrier"(by default) because write cache is disabled anyway > or may be it's a good idea to mount with "nobarriers" in this case? Hello Leon! It is not needed to enable barriers when write cache is disabled. Enabling= =20 barriers in this case shouldn't have any visible effect I think. > I thought that "barrier" is on by default to somewhat minimize > potential dangers of enabled write cache? But if write cache is > disabled, would "barrier" option just slow down the FS performance > (which is already slowed down by "hdparm -W0 /dev/had" anyway)? I think it wouldn't slow down any more, except maybe a minimal slow down=20 due to a little bit more of code executed inside XFS. But why do you want to disable write cache in the first case? As long as=20 you are using 2.6.17.7 or later you can safely enable barriers and and=20 write cache. At least that is my experience upto 2.6.20.1 with old IDE=20 drivers and now since some hours on my ThinkPad T42 2.6.20.3 with libata=20 drivers. With write barriers XFS and enabled write cache, XFS will not be as fast=20 as without write barriers and with enabled write cache - which is the=20 unsafe combination -, but it will still be faster than with disabled=20 write cache. Regards, --=20 Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 --nextPart12764108.IlLcJFcpmg Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBF+T6ImRvqrKWZhMcRAq3ZAJwKXM5jZFwQYMzp3nUN5vqmCRkehwCeL8VF WgwkoRpR/gI3Dx2DNoH1RCU= =QpPo -----END PGP SIGNATURE----- --nextPart12764108.IlLcJFcpmg-- From owner-xfs@oss.sgi.com Thu Mar 15 06:07:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 06:07:42 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00,RCVD_IN_PSBL autolearn=no version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FD7a6p019729 for ; Thu, 15 Mar 2007 06:07:38 -0700 Received: from localhost (dslb-084-056-091-239.pools.arcor-ip.net [84.56.91.239]) by mail.lichtvoll.de (Postfix) with ESMTP id CC7095ADEC for ; Thu, 15 Mar 2007 14:07:35 +0100 (CET) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: Re: cache+barriers vs cache+nobarriers vs disabled cache+barriers vs disabled cache+nobarriers Date: Thu, 15 Mar 2007 14:07:34 +0100 User-Agent: KMail/1.9.6 References: <20070315092220.1B09B193FE@mail.edu.haifa.ac.il> <200703151339.36259.Martin@lichtvoll.de> In-Reply-To: <200703151339.36259.Martin@lichtvoll.de> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Disposition: inline Message-Id: <200703151407.34419.Martin@lichtvoll.de> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2FD7c6p019739 X-archive-position: 10822 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 665 Lines: 20 Am Donnerstag 15 März 2007 schrieb Martin Steigerwald: > But why do you want to disable write cache in the first case? As long > as you are using 2.6.17.7 or later you can safely enable barriers and > and write cache. Hello again, Leon, of couse using write barriers is only possible if the hardware requirements (cache flushes or similar mechanisms) are met. XFS complains if it can't use barriers. See dmesg or log for details. It seems that the mailinglist software broke my GPG signature. It was correct as I sent out the mail. Regards, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 From owner-xfs@oss.sgi.com Thu Mar 15 07:06:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 07:06:55 -0700 (PDT) X-Spam-oss-Status: No, score=1.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FE6n6p031608 for ; Thu, 15 Mar 2007 07:06:50 -0700 Received: from [130.167.102.81] (redhat1.stsci.edu [130.167.102.81]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJF26083; Thu, 15 Mar 2007 10:05:22 -0400 (EDT) Message-ID: <45F952F2.6000008@stsci.edu> Date: Thu, 15 Mar 2007 10:06:42 -0400 From: Thomas Walker User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Emmanuel Florac CC: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? References: <45F92D8C.3090708@stsci.edu> <20070315150422.7bc5d178@harpe.intellique.com> In-Reply-To: <20070315150422.7bc5d178@harpe.intellique.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10823 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 775 Lines: 34 The terminal shows a lot of "." dots running across the screen quickly, and every few hours it says this; .....................................................found candidate secondary superblock... unable to verify superblock, continuing... found candidate secondary superblock... unable to verify superblock, continuing... Thomas Walker Emmanuel Florac wrote: >Le Thu, 15 Mar 2007 07:27:08 -0400 >Thomas Walker écrivait: > > > >>xfs_repair -o assume_xfs /dev/mapper/vg0-hladata3 >> >> This command has been running for two days now. >> >> > >Is there any output from xfs_repair ? This doesn't sound good. I've run >xfs_repair on some badly corrupted fs up to 13 TB, and it never took >more than a couple of minutes. > > > From owner-xfs@oss.sgi.com Thu Mar 15 07:23:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 07:24:57 -0700 (PDT) X-Spam-oss-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtp-ft4.fr.colt.net (smtp-ft4.fr.colt.net [213.41.78.208]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FENg6p002845 for ; Thu, 15 Mar 2007 07:23:44 -0700 Received: from harpe.intellique.com (host.93.124.68.195.rev.coltfrance.com [195.68.124.93]) by smtp-ft4.fr.colt.net (8.13.4/8.13.4/Debian-3sarge3) with ESMTP id l2FE4O6U002621; Thu, 15 Mar 2007 15:04:25 +0100 Date: Thu, 15 Mar 2007 15:04:22 +0100 From: Emmanuel Florac To: Thomas Walker Cc: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070315150422.7bc5d178@harpe.intellique.com> In-Reply-To: <45F92D8C.3090708@stsci.edu> References: <45F92D8C.3090708@stsci.edu> Organization: Intellique X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2FENk6p002853 X-archive-position: 10824 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: eflorac@intellique.com Precedence: bulk X-list: xfs Content-Length: 485 Lines: 17 Le Thu, 15 Mar 2007 07:27:08 -0400 Thomas Walker écrivait: > xfs_repair -o assume_xfs /dev/mapper/vg0-hladata3 > > This command has been running for two days now. Is there any output from xfs_repair ? This doesn't sound good. I've run xfs_repair on some badly corrupted fs up to 13 TB, and it never took more than a couple of minutes. -- ---------------------------------------- Emmanuel Florac | Intellique ---------------------------------------- From owner-xfs@oss.sgi.com Thu Mar 15 07:40:30 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 07:40:38 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50,HTML_MESSAGE autolearn=ham version=3.2.0-pre1-r499012 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FEeS6p006281 for ; Thu, 15 Mar 2007 07:40:30 -0700 Received: from orsmga001.jf.intel.com ([10.7.209.18]) by mga02.intel.com with ESMTP; 15 Mar 2007 07:30:07 -0700 Received: from fmsmsx334.amr.corp.intel.com ([132.233.42.1]) by orsmga001.jf.intel.com with ESMTP; 15 Mar 2007 07:30:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: i="4.14,288,1170662400"; d="scan'208,217"; a="210129018:sNHT33575164" Received: from swsmsx411.ger.corp.intel.com ([172.28.128.17]) by fmsmsx334.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 15 Mar 2007 07:30:06 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 MIME-Version: 1.0 Subject: XFS on Redhat4 U4 Date: Thu, 15 Mar 2007 14:30:03 -0000 Message-ID: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.intel.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: XFS on Redhat4 U4 Thread-Index: AcdnDmuXu9ftQtELTXCm3OzuNi+ZOA== From: "Carassale, Mario" To: X-OriginalArrivalTime: 15 Mar 2007 14:30:06.0529 (UTC) FILETIME=[6D5BE310:01C7670E] Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 10825 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mario.carassale@intel.com Precedence: bulk X-list: xfs Content-Length: 154 Lines: 38 Hi, Is there any way to run XFS on RedHat 4 U4. Regards, Mario Carassale [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Mar 15 07:59:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 07:59:07 -0700 (PDT) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50,RCVD_BAD_ID autolearn=no version=3.2.0-pre1-r499012 Received: from evaldomino.Falconstor.com (mail1.falconstor.com [216.223.47.230]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FEx26q010004 for ; Thu, 15 Mar 2007 07:59:04 -0700 Received: from [10.3.4.156] ([10.3.4.156]) by falconstormail.falconstor.net (Lotus Domino Release 5.0.11) with ESMTP id 2007031510550544:4432 ; Thu, 15 Mar 2007 10:55:05 -0400 Message-ID: <45F95E77.10207@falconstor.com> Date: Thu, 15 Mar 2007 10:55:51 -0400 From: "Geir A. Myrestrand" Reply-To: geir.myrestrand@falconstor.com Organization: FalconStor Software, Inc. User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Re: XFS on Redhat4 U4 References: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.intel.com> In-Reply-To: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.intel.com> X-MIMETrack: Itemize by SMTP Server on FalconstorMail/FalconStor(Release 5.0.11 |July 24, 2002) at 03/15/2007 10:55:05 AM, Serialize by Router on evaldomino/FalconStor(Release 5.0.11 |July 24, 2002) at 03/15/2007 11:00:04 AM, Serialize complete at 03/15/2007 11:00:04 AM Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-archive-position: 10826 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: geir.myrestrand@falconstor.com Precedence: bulk X-list: xfs Content-Length: 245 Lines: 20 Carassale, Mario wrote: > Hi, > > > > > > Is there any way to run XFS on RedHat 4 U4. > Yes. Eric Sandeen made a source RPM for the XFS kernel modules for RHEL4: ftp://oss.sgi.com/projects/xfs/testing/RHEL4/ -- Geir A. Myrestrand From owner-xfs@oss.sgi.com Thu Mar 15 07:59:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 07:59:08 -0700 (PDT) X-Spam-oss-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_50,RCVD_BAD_ID autolearn=no version=3.2.0-pre1-r499012 Received: from evaldomino.Falconstor.com (mail1.falconstor.com [216.223.47.230]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FEx26p010004 for ; Thu, 15 Mar 2007 07:59:03 -0700 Received: from [10.3.4.156] ([10.3.4.156]) by falconstormail.falconstor.net (Lotus Domino Release 5.0.11) with ESMTP id 2007031510405214:4423 ; Thu, 15 Mar 2007 10:40:52 -0400 Message-ID: <45F95B21.9040603@falconstor.com> Date: Thu, 15 Mar 2007 10:41:37 -0400 From: "Geir A. Myrestrand" Reply-To: geir.myrestrand@falconstor.com Organization: FalconStor Software, Inc. User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Re: Questions about XFS References: <200703131440.56678.clflush@chello.be> <1173890016.20671.11.camel@localhost.localdomain> <45F8CAEA.3050408@list.rakugaki.org> <200703151007.32630.clflush@chello.be> In-Reply-To: <200703151007.32630.clflush@chello.be> X-MIMETrack: Itemize by SMTP Server on FalconstorMail/FalconStor(Release 5.0.11 |July 24, 2002) at 03/15/2007 10:40:53 AM, Serialize by Router on evaldomino/FalconStor(Release 5.0.11 |July 24, 2002) at 03/15/2007 11:00:04 AM, Serialize complete at 03/15/2007 11:00:04 AM Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-2022-JP X-archive-position: 10827 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: geir.myrestrand@falconstor.com Precedence: bulk X-list: xfs Content-Length: 886 Lines: 23 clflush wrote: > On the one hand you have the old Ext3 FS which doesn't perform very well in > many areas but IMO is a lot safer to work on (doesn't loose data that easily > compared to XFS - and I'm talking from experience here because I use both > file systems and I lost much more on the XFS system than on the Ext3 one) and > on the other hand you have this excellent XFS file system with its clean > layout and awesome performance + fancy features like GRIO, extents, allocate > on flush, real time volumes, etc *but* is not "safe" enough to work with if > you have unreliable hardware and/or a lot of power outage issues - I've > never lost data on Ext3 during a power outage but already lost 2 times data > on XFS You *always* use a UPS when you use XFS. XFS does not prevent power outages [yet]... > Just my $0.02 Save them for a UPS. ;-) -- Geir A. Myrestrand From owner-xfs@oss.sgi.com Thu Mar 15 08:23:39 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 08:23:45 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtp-ft5.fr.colt.net (smtp-ft5.fr.colt.net [213.41.78.197]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FFNb6p020428 for ; Thu, 15 Mar 2007 08:23:38 -0700 Received: from harpe.intellique.com (host.93.124.68.195.rev.coltfrance.com [195.68.124.93]) by smtp-ft5.fr.colt.net (8.13.4/8.13.4/Debian-3sarge3) with ESMTP id l2FFNXhF025604; Thu, 15 Mar 2007 16:23:34 +0100 Date: Thu, 15 Mar 2007 16:23:33 +0100 From: Emmanuel Florac To: Thomas Walker , xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070315162333.72f34d58@harpe.intellique.com> In-Reply-To: <45F96150.50001@stsci.edu> References: <45F92D8C.3090708@stsci.edu> <20070315150422.7bc5d178@harpe.intellique.com> <45F952F2.6000008@stsci.edu> <20070315160309.652a6e0c@harpe.intellique.com> <45F96150.50001@stsci.edu> Organization: Intellique X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2FFNd6p020445 X-archive-position: 10828 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: eflorac@intellique.com Precedence: bulk X-list: xfs Content-Length: 993 Lines: 31 Le Thu, 15 Mar 2007 11:08:00 -0400 Thomas Walker écrivait: > So if I see I/O activity and cpu activity, which I do, should I > assume that eventually the repair should return? The repair _MAY_ return, unfortunately... > We are thinking of > interrupting it and trying to add more memory and restarting it. > Maybe we should just let it go. Yes, let it go now... > If you think it might finish some > time, even if it's going to be another day or two, then I'm willing > to be patient. I'm just worried it might be going around in circles. > It should finish if you're testing the right device, and the LV is properly assembled. If it reach the end of the device and find nothing, you should restart LVM first to check that your PV/VG/LV are correctly set up, and retry. If it doesn't work after that, backup will be your last friend. -- ---------------------------------------- Emmanuel Florac | Intellique ---------------------------------------- From owner-xfs@oss.sgi.com Thu Mar 15 08:27:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 08:27:56 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FFRo6p021727 for ; Thu, 15 Mar 2007 08:27:50 -0700 Received: from [130.167.102.81] (redhat1.stsci.edu [130.167.102.81]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJF66464; Thu, 15 Mar 2007 11:26:27 -0400 (EDT) Message-ID: <45F965F3.10902@stsci.edu> Date: Thu, 15 Mar 2007 11:27:47 -0400 From: Thomas Walker User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Emmanuel Florac CC: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? References: <45F92D8C.3090708@stsci.edu> <20070315150422.7bc5d178@harpe.intellique.com> <45F952F2.6000008@stsci.edu> <20070315160309.652a6e0c@harpe.intellique.com> <45F96150.50001@stsci.edu> <20070315162333.72f34d58@harpe.intellique.com> In-Reply-To: <20070315162333.72f34d58@harpe.intellique.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10829 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 1371 Lines: 53 I checked the status of the LVM before starting xfs_repair. I can't promise it's all in order, but at least the various pvdisplay, vgdisplay, lvdisplay, etc all came back normal. So... I'll just wait a few days and hope that xfs_repair comes back with something eventually. thanks for at least taking an interest, too bad that there's nothing we can really do about it. Thomas Walker Emmanuel Florac wrote: >Le Thu, 15 Mar 2007 11:08:00 -0400 >Thomas Walker écrivait: > > > >> So if I see I/O activity and cpu activity, which I do, should I >>assume that eventually the repair should return? >> >> > >The repair _MAY_ return, unfortunately... > > > >> We are thinking of >>interrupting it and trying to add more memory and restarting it. >>Maybe we should just let it go. >> >> > >Yes, let it go now... > > > >>If you think it might finish some >>time, even if it's going to be another day or two, then I'm willing >>to be patient. I'm just worried it might be going around in circles. >> >> >> > >It should finish if you're testing the right device, and the LV is >properly assembled. If it reach the end of the device and find nothing, >you should restart LVM first to check that your PV/VG/LV are correctly >set up, and retry. >If it doesn't work after that, backup will be your last friend. > > > From owner-xfs@oss.sgi.com Thu Mar 15 08:37:49 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 08:37:55 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from chaos.egr.duke.edu (chaos.egr.duke.edu [152.3.195.82]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FFbm6p023745 for ; Thu, 15 Mar 2007 08:37:49 -0700 Received: from chaos.egr.duke.edu (localhost.localdomain [127.0.0.1]) by chaos.egr.duke.edu (8.13.1/8.13.1) with ESMTP id l2FEl9xl009854; Thu, 15 Mar 2007 10:47:09 -0400 Received: from localhost (jlb@localhost) by chaos.egr.duke.edu (8.13.1/8.13.1/Submit) with ESMTP id l2FEl96C009850; Thu, 15 Mar 2007 10:47:09 -0400 X-Authentication-Warning: chaos.egr.duke.edu: jlb owned process doing -bs Date: Thu, 15 Mar 2007 10:47:09 -0400 (EDT) From: Joshua Baker-LePain X-X-Sender: jlb@chaos.egr.duke.edu To: "Carassale, Mario" cc: xfs@oss.sgi.com Subject: Re: XFS on Redhat4 U4 In-Reply-To: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.intel.com> Message-ID: References: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.in tel.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=us-ascii X-archive-position: 10830 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jlb17@duke.edu Precedence: bulk X-list: xfs Content-Length: 499 Lines: 16 On Thu, 15 Mar 2007 at 2:30pm, Carassale, Mario wrote > Is there any way to run XFS on RedHat 4 U4. > Centos (a RHEL rebuild distro) provides a kernel module RPM with updated (from that contained in the RHEL kernel source) XFS code. See (e.g.): http://mirror.centos.org/centos/4/centosplus/i386/RPMS/kernel-module-xfs-2.6.9-42.0.10.ELsmp-0.2-1.i686.rpm (modify based on kernel version and architecture, obviously). -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From owner-xfs@oss.sgi.com Thu Mar 15 08:52:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 08:52:10 -0700 (PDT) X-Spam-oss-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FFpx6p026890 for ; Thu, 15 Mar 2007 08:52:03 -0700 Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by mga09.intel.com with ESMTP; 15 Mar 2007 08:51:56 -0700 Received: from fmsmsx333.amr.corp.intel.com ([132.233.42.2]) by fmsmga002.fm.intel.com with ESMTP; 15 Mar 2007 08:51:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: i="4.14,289,1170662400"; d="scan'208"; a="59000496:sNHT22605687" Received: from swsmsx411.ger.corp.intel.com ([172.28.128.17]) by fmsmsx333.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 15 Mar 2007 08:51:49 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: XFS on Redhat4 U4 Date: Thu, 15 Mar 2007 15:51:34 -0000 Message-ID: <4BE67D4A1485054E99B21710FF21A9CD020BEA29@swsmsx411.ger.corp.intel.com> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: XFS on Redhat4 U4 Thread-Index: AcdnENfQNLPihV5iR3qo7Cyl8q9GMAACN0LQ From: "Carassale, Mario" To: "Joshua Baker-LePain" Cc: X-OriginalArrivalTime: 15 Mar 2007 15:51:49.0835 (UTC) FILETIME=[D7F501B0:01C76719] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2FFq36p026916 X-archive-position: 10831 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mario.carassale@intel.com Precedence: bulk X-list: xfs Content-Length: 913 Lines: 42 Do we have a guide on how to build this, or is just an rpm that need to be installed after OS installation. Regards, Mario Carassale tel: 01793 404729 iTel: (8) 281 4729 Intel Corporation (UK) Ltd mario.carassale@intel.com -----Original Message----- From: Joshua Baker-LePain [mailto:jlb17@duke.edu] Sent: 15 March 2007 14:47 To: Carassale, Mario Cc: xfs@oss.sgi.com Subject: Re: XFS on Redhat4 U4 On Thu, 15 Mar 2007 at 2:30pm, Carassale, Mario wrote > Is there any way to run XFS on RedHat 4 U4. > Centos (a RHEL rebuild distro) provides a kernel module RPM with updated (from that contained in the RHEL kernel source) XFS code. See (e.g.): http://mirror.centos.org/centos/4/centosplus/i386/RPMS/kernel-module-xfs -2.6.9-42.0.10.ELsmp-0.2-1.i686.rpm (modify based on kernel version and architecture, obviously). -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From owner-xfs@oss.sgi.com Thu Mar 15 08:54:43 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 08:54:46 -0700 (PDT) X-Spam-oss-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_20, J_CHICKENPOX_43 autolearn=no version=3.2.0-pre1-r499012 Received: from chaos.egr.duke.edu (chaos.egr.duke.edu [152.3.195.82]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FFsg6p027785 for ; Thu, 15 Mar 2007 08:54:43 -0700 Received: from chaos.egr.duke.edu (localhost.localdomain [127.0.0.1]) by chaos.egr.duke.edu (8.13.1/8.13.1) with ESMTP id l2FFsdad010102; Thu, 15 Mar 2007 11:54:39 -0400 Received: from localhost (jlb@localhost) by chaos.egr.duke.edu (8.13.1/8.13.1/Submit) with ESMTP id l2FFscGh010099; Thu, 15 Mar 2007 11:54:39 -0400 X-Authentication-Warning: chaos.egr.duke.edu: jlb owned process doing -bs Date: Thu, 15 Mar 2007 11:54:38 -0400 (EDT) From: Joshua Baker-LePain X-X-Sender: jlb@chaos.egr.duke.edu To: "Carassale, Mario" cc: xfs@oss.sgi.com Subject: RE: XFS on Redhat4 U4 In-Reply-To: <4BE67D4A1485054E99B21710FF21A9CD020BEA29@swsmsx411.ger.corp.intel.com> Message-ID: References: <4BE67D4A1485054E99B21710FF21A9CD020BEA29@swsmsx411.ger.corp.in tel.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=us-ascii X-archive-position: 10832 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jlb17@duke.edu Precedence: bulk X-list: xfs Content-Length: 311 Lines: 12 On Thu, 15 Mar 2007 at 3:51pm, Carassale, Mario wrote > Do we have a guide on how to build this, or is just an rpm that need to > be installed after OS installation. The latter. You'll also need the xfsprogs RPM to get mkfs.xfs. -- Joshua Baker-LePain Department of Biomedical Engineering Duke University From owner-xfs@oss.sgi.com Thu Mar 15 09:29:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 09:29:18 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=BAYES_50,J_CHICKENPOX_23 autolearn=no version=3.2.0-pre1-r499012 Received: from mx2.suse.de (cantor2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FGT96p002553 for ; Thu, 15 Mar 2007 09:29:11 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 0598A2158B; Thu, 15 Mar 2007 17:17:05 +0100 (CET) Date: Thu, 15 Mar 2007 17:17:04 +0100 From: Nick Piggin To: Linux Filesystems , Mark Fasheh Cc: reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315161704.GH8321@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-archive-position: 10833 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: npiggin@suse.de Precedence: bulk X-list: xfs Content-Length: 948 Lines: 22 OK, I've gone through and fixed several bugs until the thing actually survives fsx-linux for both ext2 and ext3 ordered and writeback (both when using the new aops, and the legacy prepare_write path). Actually ext3 sometimes breaks, but it does in unpatched kernels anyway. At 15 patches (including the initial buffered write deadlock fixes), it is too much to keep posting -- not much has fundamentally changed, so I'll just post occasionally if we make big changes. The quilt format is probably easier for someone wishing to work on it anyway. http://www.kernel.org/pub/linux/kernel/people/npiggin/patches/new-aops/ (excludes the OCFS2 patch that Mark sent, in anticipation of an update) It would be really nice if filesystem developers could take a look at the new interfaces some time, because otherwise they might get stuck with it :) So I'm cc'ing a few filesystems that come to mind, that I haven't heard anything from. Thanks, Nick From owner-xfs@oss.sgi.com Thu Mar 15 12:58:00 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 12:58:08 -0700 (PDT) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FJvx6p013950 for ; Thu, 15 Mar 2007 12:57:59 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 9725B2177F; Thu, 15 Mar 2007 20:57:57 +0100 (CET) Date: Thu, 15 Mar 2007 20:57:57 +0100 From: Nick Piggin To: Mark Fasheh Cc: Linux Filesystems , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315195757.GB19625@wotan.suse.de> References: <20070315161704.GH8321@wotan.suse.de> <20070315195351.GE21942@ca-server1.us.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315195351.GE21942@ca-server1.us.oracle.com> User-Agent: Mutt/1.5.9i X-archive-position: 10835 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: npiggin@suse.de Precedence: bulk X-list: xfs Content-Length: 770 Lines: 16 On Thu, Mar 15, 2007 at 12:53:51PM -0700, Mark Fasheh wrote: > On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > > OK, I've gone through and fixed several bugs until the thing actually > > survives fsx-linux for both ext2 and ext3 ordered and writeback (both > > when using the new aops, and the legacy prepare_write path). Actually > > ext3 sometimes breaks, but it does in unpatched kernels anyway. > > > > At 15 patches (including the initial buffered write deadlock fixes), > > it is too much to keep posting -- not much has fundamentally changed, > > so I'll just post occasionally if we make big changes. The quilt > > format is probably easier for someone wishing to work on it anyway. > > Hmm, we still left out some exports... Thanks, applied. From owner-xfs@oss.sgi.com Thu Mar 15 12:57:09 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 12:57:15 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx1.suse.de (cantor.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FJv76p013704 for ; Thu, 15 Mar 2007 12:57:09 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 0B364122BB; Thu, 15 Mar 2007 20:57:06 +0100 (CET) Date: Thu, 15 Mar 2007 20:57:05 +0100 From: Nick Piggin To: Joel Becker Cc: Linux Filesystems , Mark Fasheh , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315195705.GA19625@wotan.suse.de> References: <20070315161704.GH8321@wotan.suse.de> <20070315193245.GC20528@ca-server1.us.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315193245.GC20528@ca-server1.us.oracle.com> User-Agent: Mutt/1.5.9i X-archive-position: 10834 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: npiggin@suse.de Precedence: bulk X-list: xfs Content-Length: 775 Lines: 18 On Thu, Mar 15, 2007 at 12:32:45PM -0700, Joel Becker wrote: > On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > > At 15 patches (including the initial buffered write deadlock fixes), > > it is too much to keep posting -- not much has fundamentally changed, > > so I'll just post occasionally if we make big changes. The quilt > > format is probably easier for someone wishing to work on it anyway. > > > > http://www.kernel.org/pub/linux/kernel/people/npiggin/patches/new-aops/ > > For future drops, can you provide the unpacked patches too, so > lazy people like me can read them in the browser? Thanks. Sorry, I did intend to unpack that, but forgot. It's done now, the new directory containing the patches is under the same URL as above. Thanks, Nick From owner-xfs@oss.sgi.com Thu Mar 15 13:37:55 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 13:38:00 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from agminet02.oracle.com (agminet02.oracle.com [141.146.126.229]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FKbr6p021468 for ; Thu, 15 Mar 2007 13:37:55 -0700 Received: from agminet01.oracle.com (agminet01.oracle.com [141.146.126.228]) by agminet02.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2FJXMEY010042 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 15 Mar 2007 14:33:22 -0500 Received: from rgmsgw02.us.oracle.com (rgmsgw02.us.oracle.com [138.1.186.52]) by agminet01.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2FJWk59007122; Thu, 15 Mar 2007 14:32:47 -0500 Received: from ca-server1.us.oracle.com (ca-server1.us.oracle.com [139.185.48.5]) by rgmsgw02.us.oracle.com (Switch-3.2.4/Switch-3.2.4) with ESMTP id l2FJWjq0009492 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO); Thu, 15 Mar 2007 13:32:46 -0600 Received: from jlbec by ca-server1.us.oracle.com with local (Exim 4.63) (envelope-from ) id 1HRvgr-0007pE-BC; Thu, 15 Mar 2007 12:32:45 -0700 Date: Thu, 15 Mar 2007 12:32:45 -0700 From: Joel Becker To: Nick Piggin Cc: Linux Filesystems , Mark Fasheh , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315193245.GC20528@ca-server1.us.oracle.com> References: <20070315161704.GH8321@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315161704.GH8321@wotan.suse.de> X-Burt-Line: Trees are cool. X-Red-Smith: Ninety feet between bases is perhaps as close as man has ever come to perfection. User-Agent: Mutt/1.5.11 X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE X-Whitelist: TRUE X-archive-position: 10836 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Joel.Becker@oracle.com Precedence: bulk X-list: xfs Content-Length: 771 Lines: 25 On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > At 15 patches (including the initial buffered write deadlock fixes), > it is too much to keep posting -- not much has fundamentally changed, > so I'll just post occasionally if we make big changes. The quilt > format is probably easier for someone wishing to work on it anyway. > > http://www.kernel.org/pub/linux/kernel/people/npiggin/patches/new-aops/ For future drops, can you provide the unpacked patches too, so lazy people like me can read them in the browser? Thanks. Joel -- "Here's something to think about: How come you never see a headline like ``Psychic Wins Lottery''?" - Jay Leno Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 From owner-xfs@oss.sgi.com Thu Mar 15 13:47:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 13:47:44 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from agminet02.oracle.com (agminet02.oracle.com [141.146.126.229]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FKlc6p023378 for ; Thu, 15 Mar 2007 13:47:39 -0700 Received: from agminet01.oracle.com (agminet01.oracle.com [141.146.126.228]) by agminet02.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2FJsEmo013249 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 15 Mar 2007 14:54:15 -0500 Received: from rgmsgw02.us.oracle.com (rgmsgw02.us.oracle.com [138.1.186.52]) by agminet01.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2FJrrR4011702; Thu, 15 Mar 2007 14:53:53 -0500 Received: from ca-server1.us.oracle.com (ca-server1.us.oracle.com [139.185.48.5]) by rgmsgw02.us.oracle.com (Switch-3.2.4/Switch-3.2.4) with ESMTP id l2FJrqLm020952 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO); Thu, 15 Mar 2007 13:53:52 -0600 Received: from mfasheh by ca-server1.us.oracle.com with local (Exim 4.63) (envelope-from ) id 1HRw1H-0008Cw-VK; Thu, 15 Mar 2007 12:53:52 -0700 Date: Thu, 15 Mar 2007 12:53:51 -0700 From: Mark Fasheh To: Nick Piggin Cc: Linux Filesystems , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315195351.GE21942@ca-server1.us.oracle.com> Reply-To: Mark Fasheh References: <20070315161704.GH8321@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315161704.GH8321@wotan.suse.de> Organization: Oracle Corporation User-Agent: Mutt/1.5.11 X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE X-Whitelist: TRUE X-archive-position: 10837 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mark.fasheh@oracle.com Precedence: bulk X-list: xfs Content-Length: 1530 Lines: 49 On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > OK, I've gone through and fixed several bugs until the thing actually > survives fsx-linux for both ext2 and ext3 ordered and writeback (both > when using the new aops, and the legacy prepare_write path). Actually > ext3 sometimes breaks, but it does in unpatched kernels anyway. > > At 15 patches (including the initial buffered write deadlock fixes), > it is too much to keep posting -- not much has fundamentally changed, > so I'll just post occasionally if we make big changes. The quilt > format is probably easier for someone wishing to work on it anyway. Hmm, we still left out some exports... --Mark -- Mark Fasheh Senior Software Developer, Oracle mark.fasheh@oracle.com From: Mark Fasheh [PATCH] Export simple_write_begin, simple_write_end These are used by configfs, which can be built as a module. Signed-off-by: Mark Fasheh --- fs/libfs.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) 36f5d6a135c9f3f30fee3d0e4ffa887e1803ac95 diff --git a/fs/libfs.c b/fs/libfs.c index d687819..51f9748 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -656,6 +656,8 @@ EXPORT_SYMBOL(dcache_dir_open); EXPORT_SYMBOL(dcache_readdir); EXPORT_SYMBOL(generic_read_dir); EXPORT_SYMBOL(get_sb_pseudo); +EXPORT_SYMBOL(simple_write_begin); +EXPORT_SYMBOL(simple_write_end); EXPORT_SYMBOL(simple_commit_write); EXPORT_SYMBOL(simple_dir_inode_operations); EXPORT_SYMBOL(simple_dir_operations); -- 1.3.3 From owner-xfs@oss.sgi.com Thu Mar 15 14:09:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 14:09:15 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from agminet01.oracle.com (agminet01.oracle.com [141.146.126.228]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FL996p027301 for ; Thu, 15 Mar 2007 14:09:10 -0700 Received: from rgmsgw02.us.oracle.com (rgmsgw02.us.oracle.com [138.1.186.52]) by agminet01.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2FL8nKo020370; Thu, 15 Mar 2007 16:08:49 -0500 Received: from ca-server1.us.oracle.com (ca-server1.us.oracle.com [139.185.48.5]) by rgmsgw02.us.oracle.com (Switch-3.2.4/Switch-3.2.4) with ESMTP id l2FL8mHA014713 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO); Thu, 15 Mar 2007 15:08:48 -0600 Received: from mfasheh by ca-server1.us.oracle.com with local (Exim 4.63) (envelope-from ) id 1HRxBo-0001Ck-3B; Thu, 15 Mar 2007 14:08:48 -0700 Date: Thu, 15 Mar 2007 14:08:48 -0700 From: Mark Fasheh To: Nick Piggin Cc: Linux Filesystems , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315210848.GG21942@ca-server1.us.oracle.com> Reply-To: Mark Fasheh References: <20070315161704.GH8321@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315161704.GH8321@wotan.suse.de> Organization: Oracle Corporation User-Agent: Mutt/1.5.11 X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE X-archive-position: 10838 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mark.fasheh@oracle.com Precedence: bulk X-list: xfs Content-Length: 1024 Lines: 40 On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > OK, I've gone through and fixed several bugs until the thing actually > survives fsx-linux for both ext2 and ext3 ordered and writeback (both > when using the new aops, and the legacy prepare_write path). Actually > ext3 sometimes breaks, but it does in unpatched kernels anyway. Attached is a bugfix for a crash folks who use an initrd will hit early on. --Mark -- Mark Fasheh Senior Software Developer, Oracle mark.fasheh@oracle.com From: Mark Fasheh [PATCH] Populate pagep in simple_write_begin() This wasn't getting passed back to callers. Signed-off-by: Mark Fasheh cbf20bf51ddd6434db935ba29f845a85f3b1ec65 diff --git a/fs/libfs.c b/fs/libfs.c index 51f9748..602496a 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -357,6 +357,8 @@ int simple_write_begin(struct file *file if (!page) return -ENOMEM; + *pagep = page; + return simple_prepare_write(file, page, from, from+len); } -- 1.3.3 From owner-xfs@oss.sgi.com Thu Mar 15 14:44:49 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 14:44:54 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2FLil6p001941 for ; Thu, 15 Mar 2007 14:44:49 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 1E8A9AAC366; Fri, 16 Mar 2007 08:44:42 +1100 (EST) Subject: Small xfsprogs updates From: Nathan Scott Reply-To: nscott@aconex.com To: bnaujok@sgi.com Cc: xfs@oss.sgi.com Content-Type: multipart/mixed; boundary="=-5pD3emMlKEsg52JBr/7+" Organization: Aconex Date: Fri, 16 Mar 2007 08:44:35 +1100 Message-Id: <1173995075.5051.172.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 X-archive-position: 10839 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 2969 Lines: 89 --=-5pD3emMlKEsg52JBr/7+ Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi Barry, Could you do an xfsprogs point release (before your next big batch of repair updates if possible) via the following couple of patches? This will let me resolve a couple of Debian issues, and fixes up a dopey warning in xfs_io builds. I see one remaining warning, from xfs_repair: dir2.c: In function ‘process_leaf_node_dir2’: dir2.c:1796: warning: ‘greatest_hashval’ may be used uninitialized in this function Which is because of the way the called function is structured - here gcc is saying if the process_leaf_block_dir2 "for" loop is not entered then there may be the possibility of using "greatest_hashval" without setting it up. Not immediately obvious what the best fix is, but not a big deal (its always been this way, not sure its even a real issue). thanks! -- Nathan --=-5pD3emMlKEsg52JBr/7+ Content-Disposition: attachment; filename=fix-warnings Content-Type: text/x-patch; name=fix-warnings; charset=UTF-8 Content-Transfer-Encoding: 7bit Index: xfsprogs/io/pwrite.c =================================================================== --- xfsprogs.orig/io/pwrite.c 2007-03-16 08:34:17.539120750 +1100 +++ xfsprogs/io/pwrite.c 2007-03-16 08:34:26.475679250 +1100 @@ -310,6 +310,7 @@ pwrite_f( c = write_backward(offset, &count, &total); break; default: + total = 0; ASSERT(0); } if (c < 0) --=-5pD3emMlKEsg52JBr/7+ Content-Disposition: attachment; filename=bump-version Content-Type: text/x-patch; name=bump-version; charset=UTF-8 Content-Transfer-Encoding: 7bit Index: xfsprogs/VERSION =================================================================== --- xfsprogs.orig/VERSION 2007-03-16 08:22:05.601377500 +1100 +++ xfsprogs/VERSION 2007-03-16 08:22:12.641817500 +1100 @@ -3,5 +3,5 @@ # PKG_MAJOR=2 PKG_MINOR=8 -PKG_REVISION=19 +PKG_REVISION=20 PKG_BUILD=1 Index: xfsprogs/debian/changelog =================================================================== --- xfsprogs.orig/debian/changelog 2007-03-16 08:22:05.729385500 +1100 +++ xfsprogs/debian/changelog 2007-03-16 08:25:08.680819250 +1100 @@ -1,3 +1,10 @@ +xfsprogs (2.8.20-1) unstable; urgency=low + + * New upstream release (closes: #414079) + * Fixed up autoconf version dependency (closes: #414073) + + -- Nathan Scott Fri, 16 Mar 2007 08:24:33 +1100 + xfsprogs (2.8.19-1) unstable; urgency=low * New upstream release (closes: #409063) Index: xfsprogs/doc/CHANGES =================================================================== --- xfsprogs.orig/doc/CHANGES 2007-03-16 08:22:05.649380500 +1100 +++ xfsprogs/doc/CHANGES 2007-03-16 08:22:35.939273500 +1100 @@ -1,4 +1,4 @@ -xfsprogs-2.8.XX +xfsprogs-2.8.20 (16 March 2007) - Fix xfs_quota gracetime reporting. Thanks to Utako Kusaka for this. - Instead of using AC_CHECK_TYPES which isn't supported for --=-5pD3emMlKEsg52JBr/7+-- From owner-xfs@oss.sgi.com Thu Mar 15 16:00:10 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 16:00:13 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2FN076p016934 for ; Thu, 15 Mar 2007 16:00:09 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA20339; Fri, 16 Mar 2007 10:00:01 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2FN00Af29353852; Fri, 16 Mar 2007 10:00:01 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2FMxxUX29534735; Fri, 16 Mar 2007 09:59:59 +1100 (AEDT) Date: Fri, 16 Mar 2007 09:59:59 +1100 From: David Chinner To: "Talpey, Thomas" Cc: xfs@oss.sgi.com Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server Message-ID: <20070315225959.GO6095633@melbourne.sgi.com> References: <45F85BFA.1070505@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 10840 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 901 Lines: 27 On Thu, Mar 15, 2007 at 07:34:37AM -0400, Talpey, Thomas wrote: > Evidently that was not the only compiling issue (surprise): > > ><6>attempt to access beyond end of device > ><6>sda3: rw=2, want=2574098408, limit=154272195 > ><6>attempt to access beyond end of device > ><6>sda3: rw=2, want=2574098408, limit=154272195 > ><6>attempt to access beyond end of device > ><6>sda3: rw=2, want=2574098408, limit=154272195 > ><6>attempt to access beyond end of device > ><6>sda3: rw=0, want=2574098408, limit=154272195 > ><1>I/O error in filesystem ("sda3") meta-data dev sda3 block 0x996d9fe0 ("xfs_trans_read_buf") error 5 buf count 4096 That's a long way past the end of the partition. Tom, did you run xfs_repair on that filesystem after running with a busted compiler? Who knows how it broke stuff on disk..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Mar 15 16:10:48 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 16:10:52 -0700 (PDT) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_05, J_CHICKENPOX_56 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2FNAj6p018535 for ; Thu, 15 Mar 2007 16:10:47 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA20742; Fri, 16 Mar 2007 10:10:37 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2FNAZAf29550545; Fri, 16 Mar 2007 10:10:35 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2FNAVhC29544191; Fri, 16 Mar 2007 10:10:31 +1100 (AEDT) Date: Fri, 16 Mar 2007 10:10:31 +1100 From: David Chinner To: Thomas Walker Cc: Emmanuel Florac , xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070315231031.GP6095633@melbourne.sgi.com> References: <45F92D8C.3090708@stsci.edu> <20070315150422.7bc5d178@harpe.intellique.com> <45F952F2.6000008@stsci.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45F952F2.6000008@stsci.edu> User-Agent: Mutt/1.4.2.1i X-archive-position: 10841 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 896 Lines: 31 On Thu, Mar 15, 2007 at 10:06:42AM -0400, Thomas Walker wrote: > > The terminal shows a lot of "." dots running across the screen > quickly, and every few hours it says this; > > > .....................................................found candidate > secondary superblock... > unable to verify superblock, continuing... > found candidate secondary superblock... > unable to verify superblock, continuing... The primary superblock is not good, and it's trying to find a valid secondary superblock. Doesn't sound promising so far - reapir can't start until a valid superblok is found.... Can you dump the first sector of the device the fielsystem is on: # dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x So we can see if that really holds a primary XFS superblock? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Mar 15 17:20:30 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 17:20:34 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=BAYES_20,J_CHICKENPOX_56 autolearn=no version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G0KT6p003909 for ; Thu, 15 Mar 2007 17:20:30 -0700 Received: from comet.stsci.edu (comet.stsci.edu [130.167.251.67]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJI25156; Thu, 15 Mar 2007 20:19:07 -0400 (EDT) Received: (from comet.stsci.edu [69.250.187.193]) by comet.stsci.edu (MOS 3.8.3-GA) with HTTPS/1.1 id CNS24438 (AUTH walker); Thu, 15 Mar 2007 20:20:27 -0400 (EDT) From: Thomas Walker Subject: Re: Should xfs_repair take this long? To: David Chinner Cc: xfs@oss.sgi.com Reply-To: walker@stsci.edu X-Mailer: Mirapoint Webmail Direct 3.8.3-GA MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20070315202027.CNS24438@comet.stsci.edu> Date: Thu, 15 Mar 2007 20:20:27 -0400 (EDT) X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10843 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 1719 Lines: 54 Ok, here's the output of the command you wanted. I ran it on both of the xfs file systems we have, both say bad superblock when trying to mount; [root@hla-ags ~]# dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x 000000 [root@hla-ags ~]# dd if=/dev/mapper/vg1-hladata2 bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x 000000 [root@hla-ags ~]# mount /hladata2 mount: wrong fs type, bad option, bad superblock on /dev/vg0/hladata3, or too many mounted file systems Thomas Walker ---- Original message ---- >Date: Fri, 16 Mar 2007 10:10:31 +1100 >From: David Chinner >Subject: Re: Should xfs_repair take this long? >To: Thomas Walker >Cc: Emmanuel Florac , xfs@oss.sgi.com > >On Thu, Mar 15, 2007 at 10:06:42AM -0400, Thomas Walker wrote: >> >> The terminal shows a lot of "." dots running across the screen >> quickly, and every few hours it says this; >> >> >> .....................................................found candidate >> secondary superblock... >> unable to verify superblock, continuing... >> found candidate secondary superblock... >> unable to verify superblock, continuing... > >The primary superblock is not good, and it's trying to find a valid >secondary superblock. Doesn't sound promising so far - reapir can't >start until a valid superblok is found.... > >Can you dump the first sector of the device the fielsystem is >on: > ># dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x > >So we can see if that really holds a primary XFS superblock? > >Cheers, > >Dave. >-- >Dave Chinner >Principal Engineer >SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Mar 15 17:47:15 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 17:47:22 -0700 (PDT) X-Spam-oss-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from rgminet02.oracle.com (rgminet02.oracle.com [148.87.113.119]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G0lE6p007973 for ; Thu, 15 Mar 2007 17:47:15 -0700 Received: from rgminet01.oracle.com (rgminet01.oracle.com [148.87.113.118]) by rgminet02.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2FNlojo001027 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 15 Mar 2007 17:47:50 -0600 Received: from rgmsgw01.us.oracle.com (rgmsgw01.us.oracle.com [138.1.186.51]) by rgminet01.oracle.com (Switch-3.2.4/Switch-3.1.6) with ESMTP id l2FNlFRC014330; Thu, 15 Mar 2007 17:47:15 -0600 Received: from ca-server1.us.oracle.com (ca-server1.us.oracle.com [139.185.48.5]) by rgmsgw01.us.oracle.com (Switch-3.2.4/Switch-3.2.4) with ESMTP id l2FNlDHB010577 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO); Thu, 15 Mar 2007 17:47:14 -0600 Received: from mfasheh by ca-server1.us.oracle.com with local (Exim 4.63) (envelope-from ) id 1HRzf7-00041R-JP; Thu, 15 Mar 2007 16:47:13 -0700 Date: Thu, 15 Mar 2007 16:47:13 -0700 From: Mark Fasheh To: Nick Piggin Cc: Linux Filesystems , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070315234713.GH21942@ca-server1.us.oracle.com> Reply-To: Mark Fasheh References: <20070315161704.GH8321@wotan.suse.de> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="QKdGvSO+nmPlgiQ/" Content-Disposition: inline In-Reply-To: <20070315161704.GH8321@wotan.suse.de> Organization: Oracle Corporation User-Agent: Mutt/1.5.11 X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Whitelist: TRUE X-Whitelist: TRUE X-Whitelist: TRUE X-archive-position: 10844 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mark.fasheh@oracle.com Precedence: bulk X-list: xfs Content-Length: 6862 Lines: 268 --QKdGvSO+nmPlgiQ/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > (excludes the OCFS2 patch that Mark sent, in anticipation of an update) Attached is said patch. I needed to export __grab_cache_page (ext2/ext3 also need this if they're to be built as modules), so a patch to do that is also attached. This passed some preliminary testing on a two node cluster I have here at Oracle. --Mark -- Mark Fasheh Senior Software Developer, Oracle mark.fasheh@oracle.com --QKdGvSO+nmPlgiQ/ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0001-ocfs2-Convert-to-new-aops.txt" From: Mark Fasheh ocfs2: Convert to new aops Turn ocfs2_prepare_write() and ocfs2_commit_write() into ocfs2_write_begin() and ocfs2_write_end(). This conveniently eliminates the need for AOP_TRUNCATED_PAGE during write. Signed-off-by: Mark Fasheh e28911070b02362a9a3a543646da84a8fbf9f63b diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c index 875c114..cbec0e1 100644 --- a/fs/ocfs2/aops.c +++ b/fs/ocfs2/aops.c @@ -293,29 +293,67 @@ int ocfs2_prepare_write_nolock(struct in } /* - * ocfs2_prepare_write() can be an outer-most ocfs2 call when it is called - * from loopback. It must be able to perform its own locking around - * ocfs2_get_block(). + * ocfs2_write_begin() can be an outer-most ocfs2 call when it is + * called from elsewhere in the kernel. It must be able to perform its + * own locking around ocfs2_get_block(). */ -static int ocfs2_prepare_write(struct file *file, struct page *page, - unsigned from, unsigned to) +static int ocfs2_write_begin(struct file *file, struct address_space *mapping, + loff_t pos, unsigned len, unsigned flags, + struct page **pagep, void **fsdata) { - struct inode *inode = page->mapping->host; + struct inode *inode = mapping->host; + struct buffer_head *di_bh = NULL; + struct page *page = NULL; int ret; - mlog_entry("(0x%p, 0x%p, %u, %u)\n", file, page, from, to); - - ret = ocfs2_meta_lock_with_page(inode, NULL, 0, page); + ret = ocfs2_meta_lock(inode, &di_bh, 1); if (ret != 0) { mlog_errno(ret); + return ret; + } + + ret = ocfs2_data_lock(inode, 1); + if (ret) { + ocfs2_meta_unlock(inode, 1); + + mlog_errno(ret); + return ret; + } + + /* + * Lock the page out here to preserve ordering with + * ip_alloc_sem. + */ + page = __grab_cache_page(mapping, pos >> PAGE_CACHE_SHIFT); + if (!page) { + ret = -ENOMEM; + mlog_errno(ret); goto out; } - ret = ocfs2_prepare_write_nolock(inode, page, from, to); + *pagep = page; - ocfs2_meta_unlock(inode, 0); + down_read(&OCFS2_I(inode)->ip_alloc_sem); + ret = block_write_begin(file, mapping, pos, len, flags, pagep, fsdata, + ocfs2_get_block); + up_read(&OCFS2_I(inode)->ip_alloc_sem); out: - mlog_exit(ret); + if (ret == 0) { + *fsdata = di_bh; + } else { + /* + * Error return - the caller won't call + * ocfs2_write_end, so drop cluster locks here. + */ + brelse(di_bh); + if (page) { + unlock_page(page); + page_cache_release(page); + } + ocfs2_data_unlock(inode, 1); + ocfs2_meta_unlock(inode, 1); + } + return ret; } @@ -388,16 +426,18 @@ out: return handle; } -static int ocfs2_commit_write(struct file *file, struct page *page, - unsigned from, unsigned to) +static int ocfs2_write_end(struct file *file, struct address_space *mapping, + loff_t pos, unsigned len, unsigned copied, + struct page *page, void *fsdata) { int ret; - struct buffer_head *di_bh = NULL; + unsigned from, to; + struct buffer_head *di_bh = fsdata; struct inode *inode = page->mapping->host; handle_t *handle = NULL; struct ocfs2_dinode *di; - mlog_entry("(0x%p, 0x%p, %u, %u)\n", file, page, from, to); + mlog_entry("(0x%p, 0x%p)\n", file, page); /* NOTE: ocfs2_file_aio_write has ensured that it's safe for * us to continue here without rechecking the I/O against @@ -412,22 +452,13 @@ static int ocfs2_commit_write(struct fil * stale inode allocation image (i_size, i_clusters, etc). */ - ret = ocfs2_meta_lock_with_page(inode, &di_bh, 1, page); - if (ret != 0) { - mlog_errno(ret); - goto out; - } - - ret = ocfs2_data_lock_with_page(inode, 1, page); - if (ret != 0) { - mlog_errno(ret); - goto out_unlock_meta; - } + from = pos & (PAGE_CACHE_SIZE - 1); + to = from + len; handle = ocfs2_start_walk_page_trans(inode, page, from, to); if (IS_ERR(handle)) { ret = PTR_ERR(handle); - goto out_unlock_data; + goto out_unlock; } /* Mark our buffer early. We'd rather catch this error up here @@ -441,8 +472,10 @@ static int ocfs2_commit_write(struct fil } /* might update i_size */ - ret = generic_commit_write(file, page, from, to); - if (ret < 0) { + copied = block_write_end(file, mapping, pos, len, copied, page, fsdata); + if (copied < 0) { + ret = copied; + copied = 0; mlog_errno(ret); goto out_commit; } @@ -458,23 +491,30 @@ static int ocfs2_commit_write(struct fil di->i_size = cpu_to_le64((u64)i_size_read(inode)); ret = ocfs2_journal_dirty(handle, di_bh); - if (ret < 0) { + if (ret < 0) mlog_errno(ret); - goto out_commit; - } + ret = 0; out_commit: ocfs2_commit_trans(OCFS2_SB(inode->i_sb), handle); -out_unlock_data: +out_unlock: ocfs2_data_unlock(inode, 1); -out_unlock_meta: ocfs2_meta_unlock(inode, 1); -out: + + if (ret) { + /* + * We caught an error before block_write_end() - + * unlock and free the page. + */ + unlock_page(page); + page_cache_release(page); + } + if (di_bh) brelse(di_bh); mlog_exit(ret); - return ret; + return copied ? copied : ret; } static sector_t ocfs2_bmap(struct address_space *mapping, sector_t block) @@ -678,8 +718,8 @@ out: const struct address_space_operations ocfs2_aops = { .readpage = ocfs2_readpage, .writepage = ocfs2_writepage, - .prepare_write = ocfs2_prepare_write, - .commit_write = ocfs2_commit_write, + .write_begin = ocfs2_write_begin, + .write_end = ocfs2_write_end, .bmap = ocfs2_bmap, .sync_page = block_sync_page, .direct_IO = ocfs2_direct_IO, -- 1.3.3 --QKdGvSO+nmPlgiQ/ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0002-Export-__grab_cache_page.txt" From: Mark Fasheh [PATCH] Export __grab_cache_page Needed at least by ocfs2 and ext[23]. Signed-off-by: Mark Fasheh ec4c66f0e6012a182105405aa11813fbf836629f diff --git a/mm/filemap.c b/mm/filemap.c index 327c20f..c4a2d68 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2196,6 +2196,7 @@ repeat: } return page; } +EXPORT_SYMBOL(__grab_cache_page); static ssize_t generic_perform_write_2copy(struct file *file, struct iov_iter *i, loff_t pos) -- 1.3.3 --QKdGvSO+nmPlgiQ/-- From owner-xfs@oss.sgi.com Thu Mar 15 18:25:34 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 18:25:40 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2G1PV6p014189 for ; Thu, 15 Mar 2007 18:25:33 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA00249; Fri, 16 Mar 2007 12:25:24 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2G1PMAf29548986; Fri, 16 Mar 2007 12:25:23 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2G1PKsS27390918; Fri, 16 Mar 2007 12:25:20 +1100 (AEDT) Date: Fri, 16 Mar 2007 12:25:20 +1100 From: David Chinner To: Marco Berizzi Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: XFS internal error xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller 0xc01b00bd Message-ID: <20070316012520.GN5743@melbourne.sgi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 10845 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1903 Lines: 62 On Wed, Mar 14, 2007 at 12:34:29PM +0100, Marco Berizzi wrote: > Hello everybody. > Since 2.6.19.2 + commit 7fbbb01dca7704d52ace6f45a805c98a5b0362f9 What commit is that? gitweb search tells me it's an nmi watchdog change. Doesn't seem likely to change XFS behaviour - can you post a url to the commit? > I'm experimenting these errors. > 2.6.19.1 has been worked good for more > than 30 days. With the above commit? > I have reverted back to 2.6.19.1 to see if > this problem happens again. without the above commit? > find_or_create_page+0x37/0x8e > _xfs_buf_lookup_pages+0x132/0x2ea > _xfs_buf_initialize+0xc8/0xf6 > xfs_buf_get_flags+0xf8/0x11d > xfs_buf_read_flags+0x1c/0x7f > xfs_trans_read_buf+0x16a/0x34f > xfs_itobp+0x7c/0x242 > xfs_iread+0x68/0x1d3 > xfs_iget_core+0xe7/0x687 > xfs_iget+0xd8/0x150 > xfs_dir_lookup_int+0x98/0x10e > xfs_lookup+0x5a/0x90 > xfs_vn_lookup+0x52/0x93 Curious - never seen this before - possibly a corrupted inode number in the directory has led to this. > ba 4e 8b cd > Mar 12 14:35:21 Pleiadi kernel: Filesystem "sda8": XFS internal error > xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller > 0xc01b00bd > Mar 12 14:35:21 Pleiadi kernel: [] xfs_da_do_buf+0x70c/0x7b1 > Mar 12 14:35:21 Pleiadi kernel: [] xfs_da_read_buf+0x30/0x35 > Mar 12 14:35:21 Pleiadi kernel: [] xfs_da_read_buf+0x30/0x35 Hmm - these could simply be follow-on errors from the first problem - the buffer would now probably be bad or corrupted, and the directory buffer read code here is saying the buffer is bad. All the errors appear to have thesame data in the buffer (which is lacking the correct magic numbers) so i'd say they are related to the above error. Can you run xfs_repair on that filesystem and see if reports (and fixes) any problems? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Mar 15 19:02:05 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 19:02:10 -0700 (PDT) X-Spam-oss-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00, J_CHICKENPOX_56 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2G2226p019973 for ; Thu, 15 Mar 2007 19:02:04 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA00601; Fri, 16 Mar 2007 12:32:30 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2G1WTAf29566658; Fri, 16 Mar 2007 12:32:29 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2G1WRcM29555459; Fri, 16 Mar 2007 12:32:27 +1100 (AEDT) Date: Fri, 16 Mar 2007 12:32:27 +1100 From: David Chinner To: Thomas Walker Cc: David Chinner , xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070316013227.GO5743@melbourne.sgi.com> References: <20070315202027.CNS24438@comet.stsci.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315202027.CNS24438@comet.stsci.edu> User-Agent: Mutt/1.4.2.1i X-archive-position: 10846 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1259 Lines: 38 On Thu, Mar 15, 2007 at 08:20:27PM -0400, Thomas Walker wrote: > > Ok, here's the output of the command you wanted. I ran it on both of the xfs file systems we have, both say bad superblock when trying to mount; > > [root@hla-ags ~]# dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x > 000000 That failed - the output should be like: # dd if=/dev/mapper/test_vg-fred bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x 000000 4658 4253 0000 0010 0000 0000 1000 0000 000010 0000 0000 0000 0000 0000 0000 0000 0000 000020 34a8 5343 d8e3 8d46 01a5 b1e4 3a76 ac05 000030 0000 0000 0800 0400 0000 0000 0000 8000 000040 0000 0000 0000 8100 0000 0000 0000 8200 000050 0000 0100 0200 0000 0000 0800 0000 0000 000060 0000 000a b430 0002 0001 1000 0000 0000 000070 0000 0000 0000 0000 090c 0408 0011 1900 000080 0000 0000 0000 803c 0000 0000 0000 0606 000090 0000 0000 0c00 16f5 0000 0000 0000 0000 0000a0 0000 0000 0000 0000 0000 0000 0000 0000 0000b0 0000 0000 0000 0200 0000 0000 0000 0000 0000c0 0000 0000 0000 0000 0000 0000 0000 0000 * 000200 Can you remove the redirect to /dev/null so we can see the error message? Sorry about that. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Mar 15 19:18:55 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 19:18:59 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2G2Iq6p023460 for ; Thu, 15 Mar 2007 19:18:54 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA02532; Fri, 16 Mar 2007 13:18:46 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 1161) id 7518058FA1B6; Fri, 16 Mar 2007 13:18:46 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 961389 - xfs_repair - invalid levels and 0 records for a btree root not detected Message-Id: <20070316021846.7518058FA1B6@chook.melbourne.sgi.com> Date: Fri, 16 Mar 2007 13:18:46 +1100 (EST) From: bnaujok@sgi.com (Barry Naujok) X-archive-position: 10847 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@sgi.com Precedence: bulk X-list: xfs Content-Length: 1378 Lines: 33 Make sure xfs_repair detects invalid btree roots in inodes Date: Fri Mar 16 13:18:10 AEDT 2007 Workarea: chook.melbourne.sgi.com:/home/bnaujok/isms/repair Inspected by: Shailendra Tripathi [stripathi@agami.com] The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:28254a xfsprogs/VERSION - 1.170 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/VERSION.diff?r1=text&tr1=1.170&r2=text&tr2=1.169&f=h xfsprogs/doc/CHANGES - 1.237 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/doc/CHANGES.diff?r1=text&tr1=1.237&r2=text&tr2=1.236&f=h xfsprogs/debian/changelog - 1.150 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/debian/changelog.diff?r1=text&tr1=1.150&r2=text&tr2=1.149&f=h - Bump version to 2.8.20 xfsprogs/repair/dir2.c - 1.21 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/repair/dir2.c.diff?r1=text&tr1=1.21&r2=text&tr2=1.20&f=h - Fix a warning xfsprogs/repair/dinode.c - 1.27 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/repair/dinode.c.diff?r1=text&tr1=1.27&r2=text&tr2=1.26&f=h - Validate the btree root values in inodes xfsprogs/io/pwrite.c - 1.25 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/xfsprogs/io/pwrite.c.diff?r1=text&tr1=1.25&r2=text&tr2=1.24&f=h - Fix a warning From owner-xfs@oss.sgi.com Thu Mar 15 19:39:20 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 19:39:23 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00, J_CHICKENPOX_34,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G2dJ6p026825 for ; Thu, 15 Mar 2007 19:39:20 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 241541807DF13; Thu, 15 Mar 2007 21:39:17 -0500 (CDT) Message-ID: <45FA035B.2070505@sandeen.net> Date: Thu, 15 Mar 2007 21:39:23 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: geir.myrestrand@falconstor.com CC: xfs@oss.sgi.com Subject: Re: XFS on Redhat4 U4 References: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.intel.com> <45F95E77.10207@falconstor.com> In-Reply-To: <45F95E77.10207@falconstor.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10848 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 568 Lines: 29 Geir A. Myrestrand wrote: > Carassale, Mario wrote: >> Hi, >> >> >> >> >> >> Is there any way to run XFS on RedHat 4 U4. >> > > Yes. > > Eric Sandeen made a source RPM for the XFS kernel modules for RHEL4: > ftp://oss.sgi.com/projects/xfs/testing/RHEL4/ > > Centos has the latest version of what I've done... sgi guys, you might actually take down the above link, or redirect it. The src.rpms are also mirrored at http://sandeen.net/rhel4_xfs (*) Grabbing the pre-built stuff from centos may be simplest, though. -Eric *rhel5_xfs for the adventurous... From owner-xfs@oss.sgi.com Thu Mar 15 20:25:43 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 20:25:48 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G3Pg6p005632 for ; Thu, 15 Mar 2007 20:25:43 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 86BDCAAC3D9; Fri, 16 Mar 2007 14:25:41 +1100 (EST) Subject: xfsdump buglets From: Nathan Scott Reply-To: nscott@aconex.com To: wkendall@sgi.com Cc: xfs@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Fri, 16 Mar 2007 14:25:36 +1100 Message-Id: <1174015536.5051.193.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10849 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 1526 Lines: 61 Hey Bill, Got a couple of minor xfsdump problems reported to me, they're probably straightforward for someone who knows xfsdump well - here ya go... cheers. -------- Forwarded Message -------- From: Peter Chubb Reply-To: Peter Chubb , 415080@bugs.debian.org To: submit@bugs.debian.org Subject: Bug#415080: Poor error message from xfsdump for incorrect args Date: Fri, 16 Mar 2007 09:47:50 +1100 Package: xfsdump Version: 2.2.38-1 I do: xfsdump -l 0 -p -f afile filesystem and get xfsdump: ERROR: -^@ argument missing because -p needs a numeric argument. The message should say -p argument missing. -- Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au http://www.ertos.nicta.com.au ERTOS within National ICT Australia ------- Forwarded Message -------- From: Peter Chubb Reply-To: Peter Chubb , 415081@bugs.debian.org To: submit@bugs.debian.org Subject: Bug#415081: xfsdump doesn't accept relative pathnames to mountpoints Date: Fri, 16 Mar 2007 09:54:52 +1100 Package: xfsdump Version: 2.2.38-1 I have a filesystem mounted at /export. I do: cd / xfsdump -l 0 -f /path/to/file export and see xfsdump: ERROR: export does not identify a file system xfsdump -l 0 -f /path/to/file /export works. -- Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au http://www.ertos.nicta.com.au ERTOS within National ICT Australia -- Nathan From owner-xfs@oss.sgi.com Thu Mar 15 21:48:09 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 21:48:15 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from tyo200.gate.nec.co.jp (TYO200.gate.nec.co.jp [210.143.35.50]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G4m36p014671 for ; Thu, 15 Mar 2007 21:48:04 -0700 Received: from tyo202.gate.nec.co.jp ([10.7.69.202]) by tyo200.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id l2G4m0xE022069 for ; Fri, 16 Mar 2007 13:48:01 +0900 (JST) Received: from mailgate3.nec.co.jp (mailgate53.nec.co.jp [10.7.69.161]) by tyo202.gate.nec.co.jp (8.13.8/8.13.4) with ESMTP id l2G4glMV002727 for ; Fri, 16 Mar 2007 13:42:47 +0900 (JST) Received: (from root@localhost) by mailgate3.nec.co.jp (8.11.7/3.7W-MAILGATE-NEC) id l2G4glG29817 for xfs@oss.sgi.com; Fri, 16 Mar 2007 13:42:47 +0900 (JST) Received: from secsv3.tnes.nec.co.jp (tnesvc2.tnes.nec.co.jp [10.1.101.15]) by mailsv.nec.co.jp (8.11.7/3.7W-MAILSV-NEC) with ESMTP id l2G4glO22754 for ; Fri, 16 Mar 2007 13:42:47 +0900 (JST) Received: from tnesvc2.tnes.nec.co.jp ([10.1.101.15]) by secsv3.tnes.nec.co.jp (ExpressMail 5.10) with SMTP id 20070316.124254.20301592 for ; Fri, 16 Mar 2007 12:42:54 +0900 Received: FROM tnessv1.tnes.nec.co.jp BY tnesvc2.tnes.nec.co.jp ; Fri Mar 16 12:42:53 2007 +0900 Received: from rifu.bsd.tnes.nec.co.jp (rifu.bsd.tnes.nec.co.jp [10.1.104.1]) by tnessv1.tnes.nec.co.jp (Postfix) with ESMTP id 26DAAAE4B3; Fri, 16 Mar 2007 13:42:41 +0900 (JST) Received: from TNESG9700 (TNESG9700.bsd.tnes.nec.co.jp [10.1.104.115]) by rifu.bsd.tnes.nec.co.jp (8.12.11/3.7W/BSD-TNES-MX01) with SMTP id l2G4gjc2008590; Fri, 16 Mar 2007 13:42:45 +0900 To: nscott@aconex.com, peterc@gelato.unsw.edu.au Cc: xfs@oss.sgi.com Subject: Re: xfsdump buglets In-reply-to: <1174015536.5051.193.camel@edge> Message-Id: <20070316134240k-ooizumi@rifu.bsd.tnes.nec.co.jp> References: <1174015536.5051.193.camel@edge> Mime-Version: 1.0 X-Mailer: WeMail32[2.51] ID:1K0086 From: Kouta Ooizumi Date: Fri, 16 Mar 2007 13:42:40 +0900 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 10850 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: k-ooizumi@tnes.nec.co.jp Precedence: bulk X-list: xfs Content-Length: 1072 Lines: 41 Hi Nathan, Nathan Scott wrote: >Hey Bill, > >Got a couple of minor xfsdump problems reported to me, they're probably >straightforward for someone who knows xfsdump well - here ya go... > >cheers. > >-------- Forwarded Message -------- >From: Peter Chubb >Reply-To: Peter Chubb , >415080@bugs.debian.org >To: submit@bugs.debian.org >Subject: Bug#415080: Poor error message from xfsdump for incorrect args >Date: Fri, 16 Mar 2007 09:47:50 +1100 > >Package: xfsdump >Version: 2.2.38-1 > >I do: > xfsdump -l 0 -p -f afile filesystem >and get > xfsdump: ERROR: -^@ argument missing >because -p needs a numeric argument. > >The message should say -p argument missing. >-- >Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au >http://www.ertos.nicta.com.au ERTOS within National ICT Australia This bug has already been fixed. See also http://oss.sgi.com/archives/xfs/2007-02/msg00010.html Cheers. -- Kouta Ooizumi NEC Software Tohoku, Ltd. k-ooizumi@tnes.nec.co.jp From owner-xfs@oss.sgi.com Thu Mar 15 22:05:49 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 22:05:53 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G55m6p016941 for ; Thu, 15 Mar 2007 22:05:49 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 1D53BAAC3D9; Fri, 16 Mar 2007 16:05:47 +1100 (EST) Subject: Re: xfsdump buglets From: Nathan Scott Reply-To: nscott@aconex.com To: Kouta Ooizumi Cc: peterc@gelato.unsw.edu.au, xfs@oss.sgi.com In-Reply-To: <20070316134240k-ooizumi@rifu.bsd.tnes.nec.co.jp> References: <1174015536.5051.193.camel@edge> <20070316134240k-ooizumi@rifu.bsd.tnes.nec.co.jp> Content-Type: text/plain Organization: Aconex Date: Fri, 16 Mar 2007 16:05:42 +1100 Message-Id: <1174021542.5051.196.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10851 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 466 Lines: 19 On Fri, 2007-03-16 at 13:42 +0900, Kouta Ooizumi wrote: > Hi Nathan, > ... > > xfsdump: ERROR: -^@ argument missing > >because -p needs a numeric argument. > > > >The message should say -p argument missing. > ... > This bug has already been fixed. > See also http://oss.sgi.com/archives/xfs/2007-02/msg00010.html Ah, great - thanks for that. Ever seen the second problem reported there before (the wierd absolute vs relative pathname thing)? cheers. -- Nathan From owner-xfs@oss.sgi.com Thu Mar 15 23:01:56 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 15 Mar 2007 23:02:01 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2G61r6p023662 for ; Thu, 15 Mar 2007 23:01:55 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 669ACAAC319; Fri, 16 Mar 2007 17:01:52 +1100 (EST) Subject: Re: [xfs-masters] XFS and booting From: Nathan Scott Reply-To: nscott@aconex.com To: xfs-masters@oss.sgi.com Cc: xfs@oss.sgi.com In-Reply-To: <45FA1605.6080405@zytor.com> References: <45FA1605.6080405@zytor.com> Content-Type: text/plain Organization: Aconex Date: Fri, 16 Mar 2007 17:01:48 +1100 Message-Id: <1174024908.5051.230.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10852 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 3266 Lines: 71 Hi, On Thu, 2007-03-15 at 20:59 -0700, H. Peter Anvin wrote: > I have been looking at adding XFS support to the syslinux bootloader > suite, and discovered, to my dismay: > > No, for root partition installations because the XFS superblock is > written at block zero, where LILO would be installed. This is to > maintain compatibility with the IRIX on-disk format, and will not be > changed. This FAQ entry could probably be worded a bit more diplomatically. It is both the IRIX format (11+ years) and the last 6+ years of use of XFS on Linux by many, many people (and many SGI / other companies paying customers) that prevents this kind of change at this stage. > This means that it's impossible to write a boot loader that actually > plays by the x86 platform rules and still can boot from XFS. Other than Using lilo with an xfs root via the boot=/dev/hda and root=/dev/hda1 method (i.e. non-root-partition MBR) has been working OK for me and others. Is that not playing by the rules? Is that an option for your setup? It's a third option to the two you listed anyway (grub vs /boot, I mean). > the GRUB option of spreading itself all over the disk in places it > shouldn't be, like the MBR, thus breaking e.g. softraid and creating all > kinds of unnecessary interoperability problems. > > Anyway, since it looks like the damage of not offsetting the filesystem > has already been done, I'm trying to figure out what, if anything, can Or, worded another way, "the damage of designing a system having an MBR in the first sector of space that is then allocated to filesystems" - it's a shocker of a layering violation. > be done about it. A standard MBR will never be able to boot an XFS, but > perhaps a slightly modified MBR can be made to do that, without > introducing filesystem-instance-specific issues. In particular, if > there is space anywhere in the superblock for a "boot sector pointer", > *and* there is a way to safely write this pointer, then a slightly > modified MBR could detect an XFS superblock and re-read a boot sector at > that offset. > > Is this something that could be done? It could be done - thats not the sound of me volunteering though ;) - there is space available in most XFS filesystem geometries that could be reclaimed for this kind of thing. For example, for a 512 byte sector filesystem with a 4K blocksize, we have unused sectors at bytes 2048->4096 (following the space used for the first 4 sectors - SB, AGI AGF, AGFL, but before the first fsblock). This is the default mkfs geometry, so by far most filesystems have the free space here that could be utilised by a boot loader. And there's plenty of space in the superblock for a field with the meaning "the boot sector lives here". Actually, it'd be pretty straightforward to do this in XFS - the trickier part is working out in mkfs all of the geometries which do not result in freespace available for a boot sector, but thats not really very hard either. The important question is, would anyone use it, if such a feature was available? Eric/I/... well, anyone could probably hack up the XFS code in a weekend to allow someone to test this out, but I'd want to be very sure it wasn't going to be wasted effort... cheers. -- Nathan From owner-xfs@oss.sgi.com Fri Mar 16 03:36:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 03:36:41 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GAaY6p003661 for ; Fri, 16 Mar 2007 03:36:36 -0700 Received: from localhost (dslb-084-056-083-057.pools.arcor-ip.net [84.56.83.57]) by mail.lichtvoll.de (Postfix) with ESMTP id 7CC525ADE6 for ; Fri, 16 Mar 2007 11:36:33 +0100 (CET) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: Re: Questions about XFS Date: Fri, 16 Mar 2007 11:36:31 +0100 User-Agent: KMail/1.9.6 References: <200703131440.56678.clflush@chello.be> <45F8CAEA.3050408@list.rakugaki.org> <200703151007.32630.clflush@chello.be> In-Reply-To: <200703151007.32630.clflush@chello.be> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200703161136.32234.Martin@lichtvoll.de> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2GAaa6p003668 X-archive-position: 10853 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 3384 Lines: 67 Am Donnerstag 15 März 2007 schrieb clflush: > From what I know, and correct me if I'm wrong, XFS relies on the > application side to do the right job but real world experience shows us > that *a lot* of applications out there behave badly and cannot be > trusted hence if something happens, XFS cannot "correct" the problem > leaving you with headaches behind depending on how much data you > lost/corrupted and the importance of it. IMHO, XFS *should* do some > effort at assuring integrity to minimize the bad behavior of badly > written applications out there. Hello, as Eric wrote in this thread recent versions of XFS do an effort on avoiding these zeros in files: "On the other hand, there were some changes made to xfs to explicitly sync files on close, if they have been truncated, which should help this sort of problem. Depending on what's in OpenSuSE 10.2, that change may or may not be in your code..." > On the one hand you have the old Ext3 FS which doesn't perform very > well in many areas but IMO is a lot safer to work on (doesn't loose > data that easily compared to XFS - and I'm talking from experience here > because I use both file systems and I lost much more on the XFS system > than on the Ext3 one) and on the other hand you have this excellent XFS > file system with its clean layout and awesome performance + fancy > features like GRIO, extents, allocate on flush, real time volumes, etc > *but* is not "safe" enough to work with if you have unreliable hardware > and/or a lot of power outage issues - I've never lost data on Ext3 > during a power outage but already lost 2 times data on XFS Since 2.6.17.7 and enabled write barriers I didn't loose meta data consistency on my laptop anymore and I can tell you that it crashed a lot due to my experiments with what not (especially OSS radeon drivers and beryl;-). I also had some classical power outages. I usually do not put a battery into my laptop if not needed. And with recent XFS I did not encounter any data losses at all. Might have been luck, but before after a crash or power outage Akkregator told me sometimes that the file with the newsfeed stuff was corrupted and a backup has been restored. I didn't see this dialog since a long time on my laptop. That given I would like to have more safety built into the filesystem itself, but at least current ext3 is too ancient technology for me. Coming from the Amiga a filesystem with a hard maximum number of inodes just doesn't fit my expectations (although original Amiga FFS has lot of shortcomings too;-). The real challenge is to implement safety without serious loss of performance. You have more data safety in ext3, but less performance, and more performance in XFS, but potential less data safety with badly written applications. Not almost every bit of additional performance in XFS comes from transferring responsibility of data safety to the application, but I believe there is a relationship between safety and performance. Maybe wandering logs / a log structured approach as (partly) seen in Reiser 4 and NetApp's WAFL might be a good approach to get more data safety without (much) less performance. (Well in the NetApp FAS non volatile RAM plays an important role, too.) Regards, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 From owner-xfs@oss.sgi.com Fri Mar 16 04:15:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 04:15:16 -0700 (PDT) X-Spam-oss-Status: No, score=-0.1 required=5.0 tests=AWL,BAYES_20, J_CHICKENPOX_56 autolearn=no version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GBFA6p011369 for ; Fri, 16 Mar 2007 04:15:12 -0700 Received: from [130.167.102.81] (redhat1.stsci.edu [130.167.102.81]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJK99066; Fri, 16 Mar 2007 07:13:43 -0400 (EDT) Message-ID: <45FA7C37.4070001@stsci.edu> Date: Fri, 16 Mar 2007 07:15:03 -0400 From: Thomas Walker User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Chinner CC: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? References: <20070315202027.CNS24438@comet.stsci.edu> <20070316013227.GO5743@melbourne.sgi.com> In-Reply-To: <20070316013227.GO5743@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10854 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 4212 Lines: 116 No problem. The error was with the iflag=direct, apparently RHEL4 doesn't like that option so I took it out. Here's the output from each of the xfs volumes; [root@hla-ags ~]# dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 | od -Ax -x 1+0 records in 1+0 records out 000000 0000 6419 0000 0c00 0000 7419 0000 0c00 000010 0000 8419 0000 0c00 0000 9419 0000 0c00 000020 0000 a419 0000 0c00 0000 b419 0000 0c00 000030 0000 c419 0000 0c00 0000 d419 0000 0c00 000040 0000 e419 0000 0c00 0000 f419 0000 0c00 000050 0000 041a 0000 0c00 0000 141a 0000 0c00 000060 0000 241a 0000 0c00 0000 341a 0000 0c00 000070 0000 441a 0000 0c00 0000 541a 0000 0c00 000080 0000 641a 0000 0c00 0000 741a 0000 0c00 000090 0000 841a 0000 0c00 0000 941a 0000 0c00 0000a0 0000 a41a 0000 0c00 0000 b41a 0000 0c00 0000b0 0000 c41a 0000 0c00 0000 d41a 0000 0c00 0000c0 0000 e41a 0000 0c00 0000 f41a 0000 0c00 0000d0 0000 041b 0000 0c00 0000 141b 0000 0c00 0000e0 0000 241b 0000 0c00 0000 341b 0000 0c00 0000f0 0000 441b 0000 0c00 0000 541b 0000 0c00 000100 0000 641b 0000 0c00 0000 741b 0000 0c00 000110 0000 841b 0000 0c00 0000 941b 0000 0c00 000120 0000 a41b 0000 0c00 0000 b41b 0000 0c00 000130 0000 c41b 0000 0c00 0000 d41b 0000 0c00 000140 0000 e41b 0000 0c00 0000 f41b 0000 0c00 000150 0000 041c 0000 0c00 0000 141c 0000 0c00 000160 0000 241c 0000 0c00 0000 341c 0000 0c00 000170 0000 441c 0000 0c00 0000 541c 0000 0c00 000180 0000 641c 0000 0c00 0000 741c 0000 0c00 000190 0000 841c 0000 0c00 0000 941c 0000 0c00 0001a0 0000 a41c 0000 0c00 0000 b41c 0000 0c00 0001b0 0000 c41c 0000 0c00 0000 d41c 0000 0c00 0001c0 0000 e41c 0000 0c00 0000 f41c 0000 0c00 0001d0 0000 041d 0000 0c00 0000 141d 0000 0c00 0001e0 0000 241d 0000 0c00 0000 341d 0000 0c00 0001f0 0000 441d 0000 0c00 0000 541d 0000 0c00 000200 [root@hla-ags ~]# dd if=/dev/mapper/vg1-hladata2 bs=512 count=1 | od -Ax -x 1+0 records in 000000 7970 6f72 746f 203a 2030 0a2f 500a 414c 1+0 records out 000010 4e49 4b0a 3120 410a 560a 3120 0a34 6964 000020 2072 2e31 2e30 3572 392f 3830 4b0a 3420 000030 690a 746f 0a61 2056 3531 660a 6c69 2065 000040 2e6b 2e30 3472 362f 3833 450a 444e 450a 000050 444e 4552 0a50 6469 203a 2e30 2e30 3572 000060 312f 3131 0a30 7974 6570 203a 6964 0a72 000070 7270 6465 203a 2e30 2e30 3472 382f 3034 000080 630a 756f 746e 203a 0a35 6574 7478 203a 000090 2035 3031 3733 3620 2030 3036 3220 6262 0000a0 3538 6237 3138 3165 3235 6134 3939 6163 0000b0 6337 6165 3163 3636 3433 6663 0a66 7063 0000c0 7461 3a68 2f20 630a 706f 7279 6f6f 3a74 0000d0 3020 2f20 0a0a 2e63 2e30 3372 392f 3135 0000e0 6420 6c65 7465 2065 6166 736c 2065 6166 0000f0 736c 2065 412f 442f 472f 722f 6f68 0a0a 000100 2e6c 2e30 3474 312d 6d20 646f 6669 2079 000110 7274 6575 6620 6c61 6573 2f20 2f41 2f43 000120 7065 6973 6f6c 0a6e 0a0a 3131 3031 3120 000130 3332 0a38 0000 0000 0000 0000 0000 0000 000140 0000 0000 0000 0000 0000 0000 0000 0000 * 000200 Does that help with a diag? Thomas Walker David Chinner wrote: >On Thu, Mar 15, 2007 at 08:20:27PM -0400, Thomas Walker wrote: > > >> Ok, here's the output of the command you wanted. I ran it on both of the xfs file systems we have, both say bad superblock when trying to mount; >> >>[root@hla-ags ~]# dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x >>000000 >> >> > >That failed - the output should be like: > > # dd if=/dev/mapper/test_vg-fred bs=512 count=1 iflag=direct 2> /dev/null | od -Ax -x >000000 4658 4253 0000 0010 0000 0000 1000 0000 >000010 0000 0000 0000 0000 0000 0000 0000 0000 >000020 34a8 5343 d8e3 8d46 01a5 b1e4 3a76 ac05 >000030 0000 0000 0800 0400 0000 0000 0000 8000 >000040 0000 0000 0000 8100 0000 0000 0000 8200 >000050 0000 0100 0200 0000 0000 0800 0000 0000 >000060 0000 000a b430 0002 0001 1000 0000 0000 >000070 0000 0000 0000 0000 090c 0408 0011 1900 >000080 0000 0000 0000 803c 0000 0000 0000 0606 >000090 0000 0000 0c00 16f5 0000 0000 0000 0000 >0000a0 0000 0000 0000 0000 0000 0000 0000 0000 >0000b0 0000 0000 0000 0200 0000 0000 0000 0000 >0000c0 0000 0000 0000 0000 0000 0000 0000 0000 >* >000200 > >Can you remove the redirect to /dev/null so we can see the error message? > >Sorry about that. > >Cheers, > >Dave. > > From owner-xfs@oss.sgi.com Fri Mar 16 04:31:24 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 04:31:28 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_05, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GBVM6p020213 for ; Fri, 16 Mar 2007 04:31:24 -0700 Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2.netapp.com with ESMTP; 16 Mar 2007 04:31:18 -0700 X-IronPort-AV: i="4.14,292,1170662400"; d="scan'208"; a="41738931:sNHT47492907" Received: from svlexrs02.hq.netapp.com (svlexrs02.corp.netapp.com [10.57.156.154]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id l2GBVHPu020975; Fri, 16 Mar 2007 04:31:17 -0700 (PDT) Received: from exsvlrb01.hq.netapp.com ([10.56.8.62]) by svlexrs02.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 16 Mar 2007 04:32:31 -0700 Received: from exnane01.hq.netapp.com ([10.97.0.61]) by exsvlrb01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 16 Mar 2007 04:32:30 -0700 Received: from tmt.netapp.com ([10.30.32.32]) by exnane01.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.0); Fri, 16 Mar 2007 07:32:29 -0400 X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Fri, 16 Mar 2007 07:30:28 -0400 To: David Chinner From: "Talpey, Thomas" Subject: Re: Strange XFS issue on tiny-NAS ARM NFS server Cc: "Talpey, Thomas" , xfs@oss.sgi.com In-Reply-To: <20070315225959.GO6095633@melbourne.sgi.com> References: <45F85BFA.1070505@sandeen.net> <20070315225959.GO6095633@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Message-ID: X-OriginalArrivalTime: 16 Mar 2007 11:32:29.0245 (UTC) FILETIME=[C78926D0:01C767BE] X-archive-position: 10855 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Thomas.Talpey@netapp.com Precedence: bulk X-list: xfs Content-Length: 686 Lines: 19 At 06:59 PM 3/15/2007, David Chinner wrote: >On Thu, Mar 15, 2007 at 07:34:37AM -0400, Talpey, Thomas wrote: >> Evidently that was not the only compiling issue (surprise): >> >> ><6>attempt to access beyond end of device >> ><6>sda3: rw=2, want=2574098408, limit=154272195... >That's a long way past the end of the partition. > >Tom, did you run xfs_repair on that filesystem after >running with a busted compiler? Who knows how >it broke stuff on disk..... No, this was a freshly rebuilt ~80GB xfs. I was running a test on it over NFS which wrote and read 500MB files repeatedly. And, it ran for quite some time before failing. I haven't had a chance to track it down at all. Tom. From owner-xfs@oss.sgi.com Fri Mar 16 07:08:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 07:08:07 -0700 (PDT) X-Spam-oss-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_34,RCVD_BAD_ID autolearn=no version=3.2.0-pre1-r499012 Received: from evaldomino.Falconstor.com (mail1.falconstor.com [216.223.47.230]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GE826p017322 for ; Fri, 16 Mar 2007 07:08:04 -0700 Received: from [10.1.10.203] ([10.1.10.203]) by falconstormail.falconstor.net (Lotus Domino Release 5.0.11) with ESMTP id 2007031610070863:4724 ; Fri, 16 Mar 2007 10:07:08 -0400 Message-ID: <45FAA4BB.20908@falconstor.com> Date: Fri, 16 Mar 2007 10:07:55 -0400 From: "Geir A. Myrestrand" Reply-To: geir.myrestrand@falconstor.com Organization: FalconStor Software, Inc. User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Re: XFS on Redhat4 U4 References: <4BE67D4A1485054E99B21710FF21A9CD020BE8A8@swsmsx411.ger.corp.intel.com> <45F95E77.10207@falconstor.com> <45FA035B.2070505@sandeen.net> In-Reply-To: <45FA035B.2070505@sandeen.net> X-MIMETrack: Itemize by SMTP Server on FalconstorMail/FalconStor(Release 5.0.11 |July 24, 2002) at 03/16/2007 10:07:08 AM, Serialize by Router on evaldomino/FalconStor(Release 5.0.11 |July 24, 2002) at 03/16/2007 11:09:00 AM, Serialize complete at 03/16/2007 11:09:00 AM Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-1; format=flowed X-archive-position: 10856 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: geir.myrestrand@falconstor.com Precedence: bulk X-list: xfs Content-Length: 528 Lines: 20 Eric Sandeen wrote: > Centos has the latest version of what I've done... sgi guys, you might > actually take down the above link, or redirect it. > > The src.rpms are also mirrored at http://sandeen.net/rhel4_xfs (*) > > Grabbing the pre-built stuff from centos may be simplest, though. > > -Eric > > *rhel5_xfs for the adventurous... I've started using your XFS modules for RHEL5, but it takes another week before I've ported our application so that I can start giving it a more serious spin. -- Geir A. Myrestrand From owner-xfs@oss.sgi.com Fri Mar 16 07:59:21 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 07:59:30 -0700 (PDT) X-Spam-oss-Status: No, score=1.0 required=5.0 tests=BAYES_60 autolearn=ham version=3.2.0-pre1-r499012 Received: from over.ny.us.ibm.com (over.ny.us.ibm.com [32.97.182.150]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GExI6p027300 for ; Fri, 16 Mar 2007 07:59:20 -0700 Received: from e2.ny.us.ibm.com ([192.168.1.102]) by pokfb.esmtp.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2GEXglR001095 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Fri, 16 Mar 2007 10:33:42 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e2.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l2GEXbbF007386 for ; Fri, 16 Mar 2007 10:33:37 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2GEV5B9211774 for ; Fri, 16 Mar 2007 10:31:05 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2GEV4sC017776 for ; Fri, 16 Mar 2007 10:31:04 -0400 Received: from amitarora.in.ibm.com (amitarora.in.ibm.com [9.124.31.34]) by d01av03.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2GEV2Q6017620; Fri, 16 Mar 2007 10:31:03 -0400 Received: from amitarora.in.ibm.com (localhost.localdomain [127.0.0.1]) by amitarora.in.ibm.com (Postfix) with ESMTP id 4714E29EC8B; Fri, 16 Mar 2007 20:01:03 +0530 (IST) Received: (from amit@localhost) by amitarora.in.ibm.com (8.13.1/8.13.1/Submit) id l2GEV10f027499; Fri, 16 Mar 2007 20:01:01 +0530 Date: Fri, 16 Mar 2007 20:01:01 +0530 From: "Amit K. Arora" To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com Cc: Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070316143101.GA10152@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070301183445.GA7911@amitarora.in.ibm.com> User-Agent: Mutt/1.4.1i X-archive-position: 10857 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: aarora@linux.vnet.ibm.com Precedence: bulk X-list: xfs Content-Length: 7479 Lines: 222 First of all, thanks for the overwhelming response! Based on the suggestions received, I have added a new parameter to the sys_fallocate() system call - an interger called "mode", just after the "fd". Now the system call looks like this: asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) Currently we have two modes FA_ALLOCATE and FA_DEALLOCATE, for preallocation and deallocation of preallocated blocks respectively. More modes can be added, when required. And these modes can be renamed, since I am sure these are no way the best ones ! :) Attached below is the patch which implements this system call. It has been currently implemented and tested on i386, ppc64 and x86_64 architectures. I am facing some problems while trying to implement this on s390, and thus the delay. While I try to get it right on s390(x), we thought of posting this patch, so that we can save some time. Parallely we will work on getting the patch work on s390, and probably it will come as a separate patch. ToDos: ===== Following is pending: 1> Implementation on other architectures (other than i386, x86_64 and ppc64) like s390(x) 2> A generic file system operation to handle fallocate (generic_fallocate), for filesystems that do _not_ have the fallocate inode operation implemented. 3> ext4 patches that support fallocate inode operation are ready. I plan to submit those separately to just ext4 mailing list. 4> Changes to glibc, so that posix_fallocate() and posix_fallocate64() call fallocate() system call 5> Changes to XFS to implement the fallocate inode operation Signed-off-by: Amit K Arora --- arch/i386/kernel/syscall_table.S | 1 arch/x86_64/kernel/functionlist | 1 fs/open.c | 41 +++++++++++++++++++++++++++++++++++++++ include/asm-i386/unistd.h | 3 +- include/asm-powerpc/systbl.h | 1 include/asm-powerpc/unistd.h | 3 +- include/asm-x86_64/unistd.h | 4 ++- include/linux/fs.h | 7 ++++++ include/linux/syscalls.h | 1 9 files changed, 59 insertions(+), 3 deletions(-) Index: linux-2.6.20.1/arch/i386/kernel/syscall_table.S =================================================================== --- linux-2.6.20.1.orig/arch/i386/kernel/syscall_table.S +++ linux-2.6.20.1/arch/i386/kernel/syscall_table.S @@ -319,3 +319,4 @@ ENTRY(sys_call_table) .long sys_move_pages .long sys_getcpu .long sys_epoll_pwait + .long sys_fallocate /* 320 */ Index: linux-2.6.20.1/fs/open.c =================================================================== --- linux-2.6.20.1.orig/fs/open.c +++ linux-2.6.20.1/fs/open.c @@ -350,6 +350,47 @@ asmlinkage long sys_ftruncate64(unsigned } #endif +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) +{ + struct file *file; + struct inode *inode; + long ret = -EINVAL; + + if (len == 0 || offset < 0) + goto out; + + ret = -EBADF; + file = fget(fd); + if (!file) + goto out; + if (!(file->f_mode & FMODE_WRITE)) + goto out_fput; + + inode = file->f_path.dentry->d_inode; + + ret = -ESPIPE; + if (S_ISFIFO(inode->i_mode)) + goto out_fput; + + ret = -ENODEV; + if (!S_ISREG(inode->i_mode)) + goto out_fput; + + ret = -EFBIG; + if (offset + len > inode->i_sb->s_maxbytes) + goto out_fput; + + if (inode->i_op && inode->i_op->fallocate) + ret = inode->i_op->fallocate(inode, mode, offset, len); + else + ret = -ENOSYS; +out_fput: + fput(file); +out: + return ret; +} +EXPORT_SYMBOL(sys_fallocate); + /* * access() needs to use the real uid/gid, not the effective uid/gid. * We do this by temporarily clearing all FS-related capabilities and Index: linux-2.6.20.1/include/asm-i386/unistd.h =================================================================== --- linux-2.6.20.1.orig/include/asm-i386/unistd.h +++ linux-2.6.20.1/include/asm-i386/unistd.h @@ -325,10 +325,11 @@ #define __NR_move_pages 317 #define __NR_getcpu 318 #define __NR_epoll_pwait 319 +#define __NR_fallocate 320 #ifdef __KERNEL__ -#define NR_syscalls 320 +#define NR_syscalls 321 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.20.1/include/linux/fs.h =================================================================== --- linux-2.6.20.1.orig/include/linux/fs.h +++ linux-2.6.20.1/include/linux/fs.h @@ -263,6 +263,12 @@ extern int dir_notify_enable; #define SYNC_FILE_RANGE_WRITE 2 #define SYNC_FILE_RANGE_WAIT_AFTER 4 +/* + * fallocate() modes + */ +#define FA_ALLOCATE 0x1 +#define FA_DEALLOCATE 0x2 + #ifdef __KERNEL__ #include @@ -1124,6 +1130,7 @@ struct inode_operations { ssize_t (*listxattr) (struct dentry *, char *, size_t); int (*removexattr) (struct dentry *, const char *); void (*truncate_range)(struct inode *, loff_t, loff_t); + int (*fallocate)(struct inode *, int, loff_t, loff_t); }; struct seq_file; Index: linux-2.6.20.1/include/linux/syscalls.h =================================================================== --- linux-2.6.20.1.orig/include/linux/syscalls.h +++ linux-2.6.20.1/include/linux/syscalls.h @@ -602,6 +602,7 @@ asmlinkage long sys_get_robust_list(int asmlinkage long sys_set_robust_list(struct robust_list_head __user *head, size_t len); asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct getcpu_cache __user *cache); +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len); int kernel_execve(const char *filename, char *const argv[], char *const envp[]); Index: linux-2.6.20.1/include/asm-x86_64/unistd.h =================================================================== --- linux-2.6.20.1.orig/include/asm-x86_64/unistd.h +++ linux-2.6.20.1/include/asm-x86_64/unistd.h @@ -619,8 +619,10 @@ __SYSCALL(__NR_sync_file_range, sys_sync __SYSCALL(__NR_vmsplice, sys_vmsplice) #define __NR_move_pages 279 __SYSCALL(__NR_move_pages, sys_move_pages) +#define __NR_fallocate 280 +__SYSCALL(__NR_fallocate, sys_fallocate) -#define __NR_syscall_max __NR_move_pages +#define __NR_syscall_max __NR_fallocate #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.20.1/include/asm-powerpc/unistd.h =================================================================== --- linux-2.6.20.1.orig/include/asm-powerpc/unistd.h +++ linux-2.6.20.1/include/asm-powerpc/unistd.h @@ -324,10 +324,11 @@ #define __NR_get_robust_list 299 #define __NR_set_robust_list 300 #define __NR_move_pages 301 +#define __NR_fallocate 302 #ifdef __KERNEL__ -#define __NR_syscalls 302 +#define __NR_syscalls 303 #define __NR__exit __NR_exit #define NR_syscalls __NR_syscalls Index: linux-2.6.20.1/arch/x86_64/kernel/functionlist =================================================================== --- linux-2.6.20.1.orig/arch/x86_64/kernel/functionlist +++ linux-2.6.20.1/arch/x86_64/kernel/functionlist @@ -932,6 +932,7 @@ *(.text.sys_getitimer) *(.text.sys_getgroups) *(.text.sys_ftruncate) +*(.text.sys_fallocate) *(.text.sysfs_lookup) *(.text.sys_exit_group) *(.text.stub_fork) Index: linux-2.6.20.1/include/asm-powerpc/systbl.h =================================================================== --- linux-2.6.20.1.orig/include/asm-powerpc/systbl.h +++ linux-2.6.20.1/include/asm-powerpc/systbl.h @@ -305,3 +305,4 @@ SYSCALL_SPU(faccessat) COMPAT_SYS_SPU(get_robust_list) COMPAT_SYS_SPU(set_robust_list) COMPAT_SYS(move_pages) +SYSCALL(fallocate) -- Regards, Amit Arora From owner-xfs@oss.sgi.com Fri Mar 16 08:15:24 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 08:15:31 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GFFM6p030986 for ; Fri, 16 Mar 2007 08:15:24 -0700 Received: from [130.167.102.81] (redhat1.stsci.edu [130.167.102.81]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJM72314; Fri, 16 Mar 2007 11:13:53 -0400 (EDT) Message-ID: <45FAB480.908@stsci.edu> Date: Fri, 16 Mar 2007 11:15:12 -0400 From: Thomas Walker User-Agent: Mozilla Thunderbird 1.0.2 (X11/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Chinner CC: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? References: <45F92D8C.3090708@stsci.edu> <20070315150422.7bc5d178@harpe.intellique.com> <45F952F2.6000008@stsci.edu> <20070315231031.GP6095633@melbourne.sgi.com> In-Reply-To: <20070315231031.GP6095633@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10858 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 1807 Lines: 58 I see in other posts that "parted" is sometimes a culprit in these problems. Indeed, in my case I did use "parted" to create a gpt partition table on these xfs volumes and I was asked my parted to use the secondary signature. --- snip --- i also remember something about parted (maybe...) finding a backup gpt signature at the end of a disk, and "helpfully" copying it over the front end if so. This was a bug. sgi guys do you remember? But for this one has to invoke parted, and commit the operations done, am I right? if I recall, even invoking parted could do this. --- snip --- So maybe I got bit the same way. parted may be overwritten something at the head of the volume. Is there any way to repair the super block though? It seems that everyone agrees xfs can't do anything until it has a super block somewhere and I don't seem to have one. If there's no way to repair, then what about recovery? I see mention of possibly doing an xfs dump to another disk, reformat the original volume, and then xfs restore back. Is there any online procedure for how to do that if it applies to me here? Thomas Walker David Chinner wrote: >On Thu, Mar 15, 2007 at 10:06:42AM -0400, Thomas Walker wrote: > > >> The terminal shows a lot of "." dots running across the screen >>quickly, and every few hours it says this; >> >> >>.....................................................found candidate >>secondary superblock... >>unable to verify superblock, continuing... >>found candidate secondary superblock... >>unable to verify superblock, continuing... >> >> > >The primary superblock is not good, and it's trying to find a valid >secondary superblock. Doesn't sound promising so far - reapir can't >start until a valid superblok is found.... > > > > From owner-xfs@oss.sgi.com Fri Mar 16 08:42:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 08:42:26 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mtagate3.uk.ibm.com (mtagate3.uk.ibm.com [195.212.29.136]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GFgG6p008944 for ; Fri, 16 Mar 2007 08:42:18 -0700 Received: from d06nrmr1407.portsmouth.uk.ibm.com (d06nrmr1407.portsmouth.uk.ibm.com [9.149.38.185]) by mtagate3.uk.ibm.com (8.13.8/8.13.8) with ESMTP id l2GFMhJb073280 for ; Fri, 16 Mar 2007 15:22:43 GMT Received: from d06av04.portsmouth.uk.ibm.com (d06av04.portsmouth.uk.ibm.com [9.149.37.216]) by d06nrmr1407.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2GFMhdx2105422 for ; Fri, 16 Mar 2007 15:22:43 GMT Received: from d06av04.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av04.portsmouth.uk.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2GFMgXn017555 for ; Fri, 16 Mar 2007 15:22:43 GMT Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d06av04.portsmouth.uk.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2GFMfIR017520; Fri, 16 Mar 2007 15:22:42 GMT Date: Fri, 16 Mar 2007 16:21:03 +0100 From: Heiko Carstens To: "Amit K. Arora" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070316152103.GD8525@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070316143101.GA10152@amitarora.in.ibm.com> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 10859 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 1272 Lines: 25 On Fri, Mar 16, 2007 at 08:01:01PM +0530, Amit K. Arora wrote: > First of all, thanks for the overwhelming response! > > Based on the suggestions received, I have added a new parameter to the > sys_fallocate() system call - an interger called "mode", just after the > "fd". Now the system call looks like this: > > asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > > Currently we have two modes FA_ALLOCATE and FA_DEALLOCATE, for > preallocation and deallocation of preallocated blocks respectively. More > modes can be added, when required. And these modes can be renamed, since > I am sure these are no way the best ones ! :) > > Attached below is the patch which implements this system call. It has > been currently implemented and tested on i386, ppc64 and x86_64 > architectures. I am facing some problems while trying to implement this > on s390, and thus the delay. While I try to get it right on s390(x), we > thought of posting this patch, so that we can save some time. Parallely > we will work on getting the patch work on s390, and probably it will > come as a separate patch. What's the problem you face on s390? If it's just the compat wrapper, you may look at sys_sync_file_range_wrapper. Or I will send a patch if needed. From owner-xfs@oss.sgi.com Fri Mar 16 09:18:45 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 09:18:50 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mtagate1.de.ibm.com (mtagate1.de.ibm.com [195.212.29.150]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GGIg6p017366 for ; Fri, 16 Mar 2007 09:18:44 -0700 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate1.de.ibm.com (8.13.8/8.13.8) with ESMTP id l2GGIfIw056630 for ; Fri, 16 Mar 2007 16:18:41 GMT Received: from d12av01.megacenter.de.ibm.com (d12av01.megacenter.de.ibm.com [9.149.165.212]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2GGIe5e2080954 for ; Fri, 16 Mar 2007 17:18:41 +0100 Received: from d12av01.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av01.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2GGIex3006302 for ; Fri, 16 Mar 2007 17:18:40 +0100 Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d12av01.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2GGIefA006299; Fri, 16 Mar 2007 17:18:40 +0100 Date: Fri, 16 Mar 2007 17:17:04 +0100 From: Heiko Carstens To: "Amit K. Arora" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070316143101.GA10152@amitarora.in.ibm.com> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 10860 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 1107 Lines: 29 > on s390, and thus the delay. While I try to get it right on s390(x), we > thought of posting this patch, so that we can save some time. Parallely > we will work on getting the patch work on s390, and probably it will > come as a separate patch. > > +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > +{ There is something here that will not work on s390 (31bit): the arguments would end up in: fd -> r2 mode -> r3 offset -> r4 + r5 len -> r6 + second halve on stack But the s390 ABI says that a long long will be put into two consecutive registers if the first register is smaller than 6, or it will be put completely on the stack. So both 32 bit parts of len will end up on the stack. That would make it a syscall with seven arguments which we currently don't support on s390. There is no way to access the second half of len from kernel space and that is why it is not working for you. So you either rearrange the parameters or convert the loff_t's to pointers. e.g. asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len, int mode) would work even on s390 ;) From owner-xfs@oss.sgi.com Fri Mar 16 11:08:56 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 11:09:01 -0700 (PDT) X-Spam-oss-Status: No, score=1.3 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtp2.mundo-r.com (smtp7.mundo-r.com [212.51.32.154]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GI8r6p006243 for ; Fri, 16 Mar 2007 11:08:55 -0700 Received: from cm44039.red83-165.mundo-r.com (HELO [192.168.1.36]) ([83.165.44.39]) by smtp2.mundo-r.com with ESMTP; 16 Mar 2007 19:08:51 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAFZ6+kVTpSwn/2dsb2JhbAAN X-IronPort-AV: i="4.14,293,1170630000"; d="scan'208"; a="161176163:sNHT103015304" Message-ID: <45FADD32.40203@mundo-r.com> Date: Fri, 16 Mar 2007 19:08:50 +0100 From: Antonio Trueba User-Agent: IceDove 1.5.0.10 (X11/20070307) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Re: Spanish and Galician translation References: <45EC7F67.3050308@mundo-r.com> <1173182784.5051.7.camel@edge> In-Reply-To: <1173182784.5051.7.camel@edge> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 10861 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: atrueba@mundo-r.com Precedence: bulk X-list: xfs Content-Length: 772 Lines: 21 Nathan Scott escribió: > On Mon, 2007-03-05 at 21:36 +0100, Antonio Trueba wrote: >> Hello, >> >> I'd like to know if someone is already translating XFS to Spanish (es) >> or Galician (gl). If not, I'd like to do both myself. > > Go for it, noone else if working on those AFAIK. You should probably > rebuild the translation database ("cd xfsprogs/po && make xfsprogs.pot" > IIRC) first, its probably not been updated in awhile, and there's new > strings in xfs_repair at least. > > Let me know if theres any issues getting it working - I know a little > bit about the build system in this area, so can help. I have something to submit, at last :-). What is the preferred method for this? Do I send .po files to some coordinator, or just to the list? Regards, -- From owner-xfs@oss.sgi.com Fri Mar 16 12:20:52 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 12:20:55 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_20, J_CHICKENPOX_56 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2GJKm6p025204 for ; Fri, 16 Mar 2007 12:20:50 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id GAA18453; Sat, 17 Mar 2007 06:20:41 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2GJKeAf29530930; Sat, 17 Mar 2007 06:20:40 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2GJKbBe29836882; Sat, 17 Mar 2007 06:20:37 +1100 (AEDT) Date: Sat, 17 Mar 2007 06:20:37 +1100 From: David Chinner To: Thomas Walker Cc: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070316192037.GZ5743@melbourne.sgi.com> References: <20070315202027.CNS24438@comet.stsci.edu> <20070316013227.GO5743@melbourne.sgi.com> <45FA7C37.4070001@stsci.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45FA7C37.4070001@stsci.edu> User-Agent: Mutt/1.4.2.1i X-archive-position: 10862 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1263 Lines: 41 On Fri, Mar 16, 2007 at 07:15:03AM -0400, Thomas Walker wrote: > > No problem. The error was with the iflag=direct, apparently RHEL4 > doesn't like that option so I took it out. Here's the output from each > of the xfs volumes; > > [root@hla-ags ~]# dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 | od -Ax -x > 1+0 records in > 1+0 records out > 000000 0000 6419 0000 0c00 0000 7419 0000 0c00 > 000010 0000 8419 0000 0c00 0000 9419 0000 0c00 That's not a XFS superblock :( > [root@hla-ags ~]# dd if=/dev/mapper/vg1-hladata2 bs=512 count=1 | od -Ax -x > 1+0 records in > 000000 7970 6f72 746f 203a 2030 0a2f 500a 414c > 000010 4e49 4b0a 3120 410a 560a 3120 0a34 6964 > 000020 2072 2e31 2e30 3572 392f 3830 4b0a 3420 > 000030 690a 746f 0a61 2056 3531 660a 6c69 2065 Neither is that - it's a bunch of text that doesn't make much sense to me.... This would explain why xfs_repair is having trouble. > Does that help with a diag? It tells us that something has either overwritten the start of the partitions, or the lvm volumes have been put together incorrectly so the superblocks are not where it should be. I'd check that the LVM config is correct (again).... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Mar 16 12:30:39 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 12:30:43 -0700 (PDT) X-Spam-oss-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_20, J_CHICKENPOX_56,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GJUZ6p027199 for ; Fri, 16 Mar 2007 12:30:38 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 307A2180173DC; Fri, 16 Mar 2007 14:30:34 -0500 (CDT) Message-ID: <45FAF061.8070800@sandeen.net> Date: Fri, 16 Mar 2007 14:30:41 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: David Chinner CC: Thomas Walker , xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? References: <20070315202027.CNS24438@comet.stsci.edu> <20070316013227.GO5743@melbourne.sgi.com> <45FA7C37.4070001@stsci.edu> <20070316192037.GZ5743@melbourne.sgi.com> In-Reply-To: <20070316192037.GZ5743@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10863 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1003 Lines: 33 David Chinner wrote: > On Fri, Mar 16, 2007 at 07:15:03AM -0400, Thomas Walker wrote: >> No problem. The error was with the iflag=direct, apparently RHEL4 >> doesn't like that option so I took it out. Here's the output from each >> of the xfs volumes; >> >> [root@hla-ags ~]# dd if=/dev/mapper/vg0-hladata3 bs=512 count=1 | od -Ax -x >> 1+0 records in >> 1+0 records out >> 000000 0000 6419 0000 0c00 0000 7419 0000 0c00 >> 000010 0000 8419 0000 0c00 0000 9419 0000 0c00 > > That's not a XFS superblock :( lmdd pattern? >> [root@hla-ags ~]# dd if=/dev/mapper/vg1-hladata2 bs=512 count=1 | od -Ax -x >> 1+0 records in >> 000000 7970 6f72 746f 203a 2030 0a2f 500a 414c >> 000010 4e49 4b0a 3120 410a 560a 3120 0a34 6964 >> 000020 2072 2e31 2e30 3572 392f 3830 4b0a 3420 >> 000030 690a 746f 0a61 2056 3531 660a 6c69 2065 > > Neither is that - it's a bunch of text that doesn't > make much sense to me.... [esandeen@neon ~]$ echo "pyroot" | hexdump 0000000 7970 6f72 746f 000a python? *shrug* From owner-xfs@oss.sgi.com Fri Mar 16 12:37:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 12:37:40 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_40, J_CHICKENPOX_45 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2GJbX6p028624 for ; Fri, 16 Mar 2007 12:37:35 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id GAA19322; Sat, 17 Mar 2007 06:37:25 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2GJbNAf30512670; Sat, 17 Mar 2007 06:37:24 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2GJbMgo30545043; Sat, 17 Mar 2007 06:37:22 +1100 (AEDT) Date: Sat, 17 Mar 2007 06:37:22 +1100 From: David Chinner To: Thomas Walker Cc: David Chinner , xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070316193722.GA5743@melbourne.sgi.com> References: <45F92D8C.3090708@stsci.edu> <20070315150422.7bc5d178@harpe.intellique.com> <45F952F2.6000008@stsci.edu> <20070315231031.GP6095633@melbourne.sgi.com> <45FAB480.908@stsci.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45FAB480.908@stsci.edu> User-Agent: Mutt/1.4.2.1i X-archive-position: 10864 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1833 Lines: 49 On Fri, Mar 16, 2007 at 11:15:12AM -0400, Thomas Walker wrote: > So maybe I got bit the same way. parted may be overwritten something > at the head of the volume. Doesn't look like partition blocks at the start of each volume, though. > Is there any way to repair the super block > though? It seems that everyone agrees xfs can't do anything until it > has a super block somewhere and I don't seem to have one. That's beacuse repair can't work out where things are supposed to be without a superblock to tell it critical information. Manually trying to find and repair a superblock is a hit and miss affair - at this point we don't even know if the primary superblocks have been overwritten or whether something else is wrong with LVM... > If there's no > way to repair, then what about recovery? In a word: backups. > I see mention of possibly > doing an xfs dump to another disk, reformat the original volume, and > then xfs restore back. Is there any online procedure for how to do that > if it applies to me here? You need to be able to mount the filesystem to dump it, so until you can run repair there's no simple recovery option. If the lvm config is correct and repair cannot find a valid secondary superblock, then you really need to start doing dangerous things to try to recover. i'd suggest taking a copy of the lvm volumes before doing anything else. Then, find a secondary superblock in the volume (first 4 bytes of the sector are "XFSB" in hex) and copy that sector to block zero of the filesystem. If repair still won't do it's stuff, then you need to use xfs_db to modify that superblock until it does. Then when repair runs, you get to look in lost+found and try to work out what all the broken bits are..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Mar 16 13:00:06 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 13:00:11 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2GK016p001050 for ; Fri, 16 Mar 2007 13:00:05 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id GAA20651; Sat, 17 Mar 2007 06:59:57 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2GJxtAf30517548; Sat, 17 Mar 2007 06:59:55 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2GJxpFC30528272; Sat, 17 Mar 2007 06:59:51 +1100 (AEDT) Date: Sat, 17 Mar 2007 06:59:51 +1100 From: David Chinner To: Marco Berizzi Cc: David Chinner , linux-kernel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: XFS internal error xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller 0xc01b00bd Message-ID: <20070316195951.GB5743@melbourne.sgi.com> References: <20070316012520.GN5743@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 10865 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1166 Lines: 40 On Fri, Mar 16, 2007 at 12:05:33PM +0100, Marco Berizzi wrote: > David Chinner wrote: > > > can > > you post a url to the commit? > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.19.y.git;a=commit;h=7fbbb01dca7704d52ace6f45a805c98a5b0362f9 Ok, so an ipsec change. And I see from the history below it really has nothing to do with this problem. it seems the problem has something to do with changes between 2.6.19.1 and 2.6.19.2. There were no changes to XFS between 2.6.19.1 and 2.6.19.2, so I'm thinking that your problems are related to something other than XFS. Can you do a git bisect to determine what the bad patch is? > > Can you run xfs_repair on that filesystem and see if reports > > (and fixes) any problems? > > I don't need to run xfs_repair to fix the problem, Except that the trigger might be on-disk corruption so we need to rule that out first. > I only unplug the power cable and reboot the system, > xfs filesystem are correctly mounted. > However tell me if I must run xfs_repair to check > the filesystem. Yes, you need to run xfs_repair. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Mar 16 13:09:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 13:09:08 -0700 (PDT) X-Spam-oss-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_45 autolearn=no version=3.2.0-pre1-r499012 Received: from donner.stsci.edu (donner.stsci.edu [130.167.251.65]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2GK926p002775 for ; Fri, 16 Mar 2007 13:09:04 -0700 Received: from comet.stsci.edu (comet.stsci.edu [130.167.251.67]) by donner.stsci.edu (MOS 3.8.3-GA) with ESMTP id FJO02632; Fri, 16 Mar 2007 16:07:40 -0400 (EDT) Received: (from comet.stsci.edu [69.250.187.193]) by comet.stsci.edu (MOS 3.8.3-GA) with HTTPS/1.1 id CNS43969 (AUTH walker); Fri, 16 Mar 2007 16:09:00 -0400 (EDT) From: Thomas Walker Subject: Re: Should xfs_repair take this long? To: David Chinner Cc: xfs@oss.sgi.com Reply-To: walker@stsci.edu X-Mailer: Mirapoint Webmail Direct 3.8.3-GA MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20070316160900.CNS43969@comet.stsci.edu> Date: Fri, 16 Mar 2007 16:09:00 -0400 (EDT) X-Junkmail-Whitelist: YES (by domain whitelist at donner.stsci.edu) X-archive-position: 10866 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: walker@stsci.edu Precedence: bulk X-list: xfs Content-Length: 3334 Lines: 67 I already had xfs_repair scan the entire 6TB (took it 56 hours, which is the reason for the subject line). So it couldn't find a SB anywhere on that volume and it walked all over it. Therefore I guess the SB has been overwritten by something, maybe parted. As for the LVM physicals being in the wrong order, I can try to reverse them but I'm really pretty sure I have it right. Still, since the scan by xfs_repair couldn't find a SB anywhere I don't know what I would gain. We don't have a backup of these volumes, but I'm told by the user that almost all the data can be retrieved again from our archive, it's just a pain in the neck to do that. So while it would be nice to recover it won't be critical. Before wrapping this up, if you could just clarify a couple things. If I look at the bytes at the beginning of each physical part of the LVM's, what am I looking for? "XFSB"? If I do find that byte string, why couldn't xfs_repair find it when it did the scan and what do I do with it if I do find one? We see a software product call ufsexplorer that claims to be able to recover data without an XFS super block, anybody try it? I appreciate your help and time, Thomas Walker ---- Original message ---- >Date: Sat, 17 Mar 2007 06:37:22 +1100 >From: David Chinner >Subject: Re: Should xfs_repair take this long? >To: Thomas Walker >Cc: David Chinner , xfs@oss.sgi.com > >On Fri, Mar 16, 2007 at 11:15:12AM -0400, Thomas Walker wrote: >> So maybe I got bit the same way. parted may be overwritten something >> at the head of the volume. > >Doesn't look like partition blocks at the start of each volume, though. > >> Is there any way to repair the super block >> though? It seems that everyone agrees xfs can't do anything until it >> has a super block somewhere and I don't seem to have one. > >That's beacuse repair can't work out where things are supposed to >be without a superblock to tell it critical information. >Manually trying to find and repair a superblock is a hit and miss >affair - at this point we don't even know if the primary superblocks >have been overwritten or whether something else is wrong with LVM... > >> If there's no >> way to repair, then what about recovery? > >In a word: backups. > >> I see mention of possibly >> doing an xfs dump to another disk, reformat the original volume, and >> then xfs restore back. Is there any online procedure for how to do that >> if it applies to me here? > >You need to be able to mount the filesystem to dump it, so until you >can run repair there's no simple recovery option. > >If the lvm config is correct and repair cannot find a valid >secondary superblock, then you really need to start doing dangerous >things to try to recover. i'd suggest taking a copy of the lvm >volumes before doing anything else. > >Then, find a secondary superblock in the volume (first 4 bytes of >the sector are "XFSB" in hex) and copy that sector to block zero of >the filesystem. If repair still won't do it's stuff, then you need >to use xfs_db to modify that superblock until it does. Then when >repair runs, you get to look in lost+found and try to work out what >all the broken bits are..... > >Cheers, > >Dave. >-- >Dave Chinner >Principal Engineer >SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Mar 16 13:52:25 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 13:52:29 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2GKqL6p011912 for ; Fri, 16 Mar 2007 13:52:24 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id HAA23422; Sat, 17 Mar 2007 07:52:12 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2GKqAAf30559242; Sat, 17 Mar 2007 07:52:11 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2GKq8u630549098; Sat, 17 Mar 2007 07:52:08 +1100 (AEDT) Date: Sat, 17 Mar 2007 07:52:08 +1100 From: David Chinner To: Thomas Walker Cc: xfs@oss.sgi.com Subject: Re: Should xfs_repair take this long? Message-ID: <20070316205208.GT6095633@melbourne.sgi.com> References: <20070316160900.CNS43969@comet.stsci.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070316160900.CNS43969@comet.stsci.edu> User-Agent: Mutt/1.4.2.1i X-archive-position: 10867 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1846 Lines: 46 On Fri, Mar 16, 2007 at 04:09:00PM -0400, Thomas Walker wrote: > > I already had xfs_repair scan the entire 6TB (took it 56 hours, which is > the reason for the subject line). So it couldn't find a SB anywhere on > that volume and it walked all over it. Therefore I guess the SB has been > overwritten by something, maybe parted. As for the LVM physicals being in > the wrong order, I can try to reverse them but I'm really pretty sure I > have it right. Still, since the scan by xfs_repair couldn't find a SB > anywhere I don't know what I would gain. xfs-repair did find candidate secondary superblocks - it discarded them for some reason or another. If they were ok, all repair would have done is copied them to block zero and then continued. I'm suggesting that you manually do this step, and then see if repair will run. > Before wrapping this up, if you could just clarify a couple things. If I > look at the bytes at the beginning of each physical part of the LVM's, > what am I looking for? "XFSB"? yes. > If I do find that byte string, why > couldn't xfs_repair find it when it did the scan and what do I do with it > if I do find one? As I said above, xfs-repair did find some, but rejected them for some (unknown) reason. if you find one, copy it over block zero of the partition, and see if repair will run. Like I said, though, you'll probably want to back up th partition first, or at least run repair in no-modify mode. > We see a software product call ufsexplorer that claims > to be able to recover data without an XFS super block, anybody try it? Given that a) it runs on windows, and b) XFS support was apparently adding only a week ago, I doubt there's many ppl here that have tried it.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Mar 16 17:36:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 17:36:27 -0700 (PDT) X-Spam-oss-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00, J_CHICKENPOX_43 autolearn=no version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2H0aL6p017384 for ; Fri, 16 Mar 2007 17:36:22 -0700 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l2H0aHb2017463 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 17 Mar 2007 01:36:17 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l2H0aHaD017461; Sat, 17 Mar 2007 01:36:17 +0100 Date: Sat, 17 Mar 2007 01:36:17 +0100 From: Christoph Hellwig To: Timothy Shimmin Cc: Christoph Hellwig , xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] xfs: stop using kmalloc in xfs_buf_get_noaddr Message-ID: <20070317003617.GB17362@lst.de> References: <20070307101324.GC30587@lst.de> <20070309115511.GA20426@lst.de> <73E41C01F3C8F79AD31CEDAF@timothy-shimmins-power-mac-g5.local> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <73E41C01F3C8F79AD31CEDAF@timothy-shimmins-power-mac-g5.local> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10869 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 3729 Lines: 125 > It looks like you might need: for (i--; i >= 0; i--) > (or: for (j = 0; j < i; j++) etc.) > > Because if the initial alloc_page loop goes to completion then: > i == pagecount > and if alloc_page loop terminates early then > bp->b_pages[i] == NULL > So we have gone 1 too far in both cases and need to > start free'ing back one. > Unless I missed something. No, I was missing something :) Here's the updated version: Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-03-16 15:32:20.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c 2007-03-16 15:35:10.000000000 +0100 @@ -314,7 +314,7 @@ xfs_buf_free( ASSERT(list_empty(&bp->b_hash_list)); - if (bp->b_flags & _XBF_PAGE_CACHE) { + if (bp->b_flags & (_XBF_PAGE_CACHE|_XBF_PAGES)) { uint i; if ((bp->b_flags & XBF_MAPPED) && (bp->b_page_count > 1)) @@ -323,18 +323,11 @@ xfs_buf_free( for (i = 0; i < bp->b_page_count; i++) { struct page *page = bp->b_pages[i]; - ASSERT(!PagePrivate(page)); + if (bp->b_flags & _XBF_PAGE_CACHE) + ASSERT(!PagePrivate(page)); page_cache_release(page); } _xfs_buf_free_pages(bp); - } else if (bp->b_flags & _XBF_KMEM_ALLOC) { - /* - * XXX(hch): bp->b_count_desired might be incorrect (see - * xfs_buf_associate_memory for details), but fortunately - * the Linux version of kmem_free ignores the len argument.. - */ - kmem_free(bp->b_addr, bp->b_count_desired); - _xfs_buf_free_pages(bp); } xfs_buf_deallocate(bp); @@ -764,41 +757,41 @@ xfs_buf_get_noaddr( size_t len, xfs_buftarg_t *target) { - size_t malloc_len = len; + unsigned long page_count = PAGE_ALIGN(len) >> PAGE_SHIFT; + int error, i; xfs_buf_t *bp; - void *data; - int error; bp = xfs_buf_allocate(0); if (unlikely(bp == NULL)) goto fail; _xfs_buf_initialize(bp, target, 0, len, 0); - try_again: - data = kmem_alloc(malloc_len, KM_SLEEP | KM_MAYFAIL | KM_LARGE); - if (unlikely(data == NULL)) + error = _xfs_buf_get_pages(bp, page_count, 0); + if (error) goto fail_free_buf; - /* check whether alignment matches.. */ - if ((__psunsigned_t)data != - ((__psunsigned_t)data & ~target->bt_smask)) { - /* .. else double the size and try again */ - kmem_free(data, malloc_len); - malloc_len <<= 1; - goto try_again; - } - - error = xfs_buf_associate_memory(bp, data, len); - if (error) + for (i = 0; i < page_count; i++) { + bp->b_pages[i] = alloc_page(GFP_KERNEL); + if (!bp->b_pages[i]) + goto fail_free_mem; + } + bp->b_flags |= _XBF_PAGES; + + error = _xfs_buf_map_pages(bp, XBF_MAPPED); + if (unlikely(error)) { + printk(KERN_WARNING "%s: failed to map pages\n", + __FUNCTION__); goto fail_free_mem; - bp->b_flags |= _XBF_KMEM_ALLOC; + } xfs_buf_unlock(bp); XB_TRACE(bp, "no_daddr", data); return bp; + fail_free_mem: - kmem_free(data, malloc_len); + while (--i >= 0) + __free_page(bp->b_pages[i]); fail_free_buf: xfs_buf_free(bp); fail: Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.h 2007-03-13 18:18:05.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.h 2007-03-16 15:34:20.000000000 +0100 @@ -63,7 +63,7 @@ typedef enum { /* flags used only internally */ _XBF_PAGE_CACHE = (1 << 17),/* backed by pagecache */ - _XBF_KMEM_ALLOC = (1 << 18),/* backed by kmem_alloc() */ + _XBF_PAGES = (1 << 18), /* backed by refcounted pages */ _XBF_RUN_QUEUES = (1 << 19),/* run block device task queue */ _XBF_DELWRI_Q = (1 << 21), /* buffer on delwri queue */ } xfs_buf_flags_t; From owner-xfs@oss.sgi.com Fri Mar 16 17:35:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 17:35:42 -0700 (PDT) X-Spam-oss-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2H0ZX6p017138 for ; Fri, 16 Mar 2007 17:35:35 -0700 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l2H0ZRb2017427 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 17 Mar 2007 01:35:27 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l2H0ZOl6017425; Sat, 17 Mar 2007 01:35:24 +0100 Date: Sat, 17 Mar 2007 01:35:24 +0100 From: Christoph Hellwig To: David Chinner Cc: Christoph Hellwig , xfs@oss.sgi.com, ecashin@coraid.com, akpm@osdl.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] xfs: use xfs_get_buf_noaddr for iclogs Message-ID: <20070317003524.GA17362@lst.de> References: <20070307101314.GB30587@lst.de> <20070312044117.GK6095633@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070312044117.GK6095633@melbourne.sgi.com> User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10868 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 2494 Lines: 72 On Mon, Mar 12, 2007 at 03:41:17PM +1100, David Chinner wrote: > OTOH, all other buffers are supposed to be locked when under I/O. > This change makes a special case for the log buffers, and I'd prefer > not to have to remember that this behaviour changed fo log buffers > at some point in time. > > I suggest that adding: ... > + XFS_BUF_PSEMA(bp, PRIBIO); ... > To lock the buffer should be added here. That way we don't change > any semantics of the code at all. Here's a patch with your suggestion implemented. Seems to work fine under heavy NFS load for me. Note that the log recovery has some inconsistancies already about doing I/O both on locked and unlocked buffers. Long-term it might be a good idea to change xfs_get_buf_noaddr to return a locked buffer like xfs_get_buf(_flags) does already. Index: linux-2.6/fs/xfs/xfs_log.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_log.c 2007-03-16 15:21:43.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_log.c 2007-03-16 15:34:15.000000000 +0100 @@ -1199,11 +1199,18 @@ xlog_alloc_log(xfs_mount_t *mp, *iclogp = (xlog_in_core_t *) kmem_zalloc(sizeof(xlog_in_core_t), KM_SLEEP); iclog = *iclogp; - iclog->hic_data = (xlog_in_core_2_t *) - kmem_zalloc(iclogsize, KM_SLEEP | KM_LARGE); - iclog->ic_prev = prev_iclog; prev_iclog = iclog; + + bp = xfs_buf_get_noaddr(log->l_iclog_size, mp->m_logdev_targp); + if (!XFS_BUF_CPSEMA(bp)) + ASSERT(0); + XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); + XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); + XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); + iclog->ic_bp = bp; + iclog->hic_data = bp->b_addr; + log->l_iclog_bak[i] = (xfs_caddr_t)&(iclog->ic_header); head = &iclog->ic_header; @@ -1216,11 +1223,6 @@ xlog_alloc_log(xfs_mount_t *mp, INT_SET(head->h_fmt, ARCH_CONVERT, XLOG_FMT); memcpy(&head->h_fs_uuid, &mp->m_sb.sb_uuid, sizeof(uuid_t)); - bp = xfs_buf_get_empty(log->l_iclog_size, mp->m_logdev_targp); - XFS_BUF_SET_IODONE_FUNC(bp, xlog_iodone); - XFS_BUF_SET_BDSTRAT_FUNC(bp, xlog_bdstrat_cb); - XFS_BUF_SET_FSPRIVATE2(bp, (unsigned long)1); - iclog->ic_bp = bp; iclog->ic_size = XFS_BUF_SIZE(bp) - log->l_iclog_hsize; iclog->ic_state = XLOG_STATE_ACTIVE; @@ -1528,7 +1530,6 @@ xlog_dealloc_log(xlog_t *log) } #endif next_iclog = iclog->ic_next; - kmem_free(iclog->hic_data, log->l_iclog_size); kmem_free(iclog, sizeof(xlog_in_core_t)); iclog = next_iclog; } From owner-xfs@oss.sgi.com Fri Mar 16 18:11:17 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 18:11:23 -0700 (PDT) X-Spam-oss-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from home.jason.bur.st (ppp76-251.lns1.mel3.internode.on.net [59.167.76.251]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2H1BF6p024226 for ; Fri, 16 Mar 2007 18:11:17 -0700 Received: by home.jason.bur.st (Postfix, from userid 1000) id 09412780B692; Sat, 17 Mar 2007 11:47:32 +1100 (EST) Date: Sat, 17 Mar 2007 11:47:31 +1100 From: Jason White To: linux-xfs@oss.sgi.com Subject: Re: Questions about XFS Message-ID: <20070317004731.GA5236@jdc.local> Mail-Followup-To: linux-xfs@oss.sgi.com References: <200703131440.56678.clflush@chello.be> <45F8CAEA.3050408@list.rakugaki.org> <200703151007.32630.clflush@chello.be> <200703161136.32234.Martin@lichtvoll.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200703161136.32234.Martin@lichtvoll.de> User-Agent: Mutt/1.5.13 (2006-08-11) X-archive-position: 10870 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jasonjgw@internode.on.net Precedence: bulk X-list: xfs Content-Length: 788 Lines: 15 On Fri, Mar 16, 2007 at 11:36:31AM +0100, Martin Steigerwald wrote: > Since 2.6.17.7 and enabled write barriers I didn't loose meta data > consistency on my laptop anymore and I can tell you that it crashed a lot > due to my experiments with what not (especially OSS radeon drivers and > beryl;-). I also had some classical power outages. My laptop also supports write barriers, but I leave the battery in place in case there's a power outage; effectively it's operating as a UPS. This might be slightly off-topic, but in choosing a SATA drive for a desktop machine, what features/standard-complaince should one look for in order to ensure that write barriers work? I know this involves flushing the drive cache, but is this support mandatory in any of the applicable standards? From owner-xfs@oss.sgi.com Fri Mar 16 18:39:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 18:39:22 -0700 (PDT) X-Spam-oss-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2H1dE6p030266 for ; Fri, 16 Mar 2007 18:39:15 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HSNQp-0007Qb-Kf; Sat, 17 Mar 2007 01:10:03 +0000 Date: Sat, 17 Mar 2007 01:10:03 +0000 From: Christoph Hellwig To: nfs@lists.sourceforge.net, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com Subject: [PATCH 1/18] xfs: kill struct fid/fid_t namespace pollution Message-ID: <20070317011003.GB24947@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 10871 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 10212 Lines: 282 XFS currently defines various fid types internally, including an fid_t which I will introduce kernel wide in the next patch. This patch kills the fid_t and xfs_fid2_t types inside xfs and uses xfs_fid_t consistantly. Signed-off-by: Christoph Hellwig Index: linux-2.6/fs/xfs/linux-2.6/xfs_export.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_export.c 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_export.c 2007-03-13 18:21:55.000000000 +0100 @@ -48,8 +48,8 @@ xfs_fs_decode_fh( struct dentry *de), void *context) { - xfs_fid2_t ifid; - xfs_fid2_t pfid; + xfs_fid_t ifid; + xfs_fid_t pfid; void *parent = NULL; int is64 = 0; __u32 *p = fh; @@ -141,7 +141,7 @@ xfs_fs_get_dentry( bhv_vfs_t *vfsp = vfs_from_sb(sb); int error; - error = bhv_vfs_vget(vfsp, &vp, (fid_t *)data); + error = bhv_vfs_vget(vfsp, &vp, data); if (error || vp == NULL) return ERR_PTR(-ESTALE) ; Index: linux-2.6/fs/xfs/linux-2.6/xfs_vfs.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_vfs.c 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_vfs.c 2007-03-13 18:21:55.000000000 +0100 @@ -145,7 +145,7 @@ int vfs_vget( struct bhv_desc *bdp, struct bhv_vnode **vpp, - struct fid *fidp) + struct xfs_fid *fidp) { struct bhv_desc *next = bdp; Index: linux-2.6/fs/xfs/linux-2.6/xfs_vfs.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_vfs.h 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_vfs.h 2007-03-13 18:21:55.000000000 +0100 @@ -24,10 +24,10 @@ struct bhv_vfs; struct bhv_vnode; -struct fid; struct cred; struct seq_file; struct super_block; +struct xfs_fid; struct xfs_mount_args; typedef struct kstatfs bhv_statvfs_t; @@ -112,7 +112,8 @@ typedef int (*vfs_root_t)(bhv_desc_t *, typedef int (*vfs_statvfs_t)(bhv_desc_t *, bhv_statvfs_t *, struct bhv_vnode *); typedef int (*vfs_sync_t)(bhv_desc_t *, int, struct cred *); -typedef int (*vfs_vget_t)(bhv_desc_t *, struct bhv_vnode **, struct fid *); +typedef int (*vfs_vget_t)(bhv_desc_t *, struct bhv_vnode **, + struct xfs_fid *); typedef int (*vfs_dmapiops_t)(bhv_desc_t *, caddr_t); typedef int (*vfs_quotactl_t)(bhv_desc_t *, int, int, caddr_t); typedef void (*vfs_init_vnode_t)(bhv_desc_t *, @@ -183,7 +184,7 @@ extern int vfs_mntupdate(bhv_desc_t *, i extern int vfs_root(bhv_desc_t *, struct bhv_vnode **); extern int vfs_statvfs(bhv_desc_t *, bhv_statvfs_t *, struct bhv_vnode *); extern int vfs_sync(bhv_desc_t *, int, struct cred *); -extern int vfs_vget(bhv_desc_t *, struct bhv_vnode **, struct fid *); +extern int vfs_vget(bhv_desc_t *, struct bhv_vnode **, struct xfs_fid *); extern int vfs_dmapiops(bhv_desc_t *, caddr_t); extern int vfs_quotactl(bhv_desc_t *, int, int, caddr_t); extern void vfs_init_vnode(bhv_desc_t *, struct bhv_vnode *, bhv_desc_t *, int); Index: linux-2.6/fs/xfs/xfs_fs.h =================================================================== --- linux-2.6.orig/fs/xfs/xfs_fs.h 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_fs.h 2007-03-13 18:21:55.000000000 +0100 @@ -387,30 +387,13 @@ typedef struct xfs_fsop_attrmulti_handle */ typedef struct { __u32 val[2]; } xfs_fsid_t; /* file system id type */ - -#ifndef HAVE_FID -#define MAXFIDSZ 46 - -typedef struct fid { - __u16 fid_len; /* length of data in bytes */ - unsigned char fid_data[MAXFIDSZ]; /* data (fid_len worth) */ -} fid_t; -#endif - typedef struct xfs_fid { - __u16 xfs_fid_len; /* length of remainder */ - __u16 xfs_fid_pad; - __u32 xfs_fid_gen; /* generation number */ - __u64 xfs_fid_ino; /* 64 bits inode number */ + __u16 fid_len; /* length of remainder */ + __u16 fid_pad; + __u32 fid_gen; /* generation number */ + __u64 fid_ino; /* 64 bits inode number */ } xfs_fid_t; -typedef struct xfs_fid2 { - __u16 fid_len; /* length of remainder */ - __u16 fid_pad; /* padding, must be zero */ - __u32 fid_gen; /* generation number */ - __u64 fid_ino; /* inode number */ -} xfs_fid2_t; - typedef struct xfs_handle { union { __s64 align; /* force alignment of ha_fid */ @@ -420,9 +403,9 @@ typedef struct xfs_handle { } xfs_handle_t; #define ha_fsid ha_u._ha_fsid -#define XFS_HSIZE(handle) (((char *) &(handle).ha_fid.xfs_fid_pad \ +#define XFS_HSIZE(handle) (((char *) &(handle).ha_fid.fid_pad \ - (char *) &(handle)) \ - + (handle).ha_fid.xfs_fid_len) + + (handle).ha_fid.fid_len) /* * Flags for going down operation Index: linux-2.6/fs/xfs/xfs_vfsops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_vfsops.c 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_vfsops.c 2007-03-13 18:21:55.000000000 +0100 @@ -1565,10 +1565,9 @@ STATIC int xfs_vget( bhv_desc_t *bdp, bhv_vnode_t **vpp, - fid_t *fidp) + xfs_fid_t *xfid) { xfs_mount_t *mp = XFS_BHVTOM(bdp); - xfs_fid_t *xfid = (struct xfs_fid *)fidp; xfs_inode_t *ip; int error; xfs_ino_t ino; @@ -1578,11 +1577,11 @@ xfs_vget( * Invalid. Since handles can be created in user space and passed in * via gethandle(), this is not cause for a panic. */ - if (xfid->xfs_fid_len != sizeof(*xfid) - sizeof(xfid->xfs_fid_len)) + if (xfid->fid_len != sizeof(*xfid) - sizeof(xfid->fid_len)) return XFS_ERROR(EINVAL); - ino = xfid->xfs_fid_ino; - igen = xfid->xfs_fid_gen; + ino = xfid->fid_ino; + igen = xfid->fid_gen; /* * NFS can sometimes send requests for ino 0. Fail them gracefully. Index: linux-2.6/fs/xfs/xfs_vnodeops.c =================================================================== --- linux-2.6.orig/fs/xfs/xfs_vnodeops.c 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/xfs_vnodeops.c 2007-03-13 18:21:55.000000000 +0100 @@ -3577,28 +3577,18 @@ std_return: goto std_return; } - -/* - * xfs_fid2 - * - * A fid routine that takes a pointer to a previously allocated - * fid structure (like xfs_fast_fid) but uses a 64 bit inode number. - */ STATIC int xfs_fid2( bhv_desc_t *bdp, - fid_t *fidp) + xfs_fid_t *xfid) { xfs_inode_t *ip; - xfs_fid2_t *xfid; vn_trace_entry(BHV_TO_VNODE(bdp), __FUNCTION__, (inst_t *)__return_address); - ASSERT(sizeof(fid_t) >= sizeof(xfs_fid2_t)); - xfid = (xfs_fid2_t *)fidp; ip = XFS_BHVTOI(bdp); - xfid->fid_len = sizeof(xfs_fid2_t) - sizeof(xfid->fid_len); + xfid->fid_len = sizeof(xfs_fid_t) - sizeof(xfid->fid_len); xfid->fid_pad = 0; /* * use memcpy because the inode is a long long and there's no Index: linux-2.6/fs/xfs/linux-2.6/xfs_vnode.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_vnode.h 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_vnode.h 2007-03-13 18:21:55.000000000 +0100 @@ -176,7 +176,7 @@ typedef int (*vop_readlink_t)(bhv_desc_t typedef int (*vop_fsync_t)(bhv_desc_t *, int, struct cred *, xfs_off_t, xfs_off_t); typedef int (*vop_inactive_t)(bhv_desc_t *, struct cred *); -typedef int (*vop_fid2_t)(bhv_desc_t *, struct fid *); +typedef int (*vop_fid2_t)(bhv_desc_t *, struct xfs_fid *); typedef int (*vop_release_t)(bhv_desc_t *); typedef int (*vop_rwlock_t)(bhv_desc_t *, bhv_vrwlock_t); typedef void (*vop_rwunlock_t)(bhv_desc_t *, bhv_vrwlock_t); Index: linux-2.6/fs/xfs/linux-2.6/xfs_ioctl.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_ioctl.c 2007-03-13 18:21:13.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_ioctl.c 2007-03-13 18:21:55.000000000 +0100 @@ -150,11 +150,11 @@ xfs_find_handle( lock_mode = xfs_ilock_map_shared(ip); /* fill in fid section of handle from inode */ - handle.ha_fid.xfs_fid_len = sizeof(xfs_fid_t) - - sizeof(handle.ha_fid.xfs_fid_len); - handle.ha_fid.xfs_fid_pad = 0; - handle.ha_fid.xfs_fid_gen = ip->i_d.di_gen; - handle.ha_fid.xfs_fid_ino = ip->i_ino; + handle.ha_fid.fid_len = sizeof(xfs_fid_t) - + sizeof(handle.ha_fid.fid_len); + handle.ha_fid.fid_pad = 0; + handle.ha_fid.fid_gen = ip->i_d.di_gen; + handle.ha_fid.fid_ino = ip->i_ino; xfs_iunlock_map_shared(ip, lock_mode); @@ -220,10 +220,10 @@ xfs_vget_fsop_handlereq( if (hlen < sizeof(*handlep)) memset(((char *)handlep) + hlen, 0, sizeof(*handlep) - hlen); if (hlen > sizeof(handlep->ha_fsid)) { - if (handlep->ha_fid.xfs_fid_len != - (hlen - sizeof(handlep->ha_fsid) - - sizeof(handlep->ha_fid.xfs_fid_len)) - || handlep->ha_fid.xfs_fid_pad) + if (handlep->ha_fid.fid_len != + (hlen - sizeof(handlep->ha_fsid) - + sizeof(handlep->ha_fid.fid_len)) || + handlep->ha_fid.fid_pad) return XFS_ERROR(EINVAL); } @@ -231,9 +231,9 @@ xfs_vget_fsop_handlereq( * Crack the handle, obtain the inode # & generation # */ xfid = (struct xfs_fid *)&handlep->ha_fid; - if (xfid->xfs_fid_len == sizeof(*xfid) - sizeof(xfid->xfs_fid_len)) { - ino = xfid->xfs_fid_ino; - igen = xfid->xfs_fid_gen; + if (xfid->fid_len == sizeof(*xfid) - sizeof(xfid->fid_len)) { + ino = xfid->fid_ino; + igen = xfid->fid_gen; } else { return XFS_ERROR(EINVAL); } Index: linux-2.6/fs/xfs/linux-2.6/xfs_export.h =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_export.h 2007-03-13 18:23:42.000000000 +0100 +++ linux-2.6/fs/xfs/linux-2.6/xfs_export.h 2007-03-13 18:23:59.000000000 +0100 @@ -71,13 +71,13 @@ xfs_fileid_length(int hasparent, int is6 /* * Decode encoded inode information (either for the inode itself - * or the parent) into an xfs_fid2_t structure. Advances and + * or the parent) into an xfs_fid_t structure. Advances and * returns the new data pointer */ static inline __u32 * -xfs_fileid_decode_fid2(__u32 *p, xfs_fid2_t *fid, int is64) +xfs_fileid_decode_fid2(__u32 *p, xfs_fid_t *fid, int is64) { - fid->fid_len = sizeof(xfs_fid2_t) - sizeof(fid->fid_len); + fid->fid_len = sizeof(xfs_fid_t) - sizeof(fid->fid_len); fid->fid_pad = 0; fid->fid_ino = *p++; #if XFS_BIG_INUMS From owner-xfs@oss.sgi.com Fri Mar 16 19:27:14 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 19:27:18 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_45,J_CHICKENPOX_61,J_CHICKENPOX_62,J_CHICKENPOX_63, J_CHICKENPOX_65,SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2H2RC6p006313 for ; Fri, 16 Mar 2007 19:27:13 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 529FE1807DF1F for ; Fri, 16 Mar 2007 21:27:11 -0500 (CDT) Message-ID: <45FB51FE.3060400@sandeen.net> Date: Fri, 16 Mar 2007 21:27:10 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: Re: [PATCH] get rid of fname[] for tracing functions References: <45EE2EF3.8090707@sandeen.net> In-Reply-To: <45EE2EF3.8090707@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10872 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 50087 Lines: 1359 slight quilt misfire... here's an updated version ------------------------------------------------------------------------ Get rid of fname[] arrays for tracing code, and just use gcc's __FUNCTION__ instead. xfs_alloc.c | 53 ++-------- xfs_bmap.c | 266 +++++++++++++++++++++++-------------------------------- xfs_bmap.h | 6 - xfs_bmap_btree.c | 88 +++--------------- xfs_inode.c | 8 - 5 files changed, 149 insertions(+), 272 deletions(-) Signed-off-by: Eric Sandeen Index: linux/fs/xfs/xfs_bmap_btree.c =================================================================== --- linux.orig/fs/xfs/xfs_bmap_btree.c +++ linux/fs/xfs/xfs_bmap_btree.c @@ -76,7 +76,7 @@ static char EXIT[] = "exit"; */ STATIC void xfs_bmbt_trace_enter( - char *func, + const char *func, xfs_btree_cur_t *cur, char *s, int type, @@ -117,7 +117,7 @@ xfs_bmbt_trace_enter( */ STATIC void xfs_bmbt_trace_argbi( - char *func, + const char *func, xfs_btree_cur_t *cur, xfs_buf_t *b, int i, @@ -134,7 +134,7 @@ xfs_bmbt_trace_argbi( */ STATIC void xfs_bmbt_trace_argbii( - char *func, + const char *func, xfs_btree_cur_t *cur, xfs_buf_t *b, int i0, @@ -153,7 +153,7 @@ xfs_bmbt_trace_argbii( */ STATIC void xfs_bmbt_trace_argfffi( - char *func, + const char *func, xfs_btree_cur_t *cur, xfs_dfiloff_t o, xfs_dfsbno_t b, @@ -172,7 +172,7 @@ xfs_bmbt_trace_argfffi( */ STATIC void xfs_bmbt_trace_argi( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, int line) @@ -188,7 +188,7 @@ xfs_bmbt_trace_argi( */ STATIC void xfs_bmbt_trace_argifk( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, xfs_fsblock_t f, @@ -206,7 +206,7 @@ xfs_bmbt_trace_argifk( */ STATIC void xfs_bmbt_trace_argifr( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, xfs_fsblock_t f, @@ -235,7 +235,7 @@ xfs_bmbt_trace_argifr( */ STATIC void xfs_bmbt_trace_argik( - char *func, + const char *func, xfs_btree_cur_t *cur, int i, xfs_bmbt_key_t *k, @@ -255,7 +255,7 @@ xfs_bmbt_trace_argik( */ STATIC void xfs_bmbt_trace_cursor( - char *func, + const char *func, xfs_btree_cur_t *cur, char *s, int line) @@ -274,21 +274,21 @@ xfs_bmbt_trace_cursor( } #define XFS_BMBT_TRACE_ARGBI(c,b,i) \ - xfs_bmbt_trace_argbi(fname, c, b, i, __LINE__) + xfs_bmbt_trace_argbi(__FUNCTION__, c, b, i, __LINE__) #define XFS_BMBT_TRACE_ARGBII(c,b,i,j) \ - xfs_bmbt_trace_argbii(fname, c, b, i, j, __LINE__) + xfs_bmbt_trace_argbii(__FUNCTION__, c, b, i, j, __LINE__) #define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) \ - xfs_bmbt_trace_argfffi(fname, c, o, b, i, j, __LINE__) + xfs_bmbt_trace_argfffi(__FUNCTION__, c, o, b, i, j, __LINE__) #define XFS_BMBT_TRACE_ARGI(c,i) \ - xfs_bmbt_trace_argi(fname, c, i, __LINE__) + xfs_bmbt_trace_argi(__FUNCTION__, c, i, __LINE__) #define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) \ - xfs_bmbt_trace_argifk(fname, c, i, f, s, __LINE__) + xfs_bmbt_trace_argifk(__FUNCTION__, c, i, f, s, __LINE__) #define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) \ - xfs_bmbt_trace_argifr(fname, c, i, f, r, __LINE__) + xfs_bmbt_trace_argifr(__FUNCTION__, c, i, f, r, __LINE__) #define XFS_BMBT_TRACE_ARGIK(c,i,k) \ - xfs_bmbt_trace_argik(fname, c, i, k, __LINE__) + xfs_bmbt_trace_argik(__FUNCTION__, c, i, k, __LINE__) #define XFS_BMBT_TRACE_CURSOR(c,s) \ - xfs_bmbt_trace_cursor(fname, c, s, __LINE__) + xfs_bmbt_trace_cursor(__FUNCTION__, c, s, __LINE__) #else #define XFS_BMBT_TRACE_ARGBI(c,b,i) #define XFS_BMBT_TRACE_ARGBII(c,b,i,j) @@ -318,9 +318,6 @@ xfs_bmbt_delrec( xfs_fsblock_t bno; /* fs-relative block number */ xfs_buf_t *bp; /* buffer for block */ int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_delrec"; -#endif int i; /* loop counter */ int j; /* temp state */ xfs_bmbt_key_t key; /* bmap btree key */ @@ -694,9 +691,6 @@ xfs_bmbt_insrec( xfs_bmbt_block_t *block; /* bmap btree block */ xfs_buf_t *bp; /* buffer for block */ int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_insrec"; -#endif int i; /* loop index */ xfs_bmbt_key_t key; /* bmap btree key */ xfs_bmbt_key_t *kp=NULL; /* pointer to bmap btree key */ @@ -881,9 +875,6 @@ xfs_bmbt_killroot( #ifdef DEBUG int error; #endif -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_killroot"; -#endif int i; xfs_bmbt_key_t *kp; xfs_inode_t *ip; @@ -973,9 +964,6 @@ xfs_bmbt_log_keys( int kfirst, int klast) { -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_keys"; -#endif xfs_trans_t *tp; XFS_BMBT_TRACE_CURSOR(cur, ENTRY); @@ -1012,9 +1000,6 @@ xfs_bmbt_log_ptrs( int pfirst, int plast) { -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_ptrs"; -#endif xfs_trans_t *tp; XFS_BMBT_TRACE_CURSOR(cur, ENTRY); @@ -1055,9 +1040,6 @@ xfs_bmbt_lookup( xfs_daddr_t d; xfs_sfiloff_t diff; int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_lookup"; -#endif xfs_fsblock_t fsbno=0; int high; int i; @@ -1195,9 +1177,6 @@ xfs_bmbt_lshift( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_lshift"; -#endif #ifdef DEBUG int i; /* loop counter */ #endif @@ -1331,9 +1310,6 @@ xfs_bmbt_rshift( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_rshift"; -#endif int i; /* loop counter */ xfs_bmbt_key_t key; /* bmap btree key */ xfs_buf_t *lbp; /* left buffer pointer */ @@ -1492,9 +1468,6 @@ xfs_bmbt_split( { xfs_alloc_arg_t args; /* block allocation args */ int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_split"; -#endif int i; /* loop counter */ xfs_fsblock_t lbno; /* left sibling block number */ xfs_buf_t *lbp; /* left buffer pointer */ @@ -1641,9 +1614,6 @@ xfs_bmbt_updkey( #ifdef DEBUG int error; #endif -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_updkey"; -#endif xfs_bmbt_key_t *kp; int ptr; @@ -1712,9 +1682,6 @@ xfs_bmbt_decrement( xfs_bmbt_block_t *block; xfs_buf_t *bp; int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_decrement"; -#endif xfs_fsblock_t fsbno; int lev; xfs_mount_t *mp; @@ -1785,9 +1752,6 @@ xfs_bmbt_delete( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_delete"; -#endif int i; int level; @@ -2000,9 +1964,6 @@ xfs_bmbt_increment( xfs_bmbt_block_t *block; xfs_buf_t *bp; int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_increment"; -#endif xfs_fsblock_t fsbno; int lev; xfs_mount_t *mp; @@ -2080,9 +2041,6 @@ xfs_bmbt_insert( int *stat) /* success/failure */ { int error; /* error return value */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_insert"; -#endif int i; int level; xfs_fsblock_t nbno; @@ -2142,9 +2100,6 @@ xfs_bmbt_log_block( int fields) { int first; -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_block"; -#endif int last; xfs_trans_t *tp; static const short offsets[] = { @@ -2181,9 +2136,6 @@ xfs_bmbt_log_recs( { xfs_bmbt_block_t *block; int first; -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_log_recs"; -#endif int last; xfs_bmbt_rec_t *rp; xfs_trans_t *tp; @@ -2245,9 +2197,6 @@ xfs_bmbt_newroot( xfs_bmbt_key_t *ckp; /* child key pointer */ xfs_bmbt_ptr_t *cpp; /* child ptr pointer */ int error; /* error return code */ -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_newroot"; -#endif #ifdef DEBUG int i; /* loop counter */ #endif @@ -2630,9 +2579,6 @@ xfs_bmbt_update( xfs_bmbt_block_t *block; xfs_buf_t *bp; int error; -#ifdef XFS_BMBT_TRACE - static char fname[] = "xfs_bmbt_update"; -#endif xfs_bmbt_key_t key; int ptr; xfs_bmbt_rec_t *rp; Index: linux/fs/xfs/xfs_alloc.c =================================================================== --- linux.orig/fs/xfs/xfs_alloc.c +++ linux/fs/xfs/xfs_alloc.c @@ -55,17 +55,17 @@ xfs_alloc_search_busy(xfs_trans_t *tp, ktrace_t *xfs_alloc_trace_buf; #define TRACE_ALLOC(s,a) \ - xfs_alloc_trace_alloc(fname, s, a, __LINE__) + xfs_alloc_trace_alloc(__FUNCTION__, s, a, __LINE__) #define TRACE_FREE(s,a,b,x,f) \ - xfs_alloc_trace_free(fname, s, mp, a, b, x, f, __LINE__) + xfs_alloc_trace_free(__FUNCTION__, s, mp, a, b, x, f, __LINE__) #define TRACE_MODAGF(s,a,f) \ - xfs_alloc_trace_modagf(fname, s, mp, a, f, __LINE__) -#define TRACE_BUSY(fname,s,ag,agb,l,sl,tp) \ - xfs_alloc_trace_busy(fname, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSY, __LINE__) -#define TRACE_UNBUSY(fname,s,ag,sl,tp) \ - xfs_alloc_trace_busy(fname, s, mp, ag, -1, -1, sl, tp, XFS_ALLOC_KTRACE_UNBUSY, __LINE__) -#define TRACE_BUSYSEARCH(fname,s,ag,agb,l,sl,tp) \ - xfs_alloc_trace_busy(fname, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSYSEARCH, __LINE__) + xfs_alloc_trace_modagf(__FUNCTION__, s, mp, a, f, __LINE__) +#define TRACE_BUSY(__FUNCTION__,s,ag,agb,l,sl,tp) \ + xfs_alloc_trace_busy(__FUNCTION__, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSY, __LINE__) +#define TRACE_UNBUSY(__FUNCTION__,s,ag,sl,tp) \ + xfs_alloc_trace_busy(__FUNCTION__, s, mp, ag, -1, -1, sl, tp, XFS_ALLOC_KTRACE_UNBUSY, __LINE__) +#define TRACE_BUSYSEARCH(__FUNCTION__,s,ag,agb,l,sl,tp) \ + xfs_alloc_trace_busy(__FUNCTION__, s, mp, ag, agb, l, sl, tp, XFS_ALLOC_KTRACE_BUSYSEARCH, __LINE__) #else #define TRACE_ALLOC(s,a) #define TRACE_FREE(s,a,b,x,f) @@ -420,7 +420,7 @@ xfs_alloc_read_agfl( */ STATIC void xfs_alloc_trace_alloc( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_alloc_arg_t *args, /* allocation argument structure */ int line) /* source line number */ @@ -453,7 +453,7 @@ xfs_alloc_trace_alloc( */ STATIC void xfs_alloc_trace_free( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_mount_t *mp, /* file system mount point */ xfs_agnumber_t agno, /* allocation group number */ @@ -479,7 +479,7 @@ xfs_alloc_trace_free( */ STATIC void xfs_alloc_trace_modagf( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_mount_t *mp, /* file system mount point */ xfs_agf_t *agf, /* new agf value */ @@ -507,7 +507,7 @@ xfs_alloc_trace_modagf( STATIC void xfs_alloc_trace_busy( - char *name, /* function tag string */ + const char *name, /* function tag string */ char *str, /* additional string */ xfs_mount_t *mp, /* file system mount point */ xfs_agnumber_t agno, /* allocation group number */ @@ -549,9 +549,6 @@ xfs_alloc_ag_vextent( xfs_alloc_arg_t *args) /* argument structure for allocation */ { int error=0; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent"; -#endif ASSERT(args->minlen > 0); ASSERT(args->maxlen > 0); @@ -635,9 +632,6 @@ xfs_alloc_ag_vextent_exact( xfs_agblock_t fbno; /* start block of found extent */ xfs_agblock_t fend; /* end block of found extent */ xfs_extlen_t flen; /* length of found extent */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_exact"; -#endif int i; /* success/failure of operation */ xfs_agblock_t maxend; /* end of maximal extent */ xfs_agblock_t minend; /* end of minimal extent */ @@ -737,9 +731,6 @@ xfs_alloc_ag_vextent_near( xfs_btree_cur_t *bno_cur_gt; /* cursor for bno btree, right side */ xfs_btree_cur_t *bno_cur_lt; /* cursor for bno btree, left side */ xfs_btree_cur_t *cnt_cur; /* cursor for count btree */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_near"; -#endif xfs_agblock_t gtbno; /* start bno of right side entry */ xfs_agblock_t gtbnoa; /* aligned ... */ xfs_extlen_t gtdiff; /* difference to right side entry */ @@ -1270,9 +1261,6 @@ xfs_alloc_ag_vextent_size( int error; /* error result */ xfs_agblock_t fbno; /* start of found freespace */ xfs_extlen_t flen; /* length of found freespace */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_size"; -#endif int i; /* temp status variable */ xfs_agblock_t rbno; /* returned block number */ xfs_extlen_t rlen; /* length of returned extent */ @@ -1427,9 +1415,6 @@ xfs_alloc_ag_vextent_small( int error; xfs_agblock_t fbno; xfs_extlen_t flen; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_ag_vextent_small"; -#endif int i; if ((error = xfs_alloc_decrement(ccur, 0, &i))) @@ -1515,9 +1500,6 @@ xfs_free_ag_extent( xfs_btree_cur_t *bno_cur; /* cursor for by-block btree */ xfs_btree_cur_t *cnt_cur; /* cursor for by-size btree */ int error; /* error return value */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_free_ag_extent"; -#endif xfs_agblock_t gtbno; /* start of right neighbor block */ xfs_extlen_t gtlen; /* length of right neighbor block */ int haveleft; /* have a left neighbor block */ @@ -1998,9 +1980,6 @@ xfs_alloc_get_freelist( xfs_buf_t *agflbp;/* buffer for a.g. freelist structure */ xfs_agblock_t bno; /* block number returned */ int error; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_get_freelist"; -#endif xfs_mount_t *mp; /* mount structure */ xfs_perag_t *pag; /* per allocation group data */ @@ -2112,9 +2091,6 @@ xfs_alloc_put_freelist( xfs_agfl_t *agfl; /* a.g. free block array */ __be32 *blockp;/* pointer to array entry */ int error; -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_put_freelist"; -#endif xfs_mount_t *mp; /* mount structure */ xfs_perag_t *pag; /* per allocation group data */ @@ -2235,9 +2211,6 @@ xfs_alloc_vextent( xfs_agblock_t agsize; /* allocation group size */ int error; int flags; /* XFS_ALLOC_FLAG_... locking flags */ -#ifdef XFS_ALLOC_TRACE - static char fname[] = "xfs_alloc_vextent"; -#endif xfs_extlen_t minleft;/* minimum left value, temp copy */ xfs_mount_t *mp; /* mount structure pointer */ xfs_agnumber_t sagno; /* starting allocation group number */ Index: linux/fs/xfs/xfs_bmap.c =================================================================== --- linux.orig/fs/xfs/xfs_bmap.c +++ linux/fs/xfs/xfs_bmap.c @@ -277,7 +277,7 @@ xfs_bmap_isaeof( STATIC void xfs_bmap_trace_addentry( int opcode, /* operation */ - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(ies) */ @@ -291,7 +291,7 @@ xfs_bmap_trace_addentry( */ STATIC void xfs_bmap_trace_delete( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) deleted */ @@ -304,7 +304,7 @@ xfs_bmap_trace_delete( */ STATIC void xfs_bmap_trace_insert( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) inserted */ @@ -318,7 +318,7 @@ xfs_bmap_trace_insert( */ STATIC void xfs_bmap_trace_post_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry updated */ @@ -329,17 +329,25 @@ xfs_bmap_trace_post_update( */ STATIC void xfs_bmap_trace_pre_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry to be updated */ int whichfork); /* data or attr fork */ +#define XFS_BMAP_TRACE_DELETE(d,ip,i,c,w) \ + xfs_bmap_trace_delete(__FUNCTION__,d,ip,i,c,w) +#define XFS_BMAP_TRACE_INSERT(d,ip,i,c,r1,r2,w) \ + xfs_bmap_trace_insert(__FUNCTION__,d,ip,i,c,r1,r2,w) +#define XFS_BMAP_TRACE_POST_UPDATE(d,ip,i,w) \ + xfs_bmap_trace_post_update(__FUNCTION__,d,ip,i,w) +#define XFS_BMAP_TRACE_PRE_UPDATE(d,ip,i,w) \ + xfs_bmap_trace_pre_update(__FUNCTION__,d,ip,i,w) #else -#define xfs_bmap_trace_delete(f,d,ip,i,c,w) -#define xfs_bmap_trace_insert(f,d,ip,i,c,r1,r2,w) -#define xfs_bmap_trace_post_update(f,d,ip,i,w) -#define xfs_bmap_trace_pre_update(f,d,ip,i,w) +#define XFS_BMAP_TRACE_DELETE(d,ip,i,c,w) +#define XFS_BMAP_TRACE_INSERT(d,ip,i,c,r1,r2,w) +#define XFS_BMAP_TRACE_POST_UPDATE(d,ip,i,w) +#define XFS_BMAP_TRACE_PRE_UPDATE(d,ip,i,w) #endif /* XFS_BMAP_TRACE */ /* @@ -531,9 +539,6 @@ xfs_bmap_add_extent( xfs_filblks_t da_new; /* new count del alloc blocks used */ xfs_filblks_t da_old; /* old count del alloc blocks used */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent"; -#endif xfs_ifork_t *ifp; /* inode fork ptr */ int logflags; /* returned value */ xfs_extnum_t nextents; /* number of extents in file now */ @@ -551,8 +556,8 @@ xfs_bmap_add_extent( * already extents in the list. */ if (nextents == 0) { - xfs_bmap_trace_insert(fname, "insert empty", ip, 0, 1, new, - NULL, whichfork); + XFS_BMAP_TRACE_INSERT("insert empty", ip, 0, 1, new, NULL, + whichfork); xfs_iext_insert(ifp, 0, 1, new); ASSERT(cur == NULL); ifp->if_lastex = 0; @@ -710,9 +715,6 @@ xfs_bmap_add_extent_delay_real( int diff; /* temp value */ xfs_bmbt_rec_t *ep; /* extent entry for idx */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_delay_real"; -#endif int i; /* temp state */ xfs_ifork_t *ifp; /* inode fork pointer */ xfs_fileoff_t new_endoff; /* end offset of new entry */ @@ -808,15 +810,14 @@ xfs_bmap_add_extent_delay_real( * Filling in all of a previously delayed allocation extent. * The left and right neighbors are both contiguous with new. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount + RIGHT.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC|RC", ip, idx - 1, - XFS_DATA_FORK); - xfs_bmap_trace_delete(fname, "LF|RF|LC|RC", ip, idx, 2, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC|RC", ip, idx, 2, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 2); ip->i_df.if_lastex = idx - 1; ip->i_d.di_nextents--; @@ -855,15 +856,14 @@ xfs_bmap_add_extent_delay_real( * Filling in all of a previously delayed allocation extent. * The left neighbor is contiguous, the right is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; - xfs_bmap_trace_delete(fname, "LF|RF|LC", ip, idx, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC", ip, idx, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 1); if (cur == NULL) rval = XFS_ILOG_DEXT; @@ -892,16 +892,13 @@ xfs_bmap_add_extent_delay_real( * Filling in all of a previously delayed allocation extent. * The right neighbor is contiguous, the left is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|RC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startblock(ep, new->br_startblock); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount + RIGHT.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|RC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; - xfs_bmap_trace_delete(fname, "LF|RF|RC", ip, idx + 1, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|RC", ip, idx + 1, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx + 1, 1); if (cur == NULL) rval = XFS_ILOG_DEXT; @@ -931,11 +928,9 @@ xfs_bmap_add_extent_delay_real( * Neither the left nor right neighbors are contiguous with * the new one. */ - xfs_bmap_trace_pre_update(fname, "LF|RF", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startblock(ep, new->br_startblock); - xfs_bmap_trace_post_update(fname, "LF|RF", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; ip->i_d.di_nextents++; if (cur == NULL) @@ -963,17 +958,14 @@ xfs_bmap_add_extent_delay_real( * Filling in the first part of a previous delayed allocation. * The left neighbor is contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx - 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + new->br_blockcount); xfs_bmbt_set_startoff(ep, PREV.br_startoff + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx - 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); temp = PREV.br_blockcount - new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); ip->i_df.if_lastex = idx - 1; if (cur == NULL) @@ -995,8 +987,7 @@ xfs_bmap_add_extent_delay_real( temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), STARTBLOCKVAL(PREV.br_startblock)); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); *dnew = temp; /* DELTA: The boundary between two in-core extents moved. */ temp = LEFT.br_startoff; @@ -1009,11 +1000,11 @@ xfs_bmap_add_extent_delay_real( * Filling in the first part of a previous delayed allocation. * The left neighbor is not contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startoff(ep, new_endoff); temp = PREV.br_blockcount - new->br_blockcount; xfs_bmbt_set_blockcount(ep, temp); - xfs_bmap_trace_insert(fname, "LF", ip, idx, 1, new, NULL, + XFS_BMAP_TRACE_INSERT("LF", ip, idx, 1, new, NULL, XFS_DATA_FORK); xfs_iext_insert(ifp, idx, 1, new); ip->i_df.if_lastex = idx; @@ -1046,8 +1037,7 @@ xfs_bmap_add_extent_delay_real( (cur ? cur->bc_private.b.allocated : 0)); ep = xfs_iext_get_ext(ifp, idx + 1); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "LF", ip, idx + 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("LF", ip, idx + 1, XFS_DATA_FORK); *dnew = temp; /* DELTA: One in-core extent is split in two. */ temp = PREV.br_startoff; @@ -1060,17 +1050,14 @@ xfs_bmap_add_extent_delay_real( * The right neighbor is contiguous with the new allocation. */ temp = PREV.br_blockcount - new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx, - XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx + 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); xfs_bmbt_set_allf(xfs_iext_get_ext(ifp, idx + 1), new->br_startoff, new->br_startblock, new->br_blockcount + RIGHT.br_blockcount, RIGHT.br_state); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx + 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx + 1; if (cur == NULL) rval = XFS_ILOG_DEXT; @@ -1091,8 +1078,7 @@ xfs_bmap_add_extent_delay_real( temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), STARTBLOCKVAL(PREV.br_startblock)); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); *dnew = temp; /* DELTA: The boundary between two in-core extents moved. */ temp = PREV.br_startoff; @@ -1106,10 +1092,10 @@ xfs_bmap_add_extent_delay_real( * The right neighbor is not contiguous. */ temp = PREV.br_blockcount - new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); - xfs_bmap_trace_insert(fname, "RF", ip, idx + 1, 1, - new, NULL, XFS_DATA_FORK); + XFS_BMAP_TRACE_INSERT("RF", ip, idx + 1, 1, new, NULL, + XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 1, new); ip->i_df.if_lastex = idx + 1; ip->i_d.di_nextents++; @@ -1141,7 +1127,7 @@ xfs_bmap_add_extent_delay_real( (cur ? cur->bc_private.b.allocated : 0)); ep = xfs_iext_get_ext(ifp, idx); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF", ip, idx, XFS_DATA_FORK); *dnew = temp; /* DELTA: One in-core extent is split in two. */ temp = PREV.br_startoff; @@ -1155,7 +1141,7 @@ xfs_bmap_add_extent_delay_real( * This case is avoided almost all the time. */ temp = new->br_startoff - PREV.br_startoff; - xfs_bmap_trace_pre_update(fname, "0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, temp); r[0] = *new; r[1].br_state = PREV.br_state; @@ -1163,7 +1149,7 @@ xfs_bmap_add_extent_delay_real( r[1].br_startoff = new_endoff; temp2 = PREV.br_startoff + PREV.br_blockcount - new_endoff; r[1].br_blockcount = temp2; - xfs_bmap_trace_insert(fname, "0", ip, idx + 1, 2, &r[0], &r[1], + XFS_BMAP_TRACE_INSERT("0", ip, idx + 1, 2, &r[0], &r[1], XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 2, &r[0]); ip->i_df.if_lastex = idx + 1; @@ -1222,13 +1208,11 @@ xfs_bmap_add_extent_delay_real( } ep = xfs_iext_get_ext(ifp, idx); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "0", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "0", ip, idx + 2, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx + 2, XFS_DATA_FORK); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, idx + 2), NULLSTARTBLOCK((int)temp2)); - xfs_bmap_trace_post_update(fname, "0", ip, idx + 2, - XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx + 2, XFS_DATA_FORK); *dnew = temp + temp2; /* DELTA: One in-core extent is split in three. */ temp = PREV.br_startoff; @@ -1287,9 +1271,6 @@ xfs_bmap_add_extent_unwritten_real( xfs_btree_cur_t *cur; /* btree cursor */ xfs_bmbt_rec_t *ep; /* extent entry for idx */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_unwritten_real"; -#endif int i; /* temp state */ xfs_ifork_t *ifp; /* inode fork pointer */ xfs_fileoff_t new_endoff; /* end offset of new entry */ @@ -1390,15 +1371,14 @@ xfs_bmap_add_extent_unwritten_real( * Setting all of a previous oldext extent to newext. * The left and right neighbors are both contiguous with new. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount + RIGHT.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC|RC", ip, idx - 1, - XFS_DATA_FORK); - xfs_bmap_trace_delete(fname, "LF|RF|LC|RC", ip, idx, 2, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC|RC", ip, idx - 1, XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC|RC", ip, idx, 2, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 2); ip->i_df.if_lastex = idx - 1; ip->i_d.di_nextents -= 2; @@ -1441,15 +1421,14 @@ xfs_bmap_add_extent_unwritten_real( * Setting all of a previous oldext extent to newext. * The left neighbor is contiguous, the right is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + PREV.br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|RF|LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|LC", ip, idx - 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; - xfs_bmap_trace_delete(fname, "LF|RF|LC", ip, idx, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|LC", ip, idx, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 1); ip->i_d.di_nextents--; if (cur == NULL) @@ -1484,16 +1463,15 @@ xfs_bmap_add_extent_unwritten_real( * Setting all of a previous oldext extent to newext. * The right neighbor is contiguous, the left is not. */ - xfs_bmap_trace_pre_update(fname, "LF|RF|RC", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount + RIGHT.br_blockcount); xfs_bmbt_set_state(ep, newext); - xfs_bmap_trace_post_update(fname, "LF|RF|RC", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF|RC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; - xfs_bmap_trace_delete(fname, "LF|RF|RC", ip, idx + 1, 1, - XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LF|RF|RC", ip, idx + 1, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx + 1, 1); ip->i_d.di_nextents--; if (cur == NULL) @@ -1529,10 +1507,10 @@ xfs_bmap_add_extent_unwritten_real( * Neither the left nor right neighbors are contiguous with * the new one. */ - xfs_bmap_trace_pre_update(fname, "LF|RF", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_state(ep, newext); - xfs_bmap_trace_post_update(fname, "LF|RF", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("LF|RF", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; if (cur == NULL) @@ -1559,21 +1537,21 @@ xfs_bmap_add_extent_unwritten_real( * Setting the first part of a previous oldext extent to newext. * The left neighbor is contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), LEFT.br_blockcount + new->br_blockcount); xfs_bmbt_set_startoff(ep, PREV.br_startoff + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx - 1, XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "LF|LC", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_startblock(ep, new->br_startblock + new->br_blockcount); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF|LC", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("LF|LC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; if (cur == NULL) @@ -1610,15 +1588,15 @@ xfs_bmap_add_extent_unwritten_real( * Setting the first part of a previous oldext extent to newext. * The left neighbor is not contiguous. */ - xfs_bmap_trace_pre_update(fname, "LF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("LF", ip, idx, XFS_DATA_FORK); ASSERT(ep && xfs_bmbt_get_state(ep) == oldext); xfs_bmbt_set_startoff(ep, new_endoff); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); xfs_bmbt_set_startblock(ep, new->br_startblock + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LF", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_insert(fname, "LF", ip, idx, 1, new, NULL, + XFS_BMAP_TRACE_POST_UPDATE("LF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_INSERT("LF", ip, idx, 1, new, NULL, XFS_DATA_FORK); xfs_iext_insert(ifp, idx, 1, new); ip->i_df.if_lastex = idx; @@ -1653,18 +1631,18 @@ xfs_bmap_add_extent_unwritten_real( * Setting the last part of a previous oldext extent to newext. * The right neighbor is contiguous with the new allocation. */ - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx, + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_pre_update(fname, "RF|RC", ip, idx + 1, + XFS_BMAP_TRACE_PRE_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_allf(xfs_iext_get_ext(ifp, idx + 1), new->br_startoff, new->br_startblock, new->br_blockcount + RIGHT.br_blockcount, newext); - xfs_bmap_trace_post_update(fname, "RF|RC", ip, idx + 1, + XFS_BMAP_TRACE_POST_UPDATE("RF|RC", ip, idx + 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx + 1; if (cur == NULL) @@ -1700,12 +1678,12 @@ xfs_bmap_add_extent_unwritten_real( * Setting the last part of a previous oldext extent to newext. * The right neighbor is not contiguous. */ - xfs_bmap_trace_pre_update(fname, "RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RF", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, PREV.br_blockcount - new->br_blockcount); - xfs_bmap_trace_post_update(fname, "RF", ip, idx, XFS_DATA_FORK); - xfs_bmap_trace_insert(fname, "RF", ip, idx + 1, 1, - new, NULL, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RF", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_INSERT("RF", ip, idx + 1, 1, new, NULL, + XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 1, new); ip->i_df.if_lastex = idx + 1; ip->i_d.di_nextents++; @@ -1744,17 +1722,17 @@ xfs_bmap_add_extent_unwritten_real( * newext. Contiguity is impossible here. * One extent becomes three extents. */ - xfs_bmap_trace_pre_update(fname, "0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx, XFS_DATA_FORK); xfs_bmbt_set_blockcount(ep, new->br_startoff - PREV.br_startoff); - xfs_bmap_trace_post_update(fname, "0", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx, XFS_DATA_FORK); r[0] = *new; r[1].br_startoff = new_endoff; r[1].br_blockcount = PREV.br_startoff + PREV.br_blockcount - new_endoff; r[1].br_startblock = new->br_startblock + new->br_blockcount; r[1].br_state = oldext; - xfs_bmap_trace_insert(fname, "0", ip, idx + 1, 2, &r[0], &r[1], + XFS_BMAP_TRACE_INSERT("0", ip, idx + 1, 2, &r[0], &r[1], XFS_DATA_FORK); xfs_iext_insert(ifp, idx + 1, 2, &r[0]); ip->i_df.if_lastex = idx + 1; @@ -1845,9 +1823,6 @@ xfs_bmap_add_extent_hole_delay( int rsvd) /* OK to allocate reserved blocks */ { xfs_bmbt_rec_t *ep; /* extent record for idx */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_hole_delay"; -#endif xfs_ifork_t *ifp; /* inode fork pointer */ xfs_bmbt_irec_t left; /* left neighbor extent entry */ xfs_filblks_t newlen=0; /* new indirect size */ @@ -1919,7 +1894,7 @@ xfs_bmap_add_extent_hole_delay( */ temp = left.br_blockcount + new->br_blockcount + right.br_blockcount; - xfs_bmap_trace_pre_update(fname, "LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LC|RC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), temp); oldlen = STARTBLOCKVAL(left.br_startblock) + @@ -1928,10 +1903,9 @@ xfs_bmap_add_extent_hole_delay( newlen = xfs_bmap_worst_indlen(ip, temp); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, idx - 1), NULLSTARTBLOCK((int)newlen)); - xfs_bmap_trace_post_update(fname, "LC|RC", ip, idx - 1, - XFS_DATA_FORK); - xfs_bmap_trace_delete(fname, "LC|RC", ip, idx, 1, + XFS_BMAP_TRACE_POST_UPDATE("LC|RC", ip, idx - 1, XFS_DATA_FORK); + XFS_BMAP_TRACE_DELETE("LC|RC", ip, idx, 1, XFS_DATA_FORK); xfs_iext_remove(ifp, idx, 1); ip->i_df.if_lastex = idx - 1; /* DELTA: Two in-core extents were replaced by one. */ @@ -1946,7 +1920,7 @@ xfs_bmap_add_extent_hole_delay( * Merge the new allocation with the left neighbor. */ temp = left.br_blockcount + new->br_blockcount; - xfs_bmap_trace_pre_update(fname, "LC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LC", ip, idx - 1, XFS_DATA_FORK); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), temp); oldlen = STARTBLOCKVAL(left.br_startblock) + @@ -1954,7 +1928,7 @@ xfs_bmap_add_extent_hole_delay( newlen = xfs_bmap_worst_indlen(ip, temp); xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, idx - 1), NULLSTARTBLOCK((int)newlen)); - xfs_bmap_trace_post_update(fname, "LC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LC", ip, idx - 1, XFS_DATA_FORK); ip->i_df.if_lastex = idx - 1; /* DELTA: One in-core extent grew into a hole. */ @@ -1968,14 +1942,14 @@ xfs_bmap_add_extent_hole_delay( * on the right. * Merge the new allocation with the right neighbor. */ - xfs_bmap_trace_pre_update(fname, "RC", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_PRE_UPDATE("RC", ip, idx, XFS_DATA_FORK); temp = new->br_blockcount + right.br_blockcount; oldlen = STARTBLOCKVAL(new->br_startblock) + STARTBLOCKVAL(right.br_startblock); newlen = xfs_bmap_worst_indlen(ip, temp); xfs_bmbt_set_allf(ep, new->br_startoff, NULLSTARTBLOCK((int)newlen), temp, right.br_state); - xfs_bmap_trace_post_update(fname, "RC", ip, idx, XFS_DATA_FORK); + XFS_BMAP_TRACE_POST_UPDATE("RC", ip, idx, XFS_DATA_FORK); ip->i_df.if_lastex = idx; /* DELTA: One in-core extent grew into a hole. */ temp2 = temp; @@ -1989,7 +1963,7 @@ xfs_bmap_add_extent_hole_delay( * Insert a new entry. */ oldlen = newlen = 0; - xfs_bmap_trace_insert(fname, "0", ip, idx, 1, new, NULL, + XFS_BMAP_TRACE_INSERT("0", ip, idx, 1, new, NULL, XFS_DATA_FORK); xfs_iext_insert(ifp, idx, 1, new); ip->i_df.if_lastex = idx; @@ -2039,9 +2013,6 @@ xfs_bmap_add_extent_hole_real( { xfs_bmbt_rec_t *ep; /* pointer to extent entry ins. point */ int error; /* error return value */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_add_extent_hole_real"; -#endif int i; /* temp state */ xfs_ifork_t *ifp; /* inode fork pointer */ xfs_bmbt_irec_t left; /* left neighbor extent entry */ @@ -2118,15 +2089,14 @@ xfs_bmap_add_extent_hole_real( * left and on the right. * Merge all three into a single extent record. */ - xfs_bmap_trace_pre_update(fname, "LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_PRE_UPDATE("LC|RC", ip, idx - 1, whichfork); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), left.br_blockcount + new->br_blockcount + right.br_blockcount); - xfs_bmap_trace_post_update(fname, "LC|RC", ip, idx - 1, + XFS_BMAP_TRACE_POST_UPDATE("LC|RC", ip, idx - 1, whichfork); - xfs_bmap_trace_delete(fname, "LC|RC", ip, - idx, 1, whichfork); + XFS_BMAP_TRACE_DELETE("LC|RC", ip, idx, 1, whichfork); xfs_iext_remove(ifp, idx, 1); ifp->if_lastex = idx - 1; XFS_IFORK_NEXT_SET(ip, whichfork, @@ -2168,10 +2138,10 @@ xfs_bmap_add_extent_hole_real( * on the left. * Merge the new allocation with the left neighbor. */ - xfs_bmap_trace_pre_update(fname, "LC", ip, idx - 1, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("LC", ip, idx - 1, whichfork); xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, idx - 1), left.br_blockcount + new->br_blockcount); - xfs_bmap_trace_post_update(fname, "LC", ip, idx - 1, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("LC", ip, idx - 1, whichfork); ifp->if_lastex = idx - 1; if (cur == NULL) { rval = XFS_ILOG_FEXT(whichfork); @@ -2202,11 +2172,11 @@ xfs_bmap_add_extent_hole_real( * on the right. * Merge the new allocation with the right neighbor. */ - xfs_bmap_trace_pre_update(fname, "RC", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("RC", ip, idx, whichfork); xfs_bmbt_set_allf(ep, new->br_startoff, new->br_startblock, new->br_blockcount + right.br_blockcount, right.br_state); - xfs_bmap_trace_post_update(fname, "RC", ip, idx, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("RC", ip, idx, whichfork); ifp->if_lastex = idx; if (cur == NULL) { rval = XFS_ILOG_FEXT(whichfork); @@ -2237,8 +2207,7 @@ xfs_bmap_add_extent_hole_real( * real allocation. * Insert a new entry. */ - xfs_bmap_trace_insert(fname, "0", ip, idx, 1, new, NULL, - whichfork); + XFS_BMAP_TRACE_INSERT("0", ip, idx, 1, new, NULL, whichfork); xfs_iext_insert(ifp, idx, 1, new); ifp->if_lastex = idx; XFS_IFORK_NEXT_SET(ip, whichfork, @@ -3051,9 +3020,6 @@ xfs_bmap_del_extent( xfs_bmbt_rec_t *ep; /* current extent entry pointer */ int error; /* error return value */ int flags; /* inode logging flags */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_del_extent"; -#endif xfs_bmbt_irec_t got; /* current extent entry */ xfs_fileoff_t got_endoff; /* first offset past got */ int i; /* temp state */ @@ -3147,7 +3113,7 @@ xfs_bmap_del_extent( /* * Matches the whole extent. Delete the entry. */ - xfs_bmap_trace_delete(fname, "3", ip, idx, 1, whichfork); + XFS_BMAP_TRACE_DELETE("3", ip, idx, 1, whichfork); xfs_iext_remove(ifp, idx, 1); ifp->if_lastex = idx; if (delay) @@ -3168,7 +3134,7 @@ xfs_bmap_del_extent( /* * Deleting the first part of the extent. */ - xfs_bmap_trace_pre_update(fname, "2", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("2", ip, idx, whichfork); xfs_bmbt_set_startoff(ep, del_endoff); temp = got.br_blockcount - del->br_blockcount; xfs_bmbt_set_blockcount(ep, temp); @@ -3177,13 +3143,13 @@ xfs_bmap_del_extent( temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), da_old); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "2", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("2", ip, idx, whichfork); da_new = temp; break; } xfs_bmbt_set_startblock(ep, del_endblock); - xfs_bmap_trace_post_update(fname, "2", ip, idx, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("2", ip, idx, whichfork); if (!cur) { flags |= XFS_ILOG_FEXT(whichfork); break; @@ -3199,19 +3165,19 @@ xfs_bmap_del_extent( * Deleting the last part of the extent. */ temp = got.br_blockcount - del->br_blockcount; - xfs_bmap_trace_pre_update(fname, "1", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("1", ip, idx, whichfork); xfs_bmbt_set_blockcount(ep, temp); ifp->if_lastex = idx; if (delay) { temp = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp), da_old); xfs_bmbt_set_startblock(ep, NULLSTARTBLOCK((int)temp)); - xfs_bmap_trace_post_update(fname, "1", ip, idx, + XFS_BMAP_TRACE_POST_UPDATE("1", ip, idx, whichfork); da_new = temp; break; } - xfs_bmap_trace_post_update(fname, "1", ip, idx, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("1", ip, idx, whichfork); if (!cur) { flags |= XFS_ILOG_FEXT(whichfork); break; @@ -3228,7 +3194,7 @@ xfs_bmap_del_extent( * Deleting the middle of the extent. */ temp = del->br_startoff - got.br_startoff; - xfs_bmap_trace_pre_update(fname, "0", ip, idx, whichfork); + XFS_BMAP_TRACE_PRE_UPDATE("0", ip, idx, whichfork); xfs_bmbt_set_blockcount(ep, temp); new.br_startoff = del_endoff; temp2 = got_endoff - del_endoff; @@ -3315,8 +3281,8 @@ xfs_bmap_del_extent( } } } - xfs_bmap_trace_post_update(fname, "0", ip, idx, whichfork); - xfs_bmap_trace_insert(fname, "0", ip, idx + 1, 1, &new, NULL, + XFS_BMAP_TRACE_POST_UPDATE("0", ip, idx, whichfork); + XFS_BMAP_TRACE_INSERT("0", ip, idx + 1, 1, &new, NULL, whichfork); xfs_iext_insert(ifp, idx + 1, 1, &new); ifp->if_lastex = idx + 1; @@ -3556,9 +3522,6 @@ xfs_bmap_local_to_extents( { int error; /* error return value */ int flags; /* logging flags returned */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_local_to_extents"; -#endif xfs_ifork_t *ifp; /* inode fork pointer */ /* @@ -3613,7 +3576,7 @@ xfs_bmap_local_to_extents( xfs_iext_add(ifp, 0, 1); ep = xfs_iext_get_ext(ifp, 0); xfs_bmbt_set_allf(ep, 0, args.fsbno, 1, XFS_EXT_NORM); - xfs_bmap_trace_post_update(fname, "new", ip, 0, whichfork); + XFS_BMAP_TRACE_POST_UPDATE("new", ip, 0, whichfork); XFS_IFORK_NEXT_SET(ip, whichfork, 1); ip->i_d.di_nblocks = 1; XFS_TRANS_MOD_DQUOT_BYINO(args.mp, tp, ip, @@ -3736,7 +3699,7 @@ ktrace_t *xfs_bmap_trace_buf; STATIC void xfs_bmap_trace_addentry( int opcode, /* operation */ - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(ies) */ @@ -3795,7 +3758,7 @@ xfs_bmap_trace_addentry( */ STATIC void xfs_bmap_trace_delete( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) deleted */ @@ -3817,7 +3780,7 @@ xfs_bmap_trace_delete( */ STATIC void xfs_bmap_trace_insert( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry(entries) inserted */ @@ -3846,7 +3809,7 @@ xfs_bmap_trace_insert( */ STATIC void xfs_bmap_trace_post_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry updated */ @@ -3864,7 +3827,7 @@ xfs_bmap_trace_post_update( */ STATIC void xfs_bmap_trace_pre_update( - char *fname, /* function name */ + const char *fname, /* function name */ char *desc, /* operation description */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t idx, /* index of entry to be updated */ @@ -4478,9 +4441,6 @@ xfs_bmap_read_extents( xfs_buf_t *bp; /* buffer for "block" */ int error; /* error return value */ xfs_exntfmt_t exntf; /* XFS_EXTFMT_NOSTATE, if checking */ -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_bmap_read_extents"; -#endif xfs_extnum_t i, j; /* index into the extents list */ xfs_ifork_t *ifp; /* fork structure */ int level; /* btree level, for checking */ @@ -4597,7 +4557,7 @@ xfs_bmap_read_extents( } ASSERT(i == (ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t))); ASSERT(i == XFS_IFORK_NEXTENTS(ip, whichfork)); - xfs_bmap_trace_exlist(fname, ip, i, whichfork); + XFS_BMAP_TRACE_EXLIST(ip, i, whichfork); return 0; error0: xfs_trans_brelse(tp, bp); @@ -4610,7 +4570,7 @@ error0: */ void xfs_bmap_trace_exlist( - char *fname, /* function name */ + const char *fname, /* function name */ xfs_inode_t *ip, /* incore inode pointer */ xfs_extnum_t cnt, /* count of entries in the list */ int whichfork) /* data or attr fork */ @@ -4625,7 +4585,7 @@ xfs_bmap_trace_exlist( for (idx = 0; idx < cnt; idx++) { ep = xfs_iext_get_ext(ifp, idx); xfs_bmbt_get_all(ep, &s); - xfs_bmap_trace_insert(fname, "exlist", ip, idx, 1, &s, NULL, + XFS_BMAP_TRACE_INSERT("exlist", ip, idx, 1, &s, NULL, whichfork); } } Index: linux/fs/xfs/xfs_bmap.h =================================================================== --- linux.orig/fs/xfs/xfs_bmap.h +++ linux/fs/xfs/xfs_bmap.h @@ -144,12 +144,14 @@ extern ktrace_t *xfs_bmap_trace_buf; */ void xfs_bmap_trace_exlist( - char *fname, /* function name */ + const char *fname, /* function name */ struct xfs_inode *ip, /* incore inode pointer */ xfs_extnum_t cnt, /* count of entries in list */ int whichfork); /* data or attr fork */ +#define XFS_BMAP_TRACE_EXLIST(ip,c,w) \ + xfs_bmap_trace_exlist(__FUNCTION__,ip,c,w) #else -#define xfs_bmap_trace_exlist(f,ip,c,w) +#define XFS_BMAP_TRACE_EXLIST(ip,c,w) #endif /* Index: linux/fs/xfs/xfs_inode.c =================================================================== --- linux.orig/fs/xfs/xfs_inode.c +++ linux/fs/xfs/xfs_inode.c @@ -642,8 +642,7 @@ xfs_iformat_extents( ep->l1 = INT_GET(get_unaligned((__uint64_t*)&dp->l1), ARCH_CONVERT); } - xfs_bmap_trace_exlist("xfs_iformat_extents", ip, nex, - whichfork); + XFS_BMAP_TRACE_EXLIST(ip, nex, whichfork); if (whichfork != XFS_DATA_FORK || XFS_EXTFMT_INODE(ip) == XFS_EXTFMT_NOSTATE) if (unlikely(xfs_check_nostate_extents( @@ -2845,9 +2844,6 @@ xfs_iextents_copy( int copied; xfs_bmbt_rec_t *dest_ep; xfs_bmbt_rec_t *ep; -#ifdef XFS_BMAP_TRACE - static char fname[] = "xfs_iextents_copy"; -#endif int i; xfs_ifork_t *ifp; int nrecs; @@ -2858,7 +2854,7 @@ xfs_iextents_copy( ASSERT(ifp->if_bytes > 0); nrecs = ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t); - xfs_bmap_trace_exlist(fname, ip, nrecs, whichfork); + XFS_BMAP_TRACE_EXLIST(ip, nrecs, whichfork); ASSERT(nrecs > 0); /* From owner-xfs@oss.sgi.com Fri Mar 16 23:03:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 16 Mar 2007 23:03:49 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtps.tip.net.au (chilli.pcug.org.au [203.10.76.44]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2H63d6p020297 for ; Fri, 16 Mar 2007 23:03:40 -0700 Received: from localhost (ta-1-1.tip.net.au [203.11.71.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by smtps.tip.net.au (Postfix) with ESMTP id 5B5EE368073; Sat, 17 Mar 2007 16:33:48 +1100 (EST) Date: Sat, 17 Mar 2007 16:33:50 +1100 From: Stephen Rothwell To: "Amit K. Arora" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-Id: <20070317163350.6b676c26.sfr@canb.auug.org.au> In-Reply-To: <20070316143101.GA10152@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> X-Mailer: Sylpheed version 2.3.0beta5 (GTK+ 2.8.20; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Sat__17_Mar_2007_16_33_50_+1100_RDGr9JcpHSSG4wvw" X-archive-position: 10873 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sfr@canb.auug.org.au Precedence: bulk X-list: xfs Content-Length: 1240 Lines: 40 --Signature=_Sat__17_Mar_2007_16_33_50_+1100_RDGr9JcpHSSG4wvw Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: 7bit On Fri, 16 Mar 2007 20:01:01 +0530 "Amit K. Arora" wrote: > > +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len); > > --- linux-2.6.20.1.orig/include/asm-powerpc/systbl.h > +++ linux-2.6.20.1/include/asm-powerpc/systbl.h > @@ -305,3 +305,4 @@ SYSCALL_SPU(faccessat) > COMPAT_SYS_SPU(get_robust_list) > COMPAT_SYS_SPU(set_robust_list) > COMPAT_SYS(move_pages) > +SYSCALL(fallocate) It is going to need to be a COMPAT_SYS call in powerpc because 32 bit powerpc will pass the two loff_t's in pairs of registers while 64bit passes them in one register each. -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ --Signature=_Sat__17_Mar_2007_16_33_50_+1100_RDGr9JcpHSSG4wvw Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF+33HFdBgD/zoJvwRAsUAAJ4lt3jRPS6CE09R0kbH9uY2YBf1CwCcC71K ciATEa4iyLt7aSVc9IaUZ6w= =4vjD -----END PGP SIGNATURE----- --Signature=_Sat__17_Mar_2007_16_33_50_+1100_RDGr9JcpHSSG4wvw-- From owner-xfs@oss.sgi.com Sat Mar 17 03:33:27 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 03:33:37 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HAXP6p011024 for ; Sat, 17 Mar 2007 03:33:27 -0700 Received: by ozlabs.org (Postfix, from userid 1003) id 215C1DDED6; Sat, 17 Mar 2007 21:05:53 +1100 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17915.48105.70494.958220@cargo.ozlabs.ibm.com> Date: Sat, 17 Mar 2007 20:59:05 +1100 From: Paul Mackerras To: Heiko Carstens Cc: "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call In-Reply-To: <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-archive-position: 10874 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: paulus@samba.org Precedence: bulk X-list: xfs Content-Length: 416 Lines: 16 Heiko Carstens writes: > So you either rearrange the parameters or convert the loff_t's to pointers. > > e.g. > > asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len, int mode) > > would work even on s390 ;) ... but wouldn't work on 32-bit powerpc. :( We would end up with a pad argument between fd and offset, giving 7 arguments in all (counting the loff_t's as 2), but we only support 6. Paul. From owner-xfs@oss.sgi.com Sat Mar 17 04:26:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 04:26:19 -0700 (PDT) X-Spam-oss-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.0-pre1-r499012 Received: from mail.parisc-linux.org ([192.25.206.14]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HBQC6p024580 for ; Sat, 17 Mar 2007 04:26:13 -0700 Received: by mail.parisc-linux.org (Postfix, from userid 26919) id 1AA7E49400A; Sat, 17 Mar 2007 05:07:06 -0600 (MDT) Date: Sat, 17 Mar 2007 05:07:06 -0600 From: Matthew Wilcox To: Paul Mackerras Cc: Heiko Carstens , "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070317110706.GB29931@parisc-linux.org> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <17915.48105.70494.958220@cargo.ozlabs.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17915.48105.70494.958220@cargo.ozlabs.ibm.com> User-Agent: Mutt/1.5.13 (2006-08-11) X-archive-position: 10875 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: matthew@wil.cx Precedence: bulk X-list: xfs Content-Length: 278 Lines: 7 On Sat, Mar 17, 2007 at 08:59:05PM +1100, Paul Mackerras wrote: > ... but wouldn't work on 32-bit powerpc. :( We would end up with a > pad argument between fd and offset, giving 7 arguments in all > (counting the loff_t's as 2), but we only support 6. Ditto mips and parisc. From owner-xfs@oss.sgi.com Sat Mar 17 04:42:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 04:43:01 -0700 (PDT) X-Spam-oss-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.0-pre1-r499012 Received: from mail.parisc-linux.org ([192.25.206.14]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HBgq6p028048 for ; Sat, 17 Mar 2007 04:42:53 -0700 Received: by mail.parisc-linux.org (Postfix, from userid 26919) id 3DD3549400B; Sat, 17 Mar 2007 05:10:37 -0600 (MDT) Date: Sat, 17 Mar 2007 05:10:37 -0600 From: Matthew Wilcox To: Heiko Carstens Cc: "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070317111036.GC29931@parisc-linux.org> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> User-Agent: Mutt/1.5.13 (2006-08-11) X-archive-position: 10876 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: matthew@wil.cx Precedence: bulk X-list: xfs Content-Length: 423 Lines: 16 On Fri, Mar 16, 2007 at 05:17:04PM +0100, Heiko Carstens wrote: > > +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > > e.g. > > asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len, int mode) > > would work even on s390 ;) How about: asmlinkage long sys_fallocate(int fd, int mode, u32 off_low, u32 off_high, u32 len_low, u32 len_high); That way we all suffer equally ... From owner-xfs@oss.sgi.com Sat Mar 17 04:45:24 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 04:45:28 -0700 (PDT) X-Spam-oss-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HBjL6p028607 for ; Sat, 17 Mar 2007 04:45:24 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HSXLc-0002Bi-21; Sat, 17 Mar 2007 11:45:20 +0000 Date: Sat, 17 Mar 2007 11:45:20 +0000 From: "'Christoph Hellwig'" To: Barry Naujok Cc: "'Christoph Hellwig'" , xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] New xfs_repair handling for inode nlink counts Message-ID: <20070317114519.GA7922@infradead.org> References: <20070309073410.GA8798@infradead.org> <200703130151.MAA03235@larry.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200703130151.MAA03235@larry.melbourne.sgi.com> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 10877 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 218 Lines: 8 On Tue, Mar 13, 2007 at 12:51:35PM +1100, Barry Naujok wrote: > Hi Christoph, > > Thanks for the feedback. I've attached an update to the trackmem stuff > for review. globals.c is now unmodified. Looks good to me. From owner-xfs@oss.sgi.com Sat Mar 17 06:01:21 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 06:01:26 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from bay0-omc3-s31.bay0.hotmail.com (bay0-omc3-s31.bay0.hotmail.com [65.54.246.231]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HD1K6p008578 for ; Sat, 17 Mar 2007 06:01:21 -0700 Received: from hotmail.com ([65.54.174.86]) by bay0-omc3-s31.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2668); Sat, 17 Mar 2007 06:01:19 -0700 Received: from mail pickup service by hotmail.com with Microsoft SMTPSVC; Sat, 17 Mar 2007 06:01:19 -0700 Message-ID: Received: from 85.41.109.158 by BAY103-DAV14.phx.gbl with DAV; Sat, 17 Mar 2007 13:01:15 +0000 X-Originating-IP: [85.41.109.158] X-Originating-Email: [pupilla@hotmail.com] X-Sender: pupilla@hotmail.com From: "Marco Berizzi" To: "David Chinner" Cc: "David Chinner" , , References: <20070316012520.GN5743@melbourne.sgi.com> <20070316195951.GB5743@melbourne.sgi.com> Subject: Re: XFS internal error xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller 0xc01b00bd Date: Sat, 17 Mar 2007 14:00:30 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2180 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 X-OriginalArrivalTime: 17 Mar 2007 13:01:19.0230 (UTC) FILETIME=[5ADE11E0:01C76894] Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2HD1L6p008581 X-archive-position: 10878 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: pupilla@hotmail.com Precedence: bulk X-list: xfs Content-Length: 1612 Lines: 44 David Chinner wrote: > Ok, so an ipsec change. And I see from the history below it > really has nothing to do with this problem. it seems the problem > has something to do with changes between 2.6.19.1 and 2.6.19.2. indeed. Yesterday at 13:00 I have switched from 2.6.19.1 to 2.6.19.2 (without the ipsec fix) and at about 17:30 linux has crashed again. I have recompiled 2.6.19.2 with all kernel debugging options enabled and rebooted. Now I'm waiting for the crash... > There were no changes to XFS between 2.6.19.1 and 2.6.19.2, > so I'm thinking that your problems are related to something > other than XFS. Can you do a git bisect to determine what the > bad patch is? Ok, monday morning I will try to do a git bisect. I just want tell you that this machine is a firewall (3 nic) + ipsec gateway (openswan) + http proxy (squid). I think that the problem is related to the networking. I have other 2.6.19/2.6.20 linux boxes acting as proftpd/samba/sendmail without any problem. >> > Can you run xfs_repair on that filesystem and see if reports >> > (and fixes) any problems? >> >> I don't need to run xfs_repair to fix the problem, > > Except that the trigger might be on-disk corruption so we > need to rule that out first. Ok I will run xfs_repair and post results. >> I only unplug the power cable and reboot the system, >> xfs filesystem are correctly mounted. >> However tell me if I must run xfs_repair to check >> the filesystem. > > Yes, you need to run xfs_repair. ok , thanks a lot for the feedback. PS: I'm running slackware 11 with xfsprogs-2.8.10. Are they fine or should I upgrade? From owner-xfs@oss.sgi.com Sat Mar 17 07:32:43 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 07:32:52 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mtagate1.uk.ibm.com (mtagate1.uk.ibm.com [195.212.29.134]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HEWe6p025842 for ; Sat, 17 Mar 2007 07:32:42 -0700 Received: from d06nrmr1407.portsmouth.uk.ibm.com (d06nrmr1407.portsmouth.uk.ibm.com [9.149.38.185]) by mtagate1.uk.ibm.com (8.13.8/8.13.8) with ESMTP id l2HEWcAX084742 for ; Sat, 17 Mar 2007 14:32:38 GMT Received: from d06av03.portsmouth.uk.ibm.com (d06av03.portsmouth.uk.ibm.com [9.149.37.213]) by d06nrmr1407.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2HEWcUc1626332 for ; Sat, 17 Mar 2007 14:32:38 GMT Received: from d06av03.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av03.portsmouth.uk.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2HEWb7Z005719 for ; Sat, 17 Mar 2007 14:32:38 GMT Received: from localhost (ICON-9-164-142-244.megacenter.de.ibm.com [9.164.142.244]) by d06av03.portsmouth.uk.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2HEWbt6005714; Sat, 17 Mar 2007 14:32:37 GMT Date: Sat, 17 Mar 2007 15:30:43 +0100 From: Heiko Carstens To: Matthew Wilcox Cc: Paul Mackerras , "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070317143043.GA8577@osiris.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <17915.48105.70494.958220@cargo.ozlabs.ibm.com> <20070317110706.GB29931@parisc-linux.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070317110706.GB29931@parisc-linux.org> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 10879 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 666 Lines: 16 On Sat, Mar 17, 2007 at 05:07:06AM -0600, Matthew Wilcox wrote: > On Sat, Mar 17, 2007 at 08:59:05PM +1100, Paul Mackerras wrote: > > ... but wouldn't work on 32-bit powerpc. :( We would end up with a > > pad argument between fd and offset, giving 7 arguments in all > > (counting the loff_t's as 2), but we only support 6. > > Ditto mips and parisc. Can't be. Or: mips supports 7 arguments and parisc doesn't pad. Otherwise they couldn't have wired up sys_sync_file_range(int fd, loff_t offset, loff_t nbytes, unsigned int flags) But from what I read, it's currently not possible for 32-bit powerpc to wire up the already present sync_file_range system call. From owner-xfs@oss.sgi.com Sat Mar 17 07:38:44 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 07:38:51 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtps.tip.net.au (chilli.pcug.org.au [203.10.76.44]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HEcf6p027268 for ; Sat, 17 Mar 2007 07:38:44 -0700 Received: from localhost (ta-1-1.tip.net.au [203.11.71.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by smtps.tip.net.au (Postfix) with ESMTP id 0BBA5368073; Sun, 18 Mar 2007 01:38:33 +1100 (EST) Date: Sun, 18 Mar 2007 01:38:38 +1100 From: Stephen Rothwell To: Heiko Carstens Cc: Matthew Wilcox , Paul Mackerras , "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-Id: <20070318013838.d00d420e.sfr@canb.auug.org.au> In-Reply-To: <20070317143043.GA8577@osiris.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <17915.48105.70494.958220@cargo.ozlabs.ibm.com> <20070317110706.GB29931@parisc-linux.org> <20070317143043.GA8577@osiris.ibm.com> X-Mailer: Sylpheed version 2.3.0beta5 (GTK+ 2.8.20; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Sun__18_Mar_2007_01_38_38_+1100_GAQF/_Zvfh5SpWYk" X-archive-position: 10880 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sfr@canb.auug.org.au Precedence: bulk X-list: xfs Content-Length: 1137 Lines: 34 --Signature=_Sun__18_Mar_2007_01_38_38_+1100_GAQF/_Zvfh5SpWYk Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: 7bit On Sat, 17 Mar 2007 15:30:43 +0100 Heiko Carstens wrote: > > sys_sync_file_range(int fd, loff_t offset, loff_t nbytes, unsigned int flags) > > But from what I read, it's currently not possible for 32-bit powerpc to > wire up the already present sync_file_range system call. 32bit native is fine (as the ABI in user mode is the same as that in the kernel). For 32bit on a 64bit kernel you need the arch specific comapt routine that I used in the patch I posteda little while ago, -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ --Signature=_Sun__18_Mar_2007_01_38_38_+1100_GAQF/_Zvfh5SpWYk Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF+/10FdBgD/zoJvwRAhZ2AJ41njj68tFPjdgD3BIaJ9BJUMMl6ACdHnHT 1gtx0s+ho29+CsAiNUBHOns= =y+RJ -----END PGP SIGNATURE----- --Signature=_Sun__18_Mar_2007_01_38_38_+1100_GAQF/_Zvfh5SpWYk-- From owner-xfs@oss.sgi.com Sat Mar 17 07:42:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 07:42:18 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtps.tip.net.au (chilli.pcug.org.au [203.10.76.44]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HEgB6p028449 for ; Sat, 17 Mar 2007 07:42:12 -0700 Received: from localhost (ta-1-1.tip.net.au [203.11.71.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by smtps.tip.net.au (Postfix) with ESMTP id D058B368074; Sun, 18 Mar 2007 01:42:09 +1100 (EST) Date: Sun, 18 Mar 2007 01:42:14 +1100 From: Stephen Rothwell To: Heiko Carstens Cc: Matthew Wilcox , Paul Mackerras , "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-Id: <20070318014214.bd7ce48a.sfr@canb.auug.org.au> In-Reply-To: <20070318013838.d00d420e.sfr@canb.auug.org.au> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <17915.48105.70494.958220@cargo.ozlabs.ibm.com> <20070317110706.GB29931@parisc-linux.org> <20070317143043.GA8577@osiris.ibm.com> <20070318013838.d00d420e.sfr@canb.auug.org.au> X-Mailer: Sylpheed version 2.3.0beta5 (GTK+ 2.8.20; i486-pc-linux-gnu) Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Sun__18_Mar_2007_01_42_14_+1100_F1IKO8d_lae5GUnw" X-archive-position: 10881 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sfr@canb.auug.org.au Precedence: bulk X-list: xfs Content-Length: 1132 Lines: 36 --Signature=_Sun__18_Mar_2007_01_42_14_+1100_F1IKO8d_lae5GUnw Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: 7bit On Sun, 18 Mar 2007 01:38:38 +1100 Stephen Rothwell wrote: > > On Sat, 17 Mar 2007 15:30:43 +0100 Heiko Carstens wrote: > > > > sys_sync_file_range(int fd, loff_t offset, loff_t nbytes, unsigned int flags) > > > > But from what I read, it's currently not possible for 32-bit powerpc to > > wire up the already present sync_file_range system call. > > 32bit native is fine (as the ABI in user mode is the same as that in the Sorry, I take that back ... -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ --Signature=_Sun__18_Mar_2007_01_42_14_+1100_F1IKO8d_lae5GUnw Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFF+/5MFdBgD/zoJvwRAqmAAJ4l3dpzjhJtieFkiWyo6Ert8Ua8EQCfR2cF cGgls4pRiPxNQNH1AgKchbU= =mGuz -----END PGP SIGNATURE----- --Signature=_Sun__18_Mar_2007_01_42_14_+1100_F1IKO8d_lae5GUnw-- From owner-xfs@oss.sgi.com Sat Mar 17 08:27:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 08:27:46 -0700 (PDT) X-Spam-oss-Status: No, score=-0.2 required=5.0 tests=BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from caramon.arm.linux.org.uk (caramon.arm.linux.org.uk [217.147.92.249]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HFRb6p010125 for ; Sat, 17 Mar 2007 08:27:39 -0700 Received: from flint.arm.linux.org.uk ([2002:d993:5cf9:1:201:2ff:fe14:8fad]) by caramon.arm.linux.org.uk with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.62) (envelope-from ) id 1HSaI4-000579-1U; Sat, 17 Mar 2007 14:53:52 +0000 Received: from rmk by flint.arm.linux.org.uk with local (Exim 4.62) (envelope-from ) id 1HSaI1-00017k-93; Sat, 17 Mar 2007 14:53:49 +0000 Date: Sat, 17 Mar 2007 14:53:48 +0000 From: Russell King To: "Amit K. Arora" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070317145348.GA32278@flint.arm.linux.org.uk> Mail-Followup-To: "Amit K. Arora" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070316143101.GA10152@amitarora.in.ibm.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 10882 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rmk+lkml@arm.linux.org.uk Precedence: bulk X-list: xfs Content-Length: 1395 Lines: 31 On Fri, Mar 16, 2007 at 08:01:01PM +0530, Amit K. Arora wrote: > Attached below is the patch which implements this system call. It has > been currently implemented and tested on i386, ppc64 and x86_64 > architectures. I am facing some problems while trying to implement this > on s390, and thus the delay. While I try to get it right on s390(x), we > thought of posting this patch, so that we can save some time. Parallely > we will work on getting the patch work on s390, and probably it will > come as a separate patch. I suggest reading the very end of arch/arm/kernel/sys_arm.c; I'd rather avoid adding more and more hacks like that to the kernel if at all possible. One solution (already mentioned elsewhere) is that we start avoiding passing 64-bit arguments and instead pass two 32-bit instead. This nicely avoids the alignment restrictions for 64-bit args in ABIs. (The issue for ARM is that with anything other than the "fd, mode, offset, len" layout we will have to deal with different ABI argument layouts, or implement our own wrapper function as done for sys_arm_sync_file_range.) I think the problem comes down to "what is the argument layout which causes the least amount of problems for the complete set of architectures." For ARM, that's the "fd, mode, offset, len" layout. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: From owner-xfs@oss.sgi.com Sat Mar 17 08:40:37 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 08:40:41 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lst.de (verein.lst.de [213.95.11.210]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HFeY6p012804 for ; Sat, 17 Mar 2007 08:40:36 -0700 Received: from verein.lst.de (localhost [127.0.0.1]) by mail.lst.de (8.12.3/8.12.3/Debian-7.1) with ESMTP id l2HFeSb2019038 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Sat, 17 Mar 2007 16:40:28 +0100 Received: (from hch@localhost) by verein.lst.de (8.12.3/8.12.3/Debian-6.6) id l2HFeSPP019036; Sat, 17 Mar 2007 16:40:28 +0100 Date: Sat, 17 Mar 2007 16:40:28 +0100 From: Christoph Hellwig To: tes@melbourne.sgi.com, xfs@oss.sgi.com Subject: HAVE_FORMAT32 never enabled in mainline Message-ID: <20070317154028.GA18997@lst.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i X-Scanned-By: MIMEDefang 2.39 X-archive-position: 10883 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@lst.de Precedence: bulk X-list: xfs Content-Length: 130 Lines: 3 Looks like the code for dealing with x86/amd64 logs is never actually enable in mainline because nothing defines HAVE_FORMAT32. From owner-xfs@oss.sgi.com Sat Mar 17 15:00:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 17 Mar 2007 15:00:56 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2HM0o6p021658 for ; Sat, 17 Mar 2007 15:00:51 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id C7C58185FC857; Sat, 17 Mar 2007 17:00:48 -0500 (CDT) Message-ID: <45FC650F.70208@sandeen.net> Date: Sat, 17 Mar 2007 17:00:47 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: Christoph Hellwig CC: tes@melbourne.sgi.com, xfs@oss.sgi.com Subject: Re: HAVE_FORMAT32 never enabled in mainline References: <20070317154028.GA18997@lst.de> In-Reply-To: <20070317154028.GA18997@lst.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10884 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 431 Lines: 11 Christoph Hellwig wrote: > Looks like the code for dealing with x86/amd64 logs is never actually > enable in mainline because nothing defines HAVE_FORMAT32. Actually since it's all #ifndef (note the n) everything under the #ifndefs -is- enabled (since it's never defined), so I think it's all functional.... structures under the #ifndefs are used outside the #ifndefs, so the #ifndefs can probably just be removed...? -Eric From owner-xfs@oss.sgi.com Sun Mar 18 03:07:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 18 Mar 2007 03:07:30 -0700 (PDT) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_38 autolearn=no version=3.2.0-pre1-r499012 Received: from mail.edu.haifa.ac.il (mail.edu.haifa.ac.il [132.74.40.10]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2IA7M6p005390 for ; Sun, 18 Mar 2007 03:07:25 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.edu.haifa.ac.il (Postfix) with ESMTP id ED0A91F8C2; Sun, 18 Mar 2007 12:13:04 +0200 (IST) Received: from mail.edu.haifa.ac.il ([127.0.0.1]) by localhost (mail.edu.haifa.ac.il [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id m1IiGPZ6rByb; Sun, 18 Mar 2007 12:13:04 +0200 (IST) Received: from kozanostra (leon.edu.haifa.ac.il [132.74.41.33]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by mail.edu.haifa.ac.il (Postfix) with ESMTP id BAA2E193E9; Sun, 18 Mar 2007 12:13:04 +0200 (IST) From: "Leon Kolchinsky" To: "'Martin Steigerwald'" , Subject: RE: cache+barriers vs cache+nobarriers vs disabled cache+barriers vs disabled cache+nobarriers Date: Sun, 18 Mar 2007 12:07:11 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1255" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 In-Reply-To: <200703151339.36259.Martin@lichtvoll.de> thread-index: AcdnAthP/ohJyY/dQoi80dtD3Gr+UQCQJ2aQ Message-Id: <20070318101304.BAA2E193E9@mail.edu.haifa.ac.il> X-archive-position: 10886 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: leonk@construct.haifa.ac.il Precedence: bulk X-list: xfs Content-Length: 1256 Lines: 36 > > Hello All, > > > > > > After reading http://oss.sgi.com/projects/xfs/faq.html#wcache > > and some posts on the list I've got the following question: > > > > If I have disabled write cache on the disk (hdparm -W0 /dev/hda) and by > > default FS is mounted with "barrier" enabled, Is there any taste in > > enabling "barrier"(by default) because write cache is disabled anyway > > or may be it's a good idea to mount with "nobarriers" in this case? > > Hello Leon! > > It is not needed to enable barriers when write cache is disabled. Enabling > barriers in this case shouldn't have any visible effect I think. > Thanks for your reply Martin, My goal is to avoid filesystem corruption at any cost (while trying to use fastest FS for linux) and according to the FAQ disabling write cache is the right way to do it. Power/Hardware failure may occur in-between the flushes (with write barrier enabled) so the safe way (I think) is to disable write cache. It's interesting if there is a significant drop in the performance with disabled disk "write cache" and XFS filesystem comparing to ext3+enabled "write cache". Has anyone some statistics or tests comparing ext3+enabled write cache vs. xfs+disabled write cache? Best Regards, Leon Kolchinsky From owner-xfs@oss.sgi.com Sun Mar 18 14:23:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 18 Mar 2007 14:23:05 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2ILMx6p003167 for ; Sun, 18 Mar 2007 14:23:01 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 246DBAAC2FF; Mon, 19 Mar 2007 08:22:57 +1100 (EST) Subject: But wait, theres more... [Fwd: Bug#415123: -R option can't append to a plain file] From: Nathan Scott Reply-To: nscott@aconex.com To: wkendall@sgi.com Cc: xfs@oss.sgi.com Content-Type: text/plain Organization: Aconex Date: Mon, 19 Mar 2007 08:23:13 +1100 Message-Id: <1174252993.5051.233.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10887 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 832 Lines: 28 And a free set of steak knives if you fix this bug... ;-) -------- Forwarded Message -------- From: Peter Chubb Reply-To: Peter Chubb , 415123@bugs.debian.org To: submit@bugs.debian.org Subject: Bug#415123: -R option can't append to a plain file Date: Fri, 16 Mar 2007 19:54:37 +1100 Package: xfsdump Version: 2.2.38-1 If I use xfsdump to dump to a plain file, interrupt it, then restart with the -R option, xfsdump complains: xfsdump: ERROR: media contains valid xfsdump but does not support append which is misleading: of *course* you can append to a regular file if there's space on the filesystem. -- Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au http://www.ertos.nicta.com.au ERTOS within National ICT Australia -- Nathan From owner-xfs@oss.sgi.com Sun Mar 18 14:33:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 18 Mar 2007 14:33:21 -0700 (PDT) X-Spam-oss-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_38 autolearn=no version=3.2.0-pre1-r499012 Received: from smtp107.sbc.mail.mud.yahoo.com (smtp107.sbc.mail.mud.yahoo.com [68.142.198.206]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2ILXG6p005661 for ; Sun, 18 Mar 2007 14:33:18 -0700 Received: (qmail 70514 invoked from network); 18 Mar 2007 21:33:14 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp107.sbc.mail.mud.yahoo.com with SMTP; 18 Mar 2007 21:33:14 -0000 X-YMail-OSG: X9QztVgVM1nF6ezks.z84N.paHAvbSdsBNgQntGSdOgGd5oq2NqFWjfkG_VbUTzWPpoNeOZHXdM72pS6raE4HZEoZNXMWeKqpFgfZD9kQe_ELiBL795gZ1ZwdlPipqk4DO79L7UjOCKIT60- Received: by tuatara.stupidest.org (Postfix, from userid 10000) id BCBB51826129; Sun, 18 Mar 2007 14:33:13 -0700 (PDT) Date: Sun, 18 Mar 2007 14:33:13 -0700 From: Chris Wedgwood To: Leon Kolchinsky Cc: "'Martin Steigerwald'" , linux-xfs@oss.sgi.com Subject: Re: cache+barriers vs cache+nobarriers vs disabled cache+barriers vs disabled cache+nobarriers Message-ID: <20070318213313.GA23121@tuatara.stupidest.org> References: <200703151339.36259.Martin@lichtvoll.de> <20070318101304.BAA2E193E9@mail.edu.haifa.ac.il> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070318101304.BAA2E193E9@mail.edu.haifa.ac.il> X-archive-position: 10888 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 1472 Lines: 44 On Sun, Mar 18, 2007 at 12:07:11PM +0200, Leon Kolchinsky wrote: > My goal is to avoid filesystem corruption at any cost doing what? applications can still lose data if they're not careful > (while trying to use fastest FS for linux) and according to the FAQ > disabling write cache is the right way to do it. unless you applications are care, it's typical that reliability is going to cost in terms of performance > Power/Hardware failure may occur in-between the flushes (with write > barrier enabled) so the safe way (I think) is to disable write > cache. i've heard (but nobody has been able to give conrete details on this) that disabling the write-cache on modern drives will lessen their lifespan, often considerably with write barriers sane applications should be just as safe as when you disable the write cache > It's interesting if there is a significant drop in the performance > with disabled disk "write cache" and XFS filesystem comparing to > ext3+enabled "write cache". that's expected, w/o the write-cache drives are typically a lot slower for many loads (why else would they put a write cache in disks afetr all) > Has anyone some statistics or tests comparing ext3+enabled write > cache vs. xfs+disabled write cache? why would you compare those two? why ext3 w/ caches enabled and xfs w/ caches disabled? anyhow, it depends on your load, for some loads ext3 is faster and others xfs is faster what access patterns are you expecting? From owner-xfs@oss.sgi.com Sun Mar 18 18:04:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 18 Mar 2007 18:04:16 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2J1496p023688 for ; Sun, 18 Mar 2007 18:04:11 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA00527; Mon, 19 Mar 2007 12:03:55 +1100 Date: Mon, 19 Mar 2007 12:04:00 +1100 From: Timothy Shimmin To: Christoph Hellwig , tes@melbourne.sgi.com, xfs@oss.sgi.com Subject: Re: HAVE_FORMAT32 never enabled in mainline Message-ID: In-Reply-To: <20070317154028.GA18997@lst.de> References: <20070317154028.GA18997@lst.de> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10889 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 772 Lines: 20 Hi Christoph, --On 17 March 2007 4:40:28 PM +0100 Christoph Hellwig wrote: > Looks like the code for dealing with x86/amd64 logs is never actually > enable in mainline because nothing defines HAVE_FORMAT32. It is enabled, the HAVE_FORMAT32 is #ifndef around some particular type definitions. (The code has been tested:) It is currently used because those header files are shared between userspace and kernel. And in userspace, there is a case where the 32bit format definition is already defined in another header file - that case is for IRIX, whose pack syntax is different with its compiler. So if this is a concern then I would need to have these defined elsewhere or break the kernel/userspace link on this file, etc... Suggestions welcome... --Tim From owner-xfs@oss.sgi.com Sun Mar 18 18:06:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 18 Mar 2007 18:06:05 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2J15w6p024237 for ; Sun, 18 Mar 2007 18:06:00 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA00560; Mon, 19 Mar 2007 12:05:49 +1100 Date: Mon, 19 Mar 2007 12:05:54 +1100 From: Timothy Shimmin To: Eric Sandeen , Christoph Hellwig cc: tes@melbourne.sgi.com, xfs@oss.sgi.com Subject: Re: HAVE_FORMAT32 never enabled in mainline Message-ID: <3B879A0369DABD1EDF88296F@timothy-shimmins-power-mac-g5.local> In-Reply-To: <45FC650F.70208@sandeen.net> References: <20070317154028.GA18997@lst.de> <45FC650F.70208@sandeen.net> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10890 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 679 Lines: 26 Hi Eric, --On 17 March 2007 5:00:47 PM -0500 Eric Sandeen wrote: > Christoph Hellwig wrote: >> Looks like the code for dealing with x86/amd64 logs is never actually >> enable in mainline because nothing defines HAVE_FORMAT32. > > Actually since it's all #ifndef (note the n) everything under the #ifndefs -is- enabled (since > it's never defined), so I think it's all functional.... Yep (sorry, just missed your email). > structures under the #ifndefs are used > outside the #ifndefs, so the #ifndefs can probably just be removed...? Would need to sort out the userspace header sharing situation first as mentioned in the hch email. Cheers, Tim. From owner-xfs@oss.sgi.com Mon Mar 19 00:49:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 00:49:28 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from relay.sw.ru (mailhub.sw.ru [195.214.233.200]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2J7nJ6p027244 for ; Mon, 19 Mar 2007 00:49:21 -0700 Received: from localhost ([192.168.3.106]) by relay.sw.ru (8.13.4/8.13.4) with ESMTP id l2J7mRlh007562; Mon, 19 Mar 2007 10:48:28 +0300 (MSK) To: linux-kernel@vger.kernel.org CC: linux-fsdevel@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, xfs@oss.sgi.com, devel@openvz.org Subject: [PATCH 1/2] fs: remove duplicated iovec checking code v8 From: Dmitriy Monakhov Date: Mon, 19 Mar 2007 10:49:01 +0300 Message-ID: <87slc18yhe.fsf@sw.ru> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 10891 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dmonakhov@openvz.org Precedence: bulk X-list: xfs Content-Length: 6911 Lines: 226 Where are several places where the same code used for iovec checks. This patch just move this code to separate helper function, and replace duplicated code with it. IMHO it is better because these are checks that we want for all filesystems/drivers that use vectored I/O. Signed-off-by: Dmitriy Monakhov --- fs/ntfs/file.c | 21 +++------------------ fs/read_write.c | 40 ++++++++++++++++++++++++++++++++++++++++ fs/xfs/linux-2.6/xfs_lrw.c | 22 +++------------------- include/linux/fs.h | 3 +++ mm/filemap.c | 43 ++++++------------------------------------- 5 files changed, 55 insertions(+), 74 deletions(-) diff --git a/fs/ntfs/file.c b/fs/ntfs/file.c index dbbac55..2c672a4 100644 --- a/fs/ntfs/file.c +++ b/fs/ntfs/file.c @@ -2129,28 +2129,13 @@ static ssize_t ntfs_file_aio_write_nolock(struct kiocb *iocb, struct address_space *mapping = file->f_mapping; struct inode *inode = mapping->host; loff_t pos; - unsigned long seg; size_t count; /* after file limit checks */ ssize_t written, err; count = 0; - for (seg = 0; seg < nr_segs; seg++) { - const struct iovec *iv = &iov[seg]; - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - count += iv->iov_len; - if (unlikely((ssize_t)(count|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_READ, iv->iov_base, iv->iov_len)) - continue; - if (!seg) - return -EFAULT; - nr_segs = seg; - count -= iv->iov_len; /* This segment is no good */ - break; - } + err = generic_iovec_checks(iov, &nr_segs, &count, VERIFY_READ); + if (err) + return err; pos = *ppos; vfs_check_frozen(inode->i_sb, SB_FREEZE_WRITE); /* We can write back this queue in page reclaim. */ diff --git a/fs/read_write.c b/fs/read_write.c index 4d03008..22ec324 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -217,6 +217,46 @@ Einval: return -EINVAL; } +/* + * Performs necessary iovec checks before doing a write + * @iov: io vector request + * @nr_segs: number of segments in the iovec + * @count: number of bytes to write + * @access_flags: type of access: %VERIFY_READ or %VERIFY_WRITE + * + * Adjust number of segments and amount of bytes to write (nr_segs should be + * properly initialized first). Returns appropriate error code that caller + * should return or zero in case that write should be allowed. + */ +int generic_iovec_checks(const struct iovec *iov, + unsigned long *nr_segs, size_t *count, + unsigned long access_flags) +{ + unsigned long seg; + size_t cnt = 0; + for (seg = 0; seg < *nr_segs; seg++) { + const struct iovec *iv = &iov[seg]; + + /* + * If any segment has a negative length, or the cumulative + * length ever wraps negative then return -EINVAL. + */ + cnt += iv->iov_len; + if (unlikely((ssize_t)(cnt|iv->iov_len) < 0)) + return -EINVAL; + if (access_ok(access_flags, iv->iov_base, iv->iov_len)) + continue; + if (seg == 0) + return -EFAULT; + *nr_segs = seg; + cnt -= iv->iov_len; /* This segment is no good */ + break; + } + *count = cnt; + return 0; +} +EXPORT_SYMBOL(generic_iovec_checks); + static void wait_on_retry_sync_kiocb(struct kiocb *iocb) { set_current_state(TASK_UNINTERRUPTIBLE); diff --git a/fs/xfs/linux-2.6/xfs_lrw.c b/fs/xfs/linux-2.6/xfs_lrw.c index ff8d64e..9a11b00 100644 --- a/fs/xfs/linux-2.6/xfs_lrw.c +++ b/fs/xfs/linux-2.6/xfs_lrw.c @@ -639,7 +639,6 @@ xfs_write( xfs_fsize_t isize, new_size; xfs_iocore_t *io; bhv_vnode_t *vp; - unsigned long seg; int iolock; int eventsent = 0; bhv_vrwlock_t locktype; @@ -652,24 +651,9 @@ xfs_write( vp = BHV_TO_VNODE(bdp); xip = XFS_BHVTOI(bdp); - for (seg = 0; seg < segs; seg++) { - const struct iovec *iv = &iovp[seg]; - - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - ocount += iv->iov_len; - if (unlikely((ssize_t)(ocount|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_READ, iv->iov_base, iv->iov_len)) - continue; - if (seg == 0) - return -EFAULT; - segs = seg; - ocount -= iv->iov_len; /* This segment is no good */ - break; - } + error = generic_iovec_checks(iovp, &segs, &ocount, VERIFY_READ); + if (error) + return error; count = ocount; pos = *offset; diff --git a/include/linux/fs.h b/include/linux/fs.h index b07d505..032b907 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1771,6 +1771,9 @@ extern ssize_t generic_file_sendfile(struct file *, loff_t *, size_t, read_actor extern void do_generic_mapping_read(struct address_space *mapping, struct file_ra_state *, struct file *, loff_t *, read_descriptor_t *, read_actor_t); +extern int generic_iovec_checks(const struct iovec *iov, + unsigned long *nr_segs, size_t *count, + unsigned long access_flags); /* fs/splice.c */ extern ssize_t generic_file_splice_read(struct file *, loff_t *, diff --git a/mm/filemap.c b/mm/filemap.c index 8e1849a..bbef42f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1180,24 +1180,9 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov, loff_t *ppos = &iocb->ki_pos; count = 0; - for (seg = 0; seg < nr_segs; seg++) { - const struct iovec *iv = &iov[seg]; - - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - count += iv->iov_len; - if (unlikely((ssize_t)(count|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_WRITE, iv->iov_base, iv->iov_len)) - continue; - if (seg == 0) - return -EFAULT; - nr_segs = seg; - count -= iv->iov_len; /* This segment is no good */ - break; - } + retval = generic_iovec_checks(iov, &nr_segs, &count, VERIFY_WRITE); + if (retval) + return retval; /* coalesce the iovecs and go direct-to-BIO for O_DIRECT */ if (filp->f_flags & O_DIRECT) { @@ -2094,30 +2079,14 @@ __generic_file_aio_write_nolock(struct kiocb *iocb, const struct iovec *iov, size_t ocount; /* original count */ size_t count; /* after file limit checks */ struct inode *inode = mapping->host; - unsigned long seg; loff_t pos; ssize_t written; ssize_t err; ocount = 0; - for (seg = 0; seg < nr_segs; seg++) { - const struct iovec *iv = &iov[seg]; - - /* - * If any segment has a negative length, or the cumulative - * length ever wraps negative then return -EINVAL. - */ - ocount += iv->iov_len; - if (unlikely((ssize_t)(ocount|iv->iov_len) < 0)) - return -EINVAL; - if (access_ok(VERIFY_READ, iv->iov_base, iv->iov_len)) - continue; - if (seg == 0) - return -EFAULT; - nr_segs = seg; - ocount -= iv->iov_len; /* This segment is no good */ - break; - } + err = generic_iovec_checks(iov, &nr_segs, &ocount, VERIFY_READ); + if (err) + return err; count = ocount; pos = *ppos; -- 1.4.4.2 From owner-xfs@oss.sgi.com Mon Mar 19 02:31:28 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 02:31:39 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.142]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2J9VR6p020551 for ; Mon, 19 Mar 2007 02:31:28 -0700 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e2.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l2J9VOvt026036 for ; Mon, 19 Mar 2007 05:31:24 -0400 Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2J9U83A183388 for ; Mon, 19 Mar 2007 05:30:08 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2J9U7Zk011279 for ; Mon, 19 Mar 2007 05:30:08 -0400 Received: from amitarora.in.ibm.com ([9.124.93.76]) by d01av01.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2J9U62j011136; Mon, 19 Mar 2007 05:30:06 -0400 Received: from amitarora.in.ibm.com (localhost.localdomain [127.0.0.1]) by amitarora.in.ibm.com (Postfix) with ESMTP id 3706D29ECC3; Mon, 19 Mar 2007 15:00:08 +0530 (IST) Received: (from amit@localhost) by amitarora.in.ibm.com (8.13.1/8.13.1/Submit) id l2J9U6oE003648; Mon, 19 Mar 2007 15:00:06 +0530 Date: Mon, 19 Mar 2007 15:00:06 +0530 From: "Amit K. Arora" To: Stephen Rothwell Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070319093006.GB12092@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070317163350.6b676c26.sfr@canb.auug.org.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070317163350.6b676c26.sfr@canb.auug.org.au> User-Agent: Mutt/1.4.1i X-archive-position: 10892 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: aarora@linux.vnet.ibm.com Precedence: bulk X-list: xfs Content-Length: 836 Lines: 25 On Sat, Mar 17, 2007 at 04:33:50PM +1100, Stephen Rothwell wrote: > On Fri, 16 Mar 2007 20:01:01 +0530 "Amit K. Arora" wrote: > > > > > +asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len); > > > > --- linux-2.6.20.1.orig/include/asm-powerpc/systbl.h > > +++ linux-2.6.20.1/include/asm-powerpc/systbl.h > > @@ -305,3 +305,4 @@ SYSCALL_SPU(faccessat) > > COMPAT_SYS_SPU(get_robust_list) > > COMPAT_SYS_SPU(set_robust_list) > > COMPAT_SYS(move_pages) > > +SYSCALL(fallocate) > > It is going to need to be a COMPAT_SYS call in powerpc because 32 bit > powerpc will pass the two loff_t's in pairs of registers while > 64bit passes them in one register each. Ok. Will make that change, unless it is decided to pass each loff_t argument as two "u32"s. Thanks! -- Regards, Amit Arora From owner-xfs@oss.sgi.com Mon Mar 19 02:49:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 02:49:58 -0700 (PDT) X-Spam-oss-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from over.ny.us.ibm.com (over.ny.us.ibm.com [32.97.182.150]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2J9nm6p023850 for ; Mon, 19 Mar 2007 02:49:51 -0700 Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.149]) by pokfb.esmtp.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2J9OIHG013790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 19 Mar 2007 05:24:18 -0400 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e31.co.us.ibm.com (8.13.8/8.13.8) with ESMTP id l2J9OEIR011701 for ; Mon, 19 Mar 2007 05:24:14 -0400 Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2J9OEbW057976 for ; Mon, 19 Mar 2007 03:24:14 -0600 Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2J9ODFD002219 for ; Mon, 19 Mar 2007 03:24:13 -0600 Received: from amitarora.in.ibm.com ([9.124.93.76]) by d03av04.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2J9OBGV002148; Mon, 19 Mar 2007 03:24:12 -0600 Received: from amitarora.in.ibm.com (localhost.localdomain [127.0.0.1]) by amitarora.in.ibm.com (Postfix) with ESMTP id E97D429ECC3; Mon, 19 Mar 2007 14:54:07 +0530 (IST) Received: (from amit@localhost) by amitarora.in.ibm.com (8.13.1/8.13.1/Submit) id l2J9O59N032401; Mon, 19 Mar 2007 14:54:05 +0530 Date: Mon, 19 Mar 2007 14:54:04 +0530 From: "Amit K. Arora" To: Heiko Carstens Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070319092404.GA12092@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316152103.GD8525@osiris.boeblingen.de.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070316152103.GD8525@osiris.boeblingen.de.ibm.com> User-Agent: Mutt/1.4.1i X-archive-position: 10893 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: aarora@linux.vnet.ibm.com Precedence: bulk X-list: xfs Content-Length: 1614 Lines: 36 On Fri, Mar 16, 2007 at 04:21:03PM +0100, Heiko Carstens wrote: > On Fri, Mar 16, 2007 at 08:01:01PM +0530, Amit K. Arora wrote: > > First of all, thanks for the overwhelming response! > > > > Based on the suggestions received, I have added a new parameter to the > > sys_fallocate() system call - an interger called "mode", just after the > > "fd". Now the system call looks like this: > > > > asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > > > > Currently we have two modes FA_ALLOCATE and FA_DEALLOCATE, for > > preallocation and deallocation of preallocated blocks respectively. More > > modes can be added, when required. And these modes can be renamed, since > > I am sure these are no way the best ones ! :) > > > > Attached below is the patch which implements this system call. It has > > been currently implemented and tested on i386, ppc64 and x86_64 > > architectures. I am facing some problems while trying to implement this > > on s390, and thus the delay. While I try to get it right on s390(x), we > > thought of posting this patch, so that we can save some time. Parallely > > we will work on getting the patch work on s390, and probably it will > > come as a separate patch. > > What's the problem you face on s390? If it's just the compat wrapper, you > may look at sys_sync_file_range_wrapper. Or I will send a patch if needed. Hi Heiko, Yes, the problem was adding compat wrapper for this. I will appreciate your help in writing it. Only thing is that we might have to wait till the order of the arguments is decided upon. Thanks! -- Regards, Amit Arora From owner-xfs@oss.sgi.com Mon Mar 19 02:54:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 02:54:23 -0700 (PDT) X-Spam-oss-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2J9sH6p025288 for ; Mon, 19 Mar 2007 02:54:18 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HTEAG-0001Eh-Sd; Mon, 19 Mar 2007 09:28:28 +0000 Date: Mon, 19 Mar 2007 09:28:28 +0000 From: Christoph Hellwig To: Dmitriy Monakhov Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, xfs@oss.sgi.com, devel@openvz.org Subject: Re: [PATCH 1/2] fs: remove duplicated iovec checking code v8 Message-ID: <20070319092828.GA3364@infradead.org> Mail-Followup-To: Christoph Hellwig , Dmitriy Monakhov , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ntfs-dev@lists.sourceforge.net, xfs@oss.sgi.com, devel@openvz.org References: <87slc18yhe.fsf@sw.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87slc18yhe.fsf@sw.ru> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 10894 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 593 Lines: 12 On Mon, Mar 19, 2007 at 10:49:01AM +0300, Dmitriy Monakhov wrote: > > Where are several places where the same code used for iovec checks. > This patch just move this code to separate helper function, and replace > duplicated code with it. IMHO it is better because these are checks that > we want for all filesystems/drivers that use vectored I/O. Please move this into the common code path, so it's checked before entering the filesystem. This won't cover the calculating count until we have an iodesc/uio strcuture to pass it down, so feel free to add a tiny helper for that temporaily. From owner-xfs@oss.sgi.com Mon Mar 19 04:25:14 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 04:25:23 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mtagate4.uk.ibm.com (mtagate4.uk.ibm.com [195.212.29.137]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2JBPB6p020176 for ; Mon, 19 Mar 2007 04:25:13 -0700 Received: from d06nrmr1407.portsmouth.uk.ibm.com (d06nrmr1407.portsmouth.uk.ibm.com [9.149.38.185]) by mtagate4.uk.ibm.com (8.13.8/8.13.8) with ESMTP id l2JBP8s6025314 for ; Mon, 19 Mar 2007 11:25:08 GMT Received: from d06av02.portsmouth.uk.ibm.com (d06av02.portsmouth.uk.ibm.com [9.149.37.228]) by d06nrmr1407.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2JBP8sj2039814 for ; Mon, 19 Mar 2007 11:25:08 GMT Received: from d06av02.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av02.portsmouth.uk.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2JBP7Pc017691 for ; Mon, 19 Mar 2007 11:25:08 GMT Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d06av02.portsmouth.uk.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2JBP7YG017676; Mon, 19 Mar 2007 11:25:07 GMT Date: Mon, 19 Mar 2007 12:23:12 +0100 From: Heiko Carstens To: "Amit K. Arora" Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com, Matthew Wilcox , Paul Mackerras , Stephen Rothwell , Russell King Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070319112312.GA8331@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316152103.GD8525@osiris.boeblingen.de.ibm.com> <20070319092404.GA12092@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070319092404.GA12092@amitarora.in.ibm.com> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 10895 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 1261 Lines: 26 On Mon, Mar 19, 2007 at 02:54:04PM +0530, Amit K. Arora wrote: > On Fri, Mar 16, 2007 at 04:21:03PM +0100, Heiko Carstens wrote: > > On Fri, Mar 16, 2007 at 08:01:01PM +0530, Amit K. Arora wrote: > > > asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len) > > > > > > Currently we have two modes FA_ALLOCATE and FA_DEALLOCATE, for > > > preallocation and deallocation of preallocated blocks respectively. More > > > modes can be added, when required. And these modes can be renamed, since > > > I am sure these are no way the best ones ! :) > > > > Yes, the problem was adding compat wrapper for this. I will appreciate > your help in writing it. Only thing is that we might have to wait till > the order of the arguments is decided upon. Thanks! There is probably not much choice. If you want to stay with the loff_t arguments it won't work on 31-bit s390 or 32-bit powerpc dependent on the order of the arguments. So you should go for what Matthew Wilcox suggested: asmlinkage long sys_fallocate(int fd, int mode, u32 off_low, u32 off_high, u32 len_low, u32 len_high); That way it will work an all architectures and in addition no architecture has to do some magic to combine the splitted 64 bit arguments in compat mode. From owner-xfs@oss.sgi.com Mon Mar 19 17:59:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 17:59:29 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2K0xM6p023091 for ; Mon, 19 Mar 2007 17:59:25 -0700 Received: from linuxbuild.melbourne.sgi.com (linuxbuild.melbourne.sgi.com [134.14.54.115]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA16266; Tue, 20 Mar 2007 11:59:16 +1100 From: donaldd@sgi.com Received: by linuxbuild.melbourne.sgi.com (Postfix, from userid 16365) id 2197B26AB4C4; Tue, 20 Mar 2007 11:59:15 +1100 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: TAKE 962291 - Quota enforcement active for all quota types when enforcement is active for any Message-Id: <20070320005916.2197B26AB4C4@linuxbuild.melbourne.sgi.com> Date: Tue, 20 Mar 2007 11:59:15 +1100 (EST) X-archive-position: 10898 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Content-Length: 1472 Lines: 40 Fix uquota and oquota enforcement problems. When uquota and oquota (gquota/pquota) are enabled for accounting both are enforced if ether has enforcement active. Conditions: - Both XFS_UQUOTA_ACCT and XFS_GQUOTA_ACCT are enabled. - Either XFS_UQUOTA_ENFD or XFS_OQUOTA_ENFD is enabled. - The usage without enforce is reached at the soft limit. Problems: 1. "repquota" shows all grace time even if no enforcement. 2. we cannot make a file over a hard limits even if no enforcement. Signed-off-by: Kouta Ooizumi Date: Tue Mar 20 11:54:43 AEDT 2007 Workarea: linuxbuild.melbourne.sgi.com:/home/donaldd/isms/2.6.x-xfs Inspected by: doanldd The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28272a fs/xfs/xfs_quota.h - 1.48 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_quota.h.diff?r1=text&tr1=1.48&r2=text&tr2=1.47&f=h - Split out XFS_IS_QUOTA_ENFORCED into XFS_IS_UQUOTA_ENFORCED and XFS_IS_OQUOTA_ENFORCED. fs/xfs/quota/xfs_trans_dquot.c - 1.19 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_trans_dquot.c.diff?r1=text&tr1=1.19&r2=text&tr2=1.18&f=h - Fix uquota and oquota enforcement problems. fs/xfs/quota/xfs_qm_syscalls.c - 1.33 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/quota/xfs_qm_syscalls.c.diff?r1=text&tr1=1.33&r2=text&tr2=1.32&f=h - Fix uquota and oquota enforcement problems. From owner-xfs@oss.sgi.com Mon Mar 19 18:09:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 18:09:57 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_05 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2K19n6p025943 for ; Mon, 19 Mar 2007 18:09:51 -0700 Received: from [134.14.55.84] (shark.melbourne.sgi.com [134.14.55.84]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA17077; Tue, 20 Mar 2007 12:09:41 +1100 Message-ID: <45FF344C.2070000@sgi.com> Date: Tue, 20 Mar 2007 12:09:32 +1100 From: Donald Douwsma User-Agent: Thunderbird 1.5.0.10 (X11/20070306) MIME-Version: 1.0 To: Kouta Ooizumi CC: xfs@oss.sgi.com Subject: Re: [PATCH 2/2] Fix a bug which is in user or group quota enforcement disabled References: <20070223132318k-ooizumi@rifu.bsd.tnes.nec.co.jp> In-Reply-To: <20070223132318k-ooizumi@rifu.bsd.tnes.nec.co.jp> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-archive-position: 10899 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: donaldd@sgi.com Precedence: bulk X-list: xfs Content-Length: 523 Lines: 21 Kouta Ooizumi wrote: > Hi! > > I found a bug in the following conditions. > > Conditions: > - Both XFS_UQUOTA_ACCT and XFS_GQUOTA_ACCT are enabled. > - Either XFS_UQUOTA_ENFD or XFS_OQUOTA_ENFD is enabled. > - The usage without enforce is reached at the soft limit. > > Problems: > 1. "repquota" shows all grace time even if no enforcement. > 2. we cannot make a file over a hard limits even if no enforcement. Hi Kouta, Thanks for the patches, they're both in now. Sorry for the delay picking them up. Donald From owner-xfs@oss.sgi.com Mon Mar 19 20:17:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 20:17:31 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2K3HO6p023499 for ; Mon, 19 Mar 2007 20:17:25 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 7E660187B8C54 for ; Mon, 19 Mar 2007 22:17:21 -0500 (CDT) Message-ID: <45FF5241.4000605@sandeen.net> Date: Mon, 19 Mar 2007 22:17:21 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: [PATCH] kill off xfs_count_bits Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10900 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 5349 Lines: 168 xfs_count_bits is only called once, and is then compared to 0. IOW, what it really wants to know is, is the bitmap empty. This can be done more simply, certainly. This is compile-tested and quickly unit-tested in userspace, but feel free to scrutinize & live with it in the patch stack for a while. I need to get an xfs test box at home someday. :) xfs_bit.c | 93 ++++++--------------------------------------------------- xfs_bit.h | 4 +- xfs_buf_item.c | 4 +- 3 files changed, 14 insertions(+), 87 deletions(-) === Remove xfs_count_bits in favor of a much simpler xfs_bitmap_empty, because that's how the only caller ever uses it. Signed-off-by: Eric Sandeen Index: linux/fs/xfs/xfs_bit.c =================================================================== --- linux.orig/fs/xfs/xfs_bit.c +++ linux/fs/xfs/xfs_bit.c @@ -66,44 +66,6 @@ static const char xfs_highbit[256] = { #endif /* - * Count of bits set in byte, 0..8. - */ -static const char xfs_countbit[256] = { - 0, 1, 1, 2, 1, 2, 2, 3, /* 00 .. 07 */ - 1, 2, 2, 3, 2, 3, 3, 4, /* 08 .. 0f */ - 1, 2, 2, 3, 2, 3, 3, 4, /* 10 .. 17 */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 18 .. 1f */ - 1, 2, 2, 3, 2, 3, 3, 4, /* 20 .. 27 */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 28 .. 2f */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 30 .. 37 */ - 3, 4, 4, 5, 4, 5, 5, 6, /* 38 .. 3f */ - 1, 2, 2, 3, 2, 3, 3, 4, /* 40 .. 47 */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 48 .. 4f */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 50 .. 57 */ - 3, 4, 4, 5, 4, 5, 5, 6, /* 58 .. 5f */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 60 .. 67 */ - 3, 4, 4, 5, 4, 5, 5, 6, /* 68 .. 6f */ - 3, 4, 4, 5, 4, 5, 5, 6, /* 70 .. 77 */ - 4, 5, 5, 6, 5, 6, 6, 7, /* 78 .. 7f */ - 1, 2, 2, 3, 2, 3, 3, 4, /* 80 .. 87 */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 88 .. 8f */ - 2, 3, 3, 4, 3, 4, 4, 5, /* 90 .. 97 */ - 3, 4, 4, 5, 4, 5, 5, 6, /* 98 .. 9f */ - 2, 3, 3, 4, 3, 4, 4, 5, /* a0 .. a7 */ - 3, 4, 4, 5, 4, 5, 5, 6, /* a8 .. af */ - 3, 4, 4, 5, 4, 5, 5, 6, /* b0 .. b7 */ - 4, 5, 5, 6, 5, 6, 6, 7, /* b8 .. bf */ - 2, 3, 3, 4, 3, 4, 4, 5, /* c0 .. c7 */ - 3, 4, 4, 5, 4, 5, 5, 6, /* c8 .. cf */ - 3, 4, 4, 5, 4, 5, 5, 6, /* d0 .. d7 */ - 4, 5, 5, 6, 5, 6, 6, 7, /* d8 .. df */ - 3, 4, 4, 5, 4, 5, 5, 6, /* e0 .. e7 */ - 4, 5, 5, 6, 5, 6, 6, 7, /* e8 .. ef */ - 4, 5, 5, 6, 5, 6, 6, 7, /* f0 .. f7 */ - 5, 6, 6, 7, 6, 7, 7, 8, /* f8 .. ff */ -}; - -/* * xfs_highbit32: get high bit set out of 32-bit argument, -1 if none set. */ inline int @@ -167,56 +129,21 @@ xfs_highbit64( /* - * Count the number of bits set in the bitmap starting with bit - * start_bit. Size is the size of the bitmap in words. - * - * Do the counting by mapping a byte value to the number of set - * bits for that value using the xfs_countbit array, i.e. - * xfs_countbit[0] == 0, xfs_countbit[1] == 1, xfs_countbit[2] == 1, - * xfs_countbit[3] == 2, etc. + * Return whether bitmap is empty. + * Size is number of words in the bitmap, which is padded to word boundary + * Returns 1 for empty, 0 for non-empty. */ int -xfs_count_bits(uint *map, uint size, uint start_bit) +xfs_bitmap_empty(uint *map, uint size) { - register int bits; - register unsigned char *bytep; - register unsigned char *end_map; - int byte_bit; - - bits = 0; - end_map = (char*)(map + size); - bytep = (char*)(map + (start_bit & ~0x7)); - byte_bit = start_bit & 0x7; - - /* - * If the caller fell off the end of the map, return 0. - */ - if (bytep >= end_map) { - return (0); - } - - /* - * If start_bit is not byte aligned, then process the - * first byte separately. - */ - if (byte_bit != 0) { - /* - * Shift off the bits we don't want to look at, - * before indexing into xfs_countbit. - */ - bits += xfs_countbit[(*bytep >> byte_bit)]; - bytep++; - } - - /* - * Count the bits in each byte until the end of the bitmap. - */ - while (bytep < end_map) { - bits += xfs_countbit[*bytep]; - bytep++; + uint i; + uint ret = 0; + + for (i = 0; i < size; i++) { + ret |= map[i]; } - return (bits); + return (ret == 0); } /* Index: linux/fs/xfs/xfs_bit.h =================================================================== --- linux.orig/fs/xfs/xfs_bit.h +++ linux/fs/xfs/xfs_bit.h @@ -55,8 +55,8 @@ extern int xfs_lowbit64(__uint64_t v); /* Get high bit set out of 64-bit argument, -1 if none set */ extern int xfs_highbit64(__uint64_t); -/* Count set bits in map starting with start_bit */ -extern int xfs_count_bits(uint *map, uint size, uint start_bit); +/* Return whether bitmap is empty (1 == empty) */ +extern int xfs_bitmap_empty(uint *map, uint size); /* Count continuous one bits in map starting with start_bit */ extern int xfs_contig_bits(uint *map, uint size, uint start_bit); Index: linux/fs/xfs/xfs_buf_item.c =================================================================== --- linux/fs/xfs/xfs_buf_item.c +++ linux.orig/fs/xfs/xfs_buf_item.c @@ -580,8 +580,8 @@ xfs_buf_item_unlock( * If the buf item isn't tracking any data, free it. * Otherwise, if XFS_BLI_HOLD is set clear it. */ - if (xfs_count_bits(bip->bli_format.blf_data_map, - bip->bli_format.blf_map_size, 0) == 0) { + if (xfs_bitmap_empty(bip->bli_format.blf_data_map, + bip->bli_format.blf_map_size)) { xfs_buf_item_relse(bp); } else if (hold) { bip->bli_flags &= ~XFS_BLI_HOLD; From owner-xfs@oss.sgi.com Mon Mar 19 20:26:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 20:26:15 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2K3Q56p031541 for ; Mon, 19 Mar 2007 20:26:09 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA27144; Tue, 20 Mar 2007 14:25:50 +1100 Date: Tue, 20 Mar 2007 14:26:00 +1100 From: Timothy Shimmin To: nscott@aconex.com, wkendall@sgi.com cc: xfs@oss.sgi.com Subject: Re: But wait, theres more... [Fwd: Bug#415123: -R option can't append to a plain file] Message-ID: <17CB3A8C59A80064581C0A32@timothy-shimmins-power-mac-g5.local> In-Reply-To: <1174252993.5051.233.camel@edge> References: <1174252993.5051.233.camel@edge> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10901 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1578 Lines: 50 Hi, I'll leave this to Bill. Just a couple of comments. I don't know if you can call it a "bug", more of a restriction. There is a layer provided for the I/O using drive_*.c - i.e drive_scsitape.c, drive_minrmt.c, drive_simple.c (which show up as a "drive strategy" on output). As part of the ds_instantiate() interface one typically sets up the d_capabilities that this drive strategy supports and for files (drive_simple), DRIVE_CAP_APPEND is not given. What the ramifications for drive_simple (file strategy) are, I wouldn't know without looking at the code and playing with it. But I'd call it an RFE :) --Tim --On 19 March 2007 8:23:13 AM +1100 Nathan Scott wrote: > And a free set of steak knives if you fix this bug... ;-) > > -------- Forwarded Message -------- > From: Peter Chubb > Reply-To: Peter Chubb , > 415123@bugs.debian.org > To: submit@bugs.debian.org > Subject: Bug#415123: -R option can't append to a plain file > Date: Fri, 16 Mar 2007 19:54:37 +1100 > > Package: xfsdump > Version: 2.2.38-1 > > If I use xfsdump to dump to a plain file, interrupt it, then restart > with the -R option, xfsdump complains: > > xfsdump: ERROR: media contains valid xfsdump but does not support append > > which is misleading: of *course* you can append to a regular file if > there's space on the filesystem. > > -- > Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au > http://www.ertos.nicta.com.au ERTOS within National ICT Australia > > -- > Nathan > From owner-xfs@oss.sgi.com Mon Mar 19 22:36:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 22:36:29 -0700 (PDT) X-Spam-oss-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx1.suse.de (mail.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2K5aL6p005294 for ; Mon, 19 Mar 2007 22:36:22 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id D4A6812207; Tue, 20 Mar 2007 06:36:18 +0100 (CET) Date: Tue, 20 Mar 2007 06:36:18 +0100 From: Nick Piggin To: Mark Fasheh Cc: Linux Filesystems , reiserfs-list@namesys.com, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, nfs@lists.sourceforge.net, cluster-devel@redhat.com, jfs-discussion@lists.sourceforge.net Subject: Re: Announce: new-aops-1 for 2.6.21-rc3 Message-ID: <20070320053618.GB30766@wotan.suse.de> References: <20070315161704.GH8321@wotan.suse.de> <20070315234713.GH21942@ca-server1.us.oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315234713.GH21942@ca-server1.us.oracle.com> User-Agent: Mutt/1.5.9i X-archive-position: 10902 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: npiggin@suse.de Precedence: bulk X-list: xfs Content-Length: 501 Lines: 15 On Thu, Mar 15, 2007 at 04:47:13PM -0700, Mark Fasheh wrote: > On Thu, Mar 15, 2007 at 05:17:04PM +0100, Nick Piggin wrote: > > (excludes the OCFS2 patch that Mark sent, in anticipation of an update) > > Attached is said patch. I needed to export __grab_cache_page (ext2/ext3 also > need this if they're to be built as modules), so a patch to do that is also > attached. > > This passed some preliminary testing on a two node cluster I have here at > Oracle. Thanks Mark, I've merged these. Nick From owner-xfs@oss.sgi.com Mon Mar 19 23:46:44 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 19 Mar 2007 23:46:48 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2K6kf6p024723 for ; Mon, 19 Mar 2007 23:46:42 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA12023; Tue, 20 Mar 2007 17:46:37 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2K6kZAf33555414; Tue, 20 Mar 2007 17:46:36 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2K6kWKa33602683; Tue, 20 Mar 2007 17:46:32 +1100 (AEDT) Date: Tue, 20 Mar 2007 17:46:32 +1100 From: David Chinner To: Marco Berizzi Cc: David Chinner , linux-kernel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: XFS internal error xfs_da_do_buf(2) at line 2087 of file fs/xfs/xfs_da_btree.c. Caller 0xc01b00bd Message-ID: <20070320064632.GO32602149@melbourne.sgi.com> References: <20070316012520.GN5743@melbourne.sgi.com> <20070316195951.GB5743@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-archive-position: 10903 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 2781 Lines: 82 On Mon, Mar 19, 2007 at 11:32:27AM +0100, Marco Berizzi wrote: > Marco Berizzi wrote: > > David Chinner wrote: > > > >> Ok, so an ipsec change. And I see from the history below it > >> really has nothing to do with this problem. it seems the problem > >> has something to do with changes between 2.6.19.1 and 2.6.19.2. > > > > indeed. Yesterday at 13:00 I have switched from 2.6.19.1 to 2.6.19.2 > > (without the ipsec fix) and at about 17:30 linux has crashed again. > > I have recompiled 2.6.19.2 with all kernel debugging options enabled > > and rebooted. Now I'm waiting for the crash... > > Linux has not been crashed. However here is dmesg output > with all debugging option enabled: (search for 'INFO: > possible recursive locking detected'). Is that normal? ..... > ============================================= > [ INFO: possible recursive locking detected ] > 2.6.19.2 #1 > --------------------------------------------- > rm/470 is trying to acquire lock: > (&(&ip->i_lock)->mr_lock){----}, at: [] xfs_ilock+0x5b/0xa1 > > but task is already holding lock: > (&(&ip->i_lock)->mr_lock){----}, at: [] xfs_ilock+0x5b/0xa1 > > other info that might help us debug this: > 3 locks held by rm/470: > #0: (&inode->i_mutex/1){--..}, at: [] do_unlinkat+0x70/0x115 > #1: (&inode->i_mutex){--..}, at: [] mutex_lock+0x1c/0x1f > #2: (&(&ip->i_lock)->mr_lock){----}, at: [] > xfs_ilock+0x5b/0xa1 > > stack backtrace: > [] dump_trace+0x215/0x21a > [] show_trace_log_lvl+0x1a/0x30 > [] show_trace+0x12/0x14 > [] dump_stack+0x19/0x1b > [] print_deadlock_bug+0xc0/0xcf > [] check_deadlock+0x6a/0x79 > [] __lock_acquire+0x350/0x970 > [] lock_acquire+0x75/0x97 > [] down_write+0x3a/0x54 > [] xfs_ilock+0x5b/0xa1 > [] xfs_lock_dir_and_entry+0x105/0x11b > [] xfs_remove+0x180/0x47f > [] xfs_vn_unlink+0x22/0x4f > [] vfs_unlink+0x9e/0xa2 > [] do_unlinkat+0xa8/0x115 > [] sys_unlink+0x10/0x12 > [] syscall_call+0x7/0xb > [] 0xb7efaa7d > ======================= That's no problem - lockdep just doesn't know that we can nest i_lock (we've got to get the annotations for this sorted out). > Here is the relevant results: > > Phase 2 - found root inode chunk > Phase 3 - ... > agno = 0 > ... > agno = 12 > LEAFN node level is 1 inode 1610612918 bno = 8388608 Hmmm - single bit error in the bno - that reminds of this: http://oss.sgi.com/projects/xfs/faq.html#dir2 So I'd definitely make sure that is repaired.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Mar 20 07:51:48 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 20 Mar 2007 07:51:55 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=BAYES_50,J_CHICKENPOX_45 autolearn=no version=3.2.0-pre1-r499012 Received: from mail.interline.it (mail.interline.it [195.182.241.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2KEpj6p030549 for ; Tue, 20 Mar 2007 07:51:48 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.interline.it (Postfix) with ESMTP id 68DBADDA for ; Tue, 20 Mar 2007 15:23:44 +0100 (CET) Received: from mail.interline.it ([127.0.0.1]) by localhost (pin [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 11358-24 for ; Tue, 20 Mar 2007 15:23:14 +0100 (CET) From: "Daniele P." Organization: Interline To: xfs@oss.sgi.com Subject: xfsrepair memory consumption Date: Tue, 20 Mar 2007 15:32:05 +0100 User-Agent: KMail/1.9.5 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200703201532.06076.daniele@interline.it> X-archive-position: 10905 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: daniele@interline.it Precedence: bulk X-list: xfs Content-Length: 2069 Lines: 59 Hi all, I'm just asking if any work has been done/is in progress to improve xfs_repair memory consumption. I know that xfs tools use a lot of memory but IIRC someone wrote on this mailing list that s/he'll work on this issue. But now I see that the memory requirements are increasing instead of lowering. I discovered this because running xfs_repair 2.6.20-1 on a 300GB file system fills up the entire memory on a debian sarge (256MB Mem/256MB swap) and eventually the process is killed .... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... Killed Next I tried a self compiled xfsprogs 2.8.18-1 from cvs but things get worse. So I increased the memory to 512MB, and made sure not to use the multi thread but still no luck: .... Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 Killed Finally using the *old* version 2.6.20-1 with 512MB of memory xfs_repair finished with success, but using near all the available memory. More info: enceladus:~# xfs_info /dev/sdb1 meta-data=/media/iomega300 isize=256 agcount=16, agsize=4578901 blks = sectsz=512 data = bsize=4096 blocks=73262416, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 enceladus:~# df /dev/sdb1 Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 292918592 164698268 128220324 57% /media/300 enceladus:~# df -i /dev/sdb1 Filesystem Inodes IUsed IFree IUse% Mounted on /dev/sdb1 293049664 6511481 286538183 3% /media/300 Thanks in Advance, Daniele P. From owner-xfs@oss.sgi.com Tue Mar 20 15:13:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 20 Mar 2007 15:13:27 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2KMDJ6p007133 for ; Tue, 20 Mar 2007 15:13:21 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA22108; Wed, 21 Mar 2007 09:13:10 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2KMD8Af34492969; Wed, 21 Mar 2007 09:13:08 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2KMD5Pi34490796; Wed, 21 Mar 2007 09:13:05 +1100 (AEDT) Date: Wed, 21 Mar 2007 09:13:05 +1100 From: David Chinner To: "Daniele P." Cc: xfs@oss.sgi.com Subject: Re: xfsrepair memory consumption Message-ID: <20070320221305.GR32602149@melbourne.sgi.com> References: <200703201532.06076.daniele@interline.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200703201532.06076.daniele@interline.it> User-Agent: Mutt/1.4.2.1i X-archive-position: 10906 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1058 Lines: 33 On Tue, Mar 20, 2007 at 03:32:05PM +0100, Daniele P. wrote: > Hi all, > I'm just asking if any work has been done/is in progress to improve > xfs_repair memory consumption. Work is in progress, but it won't really solve your problem. See, you've got: > enceladus:~# df -i /dev/sdb1 > Filesystem Inodes IUsed IFree IUse% Mounted on > /dev/sdb1 293049664 6511481 286538183 3% /media/300 6 million inodes in your filesystem, and a certain points in repair we have to hold indexes of them all (plus some state) in memory. Phase 6 is one of these points. In terms of inode count, I generally use the rule that for every 10million inodes you need a gigabyte of RAM for repair - you needed about 500MB for 6million inodes. We have been trimming bits and pieces off this per-inode usage but there comes a point where you just need more memory. That is, as filesystem size grows, so does the amount of memory needed to repair it in a finite time.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Mar 20 16:43:42 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 20 Mar 2007 16:43:45 -0700 (PDT) X-Spam-oss-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2KNhc6p025597 for ; Tue, 20 Mar 2007 16:43:40 -0700 Received: from pcbnaujok (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA28746; Wed, 21 Mar 2007 10:43:29 +1100 Message-Id: <200703202343.KAA28746@larry.melbourne.sgi.com> From: "Barry Naujok" To: "'Daniele P.'" Cc: Subject: RE: xfsrepair memory consumption Date: Wed, 21 Mar 2007 10:48:53 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 Thread-Index: AcdrPSAnvXapeeAST/uXJCZntRqVaQADQGAA X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 In-Reply-To: <20070320221305.GR32602149@melbourne.sgi.com> X-archive-position: 10907 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@melbourne.sgi.com Precedence: bulk X-list: xfs Content-Length: 1639 Lines: 55 Hi Daniele, The nlink/phase 7 patch I recently sent out does reduce this inode memory requirements which should address the issue you see. I'm going to make one further optimisation in this patch which I will repost most likely next week. Regards, Barry. > -----Original Message----- > From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] > On Behalf Of David Chinner > Sent: Wednesday, 21 March 2007 9:13 AM > To: Daniele P. > Cc: xfs@oss.sgi.com > Subject: Re: xfsrepair memory consumption > > On Tue, Mar 20, 2007 at 03:32:05PM +0100, Daniele P. wrote: > > Hi all, > > I'm just asking if any work has been done/is in progress to improve > > xfs_repair memory consumption. > > Work is in progress, but it won't really solve your problem. > See, you've got: > > > enceladus:~# df -i /dev/sdb1 > > Filesystem Inodes IUsed IFree IUse% Mounted on > > /dev/sdb1 293049664 6511481 286538183 3% /media/300 > > 6 million inodes in your filesystem, and a certain points in > repair we have to hold indexes of them all (plus some state) in > memory. Phase 6 is one of these points. > > In terms of inode count, I generally use the rule that for every > 10million inodes you need a gigabyte of RAM for repair - you needed > about 500MB for 6million inodes. > > We have been trimming bits and pieces off this per-inode usage > but there comes a point where you just need more memory. That > is, as filesystem size grows, so does the amount of memory > needed to repair it in a finite time.... > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > > From owner-xfs@oss.sgi.com Tue Mar 20 18:16:32 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 20 Mar 2007 18:16:38 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2L1GT6p008334 for ; Tue, 20 Mar 2007 18:16:31 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 56BA5AACB75; Wed, 21 Mar 2007 12:16:28 +1100 (EST) Subject: Re: XFS bug??? From: Nathan Scott Reply-To: nscott@aconex.com To: Peter Chubb Cc: xfs@oss.sgi.com In-Reply-To: <87y7lrmnra.wl%peterc@chubb.wattle.id.au> References: <87y7lrmnra.wl%peterc@chubb.wattle.id.au> Content-Type: text/plain Organization: Aconex Date: Wed, 21 Mar 2007 12:17:01 +1100 Message-Id: <1174439821.5051.314.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10908 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 2848 Lines: 47 On Wed, 2007-03-21 at 11:41 +1100, Peter Chubb wrote: > Hi Nathan, > Our main backup machine is showing XFS errors. Any ideas how > to fix? Hi Peter, I don't have as much time to spend on XFS as I used to, so better to contact the list at SGI (CC'd). What kernel version are you running there? Looks like a corrupt directory - was this machine exposed to the 2.6.17 corruption issue perhaps? cheers. > Mar 20 08:12:29 bitburger kernel: Filesystem "dm-0": XFS internal error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c. Caller 0xf91d5d60 > Mar 20 08:12:29 bitburger kernel: [pg0+953910291/1069454336] xfs_trans_cancel+0x103/0x140 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+953945440/1069454336] xfs_create+0x3d0/0x780 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+953945440/1069454336] xfs_create+0x3d0/0x780 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+954000638/1069454336] xfs_vn_mknod+0x3ae/0x4b0 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+953977031/1069454336] xfs_buf_free+0x47/0xc0 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+953722066/1069454336] xfs_da_state_free+0x52/0x70 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+953759517/1069454336] xfs_dir2_node_lookup+0x9d/0xd0 [xfs] > Mar 20 08:12:29 bitburger kernel: [pg0+953724888/1069454336] xfs_dir_lookup+0x138/0x160 [xfs] > Mar 20 08:12:29 bitburger kernel: [__link_path_walk+3778/3808] __link_path_walk+0xec2/0xee0 > Mar 20 08:12:29 bitburger kernel: [_atomic_dec_and_lock+43/80] _atomic_dec_and_lock+0x2b/0x50 > Mar 20 08:12:29 bitburger kernel: [mntput_no_expire+35/208] mntput_no_expire+0x23/0xd0 > Mar 20 08:12:29 bitburger kernel: [pg0+953918840/1069454336] xfs_dir_lookup_int+0x48/0x130 [xfs] > Mar 20 08:12:29 bitburger kernel: [permission+211/272] permission+0xd3/0x110 > Mar 20 08:12:29 bitburger kernel: [vfs_create+210/384] vfs_create+0xd2/0x180 > Mar 20 08:12:29 bitburger kernel: [open_namei+1806/1904] open_namei+0x70e/0x770 > Mar 20 08:12:29 bitburger kernel: [netif_receive_skb+651/880] netif_receive_skb+0x28b/0x370 > Mar 20 08:12:29 bitburger kernel: [do_filp_open+64/96] do_filp_open+0x40/0x60 > Mar 20 08:12:29 bitburger kernel: [get_unused_fd+168/240] get_unused_fd+0xa8/0xf0 > Mar 20 08:12:29 bitburger kernel: [do_sys_open+87/240] do_sys_open+0x57/0xf0 > Mar 20 08:12:29 bitburger kernel: [sys_open+39/48] sys_open+0x27/0x30 > Mar 20 08:12:29 bitburger kernel: [syscall_call+7/11] syscall_call+0x7/0xb > Mar 20 08:12:29 bitburger kernel: Filesystem "dm-0": Corruption of in-memory data detected. Shutting down filesystem: dm-0 > Mar 20 08:12:29 bitburger kernel: Please umount the filesystem, and rectify the > problem(s) > > > -- > Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT gelato.unsw.edu.au > http://www.ertos.nicta.com.au ERTOS within National ICT Australia -- Nathan From owner-xfs@oss.sgi.com Tue Mar 20 19:24:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 20 Mar 2007 19:24:49 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2L2Oh6p019627 for ; Tue, 20 Mar 2007 19:24:45 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA23050; Wed, 21 Mar 2007 13:24:28 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2L2OQAf34591458; Wed, 21 Mar 2007 13:24:26 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2L2ONV630524858; Wed, 21 Mar 2007 13:24:23 +1100 (AEDT) Date: Wed, 21 Mar 2007 13:24:23 +1100 From: David Chinner To: Nathan Scott Cc: Peter Chubb , xfs@oss.sgi.com Subject: Re: XFS bug??? Message-ID: <20070321022423.GB32602149@melbourne.sgi.com> References: <87y7lrmnra.wl%peterc@chubb.wattle.id.au> <1174439821.5051.314.camel@edge> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1174439821.5051.314.camel@edge> User-Agent: Mutt/1.4.2.1i X-archive-position: 10909 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1007 Lines: 32 On Wed, Mar 21, 2007 at 12:17:01PM +1100, Nathan Scott wrote: > On Wed, 2007-03-21 at 11:41 +1100, Peter Chubb wrote: > > Hi Nathan, > > Our main backup machine is showing XFS errors. Any ideas how > > to fix? > > Hi Peter, > > I don't have as much time to spend on XFS as I used to, so better to > contact the list at SGI (CC'd). What kernel version are you running > there? Looks like a corrupt directory - was this machine exposed to > the 2.6.17 corruption issue perhaps? > > > > Mar 20 08:12:29 bitburger kernel: Filesystem "dm-0": XFS internal > > error xfs_trans_cancel at line 1138 of file fs/xfs/xfs_trans.c. Caller 0xf91d5d60 Oh, yet another report of this. Peter, can you run with the patch posted here: http://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg105975.html And if you trip over the problem again it will tell us a bit more about what triggered the error that caused the shutdown. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Mar 21 01:35:19 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 01:35:22 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.interline.it (mail.interline.it [195.182.241.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2L8ZI6p007297 for ; Wed, 21 Mar 2007 01:35:19 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.interline.it (Postfix) with ESMTP id B4B2CE82 for ; Wed, 21 Mar 2007 09:26:20 +0100 (CET) Received: from mail.interline.it ([127.0.0.1]) by localhost (pin [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 02089-20 for ; Wed, 21 Mar 2007 09:26:14 +0100 (CET) From: "Daniele P." Organization: Interline To: xfs@oss.sgi.com Subject: Re: xfsrepair memory consumption User-Agent: KMail/1.9.5 References: <200703201532.06076.daniele@interline.it> <20070320221305.GR32602149@melbourne.sgi.com> In-Reply-To: <20070320221305.GR32602149@melbourne.sgi.com> MIME-Version: 1.0 Content-Disposition: inline X-Length: 1460 X-UID: 1372 Date: Wed, 21 Mar 2007 09:34:50 +0100 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200703210934.50447.daniele@interline.it> X-archive-position: 10910 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: daniele@interline.it Precedence: bulk X-list: xfs Content-Length: 723 Lines: 22 On Tuesday 20 March 2007 23:13, you wrote: > On Tue, Mar 20, 2007 at 03:32:05PM +0100, Daniele P. wrote: > 6 million inodes in your filesystem, and a certain points in > repair we have to hold indexes of them all (plus some state) in > memory. Phase 6 is one of these points. Hi David, thanks for the explanation. > In terms of inode count, I generally use the rule that for every > 10million inodes you need a gigabyte of RAM for repair - you needed > about 500MB for 6million inodes. This is true for 2.6.20-1, but 2.8.18-1 use a lot of memory in phase 2. I just want to point out the *increasing* memory requirements rather than the total amount of memory needed to repair an xfs file system. Regards, Daniele P. From owner-xfs@oss.sgi.com Wed Mar 21 01:35:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 01:35:25 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.interline.it (mail.interline.it [195.182.241.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2L8ZK6p007315 for ; Wed, 21 Mar 2007 01:35:21 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.interline.it (Postfix) with ESMTP id 1F9DBE67 for ; Wed, 21 Mar 2007 09:26:21 +0100 (CET) Received: from mail.interline.it ([127.0.0.1]) by localhost (pin [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 02163-19 for ; Wed, 21 Mar 2007 09:26:14 +0100 (CET) From: "Daniele P." Organization: Interline To: xfs@oss.sgi.com Subject: Re: xfsrepair memory consumption User-Agent: KMail/1.9.5 References: <200703202343.KAA28746@larry.melbourne.sgi.com> In-Reply-To: <200703202343.KAA28746@larry.melbourne.sgi.com> MIME-Version: 1.0 Content-Disposition: inline X-Length: 1131 X-UID: 1373 Date: Wed, 21 Mar 2007 09:35:07 +0100 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200703210935.07809.daniele@interline.it> X-archive-position: 10911 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: daniele@interline.it Precedence: bulk X-list: xfs Content-Length: 417 Lines: 16 On Wednesday 21 March 2007 00:48, you wrote: > Hi Daniele, > > The nlink/phase 7 patch I recently sent out does reduce this > inode memory requirements which should address the issue you > see. Hi Barry, maybe this could help. A simple ps -u during xfs_repair show an increasing memory usage in phase 6 and 7 for 2.8.18-1. Unfortunately with 2.6.20-1 the memory usage is exploding in phase 2. Regards, Daniele P. From owner-xfs@oss.sgi.com Wed Mar 21 04:08:39 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 04:08:42 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.interline.it (mail.interline.it [195.182.241.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2LB8a6p008462 for ; Wed, 21 Mar 2007 04:08:38 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.interline.it (Postfix) with ESMTP id D45BBE8D for ; Wed, 21 Mar 2007 11:59:39 +0100 (CET) Received: from mail.interline.it ([127.0.0.1]) by localhost (pin [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 21722-14 for ; Wed, 21 Mar 2007 11:59:33 +0100 (CET) From: "Daniele P." Organization: Interline Subject: Re: xfsrepair memory consumption Date: Wed, 21 Mar 2007 12:08:27 +0100 User-Agent: KMail/1.9.5 References: <200703210843.TAA08491@larry.melbourne.sgi.com> In-Reply-To: <200703210843.TAA08491@larry.melbourne.sgi.com> MIME-Version: 1.0 Content-Disposition: inline X-Length: 1114 X-UID: 69 To: xfs@oss.sgi.com Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200703211208.27665.daniele@interline.it> X-archive-position: 10912 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: daniele@interline.it Precedence: bulk X-list: xfs Content-Length: 623 Lines: 24 On Wednesday 21 March 2007 09:48, Barry Naujok wrote: > This could be libxfs caching in action. > > Another option you can try is: > > -o bhash=256 > > Also, -M and -P options may reduce memory consumption. > I recommend -M (I'm in the middle of making that default). Hi Barry, In the previous test I already used the -M option (no thread). Using also -P and -o bhash=256 make the memory usage of the new version of xfs_repair close to the old version with no option for phases 1-5. This is really useful. Phase 6 still uses little more memory (killed!). I will add more memory and will test later. Thanks, Daniele P. From owner-xfs@oss.sgi.com Wed Mar 21 05:05:45 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 05:05:49 -0700 (PDT) X-Spam-oss-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.146]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2LC5h6p025464 for ; Wed, 21 Mar 2007 05:05:44 -0700 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e6.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l2LC6TDf003617 for ; Wed, 21 Mar 2007 08:06:29 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2LC4OPI233464 for ; Wed, 21 Mar 2007 08:04:24 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2LC4NuI032560 for ; Wed, 21 Mar 2007 08:04:24 -0400 Received: from amitarora.in.ibm.com (amitarora.in.ibm.com [9.124.31.34]) by d01av03.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2LC4MDW032407; Wed, 21 Mar 2007 08:04:22 -0400 Received: from amitarora.in.ibm.com (localhost.localdomain [127.0.0.1]) by amitarora.in.ibm.com (Postfix) with ESMTP id 0390229ECCE; Wed, 21 Mar 2007 17:34:26 +0530 (IST) Received: (from amit@localhost) by amitarora.in.ibm.com (8.13.1/8.13.1/Submit) id l2LC4PvK001662; Wed, 21 Mar 2007 17:34:25 +0530 Date: Wed, 21 Mar 2007 17:34:25 +0530 From: "Amit K. Arora" To: Matthew Wilcox Cc: Heiko Carstens , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070321120425.GA27273@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070317111036.GC29931@parisc-linux.org> User-Agent: Mutt/1.4.1i X-archive-position: 10913 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: aarora@linux.vnet.ibm.com Precedence: bulk X-list: xfs Content-Length: 6495 Lines: 204 On Sat, Mar 17, 2007 at 05:10:37AM -0600, Matthew Wilcox wrote: > How about: > > asmlinkage long sys_fallocate(int fd, int mode, u32 off_low, u32 off_high, > u32 len_low, u32 len_high); > > That way we all suffer equally ... As suggested by you and Russel, I have made this change to the patch. Here is how it looks like now. Please let me know if anyone has concerns about passing arguments this way (breaking each "loff_t" into two "u32"s). Signed-off-by: Amit K Arora --- arch/i386/kernel/syscall_table.S | 1 arch/x86_64/kernel/functionlist | 1 fs/open.c | 46 +++++++++++++++++++++++++++++++++++++++ include/asm-i386/unistd.h | 3 +- include/asm-powerpc/systbl.h | 1 include/asm-powerpc/unistd.h | 3 +- include/asm-x86_64/unistd.h | 4 ++- include/linux/fs.h | 7 +++++ include/linux/syscalls.h | 2 + 9 files changed, 65 insertions(+), 3 deletions(-) Index: linux-2.6.20.1/arch/i386/kernel/syscall_table.S =================================================================== --- linux-2.6.20.1.orig/arch/i386/kernel/syscall_table.S +++ linux-2.6.20.1/arch/i386/kernel/syscall_table.S @@ -319,3 +319,4 @@ ENTRY(sys_call_table) .long sys_move_pages .long sys_getcpu .long sys_epoll_pwait + .long sys_fallocate /* 320 */ Index: linux-2.6.20.1/fs/open.c =================================================================== --- linux-2.6.20.1.orig/fs/open.c +++ linux-2.6.20.1/fs/open.c @@ -350,6 +350,52 @@ asmlinkage long sys_ftruncate64(unsigned } #endif +asmlinkage long sys_fallocate(int fd, int mode, u32 off_low, u32 off_high, + u32 len_low, u32 len_high) +{ + struct file *file; + struct inode *inode; + loff_t offset, len; + long ret = -EINVAL; + + offset = (off_high << 32) + off_low; + len = (len_high << 32) + len_low; + + if (len == 0 || offset < 0) + goto out; + + ret = -EBADF; + file = fget(fd); + if (!file) + goto out; + if (!(file->f_mode & FMODE_WRITE)) + goto out_fput; + + inode = file->f_path.dentry->d_inode; + + ret = -ESPIPE; + if (S_ISFIFO(inode->i_mode)) + goto out_fput; + + ret = -ENODEV; + if (!S_ISREG(inode->i_mode)) + goto out_fput; + + ret = -EFBIG; + if (offset + len > inode->i_sb->s_maxbytes) + goto out_fput; + + if (inode->i_op && inode->i_op->fallocate) + ret = inode->i_op->fallocate(inode, mode, offset, len); + else + ret = -ENOSYS; +out_fput: + fput(file); +out: + return ret; +} +EXPORT_SYMBOL(sys_fallocate); + /* * access() needs to use the real uid/gid, not the effective uid/gid. * We do this by temporarily clearing all FS-related capabilities and Index: linux-2.6.20.1/include/asm-i386/unistd.h =================================================================== --- linux-2.6.20.1.orig/include/asm-i386/unistd.h +++ linux-2.6.20.1/include/asm-i386/unistd.h @@ -325,10 +325,11 @@ #define __NR_move_pages 317 #define __NR_getcpu 318 #define __NR_epoll_pwait 319 +#define __NR_fallocate 320 #ifdef __KERNEL__ -#define NR_syscalls 320 +#define NR_syscalls 321 #define __ARCH_WANT_IPC_PARSE_VERSION #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.20.1/include/linux/fs.h =================================================================== --- linux-2.6.20.1.orig/include/linux/fs.h +++ linux-2.6.20.1/include/linux/fs.h @@ -263,6 +263,12 @@ extern int dir_notify_enable; #define SYNC_FILE_RANGE_WRITE 2 #define SYNC_FILE_RANGE_WAIT_AFTER 4 +/* + * fallocate() modes + */ +#define FA_ALLOCATE 0x1 +#define FA_DEALLOCATE 0x2 + #ifdef __KERNEL__ #include @@ -1124,6 +1130,7 @@ struct inode_operations { ssize_t (*listxattr) (struct dentry *, char *, size_t); int (*removexattr) (struct dentry *, const char *); void (*truncate_range)(struct inode *, loff_t, loff_t); + int (*fallocate)(struct inode *, int, loff_t, loff_t); }; struct seq_file; Index: linux-2.6.20.1/include/linux/syscalls.h =================================================================== --- linux-2.6.20.1.orig/include/linux/syscalls.h +++ linux-2.6.20.1/include/linux/syscalls.h @@ -602,6 +602,8 @@ asmlinkage long sys_get_robust_list(int asmlinkage long sys_set_robust_list(struct robust_list_head __user *head, size_t len); asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct getcpu_cache __user *cache); +asmlinkage long sys_fallocate(int fd, int mode, u32 off_low, u32 off_high, + u32 len_low, u32 len_high); int kernel_execve(const char *filename, char *const argv[], char *const envp[]); Index: linux-2.6.20.1/include/asm-x86_64/unistd.h =================================================================== --- linux-2.6.20.1.orig/include/asm-x86_64/unistd.h +++ linux-2.6.20.1/include/asm-x86_64/unistd.h @@ -619,8 +619,10 @@ __SYSCALL(__NR_sync_file_range, sys_sync __SYSCALL(__NR_vmsplice, sys_vmsplice) #define __NR_move_pages 279 __SYSCALL(__NR_move_pages, sys_move_pages) +#define __NR_fallocate 280 +__SYSCALL(__NR_fallocate, sys_fallocate) -#define __NR_syscall_max __NR_move_pages +#define __NR_syscall_max __NR_fallocate #ifndef __NO_STUBS #define __ARCH_WANT_OLD_READDIR Index: linux-2.6.20.1/include/asm-powerpc/unistd.h =================================================================== --- linux-2.6.20.1.orig/include/asm-powerpc/unistd.h +++ linux-2.6.20.1/include/asm-powerpc/unistd.h @@ -324,10 +324,11 @@ #define __NR_get_robust_list 299 #define __NR_set_robust_list 300 #define __NR_move_pages 301 +#define __NR_fallocate 302 #ifdef __KERNEL__ -#define __NR_syscalls 302 +#define __NR_syscalls 303 #define __NR__exit __NR_exit #define NR_syscalls __NR_syscalls Index: linux-2.6.20.1/arch/x86_64/kernel/functionlist =================================================================== --- linux-2.6.20.1.orig/arch/x86_64/kernel/functionlist +++ linux-2.6.20.1/arch/x86_64/kernel/functionlist @@ -932,6 +932,7 @@ *(.text.sys_getitimer) *(.text.sys_getgroups) *(.text.sys_ftruncate) +*(.text.sys_fallocate) *(.text.sysfs_lookup) *(.text.sys_exit_group) *(.text.stub_fork) Index: linux-2.6.20.1/include/asm-powerpc/systbl.h =================================================================== --- linux-2.6.20.1.orig/include/asm-powerpc/systbl.h +++ linux-2.6.20.1/include/asm-powerpc/systbl.h @@ -305,3 +305,4 @@ SYSCALL_SPU(faccessat) COMPAT_SYS_SPU(get_robust_list) COMPAT_SYS_SPU(set_robust_list) COMPAT_SYS(move_pages) +SYSCALL(fallocate) -- Regards, Amit Arora From owner-xfs@oss.sgi.com Wed Mar 21 15:02:57 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 15:03:02 -0700 (PDT) X-Spam-oss-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtp109.sbc.mail.mud.yahoo.com (smtp109.sbc.mail.mud.yahoo.com [68.142.198.208]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2LM2t6p023566 for ; Wed, 21 Mar 2007 15:02:56 -0700 Received: (qmail 66904 invoked from network); 21 Mar 2007 21:36:13 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp109.sbc.mail.mud.yahoo.com with SMTP; 21 Mar 2007 21:36:12 -0000 X-YMail-OSG: KK2MqVEVM1lAHPoRPHPPG7b91HN1AhEDe3HhCi_3f_p.2hju.P2.Ofmfk1Ap5cZxJCM5KOvlYTLI2M9O70O9tfToyORihNQfBjev5FXcCj9FZC8kcN9gJrjNuFI.AEnQGNCpv8761QjBxA-- Received: by tuatara.stupidest.org (Postfix, from userid 10000) id 8C6E41826127; Wed, 21 Mar 2007 14:36:11 -0700 (PDT) Date: Wed, 21 Mar 2007 14:36:11 -0700 From: Chris Wedgwood To: "Daniele P." Cc: xfs@oss.sgi.com Subject: Re: xfsrepair memory consumption Message-ID: <20070321213611.GB1208@tuatara.stupidest.org> References: <200703210843.TAA08491@larry.melbourne.sgi.com> <200703211208.27665.daniele@interline.it> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200703211208.27665.daniele@interline.it> X-archive-position: 10916 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 304 Lines: 9 On Wed, Mar 21, 2007 at 12:08:27PM +0100, Daniele P. wrote: > Phase 6 still uses little more memory (killed!). > I will add more memory and will test later. Stupid question (Sorry, I didn't read the thread that carefully), is there any reason you can't just add swap-space and let it thrash a little? From owner-xfs@oss.sgi.com Wed Mar 21 15:01:49 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 15:01:57 -0700 (PDT) X-Spam-oss-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtp106.sbc.mail.mud.yahoo.com (smtp106.sbc.mail.mud.yahoo.com [68.142.198.205]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2LM1m6p023250 for ; Wed, 21 Mar 2007 15:01:49 -0700 Received: (qmail 67788 invoked from network); 21 Mar 2007 21:35:06 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp106.sbc.mail.mud.yahoo.com with SMTP; 21 Mar 2007 21:35:06 -0000 X-YMail-OSG: pjw7UZkVM1nFyALxVVqcIST4efuApn9dwh3R7L1UVUbwYfruJq.UdMmUHdwiJM8vh6rmFfam3HlbLLlxtaPAwFXf7zg6WNVPSaT6jtPVLzG5nzkpmWg2ivceCMy9UeFYawpiJtQpfh8F25w- Received: by tuatara.stupidest.org (Postfix, from userid 10000) id 3B5EB1826129; Wed, 21 Mar 2007 14:35:02 -0700 (PDT) Date: Wed, 21 Mar 2007 14:35:02 -0700 From: Chris Wedgwood To: "Amit K. Arora" Cc: Matthew Wilcox , Heiko Carstens , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, Andrew Morton , suparna@in.ibm.com, cmm@us.ibm.com, alex@clusterfs.com, suzuki@in.ibm.com Subject: Re: [RFC][PATCH] sys_fallocate() system call Message-ID: <20070321213501.GA1208@tuatara.stupidest.org> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070321120425.GA27273@amitarora.in.ibm.com> X-archive-position: 10915 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 1181 Lines: 33 I hate to comment at this late stage, especially on something that I think is really a great idea (I did similar more complex, sys_blkalloc with even more arguments time ago --- I'm glad given how complex this thread has become I didn't post them now). In the past there wasn't that much incentive to get this functionality exposed because of various other issues (mmap + page dirty didn't flush reliably) which are close to being resolve, so I think the timing of this is really great.... On Wed, Mar 21, 2007 at 05:34:25PM +0530, Amit K. Arora wrote: > As suggested by you and Russel, I have made this change to the > patch. Here is how it looks like now. Please let me know if anyone > has concerns about passing arguments this way (breaking each > "loff_t" into two "u32"s). I really dislike breaking 64-bit args up unless it's necessary. I guess it doesn't really hurt, but it feels needlessly ugly. > + .long sys_fallocate /* 320 */ > +/* > + * fallocate() modes > + */ > +#define FA_ALLOCATE 0x1 > +#define FA_DEALLOCATE 0x2 > + given there are the only TWO modes right now, why not leave the arguments as 64-bit sane and simply have two syscalls, one for each? From owner-xfs@oss.sgi.com Wed Mar 21 20:21:59 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 21 Mar 2007 20:22:03 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from relay.sgi.com (netops-testserver-4.corp.sgi.com [192.26.58.214]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2M3Lv6p010028 for ; Wed, 21 Mar 2007 20:21:59 -0700 Received: from estes.americas.sgi.com (estes.americas.sgi.com [128.162.236.10]) by netops-testserver-4.corp.sgi.com (Postfix) with ESMTP id 9A2F861BBE for ; Wed, 21 Mar 2007 19:04:53 -0800 (PST) Received: from [134.15.64.27] (cf-vpn-sw-corp-64-27.corp.sgi.com [134.15.64.27]) by estes.americas.sgi.com (Postfix) with ESMTP id 3BC31700072E; Wed, 21 Mar 2007 21:37:23 -0500 (CDT) Message-ID: <4601F9EB.30201@sgi.com> Date: Wed, 21 Mar 2007 21:37:15 -0600 From: Bill Kendall User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: Timothy Shimmin Cc: nscott@aconex.com, xfs@oss.sgi.com Subject: Re: But wait, theres more... [Fwd: Bug#415123: -R option can't append to a plain file] References: <1174252993.5051.233.camel@edge> <17CB3A8C59A80064581C0A32@timothy-shimmins-power-mac-g5.local> In-Reply-To: <17CB3A8C59A80064581C0A32@timothy-shimmins-power-mac-g5.local> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10917 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: wkendall@sgi.com Precedence: bulk X-list: xfs Content-Length: 1939 Lines: 63 Tim summed it up pretty well. I'd agree it's more of a feature request. In case it isn't clear, the way to resume a dump to a regular file is to point xfsdump at a new file, just like incremental dumps go in a different file than the base dump. Bill Timothy Shimmin wrote: > Hi, > > I'll leave this to Bill. Just a couple of comments. > > I don't know if you can call it a "bug", more of a restriction. > There is a layer provided for the I/O using drive_*.c - > i.e drive_scsitape.c, drive_minrmt.c, drive_simple.c (which show up > as a "drive strategy" on output). > As part of the ds_instantiate() interface one typically sets up the > d_capabilities > that this drive strategy supports and for files (drive_simple), > DRIVE_CAP_APPEND is > not given. What the ramifications for drive_simple (file strategy) are, > I wouldn't know without looking at the code and playing with it. > But I'd call it an RFE :) > > --Tim > > --On 19 March 2007 8:23:13 AM +1100 Nathan Scott wrote: > >> And a free set of steak knives if you fix this bug... ;-) >> >> -------- Forwarded Message -------- >> From: Peter Chubb >> Reply-To: Peter Chubb , >> 415123@bugs.debian.org >> To: submit@bugs.debian.org >> Subject: Bug#415123: -R option can't append to a plain file >> Date: Fri, 16 Mar 2007 19:54:37 +1100 >> >> Package: xfsdump >> Version: 2.2.38-1 >> >> If I use xfsdump to dump to a plain file, interrupt it, then restart >> with the -R option, xfsdump complains: >> >> xfsdump: ERROR: media contains valid xfsdump but does not support append >> >> which is misleading: of *course* you can append to a regular file if >> there's space on the filesystem. >> >> -- >> Dr Peter Chubb http://www.gelato.unsw.edu.au peterc AT >> gelato.unsw.edu.au >> http://www.ertos.nicta.com.au ERTOS within National ICT >> Australia >> >> -- >> Nathan >> > > > From owner-xfs@oss.sgi.com Thu Mar 22 09:41:28 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 09:41:34 -0700 (PDT) X-Spam-oss-Status: No, score=4.1 required=5.0 tests=AWL,BAYES_99, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,HTML_MESSAGE,J_CHICKENPOX_42, RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from service.eng.exegy.net (68-191-203-42.static.stls.mo.charter.com [68.191.203.42]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2MGfO6p004798 for ; Thu, 22 Mar 2007 09:41:26 -0700 Received: from HANAFORD.eng.exegy.net (hanaford.eng.exegy.net [10.19.1.4]) by service.eng.exegy.net (8.13.1/8.13.1) with ESMTP id l2MG0tBb023225; Thu, 22 Mar 2007 11:01:00 -0500 X-Ninja-PIM: Scanned by Ninja X-Ninja-AttachmentFiltering: (no action) thread-index: Acdsm0XEWqEP6p2PTk6D/v+ouqag0w== Received: from [10.19.4.86] ([10.19.4.86]) by HANAFORD.eng.exegy.net over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Thu, 22 Mar 2007 11:00:54 -0500 Message-ID: <4602A836.2010004@exegy.com> Date: Thu, 22 Mar 2007 11:00:54 -0500 From: "Mr. Berkley Shands" User-Agent: Thunderbird 1.5.0.9 (X11/20070105) Importance: normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.2826 MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Cc: linux-xfs@oss.sgi.com, linux-scsi@vger.kernel.org, Dave Lloyd Subject: 2.6.20, XFS, mptsas and LSI-8888ELP write lockup X-OriginalArrivalTime: 22 Mar 2007 16:00:54.0722 (UTC) FILETIME=[45A06A20:01C76C9B] Priority: Normal Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 10919 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bshands@exegy.com Precedence: bulk X-list: xfs Content-Length: 2197 Lines: 59 I have a uniwide 3346, 20GB 8-core Opteron (2.2GHz) With one LSI-8888ELP PCI-e SAS controller. Each of the two external mini-sas 4-lane connectors goes to an Xtore XJ-SAxx-316J-2 (16 drive 3U with dual SAS ports). Running both Centos 4.4 or RedHat-5 Server with kernel 2.6.20. The raids are all 4 drive raid0, 128KB stripes, using Seagate 7200.10 320GB drives (all firmware AAJ or later) Sata-II drives. XFS filesystem, properly aligned, mkfs'ed etc. If I write to all 4 VDs on one Xtore at a time, I get ~550MB/Sec combined write rate. If I try writing to 7 of the 8 VDs at one time, I get just under 800MB/Sec. Any 7 of 8 VDs. But If I try to write to ALL 8 at once, the system crawls to a stop, and I get Micro-Bytes per second. xfs_datad/ process runs every few seconds, then pdflush runs two or three at a time every few seconds. But iostat reports 0 writes. every 10 seconds or so (iostat 5) reports a burst of write activity, and then 10 seconds of nothing. If I use the Adaptec 4805 (two of them), I get about the same write rate, but it does not hang (AACRAID driver). same type of file systems, same external enclosures. I had a "echo 10 > /proc/sys/vm/dirty_ratio" fix in to correct slow writes (vs 2.6.18), but altering this value changed nothing. It still runs oh-so-slow. What to look at next? 2.6.21-rc4 ? berkley -- //E. F. Berkley Shands, MSc// **Exegy Inc.** 3668 S. Geyer Road, Suite 300 St. Louis, MO 63127 Direct: (314) 450-5348 Cell: (314) 303-2546 Office: (314) 450-5353 Fax: (314) 450-5354 The Usual Disclaimer follows... This e-mail and any documents accompanying it may contain legally privileged and/or confidential information belonging to Exegy, Inc. Such information may be protected from disclosure by law. The information is intended for use by only the addressee. If you are not the intended recipient, you are hereby notified that any disclosure or use of the information is strictly prohibited. If you have received this e-mail in error, please immediately contact the sender by e-mail or phone regarding instructions for return or destruction and do not use or disclose the content to others. [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Mar 22 09:44:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 09:44:20 -0700 (PDT) X-Spam-oss-Status: No, score=4.4 required=5.0 tests=AWL,BAYES_99, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,J_CHICKENPOX_42,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from service.eng.exegy.net (68-191-203-42.static.stls.mo.charter.com [68.191.203.42]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2MGiE6p005768 for ; Thu, 22 Mar 2007 09:44:16 -0700 Received: from HANAFORD.eng.exegy.net (hanaford.eng.exegy.net [10.19.1.4]) by service.eng.exegy.net (8.13.1/8.13.1) with ESMTP id l2MGiDJr026289 for ; Thu, 22 Mar 2007 11:44:13 -0500 X-Ninja-PIM: Scanned by Ninja X-Ninja-AttachmentFiltering: (no action) thread-index: AcdsoVItJYe0slh3TXWXn0OJveo8eQ== Received: from [10.19.4.86] ([10.19.4.86]) by HANAFORD.eng.exegy.net over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Thu, 22 Mar 2007 11:44:12 -0500 Message-ID: <4602B25A.80009@exegy.com> Date: Thu, 22 Mar 2007 11:44:10 -0500 From: "Mr. Berkley Shands" Content-Class: urn:content-classes:message Importance: normal User-Agent: Thunderbird 1.5.0.9 (X11/20070105) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.2826 MIME-Version: 1.0 To: linux-xfs@oss.sgi.com Subject: 2.6.20 XFS and LSI8888ELP race with 8 Virtual drives Content-Type: text/plain; charset="iso-8859-1"; format="flowed" X-OriginalArrivalTime: 22 Mar 2007 16:44:12.0559 (UTC) FILETIME=[520EE1F0:01C76CA1] Priority: Normal Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2MGiG6p005780 X-archive-position: 10920 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bshands@exegy.com Precedence: bulk X-list: xfs Content-Length: 2159 Lines: 55 I have a uniwide 3346, 20GB 8-core Opteron (2.2GHz) With one LSI-8888ELP PCI-e SAS controller. Each of the two external mini-sas 4-lane connectors goes to an Xtore XJ-SAxx-316J-2 (16 drive 3U with dual SAS ports). Running both Centos 4.4 or RedHat-5 Server with kernel 2.6.20. The raids are all 4 drive raid0, 128KB stripes, using Seagate 7200.10 320GB drives (all firmware AAJ or later) Sata-II drives. XFS filesystem, properly aligned, mkfs'ed etc. If I write to all 4 VDs on one Xtore at a time, I get ~550MB/Sec combined write rate. If I try writing to 7 of the 8 VDs at one time, I get just under 800MB/Sec. Any 7 of 8 VDs. But If I try to write to ALL 8 at once, the system crawls to a stop, and I get Micro-Bytes per second. xfs_datad/ process runs every few seconds, then pdflush runs two or three at a time every few seconds. But iostat reports 0 writes. every 10 seconds or so (iostat 5) reports a burst of write activity, and then 10 seconds of nothing. If I use the Adaptec 4805 (two of them), I get about the same write rate, but it does not hang (AACRAID driver). same type of file systems, same external enclosures. I had a "echo 10 > /proc/sys/vm/dirty_ratio" fix in to correct slow writes (vs 2.6.18), but altering this value changed nothing. It still runs oh-so-slow. What to look at next? 2.6.21-rc4 ? berkley -- //E. F. Berkley Shands, MSc// **Exegy Inc.** 3668 S. Geyer Road, Suite 300 St. Louis, MO 63127 Direct: (314) 450-5348 Cell: (314) 303-2546 Office: (314) 450-5353 Fax: (314) 450-5354 The Usual Disclaimer follows... This e-mail and any documents accompanying it may contain legally privileged and/or confidential information belonging to Exegy, Inc. Such information may be protected from disclosure by law. The information is intended for use by only the addressee. If you are not the intended recipient, you are hereby notified that any disclosure or use of the information is strictly prohibited. If you have received this e-mail in error, please immediately contact the sender by e-mail or phone regarding instructions for return or destruction and do not use or disclose the content to others. From owner-xfs@oss.sgi.com Thu Mar 22 10:41:47 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 10:41:49 -0700 (PDT) X-Spam-oss-Status: No, score=4.6 required=5.0 tests=AWL,BAYES_99, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from service.eng.exegy.net (68-191-203-42.static.stls.mo.charter.com [68.191.203.42]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2MHfi6p018498 for ; Thu, 22 Mar 2007 10:41:46 -0700 Received: from HANAFORD.eng.exegy.net (hanaford.eng.exegy.net [10.19.1.4]) by service.eng.exegy.net (8.13.1/8.13.1) with ESMTP id l2MHeg2P030026; Thu, 22 Mar 2007 12:40:58 -0500 X-Ninja-PIM: Scanned by Ninja X-Ninja-AttachmentFiltering: (no action) thread-index: AcdsqTa+50LTdCGLRX2maAZdEaPN/A== Received: from [10.19.4.86] ([10.19.4.86]) by HANAFORD.eng.exegy.net over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Thu, 22 Mar 2007 12:40:42 -0500 Message-ID: <4602BF9A.2080402@exegy.com> Date: Thu, 22 Mar 2007 12:40:42 -0500 From: "Mr. Berkley Shands" Content-Class: urn:content-classes:message User-Agent: Thunderbird 1.5.0.9 (X11/20070105) Importance: normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.2826 MIME-Version: 1.0 To: linux-xfs@oss.sgi.com, linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org Cc: Dave Lloyd Subject: 2.6.20 (+2.6.21-rc4) write hang modifier Content-Type: text/plain; charset="iso-8859-1"; format="flowed" X-OriginalArrivalTime: 22 Mar 2007 17:40:42.0426 (UTC) FILETIME=[369389A0:01C76CA9] Priority: Normal Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2MHfl6p018517 X-archive-position: 10921 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bshands@exegy.com Precedence: bulk X-list: xfs Content-Length: 1138 Lines: 33 If I set /proc/sys/vm/dirty_ratio to "5" the write lockup happens within the first MilliSecond, at about 100MB per file system. If I slew the dirty_ratio to 80, the lockup happens at about 2GB per file system. 2.6.21-rc4 has the same symptoms, and same lockups. It appears that mptsas is dropping something. berkley -- //E. F. Berkley Shands, MSc// **Exegy Inc.** 3668 S. Geyer Road, Suite 300 St. Louis, MO 63127 Direct: (314) 450-5348 Cell: (314) 303-2546 Office: (314) 450-5353 Fax: (314) 450-5354 The Usual Disclaimer follows... This e-mail and any documents accompanying it may contain legally privileged and/or confidential information belonging to Exegy, Inc. Such information may be protected from disclosure by law. The information is intended for use by only the addressee. If you are not the intended recipient, you are hereby notified that any disclosure or use of the information is strictly prohibited. If you have received this e-mail in error, please immediately contact the sender by e-mail or phone regarding instructions for return or destruction and do not use or disclose the content to others. From owner-xfs@oss.sgi.com Thu Mar 22 12:10:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 12:10:05 -0700 (PDT) X-Spam-oss-Status: No, score=4.7 required=5.0 tests=AWL,BAYES_99, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from service.eng.exegy.net (68-191-203-42.static.stls.mo.charter.com [68.191.203.42]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2MJA06p003855 for ; Thu, 22 Mar 2007 12:10:01 -0700 Received: from HANAFORD.eng.exegy.net (hanaford.eng.exegy.net [10.19.1.4]) by service.eng.exegy.net (8.13.1/8.13.1) with ESMTP id l2MJ8h5f003448; Thu, 22 Mar 2007 14:09:00 -0500 X-Ninja-PIM: Scanned by Ninja X-Ninja-AttachmentFiltering: (no action) thread-index: AcdstYJR9Nn3qPLNQ5W7+oKl0M2Klg== Received: from [10.19.4.86] ([10.19.4.86]) by HANAFORD.eng.exegy.net over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Thu, 22 Mar 2007 14:08:43 -0500 Message-ID: <4602D433.3070406@exegy.com> Date: Thu, 22 Mar 2007 14:08:35 -0500 From: "Mr. Berkley Shands" Content-Class: urn:content-classes:message Importance: normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.2826 User-Agent: Thunderbird 1.5.0.9 (X11/20070105) MIME-Version: 1.0 To: linux-xfs@oss.sgi.com, linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org Subject: Quick fix for LSI8888ELP overruns in 2.6.20 and 2.6.21-rc4 Content-Type: text/plain; charset="iso-8859-1"; format="flowed" X-OriginalArrivalTime: 22 Mar 2007 19:08:43.0275 (UTC) FILETIME=[823525B0:01C76CB5] Priority: Normal Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2MJA16p003869 X-archive-position: 10922 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bshands@exegy.com Precedence: bulk X-list: xfs Content-Length: 1107 Lines: 21 My local kernel guru recognized this problem as a duplicate of the LSI8408E bus lockup problem. The controller gets flooded with work, and just climbs under a rock to hide. In the 8408E case, it froze the PCI-e bus :-) #!/bin/csh # # set the max request queue down from 128. any more than 32 # quickly slows down from 810MB/Sec to 11MB/Sec. # setenv X 32 # foreach i ( /sys/block/sd{g,h,i,j,k,l,m,n}/queue/nr_requests) echo $X > $i end Lots of time was spent in the congestion queue, like 99% of the time. Berkley This e-mail and any documents accompanying it may contain legally privileged and/or confidential information belonging to Exegy, Inc. Such information may be protected from disclosure by law. The information is intended for use by only the addressee. If you are not the intended recipient, you are hereby notified that any disclosure or use of the information is strictly prohibited. If you have received this e-mail in error, please immediately contact the sender by e-mail or phone regarding instructions for return or destruction and do not use or disclose the content to others. From owner-xfs@oss.sgi.com Thu Mar 22 18:38:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 18:38:08 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.suse.de (ns2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2N1c16p028009 for ; Thu, 22 Mar 2007 18:38:04 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 5FEBB21BA5 for ; Fri, 23 Mar 2007 02:26:35 +0100 (CET) From: Neil Brown To: xfs@oss.sgi.com Date: Fri, 23 Mar 2007 12:26:31 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17923.11463.459927.628762@notabene.brown> Subject: XFS and write barriers. X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9Dqueue->ordered to see if that is QUEUE_ORDERED_NONE. If it is, then barriers are disabled. I think this is a layering violation - xfs really has no business looking that deeply into the device. For dm and md devices, ->ordered is never used and so never set, so xfs will never use barriers on those devices (as the default value is 0 or NONE). It is true that md and dm could set ->ordered to some non-zero value just to please XFS, but that would be telling a lie and there is no possible value that is relevant to a layered devices. I think this test should just be removed and the xfs_barrier_test should be the main mechanism for seeing if barriers work. Secondly, if a barrier write fails due to EOPNOTSUPP, it should be retried without the barrier (after possibly waiting for dependant requests to complete). This is what other filesystems do, but I cannot find the code in xfs which does this. The approach taken by xfs_barrier_test seems to suggest that xfs does do this... could someone please point me to the code ? This is particularly important for md/raid1 as it is quite possible that barriers will be supported at first, but after a failure and different device on a different controller could be swapped in that does not support barriers. Thanks for your time, NeilBrown From owner-xfs@oss.sgi.com Thu Mar 22 22:31:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 22:31:05 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2N5Uu6p008117 for ; Thu, 22 Mar 2007 22:30:59 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA06047; Fri, 23 Mar 2007 16:30:46 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2N5UjAf36631137; Fri, 23 Mar 2007 16:30:45 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2N5Uh4r36747957; Fri, 23 Mar 2007 16:30:43 +1100 (AEDT) Date: Fri, 23 Mar 2007 16:30:43 +1100 From: David Chinner To: Neil Brown Cc: xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. Message-ID: <20070323053043.GD32602149@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17923.11463.459927.628762@notabene.brown> User-Agent: Mutt/1.4.2.1i X-archive-position: 10924 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 5477 Lines: 144 On Fri, Mar 23, 2007 at 12:26:31PM +1100, Neil Brown wrote: > > Hi, > I have two concerns related to XFS and write barrier support that I'm > hoping can be resolved. > > Firstly in xfs_mountfs_check_barriers in fs/xfs/linux-2.6/xfs_super.c, > it tests ....->queue->ordered to see if that is QUEUE_ORDERED_NONE. > If it is, then barriers are disabled. > > I think this is a layering violation - xfs really has no business > looking that deeply into the device. Except that the device behaviour determines what XFS needs to do and there used to be no other way to find out. Christoph, any reason for needing this check anymore? I can't see any particular reason for needing to do this as __make_request() will check it for us when we test now. > I think this test should just be removed and the xfs_barrier_test > should be the main mechanism for seeing if barriers work. Yup. > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > retried without the barrier (after possibly waiting for dependant > requests to complete). This is what other filesystems do, but I > cannot find the code in xfs which does this. XFS doesn't handle this - I was unaware that the barrier status of the underlying block device could change.... OOC, when did this behaviour get introduced? > The approach taken by xfs_barrier_test seems to suggest that xfs does > do this... could someone please point me to the code ? We test at mount time if barriers are supported, and the decision lasts the life of the mount. > This is particularly important for md/raid1 as it is quite possible > that barriers will be supported at first, but after a failure and > different device on a different controller could be swapped in that > does not support barriers. I/O errors are not the way this should be handled. What happens if the opposite happens? A drive that needs barriers is used as a replacement on a filesystem that has barriers disabled because they weren't needed? Now a crash can result in filesystem corruption, but the filesystem has not been able to warn the admin that this situation occurred. /waves hands At the recent FS/IO workshop in San Jose I raised the issue of how we can get the I/O layers to tell the filesystems about changes in status of the block layer that can affect filesystem behaviour. This is a perfect example of the sort of communication that is needed.... In the mean time, we'll need to do something like the untested patch below. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/linux-2.6/xfs_buf.c | 13 ++++++++++++- fs/xfs/linux-2.6/xfs_super.c | 8 -------- fs/xfs/xfs_log.c | 13 +++++++++++++ 3 files changed, 25 insertions(+), 9 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-02-07 15:51:09.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_buf.c 2007-03-23 16:19:19.790517132 +1100 @@ -1000,7 +1000,18 @@ xfs_buf_iodone_work( xfs_buf_t *bp = container_of(work, xfs_buf_t, b_iodone_work); - if (bp->b_iodone) + /* + * We can get an EOPNOTSUPP to ordered writes. Here we clear the + * ordered flag and reissue them. Because we can't tell the higher + * layers directly that they should not issue ordered I/O anymore, they + * need to check if the ordered flag was cleared during I/O completion. + */ + if ((bp->b_error == EOPNOTSUPP) && + (bp->b_flags & (XBF_ORDERED|XBF_ASYNC)) == (XBF_ORDERED|XBF_ASYNC)) { + XB_TRACE(bp, "ordered_retry", bp->b_iodone); + bp->b_flags &= ~XBF_ORDERED; + xfs_buf_iorequest(bp); + } else if (bp->b_iodone) (*(bp->b_iodone))(bp); else if (bp->b_flags & XBF_ASYNC) xfs_buf_relse(bp); Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2007-03-23 15:00:05.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2007-03-23 16:19:40.355818889 +1100 @@ -961,6 +961,19 @@ xlog_iodone(xfs_buf_t *bp) l = iclog->ic_log; /* + * If the ordered flag has been removed by a lower + * layer, it means the underlyin device no longer supports + * barrier I/O. Warn loudly and turn off barriers. + */ + if ((l->l_mp->m_flags & XFS_MOUNT_BARRIER) && !XFS_BUF_ORDERED(bp)) { + l->l_mp->m_flags &= ~XFS_MOUNT_BARRIER; + xfs_fs_cmn_err(CE_WARN, l->l_mp, + "xlog_iodone: Barriers are no longer supported" + " by device. Disabling barriers\n"); + xfs_buftrace("XLOG_IODONE BARRIERS OFF", bp); + } + + /* * Race to shutdown the filesystem if we see an error. */ if (XFS_TEST_ERROR((XFS_BUF_GETERROR(bp)), l->l_mp, Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c 2007-03-16 12:48:54.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c 2007-03-23 16:24:26.998220227 +1100 @@ -314,14 +314,6 @@ xfs_mountfs_check_barriers(xfs_mount_t * return; } - if (mp->m_ddev_targp->bt_bdev->bd_disk->queue->ordered == - QUEUE_ORDERED_NONE) { - xfs_fs_cmn_err(CE_NOTE, mp, - "Disabling barriers, not supported by the underlying device"); - mp->m_flags &= ~XFS_MOUNT_BARRIER; - return; - } - if (xfs_readonly_buftarg(mp->m_ddev_targp)) { xfs_fs_cmn_err(CE_NOTE, mp, "Disabling barriers, underlying device is readonly"); From owner-xfs@oss.sgi.com Thu Mar 22 23:20:33 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 22 Mar 2007 23:20:35 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_42,J_CHICKENPOX_74 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2N6KU6p021242 for ; Thu, 22 Mar 2007 23:20:32 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA07412; Fri, 23 Mar 2007 17:20:18 +1100 Date: Fri, 23 Mar 2007 17:20:40 +1100 From: Timothy Shimmin To: Neil Brown , xfs@oss.sgi.com Subject: Re: XFS and write barriers. Message-ID: <1755676AA526FF7790546385@timothy-shimmins-power-mac-g5.local> In-Reply-To: <17923.11463.459927.628762@notabene.brown> References: <17923.11463.459927.628762@notabene.brown> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10925 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 2807 Lines: 68 Hi Neil, --On 23 March 2007 12:26:31 PM +1100 Neil Brown wrote: > > Hi, > I have two concerns related to XFS and write barrier support that I'm > hoping can be resolved. > 1. > Firstly in xfs_mountfs_check_barriers in fs/xfs/linux-2.6/xfs_super.c, > it tests ....->queue->ordered to see if that is QUEUE_ORDERED_NONE. > If it is, then barriers are disabled. > > I think this is a layering violation - xfs really has no business > looking that deeply into the device. > For dm and md devices, ->ordered is never used and so never set, so > xfs will never use barriers on those devices (as the default value is > 0 or NONE). It is true that md and dm could set ->ordered to some > non-zero value just to please XFS, but that would be telling a lie and > there is no possible value that is relevant to a layered devices. > > I think this test should just be removed and the xfs_barrier_test > should be the main mechanism for seeing if barriers work. > Oh okay. This is all Christoph's (hch) code, so it would be good for him to comment here. The external log and readonly tests can stay though. 2. > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > retried without the barrier (after possibly waiting for dependant > requests to complete). This is what other filesystems do, but I > cannot find the code in xfs which does this. > The approach taken by xfs_barrier_test seems to suggest that xfs does > do this... could someone please point me to the code ? > You got me confused here. I was wondering why the test write of the superblock (in xfs_barrier_test) should be retried without barriers :) But you were referring to the writing of the log buffers using barriers. Yeah, if we get an EOPNOTSUPP AFAIK, we will report the error and shutdown the filesystem (xlog_iodone()). This will happen when one of our (up to 8) incore log buffers I/O completes and xlog_iodone handler is called. I don't believe we have a notion of barrier'ness changing for us, and we just test it at mount time. Which bit of code led you to believe we do a retry? > This is particularly important for md/raid1 as it is quite possible > that barriers will be supported at first, but after a failure and > different device on a different controller could be swapped in that > does not support barriers. > Oh okay, I see. And then later one that supported them can be swapped back in? So the other FSs are doing a sync'ed write out and then if there is an EOPNOTSUPP they retry and disable barrier support henceforth? Yeah, I guess we could do that in xlog_iodone() on failed completion and retry the write without the ORDERED flag on EOPNOTSUPP error case (and turn off the flag). Dave (dgc) can you see a problem with that? > Thanks for your time, Thanks for pointing it out. --Tim From owner-xfs@oss.sgi.com Fri Mar 23 00:50:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 00:50:26 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2N7oK6p010545 for ; Fri, 23 Mar 2007 00:50:22 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 6BED92159F; Fri, 23 Mar 2007 08:50:13 +0100 (CET) From: Neil Brown To: David Chinner Date: Fri, 23 Mar 2007 18:49:50 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17923.34462.210758.852042@notabene.brown> Cc: xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. In-Reply-To: message from David Chinner on Friday March 23 References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D On Fri, Mar 23, 2007 at 12:26:31PM +1100, Neil Brown wrote: > > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > > retried without the barrier (after possibly waiting for dependant > > requests to complete). This is what other filesystems do, but I > > cannot find the code in xfs which does this. > > XFS doesn't handle this - I was unaware that the barrier status of the > underlying block device could change.... > > OOC, when did this behaviour get introduced? Probably when md/raid1 started supported barriers.... The problem is that this interface is (as far as I can see) undocumented and not fully specified. Barriers only make sense inside drive firmware. Trying to emulate it in the md layer doesn't make any sense as the filesystem is in a much better position to do any emulation required. So as the devices can change underneath md/raid1, it must be able to fail a barrier request at any point. The first file systems to use barriers (ext3, reiserfs) submit a barrier request and if that fails they decide that barriers don't work any more and use the fall-back mechanism. The seemed to mesh perfectly with what I needed for md, so I assumed it was an intended feature of the interface and made md/raid1 depend on it. > > This is particularly important for md/raid1 as it is quite possible > > that barriers will be supported at first, but after a failure and > > different device on a different controller could be swapped in that > > does not support barriers. > > I/O errors are not the way this should be handled. What happens if > the opposite happens? A drive that needs barriers is used as a > replacement on a filesystem that has barriers disabled because they > weren't needed? Now a crash can result in filesystem corruption, but > the filesystem has not been able to warn the admin that this > situation occurred. There should never be a possibility of filesystem corruption. If the a barrier request fails, the filesystem should: wait for any dependant request to complete call blkdev_issue_flush schedule the write of the 'barrier' block call blkdev_issue_flush again. My understand is that that sequence is as safe as a barrier, but maybe not as fast. The patch looks at least believable. As you can imagine it is awkward to test thoroughly. Thanks, NeilBrown From owner-xfs@oss.sgi.com Fri Mar 23 01:00:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 01:00:56 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_42,J_CHICKENPOX_74 autolearn=no version=3.2.0-pre1-r499012 Received: from mx2.suse.de (ns2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2N80q6p012514 for ; Fri, 23 Mar 2007 01:00:53 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 60056213ED; Fri, 23 Mar 2007 09:00:50 +0100 (CET) From: Neil Brown To: Timothy Shimmin Date: Fri, 23 Mar 2007 19:00:46 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17923.35118.139991.252734@notabene.brown> Cc: xfs@oss.sgi.com Subject: Re: XFS and write barriers. In-Reply-To: message from Timothy Shimmin on Friday March 23 References: <17923.11463.459927.628762@notabene.brown> <1755676AA526FF7790546385@timothy-shimmins-power-mac-g5.local> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D > > > I think this test should just be removed and the xfs_barrier_test > > should be the main mechanism for seeing if barriers work. > > > Oh okay. > This is all Christoph's (hch) code, so it would be good for him to comment here. > The external log and readonly tests can stay though. > Why no barriers on an external log device??? Not important, just curious. > 2. > > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > > retried without the barrier (after possibly waiting for dependant > > requests to complete). This is what other filesystems do, but I > > cannot find the code in xfs which does this. > > The approach taken by xfs_barrier_test seems to suggest that xfs does > > do this... could someone please point me to the code ? > > > You got me confused here. > I was wondering why the test write of the superblock (in xfs_barrier_test) > should be retried without barriers :) > But you were referring to the writing of the log buffers using barriers. > Yeah, if we get an EOPNOTSUPP AFAIK, we will report the error and shutdown > the filesystem (xlog_iodone()). This will happen when one of our (up to 8) > incore log buffers I/O completes and xlog_iodone handler is called. > I don't believe we have a notion of barrier'ness changing for us, and > we just test it at mount time. > Which bit of code led you to believe we do a retry? Uhmm.. I think I just got confused reading xfs_barrier_test, I cannot see it anymore (I think I didn't see the error return and so assumed some lower layer but be setting some state flag). > > > This is particularly important for md/raid1 as it is quite possible > > that barriers will be supported at first, but after a failure and > > different device on a different controller could be swapped in that > > does not support barriers. > > > > Oh okay, I see. And then later one that supported them can be swapped back in? > So the other FSs are doing a sync'ed write out and then if there is an > EOPNOTSUPP they retry and disable barrier support henceforth? > Yeah, I guess we could do that in xlog_iodone() on failed completion and retry the write without > the ORDERED flag on EOPNOTSUPP error case (and turn off the flag). > Dave (dgc) can you see a problem with that? If an md/raid1 disables barriers and subsequently is restored to a state where all drives support barriers, it currently does *not* re-enable them device-wide. This would probably be quite easy to achieve, but as no existing filesystem would ever try barriers again..... Thanks, NeilBrown From owner-xfs@oss.sgi.com Fri Mar 23 01:37:31 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 01:37:34 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.interline.it (mail.interline.it [195.182.241.4]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2N8bS6p018402 for ; Fri, 23 Mar 2007 01:37:31 -0700 Received: from localhost (localhost [127.0.0.1]) by mail.interline.it (Postfix) with ESMTP id D864D103B for ; Fri, 23 Mar 2007 09:28:23 +0100 (CET) Received: from mail.interline.it ([127.0.0.1]) by localhost (pin [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 27574-05 for ; Fri, 23 Mar 2007 09:28:17 +0100 (CET) From: "Daniele P." Organization: Interline Subject: Re: xfsrepair memory consumption Date: Fri, 23 Mar 2007 09:36:47 +0100 User-Agent: KMail/1.9.5 References: <200703210843.TAA08491@larry.melbourne.sgi.com> <200703211208.27665.daniele@interline.it> <20070321213611.GB1208@tuatara.stupidest.org> In-Reply-To: <20070321213611.GB1208@tuatara.stupidest.org> MIME-Version: 1.0 Content-Disposition: inline X-Length: 1255 X-UID: 70 To: xfs@oss.sgi.com Message-Id: <200703230936.47788.daniele@interline.it> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-archive-position: 10928 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: daniele@interline.it Precedence: bulk X-list: xfs Content-Length: 602 Lines: 19 On Wednesday 21 March 2007 22:36, Chris Wedgwood wrote: > On Wed, Mar 21, 2007 at 12:08:27PM +0100, Daniele P. wrote: > > Phase 6 still uses little more memory (killed!). > > I will add more memory and will test later. > > Stupid question (Sorry, I didn't read the thread that carefully), is > there any reason you can't just add swap-space and let it thrash a > little? Oh, it's less expensive (less typing) to add ram because it's a virtual machine. Finally using the 2.8.20-1 version requires 1 GB of memory vs 750 MB for 2.6.20-1. Next time I will add more swap by default. Regards, Daniele P. From owner-xfs@oss.sgi.com Fri Mar 23 03:12:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 03:12:41 -0700 (PDT) X-Spam-oss-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2NACa6p002894 for ; Fri, 23 Mar 2007 03:12:37 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HUgQB-0003nT-Ee; Fri, 23 Mar 2007 09:50:55 +0000 Date: Fri, 23 Mar 2007 09:50:55 +0000 From: Christoph Hellwig To: David Chinner Cc: Neil Brown , xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. Message-ID: <20070323095055.GA13478@infradead.org> References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070323053043.GD32602149@melbourne.sgi.com> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 10929 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs Content-Length: 2739 Lines: 64 On Fri, Mar 23, 2007 at 04:30:43PM +1100, David Chinner wrote: > On Fri, Mar 23, 2007 at 12:26:31PM +1100, Neil Brown wrote: > > > > Hi, > > I have two concerns related to XFS and write barrier support that I'm > > hoping can be resolved. > > > > Firstly in xfs_mountfs_check_barriers in fs/xfs/linux-2.6/xfs_super.c, > > it tests ....->queue->ordered to see if that is QUEUE_ORDERED_NONE. > > If it is, then barriers are disabled. > > > > I think this is a layering violation - xfs really has no business > > looking that deeply into the device. > > Except that the device behaviour determines what XFS needs to do > and there used to be no other way to find out. > > Christoph, any reason for needing this check anymore? I can't see > any particular reason for needing to do this as __make_request() > will check it for us when we test now. When I first implemented it I really dislike the idea of having request fail asynchrnously due to the lack of barriers. Then someone (Jens?) told me we need to do this check anyway because devices might lie to us, at which point I implemented the test superblock writeback to check if it actually works. So yes, we could probably get rid of the check now, although I'd prefer the block layer exporting an API to the filesystem to tell it whether there is any point in trying to use barriers. > > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > > retried without the barrier (after possibly waiting for dependant > > requests to complete). This is what other filesystems do, but I > > cannot find the code in xfs which does this. > > XFS doesn't handle this - I was unaware that the barrier status of the > underlying block device could change.... > > OOC, when did this behaviour get introduced? That would be really bad. XFS metadata buffers can have multiple bios and retrying a single one would be rather difficult. > + /* > + * We can get an EOPNOTSUPP to ordered writes. Here we clear the > + * ordered flag and reissue them. Because we can't tell the higher > + * layers directly that they should not issue ordered I/O anymore, they > + * need to check if the ordered flag was cleared during I/O completion. > + */ > + if ((bp->b_error == EOPNOTSUPP) && > + (bp->b_flags & (XBF_ORDERED|XBF_ASYNC)) == (XBF_ORDERED|XBF_ASYNC)) { > + XB_TRACE(bp, "ordered_retry", bp->b_iodone); > + bp->b_flags &= ~XBF_ORDERED; > + xfs_buf_iorequest(bp); > + } else if (bp->b_iodone) > (*(bp->b_iodone))(bp); > else if (bp->b_flags & XBF_ASYNC) > xfs_buf_relse(bp); So you're retrying the whole I/O, this is probably better than trying to handle this at the bio level. I still don't quite like doing another I/O from the I/O completion handler. From owner-xfs@oss.sgi.com Fri Mar 23 06:28:56 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 06:29:04 -0700 (PDT) X-Spam-oss-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_50, MIME_8BIT_HEADER autolearn=no version=3.2.0-pre1-r499012 Received: from srv-mailrelais-dz1.argus.int (smtp.argus-presse.fr [213.244.9.225]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2NDSs6p013977 for ; Fri, 23 Mar 2007 06:28:56 -0700 Received: from srv-mail-dz2.argus.int (srv-mail-dz2.argus.int [192.168.2.10]) by srv-mailrelais-dz1.argus.int (Postfix) with ESMTP id 6979426968 for ; Fri, 23 Mar 2007 14:10:42 +0100 (CET) Received: from [10.0.0.111] (pnoel-as.argus.int [10.0.0.111]) by srv-mail-dz2.argus.int (Postfix) with ESMTP id 595283D94 for ; Fri, 23 Mar 2007 14:10:42 +0100 (CET) Subject: xfs_repair on Debian testing 64bits From: Patrick =?ISO-8859-1?Q?No=EBl?= To: xfs@oss.sgi.com Content-Type: text/plain Organization: Argus de la Presse Date: Fri, 23 Mar 2007 14:10:42 +0100 Message-Id: <1174655443.5441.37.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 Content-Transfer-Encoding: 7bit X-archive-position: 10930 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: patrick.noel@argus-presse.fr Precedence: bulk X-list: xfs Content-Length: 730 Lines: 36 Hi, i have a device with 5,3To under a debian (sarge) 32bits server. When i try a xfs_repair (2.8.20) on 5,3To i have a message : xfs_repair: libxfs_initbuf can't memalign 4096 bytes: Cannot allocate memory [ ... ] i tried mounting the device on a 64 bits (Debian testing) and with xfs_repair 2.8.20 there is no problem with memory but is very long (12 hours between phase 1 and phase 6) when i use xfs_repair on 32 bits i saw 8 thread but on 64 bits just only on thread. i try to force 8 thread with -o thread=8 and i have this message : xfs_repair -o thread=8 /dev/sdb1 - creating 8 worker thread(s) but with ps xa | grep repair i see only one thread. have an idea to have several thread ? Thanks Patrick From owner-xfs@oss.sgi.com Fri Mar 23 09:48:39 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 09:48:45 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, MIME_8BIT_HEADER autolearn=no version=3.2.0-pre1-r499012 Received: from rrzmta2.rz.uni-regensburg.de (rrzmta2.rz.uni-regensburg.de [194.94.155.53]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2NGmb6p024868 for ; Fri, 23 Mar 2007 09:48:38 -0700 Received: from rrzmta2.rz.uni-regensburg.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id B2606FC; Fri, 23 Mar 2007 17:25:39 +0100 (CET) Received: from pc51072.physik.uni-regensburg.de (pc51072.physik.uni-regensburg.de [132.199.98.129]) by rrzmta2.rz.uni-regensburg.de (Postfix) with ESMTP id A9B0CD9; Fri, 23 Mar 2007 17:25:29 +0100 (CET) Received: by pc51072.physik.uni-regensburg.de (Postfix, from userid 28561) id C7F13507069; Fri, 23 Mar 2007 17:25:22 +0100 (CET) Date: Fri, 23 Mar 2007 17:25:22 +0100 From: Christian Guggenberger To: Patrick =?iso-8859-1?Q?No=EBl?= Cc: xfs@oss.sgi.com Subject: Re: xfs_repair on Debian testing 64bits Message-ID: <20070323162522.GA27240@pc51072.physik.uni-regensburg.de> Reply-To: christian.guggenberger@physik.uni-regensburg.de References: <1174655443.5441.37.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1174655443.5441.37.camel@localhost.localdomain> User-Agent: Mutt/1.5.9i X-archive-position: 10931 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: christian.guggenberger@physik.uni-regensburg.de Precedence: bulk X-list: xfs Content-Length: 261 Lines: 16 > i try to force 8 thread with -o thread=8 > > and i have this message : > > xfs_repair -o thread=8 /dev/sdb1 > - creating 8 worker thread(s) > > > but with ps xa | grep repair i see only one thread. > IMHO, you'll need 'ps xaH' cheers. - Christian From owner-xfs@oss.sgi.com Fri Mar 23 14:32:50 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 23 Mar 2007 14:32:54 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from postfix1-g20.free.fr (postfix1-g20.free.fr [212.27.60.42]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2NLWn6p017830 for ; Fri, 23 Mar 2007 14:32:50 -0700 Received: from smtp5-g19.free.fr (smtp5-g19.free.fr [212.27.42.35]) by postfix1-g20.free.fr (Postfix) with ESMTP id 76297BFDD98 for ; Fri, 23 Mar 2007 22:10:22 +0100 (CET) Received: from [192.168.100.11] (alf94-14-88-166-211-59.fbx.proxad.net [88.166.211.59]) by smtp5-g19.free.fr (Postfix) with ESMTP id BE7798B9D for ; Fri, 23 Mar 2007 22:10:19 +0100 (CET) Subject: Re: xfs_repair on Debian testing 64bits From: Patrick Noel To: xfs@oss.sgi.com In-Reply-To: <20070323162522.GA27240@pc51072.physik.uni-regensburg.de> References: <1174655443.5441.37.camel@localhost.localdomain> <20070323162522.GA27240@pc51072.physik.uni-regensburg.de> Content-Type: text/plain; charset=ISO-8859-1 Date: Fri, 23 Mar 2007 22:10:14 +0100 Message-Id: <1174684214.5758.6.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 Content-Transfer-Encoding: 8bit X-archive-position: 10932 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: patrick.noel.g@free.fr Precedence: bulk X-list: xfs Content-Length: 480 Lines: 25 Yes H for show threads as if they were processes with ps xa under debian sarge i saw the threads Thanks Patrick Le vendredi 23 mars 2007 à 17:25 +0100, Christian Guggenberger a écrit : > > i try to force 8 thread with -o thread=8 > > > > and i have this message : > > > > xfs_repair -o thread=8 /dev/sdb1 > > - creating 8 worker thread(s) > > > > > > but with ps xa | grep repair i see only one thread. > > > > IMHO, you'll need 'ps xaH' > > cheers. > - Christian From owner-xfs@oss.sgi.com Sat Mar 24 20:19:48 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 24 Mar 2007 20:19:51 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_42 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2P3Jj6p018695 for ; Sat, 24 Mar 2007 20:19:47 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA10973; Sun, 25 Mar 2007 14:19:30 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2P3JTAf38525304; Sun, 25 Mar 2007 14:19:29 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2P3JRri38507258; Sun, 25 Mar 2007 14:19:27 +1100 (AEDT) Date: Sun, 25 Mar 2007 14:19:27 +1100 From: David Chinner To: Neil Brown Cc: Timothy Shimmin , xfs@oss.sgi.com Subject: Re: XFS and write barriers. Message-ID: <20070325031927.GG32602149@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <1755676AA526FF7790546385@timothy-shimmins-power-mac-g5.local> <17923.35118.139991.252734@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17923.35118.139991.252734@notabene.brown> User-Agent: Mutt/1.4.2.1i X-archive-position: 10933 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1934 Lines: 49 On Fri, Mar 23, 2007 at 07:00:46PM +1100, Neil Brown wrote: > On Friday March 23, tes@sgi.com wrote: > > > > > > I think this test should just be removed and the xfs_barrier_test > > > should be the main mechanism for seeing if barriers work. > > > > > Oh okay. > > This is all Christoph's (hch) code, so it would be good for him to comment here. > > The external log and readonly tests can stay though. > > > > Why no barriers on an external log device??? Not important, just > curious. because we need to synchronize across 2 devices, not one, so issuing barriers on an external log device does nothing to order the metadata written to the other device... > > > This is particularly important for md/raid1 as it is quite possible > > > that barriers will be supported at first, but after a failure and > > > different device on a different controller could be swapped in that > > > does not support barriers. > > > > > > > Oh okay, I see. And then later one that supported them can be swapped back in? > > So the other FSs are doing a sync'ed write out and then if there is an > > EOPNOTSUPP they retry and disable barrier support henceforth? > > Yeah, I guess we could do that in xlog_iodone() on failed completion and retry the write without > > the ORDERED flag on EOPNOTSUPP error case (and turn off the flag). > > Dave (dgc) can you see a problem with that? > > If an md/raid1 disables barriers and subsequently is restored to a > state where all drives support barriers, it currently does *not* > re-enable them device-wide. This would probably be quite easy to > achieve, but as no existing filesystem would ever try barriers > again..... And this is exactly why I think we need a block->fs communications channel for these sorts of things. Think of something like the CPU hotplug notifier mechanisms as a rough example framework.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sat Mar 24 20:24:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 24 Mar 2007 20:24:38 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50, MIME_8BIT_HEADER autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2P3OX6p019912 for ; Sat, 24 Mar 2007 20:24:35 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA11027; Sun, 25 Mar 2007 14:24:23 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2P3OKAf38546464; Sun, 25 Mar 2007 14:24:21 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2P3OIWT38517866; Sun, 25 Mar 2007 14:24:18 +1100 (AEDT) Date: Sun, 25 Mar 2007 14:24:18 +1100 From: David Chinner To: Patrick =?iso-8859-1?Q?No=EBl?= Cc: xfs@oss.sgi.com Subject: Re: xfs_repair on Debian testing 64bits Message-ID: <20070325032418.GH32602149@melbourne.sgi.com> References: <1174655443.5441.37.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1174655443.5441.37.camel@localhost.localdomain> User-Agent: Mutt/1.4.2.1i X-archive-position: 10934 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 927 Lines: 30 On Fri, Mar 23, 2007 at 02:10:42PM +0100, Patrick Noël wrote: > > Hi, > > i have a device with 5,3To under a debian (sarge) 32bits server. > > When i try a xfs_repair (2.8.20) on 5,3To i have a message : > > xfs_repair: libxfs_initbuf can't memalign 4096 bytes: Cannot allocate > memory [ ... ] Out of memory. Not surprising - 5.6To of filesystem could take between 10-20GB of RAM to repair successfully. Can't do that on a 32bit machine. > i tried mounting the device on a 64 bits (Debian testing) and with > xfs_repair 2.8.20 there is no problem with memory but is very long (12 > hours between phase 1 and phase 6) Repair duration is determined by the number of inodes in the filesystem. Given the runtime you are reporting, I'd say you've got millions (perhaps even tens of milllions) of inodes in the filesystem. Is that correct? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sat Mar 24 20:51:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 24 Mar 2007 20:51:42 -0700 (PDT) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_05 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2P3pa6p025220 for ; Sat, 24 Mar 2007 20:51:38 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA11580; Sun, 25 Mar 2007 14:51:29 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2P3pSAf38547998; Sun, 25 Mar 2007 14:51:28 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2P3pQ1x38545190; Sun, 25 Mar 2007 14:51:26 +1100 (AEDT) Date: Sun, 25 Mar 2007 14:51:26 +1100 From: David Chinner To: Christoph Hellwig Cc: David Chinner , Neil Brown , xfs@oss.sgi.com Subject: Re: XFS and write barriers. Message-ID: <20070325035126.GI32602149@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <20070323095055.GA13478@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070323095055.GA13478@infradead.org> User-Agent: Mutt/1.4.2.1i X-archive-position: 10935 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 3531 Lines: 86 On Fri, Mar 23, 2007 at 09:50:55AM +0000, Christoph Hellwig wrote: > On Fri, Mar 23, 2007 at 04:30:43PM +1100, David Chinner wrote: > > On Fri, Mar 23, 2007 at 12:26:31PM +1100, Neil Brown wrote: > > > > > > Hi, > > > I have two concerns related to XFS and write barrier support that I'm > > > hoping can be resolved. > > > > > > Firstly in xfs_mountfs_check_barriers in fs/xfs/linux-2.6/xfs_super.c, > > > it tests ....->queue->ordered to see if that is QUEUE_ORDERED_NONE. > > > If it is, then barriers are disabled. > > > > > > I think this is a layering violation - xfs really has no business > > > looking that deeply into the device. > > > > Except that the device behaviour determines what XFS needs to do > > and there used to be no other way to find out. > > > > Christoph, any reason for needing this check anymore? I can't see > > any particular reason for needing to do this as __make_request() > > will check it for us when we test now. > > When I first implemented it I really dislike the idea of having request > fail asynchrnously due to the lack of barriers. Then someone (Jens?) > told me we need to do this check anyway because devices might lie to > us, at which point I implemented the test superblock writeback to > check if it actually works. > > So yes, we could probably get rid of the check now, although I'd > prefer the block layer exporting an API to the filesystem to tell > it whether there is any point in trying to use barriers. Ditto. > > > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > > > retried without the barrier (after possibly waiting for dependant > > > requests to complete). This is what other filesystems do, but I > > > cannot find the code in xfs which does this. > > > > XFS doesn't handle this - I was unaware that the barrier status of the > > underlying block device could change.... > > > > OOC, when did this behaviour get introduced? > > That would be really bad. XFS metadata buffers can have multiple bios > and retrying a single one would be rather difficult. > > > + /* > > + * We can get an EOPNOTSUPP to ordered writes. Here we clear the > > + * ordered flag and reissue them. Because we can't tell the higher > > + * layers directly that they should not issue ordered I/O anymore, they > > + * need to check if the ordered flag was cleared during I/O completion. > > + */ > > + if ((bp->b_error == EOPNOTSUPP) && > > + (bp->b_flags & (XBF_ORDERED|XBF_ASYNC)) == (XBF_ORDERED|XBF_ASYNC)) { > > + XB_TRACE(bp, "ordered_retry", bp->b_iodone); > > + bp->b_flags &= ~XBF_ORDERED; > > + xfs_buf_iorequest(bp); > > + } else if (bp->b_iodone) > > (*(bp->b_iodone))(bp); > > else if (bp->b_flags & XBF_ASYNC) > > xfs_buf_relse(bp); > > So you're retrying the whole I/O, this is probably better than trying > to handle this at the bio level. I still don't quite like doing another > I/O from the I/O completion handler. You're not the only one, Christoph. This may be better than trying to handle it at lower layers, and far better than having to handle it at every point in the higher layers where we may issue barrier I/Os. But I *seriously dislike* having to reissue async I/Os in this manner and then having to rely on a higher layer's I/o completion handler to detect the fact that the I/O was retried to change the way the filesystem issues I/Os in the future. It's a really crappy way of communicating between layers.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sat Mar 24 21:18:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Sat, 24 Mar 2007 21:18:14 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2P4I86p030386 for ; Sat, 24 Mar 2007 21:18:09 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA12206; Sun, 25 Mar 2007 15:17:59 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2P4HvAf38541051; Sun, 25 Mar 2007 15:17:58 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2P4HtPF38361693; Sun, 25 Mar 2007 15:17:55 +1100 (AEDT) Date: Sun, 25 Mar 2007 15:17:55 +1100 From: David Chinner To: Neil Brown Cc: David Chinner , xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. Message-ID: <20070325041755.GJ32602149@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <17923.34462.210758.852042@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17923.34462.210758.852042@notabene.brown> User-Agent: Mutt/1.4.2.1i X-archive-position: 10936 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 3344 Lines: 88 On Fri, Mar 23, 2007 at 06:49:50PM +1100, Neil Brown wrote: > On Friday March 23, dgc@sgi.com wrote: > > On Fri, Mar 23, 2007 at 12:26:31PM +1100, Neil Brown wrote: > > > Secondly, if a barrier write fails due to EOPNOTSUPP, it should be > > > retried without the barrier (after possibly waiting for dependant > > > requests to complete). This is what other filesystems do, but I > > > cannot find the code in xfs which does this. > > > > XFS doesn't handle this - I was unaware that the barrier status of the > > underlying block device could change.... > > > > OOC, when did this behaviour get introduced? > > Probably when md/raid1 started supported barriers.... > > The problem is that this interface is (as far as I can see) undocumented > and not fully specified. And not communicated very far, either. > Barriers only make sense inside drive firmware. I disagree. e.g. Barriers have to be handled by the block layer to prevent reordering of I/O in the request queues as well. The block layer is responsible for ensuring barrier I/Os, as indicated by the filesystem, act as real barriers. > Trying to emulate it > in the md layer doesn't make any sense as the filesystem is in a much > better position to do any emulation required. You're saying that the emulation of block layer functionality is the responsibility of layers above the block layer. Why is this not considered a layering violation? > > > This is particularly important for md/raid1 as it is quite possible > > > that barriers will be supported at first, but after a failure and > > > different device on a different controller could be swapped in that > > > does not support barriers. > > > > I/O errors are not the way this should be handled. What happens if > > the opposite happens? A drive that needs barriers is used as a > > replacement on a filesystem that has barriers disabled because they > > weren't needed? Now a crash can result in filesystem corruption, but > > the filesystem has not been able to warn the admin that this > > situation occurred. > > There should never be a possibility of filesystem corruption. > If the a barrier request fails, the filesystem should: > wait for any dependant request to complete > call blkdev_issue_flush > schedule the write of the 'barrier' block > call blkdev_issue_flush again. IOWs, the filesystem has to use block device calls to emulate a block device barrier I/O. Why can't the block layer, on reception of a barrier write and detecting that barriers are no longer supported by the underlying device (i.e. in MD), do: wait for all queued I/Os to complete call blkdev_issue_flush schedule the write of the 'barrier' block call blkdev_issue_flush again. And not involve the filesystem at all? i.e. why should the filesystem have to do this? > My understand is that that sequence is as safe as a barrier, but maybe > not as fast. Yes, and my understanding is that the block device is perfectly capable of implementing this just as safely as the filesystem. > The patch looks at least believable. As you can imagine it is awkward > to test thoroughly. As well as being pretty much impossible to test reliably with an automated testing framework. Hence so ongoing test coverage will approach zero..... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Mar 25 06:09:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 06:09:19 -0700 (PDT) X-Spam-oss-Status: No, score=0.8 required=5.0 tests=AWL,BAYES_60, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,J_CHICKENPOX_34,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from ty.sabi.co.UK (82-69-39-138.dsl.in-addr.zen.co.uk [82.69.39.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2PD996p031553 for ; Sun, 25 Mar 2007 06:09:11 -0700 Resent-From: pg_mh@sabi.co.UK Received: from from [127.0.0.1] (helo=base.ty.sabi.co.UK) by ty.sabi.co.UK with esmtp(Exim 4.62 #1) id 1HVReN-0008Mo-FG for ; Sun, 25 Mar 2007 13:16:43 +0100 Resent-Message-ID: <17926.26665.807869.199983@base.ty.sabi.co.UK> Resent-Date: Sun, 25 Mar 2007 13:16:41 +0100 Resent-To: linux-xfs@oss.sgi.com X-Face: SMJE]JPYVBO-9UR%/8d'mG.F!@.,l@c[f'[%S8'BZIcbQc3/">GrXDwb#;fTRGNmHr^JFb SAptvwWc,0+z+~p~"Gdr4H$(|N(yF(wwCM2bW0~U?HPEE^fkPGx^u[*[yV.gyB!hDOli}EF[\cW*S H&spRGFL}{`bj1TaD^l/"[ msn( /TH#THs{Hpj>)]f> Message-ID: <17926.26297.892974.270267@base.ty.sabi.co.UK> References: <1174655443.5441.37.camel@localhost.localdomain> In-Reply-To: <1174655443.5441.37.camel@localhost.localdomain> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2PD9B6p031564 X-archive-position: 10937 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: pg_mh@sabi.co.UK Precedence: bulk X-list: xfs Content-Length: 2961 Lines: 69 >>> On Fri, 23 Mar 2007 14:10:42 +0100, Patrick Noël >>> said: patrick.noel> Hi, i have a device with 5,3To under a debian patrick.noel> (sarge) 32bits server. When i try a xfs_repair patrick.noel> (2.8.20) on 5,3To i have a message : xfs_repair: patrick.noel> libxfs_initbuf can't memalign 4096 bytes: Cannot patrick.noel> allocate memory http://OSS.SGI.com/archives/linux-xfs/2005-08/msg00045.html «From some quick tests I just ran, for 32bit binaries xfs_check needs around 1GiB RAM per TiB of filesystem plus about 100MiB RAM per 1million inodes in the filesystem (more if you have lots of fragmented files). Double this for 64bit binaries. e.g. it took 1.5GiB RAM for 32bit xfs_check and 2.7GiB RAM for a 64bit xfs_check on a 1.1TiB filesystem with 3million inodes in it.» And from a message that is missing from the list archive: «To successfully check or run repair on a multi-terabyte filesystem, you need: - a 64bit machine - a 64bit xfs_repair/xfs_check binary - ~2GB RAM per terabyte of filesystem - 100-200MB of RAM per million inodes in the filesystem. xfs_repair will usually use less memory than this, but these numbers give you a ballpark figure for what a large filesystem that is > 80% full can require to repair. FWIW, last time this came up internally, the 29TB filesystem in question took ~75GB of RAM+swap to repair.» patrick.noel> [ ... ] i tried mounting the device on a 64 bits patrick.noel> (Debian testing) and with xfs_repair 2.8.20 there patrick.noel> is no problem with memory but is very long (12 patrick.noel> hours between phase 1 and phase 6) [ ... ] That seems to me quite fast, like 500GB per hour; you don't say how many files, but assuming say 50KB/file that's 10M files/hour. For comparison sometimes it takes more than two months to 'fsck' a 1.5TB filesystem with 'ext3': http://UKAI.org/b/log/debian/snapshot/1_month_fsck-2005-07-22-00-00.html http://UKAI.org/b/log/debian/snapshot/fsck_completed_but-2005-09-04-15-00.html A rarely appreciated aspect of 'fsck' is that check and repair of an error-free filesystem is usually *much* faster than for one with errors. From some small scale tests I did some time ago: http://WWW.sabi.co.UK/blog/anno06-2nd.html#060424b «If one extrapolates linearly from the larger filesystem, which has an 800KiB average file size, a filesystem with 6.5TiB of data in 8.5M inodes will take at least 50 minutes to check, and one with 65TiB of data in around 85M inodes at least 11 hours. If one extrapolates from the smaller filesystem, which has a 16KiB average file size, those times must be at least 4 times larger. Again, these are optimal times, assuming that there the filesystem is freshly restored and there are no errors. One can imagine that recovery in a filesystem with damage and fragmentation be a lot slower.» From owner-xfs@oss.sgi.com Sun Mar 25 10:49:18 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 10:49:21 -0700 (PDT) X-Spam-oss-Status: No, score=2.1 required=5.0 tests=AWL,BAYES_80 autolearn=no version=3.2.0-pre1-r499012 Received: from smtp2.mundo-r.com (smtp5.mundo-r.com [212.51.32.152]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2PHnE6p031806 for ; Sun, 25 Mar 2007 10:49:16 -0700 Received: from cm44039.red83-165.mundo-r.com (HELO [192.168.1.36]) ([83.165.44.39]) by smtp2.mundo-r.com with ESMTP; 25 Mar 2007 19:49:11 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAK9SBkZTpSwn/2dsb2JhbAAN X-IronPort-AV: i="4.14,326,1170630000"; d="po'?scan'208"; a="168582179:sNHT141266265" Message-ID: <4606B617.4020807@mundo-r.com> Date: Sun, 25 Mar 2007 19:49:11 +0200 From: Antonio Trueba User-Agent: IceDove 1.5.0.10 (X11/20070307) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: New ACL translations X-Enigmail-Version: 0.94.2.0 Content-Type: multipart/mixed; boundary="------------070900010502020704060304" X-archive-position: 10938 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: atrueba@mundo-r.com Precedence: bulk X-list: xfs Content-Length: 23741 Lines: 717 This is a multi-part message in MIME format. --------------070900010502020704060304 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hello all, Here I send new ACL translations to Spanish (es) and Galician (gl), both against current CVS tree. Regards, -- --------------070900010502020704060304 Content-Type: text/x-gettext-translation; name="es.po" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="es.po" # SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER # This file is distributed under the same license as the PACKAGE package. # FIRST AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: acl-2.2.43.1\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2007-03-16 19:00+0100\n" "PO-Revision-Date: 2007-03-16 23:25+0100\n" "Last-Translator: Antonio Trueba \n" "Language-Team: Spanish\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=utf-8\n" "Content-Transfer-Encoding: 8bit\n" "X-Poedit-Language: Spanish\n" #: ../chacl/chacl.c:45 #, c-format msgid "Usage:\n" msgstr "Uso:\n" #: ../chacl/chacl.c:46 #, c-format msgid "\t%s acl pathname...\n" msgstr "\t%s nombre de ruta de ACL...\n" #: ../chacl/chacl.c:47 #, c-format msgid "\t%s -b acl dacl pathname...\n" msgstr "\t%s -b nombre de ruta ACL DACL...\n" #: ../chacl/chacl.c:48 #, c-format msgid "\t%s -d dacl pathname...\n" msgstr "\t%s -d nombre de ruta de ACL...\n" #: ../chacl/chacl.c:49 #, c-format msgid "\t%s -R pathname...\n" msgstr "\t%s -R ruta...\n" #: ../chacl/chacl.c:50 #, c-format msgid "\t%s -D pathname...\n" msgstr "\t%s -D ruta...\n" #: ../chacl/chacl.c:51 #, c-format msgid "\t%s -B pathname...\n" msgstr "\t%s -B ruta...\n" #: ../chacl/chacl.c:52 #, c-format msgid "\t%s -l pathname...\t[not IRIX compatible]\n" msgstr "\t%s -l ruta...\t[no compatible con IRIX]\n" #: ../chacl/chacl.c:54 #, c-format msgid "\t%s -r pathname...\t[not IRIX compatible]\n" msgstr "\t%s -r ruta...\t[no compatible con IRIX]\n" #: ../chacl/chacl.c:145 #, c-format msgid "%s: error removing access acl on \"%s\": %s\n" msgstr "%s: error borrando ACL de acceso en \"%s\": %s\n" #: ../chacl/chacl.c:152 #, c-format msgid "%s: error removing default acl on \"%s\": %s\n" msgstr "%s: error borrando ACL predeterminado en \"%s\": %s\n" #: ../chacl/chacl.c:171 #: ../chacl/chacl.c:190 #, c-format msgid "%s: access ACL '%s': %s at entry %d\n" msgstr "%s: ACL de acceso '%s': %s en posición %d\n" #: ../chacl/chacl.c:258 #, c-format msgid "%s: cannot get access ACL on '%s': %s\n" msgstr "%s: no se pudo obtener ACL de acceso en '%s': %s\n" #: ../chacl/chacl.c:264 #, c-format msgid "%s: cannot get default ACL on '%s': %s\n" msgstr "%s: no se pudo obtener ACL predeterminado en '%s': %s\n" #: ../chacl/chacl.c:270 #, c-format msgid "%s: cannot get access ACL text on '%s': %s\n" msgstr "%s: no se pudo obtener texto ACL de acceso en '%s': %s\n" #: ../chacl/chacl.c:277 #, c-format msgid "%s: cannot get default ACL text on '%s': %s\n" msgstr "%s: no se pudo obtener texto de ACL predeterminado en '%s': %s\n" #: ../chacl/chacl.c:303 #, c-format msgid "%s: cannot set access acl on \"%s\": %s\n" msgstr "%s: no se pudo establecer ACL de acceso en \"%s\": %s\n" #: ../chacl/chacl.c:309 #, c-format msgid "%s: cannot set default acl on \"%s\": %s\n" msgstr "%s: no se pudo establecer ACL predeterminado a \"%s\": %s\n" #: ../chacl/chacl.c:327 #, c-format msgid "%s: opendir failed: %s\n" msgstr "%s: falló la apertura: %s\n" #: ../chacl/chacl.c:341 #, c-format msgid "%s: malloc failed: %s\n" msgstr "%s: falló la asignación de memoria: %s\n" #: ../setfacl/do_set.c:391 #, c-format msgid "%s: %s: Malformed access ACL `%s': %s at entry %d\n" msgstr "%s: %s: ACL incorrecto `%s': %s en posición %d\n" #: ../setfacl/do_set.c:418 #, c-format msgid "%s: %s: Malformed default ACL `%s': %s at entry %d\n" msgstr "%s: %s: ACL predeterminado incorrecto `%s': %s en posición %d\n" #: ../setfacl/do_set.c:480 #, c-format msgid "%s: %s: Only directories can have default ACLs\n" msgstr "%s: %s: Sólo los directorios pueden tener ACLs predeterminados\n" #: ../setfacl/setfacl.c:151 #, c-format msgid "%s: %s: No filename found in line %d, aborting\n" msgstr "%s: %s: No se encontró nombre de archivo en línea %d, abortando\n" #: ../setfacl/setfacl.c:156 #, c-format msgid "%s: No filename found in line %d of standard input, aborting\n" msgstr "%s: %s: No se encontró nombre de archivo en línea %d de entrada estándar, abortando\n" #: ../setfacl/setfacl.c:176 #, c-format msgid "%s: %s: %s in line %d\n" msgstr "%s: %s: %s en línea %d\n" #: ../setfacl/setfacl.c:200 #, c-format msgid "%s: %s: Cannot change owner/group: %s\n" msgstr "%s: %s: No se pudo cambiar el propietario/grupo: %s\n" #: ../setfacl/setfacl.c:240 #, c-format msgid "%s %s -- set file access control lists\n" msgstr "%s %s -- establecer listas de control de acceso a archivo\n" #: ../setfacl/setfacl.c:242 #: ../setfacl/setfacl.c:692 #, c-format msgid "Usage: %s %s\n" msgstr "Uso: %s %s\n" #: ../setfacl/setfacl.c:245 #, c-format msgid "" " -m, --modify=acl modify the current ACL(s) of file(s)\n" " -M, --modify-file=file read ACL entries to modify from file\n" " -x, --remove=acl remove entries from the ACL(s) of file(s)\n" " -X, --remove-file=file read ACL entries to remove from file\n" " -b, --remove-all remove all extended ACL entries\n" " -k, --remove-default remove the default ACL\n" msgstr "" " -m, --modify=acl modificar ACL actual(es) de archivo(s)\n" " -M, --modify-file=arch leer entradas ACL desde \"arch\"\n" " -x, --remove=acl eliminar entradas desde ACL(s) de archivo(s)\n" " -X, --remove-file=arch leer entradas de ACL a borrar desde \"arch\"\n" " -b, --remove-all eliminar todas las entradas ACL extendidas\n" " -k, --remove-default eliminar el ACL predeterminado\n" #: ../setfacl/setfacl.c:254 #, c-format msgid "" " --set=acl set the ACL of file(s), replacing the current ACL\n" " --set-file=file read ACL entries to set from file\n" " --mask do recalculate the effective rights mask\n" msgstr "" " --set=acl establecer ACL(s) de archivo(s), reemplazando el actual\n" " --set-file=arch leer entradas ACL a establecer desde \"arch\"\n" " --mask recalcular la máscara de permisos efectivos\n" #: ../setfacl/setfacl.c:260 #, c-format msgid "" " -n, --no-mask don't recalculate the effective rights mask\n" " -d, --default operations apply to the default ACL\n" msgstr "" " -n, --no-mask no recalcular la máscara de derechos efectivos\n" " -d, --default las operaciones afectal al ACL predeterminado\n" #: ../setfacl/setfacl.c:265 #, c-format msgid "" " -R, --recursive recurse into subdirectories\n" " -L, --logical logical walk, follow symbolic links\n" " -P, --physical physical walk, do not follow symbolic links\n" " --restore=file restore ACLs (inverse of `getfacl -R')\n" " --test test mode (ACLs are not modified)\n" msgstr "" " -R, --recursive recorrer subdirectorios recursivamente\n" " -L, --logical recorrido lógico, siguiendo enlaces simbólicos\n" " -P, --physical recorrido físico, sin seguir enlaces simbólicos\n" " --restore=file restaurar ACLs (inverso de `getfacl -R')\n" " --test modo de prueba (los ACLs no se modifican)\n" #: ../setfacl/setfacl.c:273 #: ../getfacl/getfacl.c:559 #, c-format msgid "" " --version print version and exit\n" " --help this help text\n" msgstr "" " --version escribir versión y salir\n" " --help este texto de ayuda\n" #: ../setfacl/setfacl.c:358 #: ../getfacl/getfacl.c:768 #, c-format msgid "%s: Standard input: %s\n" msgstr "%s: Entrada estándar: %s\n" #: ../setfacl/setfacl.c:494 #, c-format msgid "%s: Option -%c incomplete\n" msgstr "%s: Opción -%c incompleta\n" #: ../setfacl/setfacl.c:499 #, c-format msgid "%s: Option -%c: %s near character %d\n" msgstr "%s: Opción -%c: %s cerca del carácter %d\n" #: ../setfacl/setfacl.c:575 #, c-format msgid "%s: %s in line %d of file %s\n" msgstr "%s: %s en línea %d de archivo %s\n" #: ../setfacl/setfacl.c:583 #, c-format msgid "%s: %s in line %d of standard input\n" msgstr "%s: %s en línea %d de entrada estándar\n" #: ../setfacl/setfacl.c:694 #: ../getfacl/getfacl.c:782 #, c-format msgid "Try `%s --help' for more information.\n" msgstr "Escriba `%s --help' para más información.\n" #: ../getfacl/getfacl.c:463 #, c-format msgid "%s: Removing leading '/' from absolute path names\n" msgstr "%s: Eliminando '/' inicial en nombres de ruta absolutos\n" #: ../getfacl/getfacl.c:532 #, c-format msgid "%s %s -- get file access control lists\n" msgstr "%s %s -- obtener listas de control de acceso a archivo\n" #: ../getfacl/getfacl.c:534 #: ../getfacl/getfacl.c:780 #, c-format msgid "Usage: %s [-%s] file ...\n" msgstr "Uso: %s [-%s] archivo ...\n" #: ../getfacl/getfacl.c:540 #, c-format msgid " -d, --default display the default access control list\n" msgstr " -d, --default mostrar la lista de control de acceso predeterminada\n" #: ../getfacl/getfacl.c:544 #, c-format msgid "" " --access display the file access control list only\n" " -d, --default display the default access control list only\n" " --omit-header do not display the comment header\n" " --all-effective print all effective rights\n" " --no-effective print no effective rights\n" " --skip-base skip files that only have the base entries\n" " -R, --recursive recurse into subdirectories\n" " -L, --logical logical walk, follow symbolic links\n" " -P --physical physical walk, do not follow symbolic links\n" " --tabular use tabular output format\n" " --numeric print numeric user/group identifiers\n" " --absolute-names don't strip leading '/' in pathnames\n" msgstr "" " --access sólo mostrar la lista de control de acceso a fichero\n" " -d, --default sólo mostrar la lista de acceso predeterminada\n" " --omit-header no mostrar el encabezado de comentarios\n" " --all-effective mostrar todos los permisos efectivos\n" " --no-effective mostrar los permisos no efectivos\n" " --skip-base ignorar archivos que sólo tienen las entradas básicas\n" " -R, --recursive descender recursivamente en los subdirectorios\n" " -L, --logical recorrido lógico, siguiendo enlaces simbólicos\n" " -P --physical recorrido físico, sin seguir enlaces simbólicos\n" " --tabular usar formato de salida tabular\n" " --numeric mostrar identificadores numéricos de usuario/grupo\n" " --absolute-names no eliminar '/' inicial en nombres de ruta\n" #: ../libacl/acl_error.c:34 msgid "Multiple entries of same type" msgstr "Múltiples entradas del mismo tipo" #: ../libacl/acl_error.c:36 msgid "Duplicate entries" msgstr "Entradas duplicadas" #: ../libacl/acl_error.c:38 msgid "Missing or wrong entry" msgstr "Falta una posición o es errónea" #: ../libacl/acl_error.c:40 msgid "Invalid entry type" msgstr "Tipo de posición inválido" #: ../libacl/perm_copy_fd.c:124 #: ../libacl/perm_copy_fd.c:136 #: ../libacl/perm_copy_fd.c:198 #: ../libacl/perm_copy_file.c:124 #: ../libacl/perm_copy_file.c:139 #: ../libacl/perm_copy_file.c:150 #: ../libacl/perm_copy_file.c:235 #, c-format msgid "setting permissions for %s" msgstr "estableciendo permisos a %s" #: ../libacl/perm_copy_fd.c:186 #: ../libacl/perm_copy_file.c:199 #: ../libacl/perm_copy_file.c:224 #, c-format msgid "preserving permissions for %s" msgstr "manteniendo permisos a %s" --------------070900010502020704060304 Content-Type: text/x-gettext-translation; name="gl.po" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="gl.po" # SOME DESCRIPTIVE TITLE. # Copyright (C) YEAR THE PACKAGE'S COPYRIGHT HOLDER # This file is distributed under the same license as the PACKAGE package. # FIRST AUTHOR , YEAR. # msgid "" msgstr "" "Project-Id-Version: acl-2.2.43.1\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2007-03-06 12:08+0100\n" "PO-Revision-Date: 2007-03-16 18:52+0100\n" "Last-Translator: Antonio Trueba \n" "Language-Team: Galician\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=utf-8\n" "Content-Transfer-Encoding: 8bit\n" "X-Poedit-Language: Galician\n" #: ../chacl/chacl.c:45 #, c-format msgid "Usage:\n" msgstr "Uso:\n" #: ../chacl/chacl.c:46 #, c-format msgid "\t%s acl pathname...\n" msgstr "\t%s nome de rota do ACL...\n" #: ../chacl/chacl.c:47 #, c-format msgid "\t%s -b acl dacl pathname...\n" msgstr "\t%s -b nome de ruta ACL DACL..\n" #: ../chacl/chacl.c:48 #, c-format msgid "\t%s -d dacl pathname...\n" msgstr "\t%s -d rota ó ACL...\n" #: ../chacl/chacl.c:49 #, c-format msgid "\t%s -R pathname...\n" msgstr "\t%s -R rota...\n" #: ../chacl/chacl.c:50 #, c-format msgid "\t%s -D pathname...\n" msgstr "\t%s -D rota...\n" #: ../chacl/chacl.c:51 #, c-format msgid "\t%s -B pathname...\n" msgstr "\t%s -B rota...\n" #: ../chacl/chacl.c:52 #, c-format msgid "\t%s -l pathname...\t[not IRIX compatible]\n" msgstr "\t%s -l rota...\t[non compatible con IRIX]\n" #: ../chacl/chacl.c:54 #, c-format msgid "\t%s -r pathname...\t[not IRIX compatible]\n" msgstr "\t%s -r rota...\t[non compatible con IRIX]\n" #: ../chacl/chacl.c:145 #, c-format msgid "%s: error removing access acl on \"%s\": %s\n" msgstr "%s: erro borrando ACL de acceso en \"%s\": %s\n" #: ../chacl/chacl.c:152 #, c-format msgid "%s: error removing default acl on \"%s\": %s\n" msgstr "%s: erro borrando ACL predeterminado en \"%s\": %s\n" #: ../chacl/chacl.c:171 #: ../chacl/chacl.c:190 #, c-format msgid "%s: access ACL '%s': %s at entry %d\n" msgstr "%s: ACL de acceso '%s': %s en posición %d\n" #: ../chacl/chacl.c:258 #, c-format msgid "%s: cannot get access ACL on '%s': %s\n" msgstr "%s: non se puido obter ACL de acceso en '%s': %s\n" #: ../chacl/chacl.c:264 #, c-format msgid "%s: cannot get default ACL on '%s': %s\n" msgstr "%s: non se puido obter ACL predeterminado en '%s': %s\n" #: ../chacl/chacl.c:270 #, c-format msgid "%s: cannot get access ACL text on '%s': %s\n" msgstr "%s: non se puido obter texto ACL de acceso en '%s': %s\n" #: ../chacl/chacl.c:277 #, c-format msgid "%s: cannot get default ACL text on '%s': %s\n" msgstr "%s: non se puido obter texto de ACL predeterminado en '%s': %s\n" #: ../chacl/chacl.c:303 #, c-format msgid "%s: cannot set access acl on \"%s\": %s\n" msgstr "%s: non se puido establecé-lo ACL de acceso en \"%s\": %s\n" #: ../chacl/chacl.c:309 #, c-format msgid "%s: cannot set default acl on \"%s\": %s\n" msgstr "%s: non se puido establecé-lo ACL predeterminado en \"%s\": %s\n" #: ../chacl/chacl.c:327 #, c-format msgid "%s: opendir failed: %s\n" msgstr "%s: a chamada a opendir fallou: %s\n" #: ../chacl/chacl.c:341 #, c-format msgid "%s: malloc failed: %s\n" msgstr "%s: a chamada a malloc fallou: %s\n" #: ../setfacl/do_set.c:391 #, c-format msgid "%s: %s: Malformed access ACL `%s': %s at entry %d\n" msgstr "%s: %s: ACL incorrecto `%s': %s na posición %d\n" #: ../setfacl/do_set.c:418 #, c-format msgid "%s: %s: Malformed default ACL `%s': %s at entry %d\n" msgstr "%s: %s: ACL predeterminado incorrecto `%s': %s na posición %d\n" #: ../setfacl/do_set.c:480 #, c-format msgid "%s: %s: Only directories can have default ACLs\n" msgstr "%s: %s: Só os directorios poden ter ACLs predeterminados\n" #: ../setfacl/setfacl.c:151 #, c-format msgid "%s: %s: No filename found in line %d, aborting\n" msgstr "%s: %s: Non se atopou nome de ficheiro na liña %d, abortando\n" #: ../setfacl/setfacl.c:156 #, c-format msgid "%s: No filename found in line %d of standard input, aborting\n" msgstr "%s: Non se atopou nome de ficheiro na liña %d da entrada estándar, abortando\n" #: ../setfacl/setfacl.c:176 #, c-format msgid "%s: %s: %s in line %d\n" msgstr "%s: %s: %s na liña %d\n" #: ../setfacl/setfacl.c:200 #, c-format msgid "%s: %s: Cannot change owner/group: %s\n" msgstr "%s: %s: Non se pode cambiá-lo propietario/grupo: %s\n" #: ../setfacl/setfacl.c:240 #, c-format msgid "%s %s -- set file access control lists\n" msgstr "%s %s -- establecer listas de control de acceso a ficheiro\n" #: ../setfacl/setfacl.c:242 #: ../setfacl/setfacl.c:692 #, c-format msgid "Usage: %s %s\n" msgstr "Uso: %s %s\n" #: ../setfacl/setfacl.c:245 #, c-format msgid "" " -m, --modify=acl modify the current ACL(s) of file(s)\n" " -M, --modify-file=file read ACL entries to modify from file\n" " -x, --remove=acl remove entries from the ACL(s) of file(s)\n" " -X, --remove-file=file read ACL entries to remove from file\n" " -b, --remove-all remove all extended ACL entries\n" " -k, --remove-default remove the default ACL\n" msgstr "" " -m, --modify=ACL modificá-lo ACL actual de ficheiro(s)\n" " -M, --modify-file=fich ler entradas ACL a modificar dende ficheiro\n" " -x, --remove=ACL borrar entradas do ACL de ficheiro(s)\n" " -X, --remove-file=fich ler entradas dACL a borrar dende ficheiro\n" " -b, --remove-all borrar tódalas entradas de ACL extendidas\n" " -k, --remove-default borrar ó ACL predeterminado\n" #: ../setfacl/setfacl.c:254 #, c-format msgid "" " --set=acl set the ACL of file(s), replacing the current ACL\n" " --set-file=file read ACL entries to set from file\n" " --mask do recalculate the effective rights mask\n" msgstr "" " --set=ACL estableceé-lo ACL de ficheiro(s), substituindo ó ACL actual\n" " --set-file=fich ler entradas ACL a establecer dende ficheiro\n" " --mask recalculá-la máscara de dereitos efectiva\n" #: ../setfacl/setfacl.c:260 #, c-format msgid "" " -n, --no-mask don't recalculate the effective rights mask\n" " -d, --default operations apply to the default ACL\n" msgstr "" " -n, --no-mask non recalculá-la máscara de dereitos efectiva\n" " -d, --default as operacións afectan ó ACL predeterminado\n" #: ../setfacl/setfacl.c:265 #, c-format msgid "" " -R, --recursive recurse into subdirectories\n" " -L, --logical logical walk, follow symbolic links\n" " -P, --physical physical walk, do not follow symbolic links\n" " --restore=file restore ACLs (inverse of `getfacl -R')\n" " --test test mode (ACLs are not modified)\n" msgstr "" " -R, --recursive recorrer subdirectorios recursivamente\n" " -L, --logical percorrido lóxico, seguindo enlaces simbólicos\n" " -P, --physical percorrido físico, non seguir enlaces simbólicos\n" " --restore=fich restaurar ACLs (inverso de 'getfacl -R')\n" " --test modo de proba (os ACLs non son modificados)\n" #: ../setfacl/setfacl.c:273 #: ../getfacl/getfacl.c:559 #, c-format msgid "" " --version print version and exit\n" " --help this help text\n" msgstr "" " --version amosar versión e sair\n" " --help este texto de axuda\n" #: ../setfacl/setfacl.c:358 #: ../getfacl/getfacl.c:768 #, c-format msgid "%s: Standard input: %s\n" msgstr "%s: Entrada estándar: %s\n" #: ../setfacl/setfacl.c:494 #, c-format msgid "%s: Option -%c incomplete\n" msgstr "%s: Opción -%c incompleta\n" #: ../setfacl/setfacl.c:499 #, c-format msgid "%s: Option -%c: %s near character %d\n" msgstr "%s: Opción -%c: %s preto do carácter %d\n" #: ../setfacl/setfacl.c:575 #, c-format msgid "%s: %s in line %d of file %s\n" msgstr "%s: %s na liña %d do ficheiro %s\n" #: ../setfacl/setfacl.c:583 #, c-format msgid "%s: %s in line %d of standard input\n" msgstr "%s: %s na liña %d da entrada estándar\n" #: ../setfacl/setfacl.c:694 #: ../getfacl/getfacl.c:782 #, c-format msgid "Try `%s --help' for more information.\n" msgstr "Escriba \"%s --help\" para máis información.\n" #: ../getfacl/getfacl.c:463 #, c-format msgid "%s: Removing leading '/' from absolute path names\n" msgstr "%s: Eliminando '/' iniciais en nomes de ruta absolutos\n" #: ../getfacl/getfacl.c:532 #, c-format msgid "%s %s -- get file access control lists\n" msgstr "%s %s -- obter listas de control de acceso a ficheiro\n" #: ../getfacl/getfacl.c:534 #: ../getfacl/getfacl.c:780 #, c-format msgid "Usage: %s [-%s] file ...\n" msgstr "Uso: %s [-%s] ficheiro ...\n" #: ../getfacl/getfacl.c:540 #, c-format msgid " -d, --default display the default access control list\n" msgstr " -d, --default amosá-la lista de control de acceso predeterminada\n" #: ../getfacl/getfacl.c:544 #, c-format msgid "" " --access display the file access control list only\n" " -d, --default display the default access control list only\n" " --omit-header do not display the comment header\n" " --all-effective print all effective rights\n" " --no-effective print no effective rights\n" " --skip-base skip files that only have the base entries\n" " -R, --recursive recurse into subdirectories\n" " -L, --logical logical walk, follow symbolic links\n" " -P --physical physical walk, do not follow symbolic links\n" " --tabular use tabular output format\n" " --numeric print numeric user/group identifiers\n" " --absolute-names don't strip leading '/' in pathnames\n" msgstr "" " --access só amosá-la lista de control de acceso ó ficheiro\n" " -d, --default só mostrá-la lista de control de acceso predeterminada\n" " --omit-header non amosá-lo comentario de encabezamento\n" " --all-effective amosar tódolos dereitos efectivos\n" " --no-effective non amosar ningún dereito efectivo\n" " --skip-base saltar ficheiros que só teñan as entradas básicas\n" " -R, --recursive descender recursivamente nos subdirectorios\n" " -L, --logical percorrido lóxico, seguindo enlaces simbólicos\n" " -P --physical percorrido físico, non seguir enlaces simbólicos\n" " --tabular usar formato de saída tabular\n" " --numeric amosar identificadores numéricos de propietario/grupo\n" " --absolute-names non eliminá-la '/' en nomes de ruta\n" #: ../libacl/acl_error.c:34 msgid "Multiple entries of same type" msgstr "Varias entradas do mesmo tipo" #: ../libacl/acl_error.c:36 msgid "Duplicate entries" msgstr "Entradas duplicadas" #: ../libacl/acl_error.c:38 msgid "Missing or wrong entry" msgstr "Falta un atributo, ou está mal formado" #: ../libacl/acl_error.c:40 msgid "Invalid entry type" msgstr "Tipo de entrada non válido" #: ../libacl/perm_copy_fd.c:124 #: ../libacl/perm_copy_fd.c:136 #: ../libacl/perm_copy_fd.c:198 #: ../libacl/perm_copy_file.c:124 #: ../libacl/perm_copy_file.c:139 #: ../libacl/perm_copy_file.c:150 #: ../libacl/perm_copy_file.c:235 #, c-format msgid "setting permissions for %s" msgstr "establecendo permisos para %s" #: ../libacl/perm_copy_fd.c:186 #: ../libacl/perm_copy_file.c:199 #: ../libacl/perm_copy_file.c:224 #, c-format msgid "preserving permissions for %s" msgstr "mantendo permisos de %s" --------------070900010502020704060304-- From owner-xfs@oss.sgi.com Sun Mar 25 16:22:08 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 16:22:11 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2PNM66p030936 for ; Sun, 25 Mar 2007 16:22:07 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 8A973215AC; Mon, 26 Mar 2007 01:21:59 +0200 (CEST) From: Neil Brown To: David Chinner Date: Mon, 26 Mar 2007 09:21:43 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17927.1031.996460.858328@notabene.brown> Cc: xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. In-Reply-To: message from David Chinner on Sunday March 25 References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <17923.34462.210758.852042@notabene.brown> <20070325041755.GJ32602149@melbourne.sgi.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D > Barriers only make sense inside drive firmware. > > I disagree. e.g. Barriers have to be handled by the block layer to > prevent reordering of I/O in the request queues as well. The > block layer is responsible for ensuring barrier I/Os, as > indicated by the filesystem, act as real barriers. Absolutely. The block layer needs to understand about barriers and allow them to do their job, which means not re-ordering requests around barriers. My point was that if the functionality cannot be provided in the lowest-level firmware (as it cannot for raid0 as there is no single lowest-level firmware), then it should be implemented at the filesystem level. Implementing barriers in md or dm doesn't make any sense (though passing barriers through can in some situations). > > > Trying to emulate it > > in the md layer doesn't make any sense as the filesystem is in a much > > better position to do any emulation required. > > You're saying that the emulation of block layer functionality is the > responsibility of layers above the block layer. Why is this not > considered a layering violation? :-) Maybe it depends on your perspective. I think this is filesystem layer functionality. Making sure blocks are written in the right order sounds like something that the filesystem should be primarily responsible for. The most straight-forward way to implement this is to make sure all preceding blocks have been written before writing the barrier block. All filesystems should be able to do this (if it is important to them). Because block IO tends to have long pipelines and because this operation will stall the pipeline, it makes sense for a block IO subsystem to provide the possibility of implementing this sequencing without a complete stall, and the 'barrier' flag makes that possible. But that doesn't mean it is block-layer functionality. It means (to me) it is common fs functionality that the block layer is helping out with. > > > > There should never be a possibility of filesystem corruption. > > If the a barrier request fails, the filesystem should: > > wait for any dependant request to complete > > call blkdev_issue_flush > > schedule the write of the 'barrier' block > > call blkdev_issue_flush again. > > IOWs, the filesystem has to use block device calls to emulate a block device > barrier I/O. Why can't the block layer, on reception of a barrier write > and detecting that barriers are no longer supported by the underlying > device (i.e. in MD), do: > > wait for all queued I/Os to complete > call blkdev_issue_flush > schedule the write of the 'barrier' block > call blkdev_issue_flush again. > > And not involve the filesystem at all? i.e. why should the filesystem > have to do this? Certainly it could. However a/ The the block layer would have to wait for *all* queued I/O, where-as the filesystem would only have to wait for queued IO which has a semantic dependence on the barrier block. So the filesystem can potentially perform the operation more efficiently. b/ Some block devices don't support barriers, so the filesystem needs to have the mechanisms in place to do this already. Why duplicate it in the block layer? (c/ md/raid0 doesn't track all the outstanding requests...:-) I think the block device should support barriers when it can do so more efficiently than the filesystem. For a single SCSI drive, it can. For a logical volume striped over multiple physical devices, it cannot. > > > My understand is that that sequence is as safe as a barrier, but maybe > > not as fast. > > Yes, and my understanding is that the block device is perfectly capable > of implementing this just as safely as the filesystem. > But possibly not as efficiently... What did XFS do before the block layer supported barriers? > > The patch looks at least believable. As you can imagine it is awkward > > to test thoroughly. > > As well as being pretty much impossible to test reliably with an > automated testing framework. Hence so ongoing test coverage will > approach zero..... This is a problem with barriers in general.... it is very hard to test that the data is encoded on the platter at any given time :-( NeilBrown From owner-xfs@oss.sgi.com Sun Mar 25 16:58:32 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 16:58:36 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx1.suse.de (mx1.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2PNwU6p005463 for ; Sun, 25 Mar 2007 16:58:31 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 52111124AA; Mon, 26 Mar 2007 01:58:28 +0200 (CEST) From: Neil Brown To: David Chinner Date: Mon, 26 Mar 2007 09:58:22 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17927.3230.273341.837191@notabene.brown> Cc: Christoph Hellwig , xfs@oss.sgi.com Subject: Re: XFS and write barriers. In-Reply-To: message from David Chinner on Sunday March 25 References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <20070323095055.GA13478@infradead.org> <20070325035126.GI32602149@melbourne.sgi.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D On Fri, Mar 23, 2007 at 09:50:55AM +0000, Christoph Hellwig wrote: > > > > So yes, we could probably get rid of the check now, although I'd > > prefer the block layer exporting an API to the filesystem to tell > > it whether there is any point in trying to use barriers. > > Ditto. What would be the point of that interface? If it only says "It might be worth testing", then you still have to test. And if you have to test, where is the value in asking in advance. The is no important difference between "the device said 'don't bother trying'" and "We tried and the device said 'no'". > > > > So you're retrying the whole I/O, this is probably better than trying > > to handle this at the bio level. I still don't quite like doing another > > I/O from the I/O completion handler. > > You're not the only one, Christoph. This may be better than trying > to handle it at lower layers, and far better than having to handle > it at every point in the higher layers where we may issue barrier > I/Os. But I think that has to be where it is handled. What other filesystems do is something like: if (barriers_supported) { submit barrier request; wait for completion if (fail with -EOPNOTSUPP) barriers_supported = 0; } if (!barriers_supported) { wait for other requests to complete; submit non-barrier request; wait for completion } handle_error Obviously if you are going to issue barrier writes from multiple places you would put this in a function... I'm not sure that other filesystems call blkdev_issue_flush.... As you said elsewhere, not a very effectively communicated interface. > > But I *seriously dislike* having to reissue async I/Os in this > manner and then having to rely on a higher layer's I/o completion > handler to detect the fact that the I/O was retried to change the > way the filesystem issues I/Os in the future. It's a really crappy > way of communicating between layers.... md/dm do add extra complexity to the blockdev interface that I don't think were fully considered when the interface wa designed. We would really like a client to say "I'm starting to build a bio" so that the device can either block that until a reconfiguration completes, or can block any reconfiguration until the bio is fully built and submitted (or aborted). Once you have that bio-being-built handle, it would probably make sense to test 'are barriers supported' for that bio without having to submit an IO.. NeilBrown From owner-xfs@oss.sgi.com Sun Mar 25 17:01:10 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 17:01:14 -0700 (PDT) X-Spam-oss-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx1.suse.de (mail.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2Q0196p006266 for ; Sun, 25 Mar 2007 17:01:10 -0700 Received: from Relay1.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 8B2FA121D9; Mon, 26 Mar 2007 02:01:07 +0200 (CEST) From: Neil Brown To: David Chinner Date: Mon, 26 Mar 2007 10:01:01 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17927.3389.655835.610940@notabene.brown> Cc: Neil Brown , Timothy Shimmin , xfs@oss.sgi.com Subject: Re: XFS and write barriers. In-Reply-To: message from David Chinner on Sunday March 25 References: <17923.11463.459927.628762@notabene.brown> <1755676AA526FF7790546385@timothy-shimmins-power-mac-g5.local> <17923.35118.139991.252734@notabene.brown> <20070325031927.GG32602149@melbourne.sgi.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D On Fri, Mar 23, 2007 at 07:00:46PM +1100, Neil Brown wrote: > > > > Why no barriers on an external log device??? Not important, just > > curious. > > because we need to synchronize across 2 devices, not one, so issuing > barriers on an external log device does nothing to order the metadata > written to the other device... Right, of course. Just like over a raid0. So you must have code to wait for all writes to the main device before writing the commit block on the journal. How hard is it to fall-back to that if the barrier fails? NeilBrown From owner-xfs@oss.sgi.com Sun Mar 25 18:11:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 18:11:39 -0700 (PDT) X-Spam-oss-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2Q1BZ6p018396 for ; Sun, 25 Mar 2007 18:11:36 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id 3E7BE2190E; Mon, 26 Mar 2007 03:11:23 +0200 (CEST) From: Neil Brown To: Christoph Hellwig Date: Mon, 26 Mar 2007 11:11:11 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17927.7599.592215.21546@notabene.brown> Cc: David Chinner , xfs@oss.sgi.com Subject: Re: XFS and write barriers. In-Reply-To: message from Christoph Hellwig on Friday March 23 References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <20070323095055.GA13478@infradead.org> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D > That would be really bad. XFS metadata buffers can have multiple bios > and retrying a single one would be rather difficult. > But would you have multiple bios for a write that had BIO_RW_BARRIER set? That would seem .... odd. NeilBrown From owner-xfs@oss.sgi.com Sun Mar 25 20:14:25 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 20:14:29 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_48 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2Q3EL6p003689 for ; Sun, 25 Mar 2007 20:14:24 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA06263; Mon, 26 Mar 2007 13:14:11 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2Q3EAAf39400042; Mon, 26 Mar 2007 14:14:10 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2Q3E7jV38948985; Mon, 26 Mar 2007 14:14:07 +1100 (AEDT) Date: Mon, 26 Mar 2007 14:14:07 +1100 From: David Chinner To: Neil Brown Cc: David Chinner , xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. Message-ID: <20070326031407.GG32597093@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <17923.34462.210758.852042@notabene.brown> <20070325041755.GJ32602149@melbourne.sgi.com> <17927.1031.996460.858328@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17927.1031.996460.858328@notabene.brown> User-Agent: Mutt/1.4.2.1i X-archive-position: 10943 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 4254 Lines: 105 On Mon, Mar 26, 2007 at 09:21:43AM +1000, Neil Brown wrote: > My point was that if the functionality cannot be provided in the > lowest-level firmware (as it cannot for raid0 as there is no single > lowest-level firmware), then it should be implemented at the > filesystem level. Implementing barriers in md or dm doesn't make any > sense (though passing barriers through can in some situations). Hold on - you've said that the barrier support in a block deivce can change because of MD doing hot swap. Now you're saying there is no barrier implementation in md. Can you explain *exactly* what barrier support there is in MD? > > > Trying to emulate it > > > in the md layer doesn't make any sense as the filesystem is in a much > > > better position to do any emulation required. > > > > You're saying that the emulation of block layer functionality is the > > responsibility of layers above the block layer. Why is this not > > considered a layering violation? > > :-) > Maybe it depends on your perspective. I think this is filesystem > layer functionality. Making sure blocks are written in the right > order sounds like something that the filesystem should be primarily > responsible for. Sure, but if the filesystem requires the block layer to provide those ordering semantics to it. e.g. barrier I/Os. Remember, different filesystem have different levels of data+metadata safety and many of them do nothing to guarantee write ordering. > The most straight-forward way to implement this is to make sure all > preceding blocks have been written before writing the barrier block. > All filesystems should be able to do this (if it is important to them). ^^^^^^^^^^^^^^^^^^^^^^^^^^ And that is the key point - XFS provides no guarantee that your data is on spinning rust other than I/O barriers when you have volatile write caches. IOWs, if you turn barriers off, we provide *no guarantees* about the consistency of your filesystem after a power failure if you are using volatile write caching. This mode is for use with non-cached disks or disks with NVRAM caches where there is no need for barriers. > Because block IO tends to have long pipelines and because this > operation will stall the pipeline, it makes sense for a block IO > subsystem to provide the possibility of implementing this sequencing > without a complete stall, and the 'barrier' flag makes that possible. > But that doesn't mean it is block-layer functionality. It means (to > me) it is common fs functionality that the block layer is helping out > with. I disagree - it is a function supported and defined by the block layer. Errors returned to the filesystem are directly defined in the block layer, the ordering guarantees are provided by the block layer and changes in semantics appear to be defined by the block layer...... > > wait for all queued I/Os to complete > > call blkdev_issue_flush > > schedule the write of the 'barrier' block > > call blkdev_issue_flush again. > > > > And not involve the filesystem at all? i.e. why should the filesystem > > have to do this? > > Certainly it could. > However > a/ The the block layer would have to wait for *all* queued I/O, > where-as the filesystem would only have to wait for queued IO > which has a semantic dependence on the barrier block. So the > filesystem can potentially perform the operation more efficiently. Assuming the filesystem can do it more efficiently. What if it can't? What if, like XFS, when barriers are turned off, the filesystem provides *no* guarantees? > b/ Some block devices don't support barriers, so the filesystem needs > to have the mechanisms in place to do this already. No, you turn write caching off on the drive. This is an especially important consideration given that many older drives lied about cache flushes being complete (i.e. they were implemented as no-ops). > (c/ md/raid0 doesn't track all the outstanding requests...:-) XFS doesn't track all outstanding requests either.... > What did XFS do before the block layer supported barriers? Either turn off write caching or use non-volatile write caches. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Mar 25 20:58:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 20:58:15 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2Q3wA6p015106 for ; Sun, 25 Mar 2007 20:58:12 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA07510; Mon, 26 Mar 2007 13:58:07 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2Q3w6Af39405567; Mon, 26 Mar 2007 14:58:06 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2Q3w3bB39433351; Mon, 26 Mar 2007 14:58:03 +1100 (AEDT) Date: Mon, 26 Mar 2007 14:58:03 +1100 From: David Chinner To: Neil Brown Cc: David Chinner , Timothy Shimmin , xfs@oss.sgi.com Subject: Re: XFS and write barriers. Message-ID: <20070326035803.GH32597093@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <1755676AA526FF7790546385@timothy-shimmins-power-mac-g5.local> <17923.35118.139991.252734@notabene.brown> <20070325031927.GG32602149@melbourne.sgi.com> <17927.3389.655835.610940@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17927.3389.655835.610940@notabene.brown> User-Agent: Mutt/1.4.2.1i X-archive-position: 10944 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1720 Lines: 43 On Mon, Mar 26, 2007 at 10:01:01AM +1000, Neil Brown wrote: > On Sunday March 25, dgc@sgi.com wrote: > > On Fri, Mar 23, 2007 at 07:00:46PM +1100, Neil Brown wrote: > > > > > > Why no barriers on an external log device??? Not important, just > > > curious. > > > > because we need to synchronize across 2 devices, not one, so issuing > > barriers on an external log device does nothing to order the metadata > > written to the other device... > > Right, of course. Just like over a raid0. > > So you must have code to wait for all writes to the main device before > writing the commit block on the journal. Forget about what you know about journalling from ext3, XFS is vastly different and much more complex..... ;) We wait for space in the log to become available during transaction reservation; we don't wait for specific I/Os to complete because we just push a bunch out. Once we have a reservation, we know we have space in the log for our transaction commit and so we don't have to wait for any I/O to complete when we do our transaction commit. Hence we don't wait for the I/Os we may have issued to make space available; another thread's push may have made enough space for our reservation. IOWs, we've got *no idea* what the dependent I/Os are when writing the transaction commit to disk because we have no clue as to what we are overwriting in the journal. This journalling method assumes that we either have no drive level caching, non-volatile caching, or barrier-based log I/Os to prevent corruption on drive power loss. Hence with external logs on XFS you have the option of no caching or non-volatile caching.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Sun Mar 25 21:27:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 25 Mar 2007 21:27:51 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_12 autolearn=no version=3.2.0-pre1-r499012 Received: from mx1.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2Q4Ri6p018708 for ; Sun, 25 Mar 2007 21:27:46 -0700 Received: from Relay2.suse.de (mail2.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id DCBFF1224E; Mon, 26 Mar 2007 06:27:42 +0200 (CEST) From: Neil Brown To: David Chinner Date: Mon, 26 Mar 2007 14:27:24 +1000 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17927.19372.553410.527506@notabene.brown> Cc: xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. In-Reply-To: message from David Chinner on Monday March 26 References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <17923.34462.210758.852042@notabene.brown> <20070325041755.GJ32602149@melbourne.sgi.com> <17927.1031.996460.858328@notabene.brown> <20070326031407.GG32597093@melbourne.sgi.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D On Mon, Mar 26, 2007 at 09:21:43AM +1000, Neil Brown wrote: > > My point was that if the functionality cannot be provided in the > > lowest-level firmware (as it cannot for raid0 as there is no single > > lowest-level firmware), then it should be implemented at the > > filesystem level. Implementing barriers in md or dm doesn't make any > > sense (though passing barriers through can in some situations). > > Hold on - you've said that the barrier support in a block deivce > can change because of MD doing hot swap. Now you're saying > there is no barrier implementation in md. Can you explain > *exactly* what barrier support there is in MD? For all levels other than md/raid1, md rejects bio_barrier() requests as -EOPNOTSUPP. For raid1 it tests barrier support when writing the superblock and the if all devices support barriers, then md/raid1 will allow bio_barrier() down. If it gets an unexpected failure it just rewrites it without the barrier flag and fails any future write requests (which isn't ideal, but is the best available, and should happen effectively never). So md/raid1 barrier support is completely dependant on the underlying devices. md/raid1 is aware of barriers but does not *implement* them. Does that make it clearer? > > The most straight-forward way to implement this is to make sure all > > preceding blocks have been written before writing the barrier block. > > All filesystems should be able to do this (if it is important to them). > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > And that is the key point - XFS provides no guarantee that your > data is on spinning rust other than I/O barriers when you have > volatile write caches. > > IOWs, if you turn barriers off, we provide *no guarantees* > about the consistency of your filesystem after a power failure > if you are using volatile write caching. This mode is for > use with non-cached disks or disks with NVRAM caches where there > is no need for barriers. But.... as the block layer can re-order writes, even non-cached disks could get the writes in a different or to the order in which you sent them. I have a report of xfs over md/raid1 going about 10% faster once we managed to let barrier writes through, so presumably XFS does something different if barriers are not enabled ??? What does it do differently? > > > Because block IO tends to have long pipelines and because this > > operation will stall the pipeline, it makes sense for a block IO > > subsystem to provide the possibility of implementing this sequencing > > without a complete stall, and the 'barrier' flag makes that possible. > > But that doesn't mean it is block-layer functionality. It means (to > > me) it is common fs functionality that the block layer is helping out > > with. > > I disagree - it is a function supported and defined by the block > layer. Errors returned to the filesystem are directly defined > in the block layer, the ordering guarantees are provided by the > block layer and changes in semantics appear to be defined by > the block layer...... chuckle.... You can tell we are on different sides of the fence, can't you ? There is certainly some validity in your position... > > > > wait for all queued I/Os to complete > > > call blkdev_issue_flush > > > schedule the write of the 'barrier' block > > > call blkdev_issue_flush again. > > > > > > And not involve the filesystem at all? i.e. why should the filesystem > > > have to do this? > > > > Certainly it could. > > However > > a/ The the block layer would have to wait for *all* queued I/O, > > where-as the filesystem would only have to wait for queued IO > > which has a semantic dependence on the barrier block. So the > > filesystem can potentially perform the operation more efficiently. > > Assuming the filesystem can do it more efficiently. What if it > can't? What if, like XFS, when barriers are turned off, the > filesystem provides *no* guarantees? (Yes.... Ted T'so like casting aspersions on XFS... I guess this is why :-) Is there some mount flag to say "cope without barriers" or "require barriers" ?? I can imagine implementing barriers in raid5 (which keeps careful track of everything) but I suspect it would be a performance hit. It might be nice if the sysadmin has to explicitly ask... For that matter, I could get raid1 to reject replacement devices that didn't support barriers, if there was a way for the filesystem to explicitly ask for them. I think we are getting back to interface issues, aren't we? > > > (c/ md/raid0 doesn't track all the outstanding requests...:-) > > XFS doesn't track all outstanding requests either.... That surprises me... but maybe it shouldn't. Thanks. NeilBrown From owner-xfs@oss.sgi.com Mon Mar 26 02:05:17 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 02:05:20 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_12 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2Q95C6p004018 for ; Mon, 26 Mar 2007 02:05:15 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id TAA16039; Mon, 26 Mar 2007 19:05:00 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2Q94wAf39505669; Mon, 26 Mar 2007 20:04:59 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2Q94u4S39494038; Mon, 26 Mar 2007 20:04:56 +1100 (AEDT) Date: Mon, 26 Mar 2007 20:04:56 +1100 From: David Chinner To: Neil Brown Cc: David Chinner , xfs@oss.sgi.com, hch@infradead.org Subject: Re: XFS and write barriers. Message-ID: <20070326090456.GO32597093@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <20070323053043.GD32602149@melbourne.sgi.com> <17923.34462.210758.852042@notabene.brown> <20070325041755.GJ32602149@melbourne.sgi.com> <17927.1031.996460.858328@notabene.brown> <20070326031407.GG32597093@melbourne.sgi.com> <17927.19372.553410.527506@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <17927.19372.553410.527506@notabene.brown> User-Agent: Mutt/1.4.2.1i X-archive-position: 10946 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 6528 Lines: 149 On Mon, Mar 26, 2007 at 02:27:24PM +1000, Neil Brown wrote: > On Monday March 26, dgc@sgi.com wrote: > > On Mon, Mar 26, 2007 at 09:21:43AM +1000, Neil Brown wrote: > > > My point was that if the functionality cannot be provided in the > > > lowest-level firmware (as it cannot for raid0 as there is no single > > > lowest-level firmware), then it should be implemented at the > > > filesystem level. Implementing barriers in md or dm doesn't make any > > > sense (though passing barriers through can in some situations). > > > > Hold on - you've said that the barrier support in a block deivce > > can change because of MD doing hot swap. Now you're saying > > there is no barrier implementation in md. Can you explain > > *exactly* what barrier support there is in MD? > > For all levels other than md/raid1, md rejects bio_barrier() requests > as -EOPNOTSUPP. > > For raid1 it tests barrier support when writing the superblock and the > if all devices support barriers, then md/raid1 will allow > bio_barrier() down. If it gets an unexpected failure it just rewrites > it without the barrier flag and fails any future write requests (which > isn't ideal, but is the best available, and should happen effectively > never). > > So md/raid1 barrier support is completely dependant on the underlying > devices. md/raid1 is aware of barriers but does not *implement* > them. Does that make it clearer? Ah, that clears up the picture - thanks Neil. > > > The most straight-forward way to implement this is to make sure all > > > preceding blocks have been written before writing the barrier block. > > > All filesystems should be able to do this (if it is important to them). > > ^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > And that is the key point - XFS provides no guarantee that your > > data is on spinning rust other than I/O barriers when you have > > volatile write caches. > > > > IOWs, if you turn barriers off, we provide *no guarantees* > > about the consistency of your filesystem after a power failure > > if you are using volatile write caching. This mode is for > > use with non-cached disks or disks with NVRAM caches where there > > is no need for barriers. > > But.... as the block layer can re-order writes, even non-cached disks > could get the writes in a different or to the order in which you sent > them. But on a non-cached disk we've had an to have received an I/O completion before the tail of the log moves, and hence the metadata is on stable storage. The problem arises when volatile write caches are used and I/O completion no longer means "data on stable storage". > I have a report of xfs over md/raid1 going about 10% faster once we > managed to let barrier writes through, so presumably XFS does > something different if barriers are not enabled ??? What does it do > differently? I bet that the disk doesn't have it's write cache turned on. For disks with write cache turned on, barriers can slow down XFS by a factor of 5. Safety, not speed, was all we are after with barriers. > > > Because block IO tends to have long pipelines and because this > > > operation will stall the pipeline, it makes sense for a block IO > > > subsystem to provide the possibility of implementing this sequencing > > > without a complete stall, and the 'barrier' flag makes that possible. > > > But that doesn't mean it is block-layer functionality. It means (to > > > me) it is common fs functionality that the block layer is helping out > > > with. > > > > I disagree - it is a function supported and defined by the block > > layer. Errors returned to the filesystem are directly defined > > in the block layer, the ordering guarantees are provided by the > > block layer and changes in semantics appear to be defined by > > the block layer...... > > chuckle.... > You can tell we are on different sides of the fence, can't you ? Yup - no fence sitting here ;) > There is certainly some validity in your position... And likewise yours - I just don't think the responsibility here is quite so black and white... > > > > wait for all queued I/Os to complete > > > > call blkdev_issue_flush > > > > schedule the write of the 'barrier' block > > > > call blkdev_issue_flush again. > > > > > > > > And not involve the filesystem at all? i.e. why should the filesystem > > > > have to do this? > > > > > > Certainly it could. > > > However > > > a/ The the block layer would have to wait for *all* queued I/O, > > > where-as the filesystem would only have to wait for queued IO > > > which has a semantic dependence on the barrier block. So the > > > filesystem can potentially perform the operation more efficiently. > > > > Assuming the filesystem can do it more efficiently. What if it > > can't? What if, like XFS, when barriers are turned off, the > > filesystem provides *no* guarantees? > > (Yes.... Ted T'so like casting aspersions on XFS... I guess this is > why :-) Different design criteria. ext3 is great doing what it was designed for, and the same can be said for XFS. Take them outside their comfort area (like putting XFS on commodity disks with volatile write caches or putting millions of files into a single directory in ext3) and you get problems. it's just that they were designed for different purposes, and that includes data resilience during failure conditions. That being said, we are doing a lot in XFS to address some of these shortcomings - it's just that ordered writes can be very difficult to retrofit to an existing filesystem.... > Is there some mount flag to say "cope without barriers" or "require > barriers" ?? XFs has "-o nobarrier" to say don't use barriers, and this is *not* the default. If barriers don't work, we drop back to "-o nobarrier" after leaving a loud warning inthe log.... > I can imagine implementing barriers in raid5 (which keeps careful > track of everything) but I suspect it would be a performance hit. It > might be nice if the sysadmin has to explicitly ask... > > For that matter, I could get raid1 to reject replacement devices that > didn't support barriers, if there was a way for the filesystem to > explicitly ask for them. I think we are getting back to interface > issues, aren't we? Yeah, very much so. If you need the filesystem to be aware of smart things the block deivce can do or tell it, then we really don't want to have to communicate them via mount options ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Mar 26 03:14:48 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 03:14:55 -0700 (PDT) X-Spam-oss-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_50, MIME_8BIT_HEADER autolearn=no version=3.2.0-pre1-r499012 Received: from srv-mailrelais-dz1.argus.int (smtp.argus-presse.fr [213.244.9.225]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2QAEi6p018549 for ; Mon, 26 Mar 2007 03:14:46 -0700 Received: from srv-mail-dz2.argus.int (srv-mail-dz2.argus.int [192.168.2.10]) by srv-mailrelais-dz1.argus.int (Postfix) with ESMTP id B1C54269BB; Mon, 26 Mar 2007 12:14:39 +0200 (CEST) Received: from [10.0.0.111] (pnoel-as.argus.int [10.0.0.111]) by srv-mail-dz2.argus.int (Postfix) with ESMTP id 77EE53D98; Mon, 26 Mar 2007 12:14:38 +0200 (CEST) Subject: Re: xfs_repair on Debian testing 64bits From: Patrick =?ISO-8859-1?Q?No=EBl?= To: Peter Grandi , xfs@oss.sgi.com In-Reply-To: <17926.26297.892974.270267@base.ty.sabi.co.UK> References: <1174655443.5441.37.camel@localhost.localdomain> <17926.26297.892974.270267@base.ty.sabi.co.UK> Content-Type: text/plain; charset=ISO-8859-1 Organization: Argus de la Presse Date: Mon, 26 Mar 2007 12:14:38 +0200 Message-Id: <1174904078.5978.32.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.8.1 Content-Transfer-Encoding: 8bit X-archive-position: 10947 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: patrick.noel@argus-presse.fr Precedence: bulk X-list: xfs Content-Length: 703 Lines: 33 Le dimanche 25 mars 2007 à 13:10 +0100, Peter Grandi a écrit : > «To successfully check or run repair on a multi-terabyte > filesystem, you need: > > - a 64bit machine > - a 64bit xfs_repair/xfs_check binary > - ~2GB RAM per terabyte of filesystem > - 100-200MB of RAM per million inodes in the filesystem. > > for 5,3To 2GB * 5,3 = 10,6Go 100MB/million inodes * 45 = 4,5Go Total = 16Go Ram i presume the number of inodes is the number of used inodes because the number of inode on 5,3To is 5119 million. The goal is to use a device of 15To with the same data, consider the amount total of RAM at 45Go. I will go to the store of the corner to buy RAM :) Thanks Patrick From owner-xfs@oss.sgi.com Mon Mar 26 13:25:55 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 13:25:57 -0700 (PDT) X-Spam-oss-Status: No, score=2.4 required=5.0 tests=AWL,BAYES_99, J_CHICKENPOX_43,J_CHICKENPOX_56 autolearn=no version=3.2.0-pre1-r499012 Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.168]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2QKPq6p013129 for ; Mon, 26 Mar 2007 13:25:54 -0700 Received: by ug-out-1314.google.com with SMTP id a2so1630408ugf for ; Mon, 26 Mar 2007 13:25:47 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=AOAD4K9meWBk3FYrWEVPXdtEbJyne0J+DmdlPiF2mTNKmlJEH03Or+twApBBJI46Q5n261BFgEOcQThVlDNoFsumrDHTptNlXa+9i5QBwzGI+26IvNPDOX684OTJ+81N47nKgdaVPgtbw3JmWF+Q1TcF6ug1T9v5gnnmRmzEUvY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=Otw5gyH4sHtL1p+KLW+flD2OscBhLw6x0x0v45iAQ8WWzkmKOjCe+vB63KXGIx/ebfSU3Ny6TuJa92oiRRkCUZLtP8zAv4g/m+iUILBBz/Bvp2zk44apS0uaxb5dh6E0JA1T3BgKWF4e1aoKIAD4eNOpQ/pbDmQ8sHR/iEtEe2Q= Received: by 10.114.132.5 with SMTP id f5mr2827338wad.1174940746131; Mon, 26 Mar 2007 13:25:46 -0700 (PDT) Received: by 10.115.107.15 with HTTP; Mon, 26 Mar 2007 13:25:46 -0700 (PDT) Message-ID: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> Date: Mon, 26 Mar 2007 22:25:46 +0200 From: "Raz Ben-Jehuda(caro)" To: linux-xfs@oss.sgi.com Subject: changing xfs file system tunnings(sunit,swidth) after mkfs.xfs MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10949 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: raziebe@gmail.com Precedence: bulk X-list: xfs Content-Length: 187 Lines: 9 I want to change the sunit,swidth of a file system which was already created with a different tunnings, or at least mount it with a different values. Is it possible ? thank you -- Raz From owner-xfs@oss.sgi.com Mon Mar 26 15:09:15 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 15:09:19 -0700 (PDT) X-Spam-oss-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_43,J_CHICKENPOX_55,J_CHICKENPOX_56,J_CHICKENPOX_65 autolearn=no version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2QM9E6p003285 for ; Mon, 26 Mar 2007 15:09:15 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id F19ECAAC36C; Tue, 27 Mar 2007 08:09:10 +1000 (EST) Subject: Re: changing xfs file system tunnings(sunit,swidth) after mkfs.xfs From: Nathan Scott Reply-To: nscott@aconex.com To: "Raz Ben-Jehuda(caro)" Cc: xfs@oss.sgi.com In-Reply-To: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> References: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> Content-Type: text/plain; charset=UTF-8 Organization: Aconex Date: Tue, 27 Mar 2007 08:10:30 +1000 Message-Id: <1174947030.5051.586.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 8bit X-archive-position: 10950 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 1111 Lines: 27 On Mon, 2007-03-26 at 22:25 +0200, Raz Ben-Jehuda(caro) wrote: > I want to change the sunit,swidth of a file system which > was already created with a different tunnings, or at least > mount it with a different values. > Is it possible ? >From the XFS section in mount(8) ... sunit=value and swidth=value Used to specify the stripe unit and width for a RAID device or a stripe volume. value must be specified in 512-byte block units. If this option is not specified and the filesystem was made on a stripe volume or the stripe width or unit were specified for the RAID device at mkfs time, then the mount system call will restore the value from the superblock. For filesystems that are made directly on RAID devices, these options can be used to override the information in the superblock if the underlying disk layout changes after the filesystem has been created. The swidth option is required if the sunit option has been speci†fied, and must be a multiple of the sunit value. cheers. -- Nathan From owner-xfs@oss.sgi.com Mon Mar 26 18:41:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 18:41:27 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2R1fI6p018034 for ; Mon, 26 Mar 2007 18:41:20 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA18114; Tue, 27 Mar 2007 11:41:10 +1000 Date: Tue, 27 Mar 2007 11:41:47 +1100 From: Timothy Shimmin To: Antonio Trueba , xfs@oss.sgi.com Subject: Re: New ACL translations Message-ID: <6F618AD8B7E7C29C505E6AA6@timothy-shimmins-power-mac-g5.local> In-Reply-To: <4606B617.4020807@mundo-r.com> References: <4606B617.4020807@mundo-r.com> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10951 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 274 Lines: 19 Hi Antonio, Thanks. I'll add them in now... --Tim --On 25 March 2007 7:49:11 PM +0200 Antonio Trueba wrote: > Hello all, > > Here I send new ACL translations to Spanish (es) and Galician (gl), both > against current CVS tree. > > Regards, > -- From owner-xfs@oss.sgi.com Mon Mar 26 19:07:15 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 19:07:19 -0700 (PDT) X-Spam-oss-Status: No, score=-1.0 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2R27C6p025803 for ; Mon, 26 Mar 2007 19:07:14 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA18855 for ; Tue, 27 Mar 2007 12:07:10 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 1116) id 409A258FF76B; Tue, 27 Mar 2007 12:07:10 +1000 (EST) To: xfs@oss.sgi.com Subject: TAKE Add some more ACL translations. Message-Id: <20070327020710.409A258FF76B@chook.melbourne.sgi.com> Date: Tue, 27 Mar 2007 12:07:10 +1000 (EST) From: tes@sgi.com (Tim Shimmin) X-archive-position: 10952 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1074 Lines: 33 Add Spanish and Galician translations. Thanks to Antonio Trueba. --Tim Date: Tue Mar 27 12:06:04 AEST 2007 Workarea: chook.melbourne.sgi.com:/build/tes/xfs-cmds Inspected by: atrueba@mundo-r.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:28306a acl/po/gl.po - 1.1 - new http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/acl/po/gl.po - Add Galician translation. acl/po/es.po - 1.1 - new http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/acl/po/es.po - Add Spanish translation. acl/VERSION - 1.84 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/acl/VERSION.diff?r1=text&tr1=1.84&r2=text&tr2=1.83&f=h - Bump verion# for Spanish and Galician translations. acl/doc/CHANGES - 1.94 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/acl/doc/CHANGES.diff?r1=text&tr1=1.94&r2=text&tr2=1.93&f=h acl/po/Makefile - 1.13 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/acl/po/Makefile.diff?r1=text&tr1=1.13&r2=text&tr2=1.12&f=h - Add Spanish and Galician translations. From owner-xfs@oss.sgi.com Mon Mar 26 21:57:42 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 26 Mar 2007 21:57:46 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2R4vd6p007222 for ; Mon, 26 Mar 2007 21:57:41 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA23917; Tue, 27 Mar 2007 14:57:28 +1000 Date: Tue, 27 Mar 2007 14:58:06 +1100 From: Timothy Shimmin To: David Chinner , Neil Brown cc: xfs@oss.sgi.com Subject: Re: XFS and write barriers. Message-ID: In-Reply-To: <20070325031927.GG32602149@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <1755676AA526FF7790546385@timothy-shimmins-power-mac-g5.local> <17923.35118.139991.252734@notabene.brown> <20070325031927.GG32602149@melbourne.sgi.com> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10953 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1901 Lines: 46 --On 25 March 2007 2:19:27 PM +1100 David Chinner wrote: > On Fri, Mar 23, 2007 at 07:00:46PM +1100, Neil Brown wrote: >> On Friday March 23, tes@sgi.com wrote: >> > > >> > > I think this test should just be removed and the xfs_barrier_test >> > > should be the main mechanism for seeing if barriers work. >> > > >> > Oh okay. >> > This is all Christoph's (hch) code, so it would be good for him to comment here. >> > The external log and readonly tests can stay though. >> > >> >> Why no barriers on an external log device??? Not important, just >> curious. > > because we need to synchronize across 2 devices, not one, so issuing > barriers on an external log device does nothing to order the metadata > written to the other device... > I have wondered in the past (sgi-bug#954969) about doing a blk_issue_flush on the metadata device at xlog_sync time prior to the log write on the log device. 27/July/06 - pv#954969 Currently, if one uses external logs then the barrier support is turned off. The reaon for this is that a write barrier is normally only done on the data device which has the log. With an external log it means that a write barrier on a log device will not do any flushing on the metadata device. This pv is opened to explore the possibility of issuing an explicit metadata device flush at xlog_sync time before doing a write barrier on the log data to the log device. This would guarantee if the tail moved because a metadata thought its data was really on disk, would now be true as we would do a flush of its device. Then we could do our log write without worrying that our log write will overwrite log data when its metadata hadn't really made it. Perhaps I'm missing something. Dave (dgc) said he'd think about it. I haven't heard back from Christoph yet, and he added the code for our barrier support in xfs. --Tim --Tim From owner-xfs@oss.sgi.com Tue Mar 27 01:09:45 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 27 Mar 2007 01:09:48 -0700 (PDT) X-Spam-oss-Status: No, score=1.4 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_43,J_CHICKENPOX_55,J_CHICKENPOX_56,J_CHICKENPOX_65 autolearn=no version=3.2.0-pre1-r499012 Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.174]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2R89h6p023517 for ; Tue, 27 Mar 2007 01:09:45 -0700 Received: by ug-out-1314.google.com with SMTP id a2so1733593ugf for ; Tue, 27 Mar 2007 01:09:41 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=PyZClNIvZQBUfqtOX/Zl7JN+BPez4RIUt6/sJkTR81h6IiOlrX9XsnlXlnRakQd1Aq2G7HET0TZhqx7PegO8KHA3c1nV2GAZpmJ9PpG5vPsfhV3Yjy4htdCOHfTAss8wIuL0fw2psV43hR5Ac4XnDgR3SJSkS6CFes+/q9YpMtI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=RBsbGf78Citi+ud1cyFYdUTQh+1Z+2ljrCtKkG71DSXMS95ixWtU6BpUXpTkszPL3DA8eaNprP13ssWGqPrzhDZut5cmPKkhkrAV68AzasgqsOk5rwlUlvQx8jnYtYokwLpQKF2AE3tpTssogvyFswtl0Uq6Vu6m/GQd1MxnJt8= Received: by 10.114.13.1 with SMTP id 1mr3034443wam.1174979074227; Tue, 27 Mar 2007 00:04:34 -0700 (PDT) Received: by 10.115.107.15 with HTTP; Tue, 27 Mar 2007 00:04:33 -0700 (PDT) Message-ID: <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> Date: Tue, 27 Mar 2007 09:04:33 +0200 From: "Raz Ben-Jehuda(caro)" To: nscott@aconex.com Subject: Re: changing xfs file system tunnings(sunit,swidth) after mkfs.xfs Cc: xfs@oss.sgi.com In-Reply-To: <1174947030.5051.586.camel@edge> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-2022-JP; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> <1174947030.5051.586.camel@edge> X-archive-position: 10955 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: raziebe@gmail.com Precedence: bulk X-list: xfs Content-Length: 1328 Lines: 38 ahha... but did you ever tried to set an sunit value bigger than the one set during the mkfs.xfs ? On 3/27/07, Nathan Scott wrote: > On Mon, 2007-03-26 at 22:25 +0200, Raz Ben-Jehuda(caro) wrote: > > I want to change the sunit,swidth of a file system which > > was already created with a different tunnings, or at least > > mount it with a different values. > > Is it possible ? > > >From the XFS section in mount(8) ... > > sunit=value and swidth=value > Used to specify the stripe unit and width for a RAID device or a > stripe volume. value must be specified in 512-byte block units. > If this option is not specified and the filesystem was made on a > stripe volume or the stripe width or unit were specified for the > RAID device at mkfs time, then the mount system call will > restore the value from the superblock. For filesystems that are > made directly on RAID devices, these options can be used to > override the information in the superblock if the underlying > disk layout changes after the filesystem has been created. The > swidth option is required if the sunit option has been speci$B!>(B > fied, and must be a multiple of the sunit value. > > > cheers. > > -- > Nathan > > -- Raz From owner-xfs@oss.sgi.com Tue Mar 27 02:48:54 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 27 Mar 2007 02:48:58 -0700 (PDT) X-Spam-oss-Status: No, score=0.5 required=5.0 tests=AWL,BAYES_50, FH_HOST_EQ_D_D_D_D,FH_HOST_EQ_D_D_D_DB,J_CHICKENPOX_43,RDNS_DYNAMIC autolearn=no version=3.2.0-pre1-r499012 Received: from ext.agami.com (64.221.212.177.ptr.us.xo.net [64.221.212.177]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2R9mr6p010611 for ; Tue, 27 Mar 2007 02:48:54 -0700 Received: from agami.com (mail [192.168.168.5]) by ext.agami.com (8.12.5/8.12.5) with ESMTP id l2R9mNc4015494 for ; Tue, 27 Mar 2007 02:48:23 -0700 Received: from mx1.agami.com (mx1.agami.com [10.123.10.30]) by agami.com (8.12.11/8.12.11) with ESMTP id l2R9mX1T028243 for ; Tue, 27 Mar 2007 02:48:33 -0700 Received: from [10.125.200.31] ([10.125.200.31]) by mx1.agami.com with Microsoft SMTPSVC(6.0.3790.1830); Tue, 27 Mar 2007 02:48:55 -0700 Message-ID: <4608E86E.7000906@agami.com> Date: Tue, 27 Mar 2007 15:18:30 +0530 From: Shailendra Tripathi User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Raz Ben-Jehuda(caro)" CC: nscott@aconex.com, xfs@oss.sgi.com Subject: Re: changing xfs file system tunnings(sunit,swidth) after mkfs.xfs References: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> <1174947030.5051.586.camel@edge> <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> In-Reply-To: <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 27 Mar 2007 09:48:55.0955 (UTC) FILETIME=[22AB9630:01C77055] X-Scanned-By: MIMEDefang 2.58 on 192.168.168.13 X-archive-position: 10957 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: stripathi@agami.com Precedence: bulk X-list: xfs Content-Length: 193 Lines: 8 Raz Ben-Jehuda(caro) wrote: > ahha... > but did you ever tried to set an sunit value bigger than the one > set during the mkfs.xfs ? > Yes, I do it many times and I have always got it right. From owner-xfs@oss.sgi.com Tue Mar 27 05:21:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 27 Mar 2007 05:21:05 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_43 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2RCKw6p012906 for ; Tue, 27 Mar 2007 05:21:00 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id WAA06522; Tue, 27 Mar 2007 22:20:51 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2RCKnAf39588104; Tue, 27 Mar 2007 23:20:49 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2RCKlSE40478592; Tue, 27 Mar 2007 23:20:47 +1100 (AEDT) Date: Tue, 27 Mar 2007 23:20:47 +1100 From: David Chinner To: "Raz Ben-Jehuda(caro)" Cc: nscott@aconex.com, xfs@oss.sgi.com Subject: Re: changing xfs file system tunnings(sunit,swidth) after mkfs.xfs Message-ID: <20070327122047.GA32597093@melbourne.sgi.com> References: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> <1174947030.5051.586.camel@edge> <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-archive-position: 10958 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 696 Lines: 25 On Tue, Mar 27, 2007 at 09:04:33AM +0200, Raz Ben-Jehuda(caro) wrote: > ahha... > but did you ever tried to set an sunit value bigger than the one > set during the mkfs.xfs ? Sure - but you've got to satisfy a whole heap of alignment and size considerations when changing sunit e.g. AGs must be stripe unit aligned and their size must be a whole multiple of the stripe unit. Hence any new value generally needs to be a multiple so that filesystem alignment remains valid. You might want to get a recent kernel, too, as it will tell more about the reason why the mount with the new sunit is failing (in dmesg). Cheers, dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Mar 27 16:50:28 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 27 Mar 2007 16:50:36 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_43 autolearn=no version=3.2.0-pre1-r499012 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2RNoP6p011319 for ; Tue, 27 Mar 2007 16:50:28 -0700 Received: from edge (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id F0C75AAC2A5; Wed, 28 Mar 2007 09:50:21 +1000 (EST) Subject: Re: changing xfs file system tunnings(sunit,swidth) after mkfs.xfs From: Nathan Scott Reply-To: nscott@aconex.com To: "Raz Ben-Jehuda(caro)" Cc: xfs@oss.sgi.com In-Reply-To: <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> References: <5d96567b0703261325m8f17b2eg55e04264fee4832a@mail.gmail.com> <1174947030.5051.586.camel@edge> <5d96567b0703270004j72d89618xe27bbf3d5e44eea1@mail.gmail.com> Content-Type: text/plain Organization: Aconex Date: Wed, 28 Mar 2007 09:50:30 +1000 Message-Id: <1175039430.4645.14.camel@edge> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-archive-position: 10960 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs Content-Length: 259 Lines: 15 On Tue, 2007-03-27 at 09:04 +0200, Raz Ben-Jehuda(caro) wrote: > EST) > > > ahha... > but did you ever tried to set an sunit value bigger than the one > set during the mkfs.xfs ? I've not tried it, no - is that a trick question? :) cheers. -- Nathan From owner-xfs@oss.sgi.com Wed Mar 28 04:31:56 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 04:32:00 -0700 (PDT) X-Spam-oss-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_45 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2SBVr6p003686 for ; Wed, 28 Mar 2007 04:31:55 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id VAA18707; Wed, 28 Mar 2007 21:31:45 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2SBVhAf41347177; Wed, 28 Mar 2007 22:31:43 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2SBVfja41346123; Wed, 28 Mar 2007 22:31:41 +1100 (AEDT) Date: Wed, 28 Mar 2007 22:31:41 +1100 From: David Chinner To: Oliver Joa Cc: linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel Message-ID: <20070328113141.GQ32597093@melbourne.sgi.com> References: <46094344.4090007@j-o-a.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <46094344.4090007@j-o-a.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 10963 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1210 Lines: 46 On Tue, Mar 27, 2007 at 06:16:04PM +0200, Oliver Joa wrote: > Hi, > > since some weeks i try to get my new hardware running: > > Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz > Intel DP965LT Mainboard > Seagate SATA-Harddisk in AHCI-Mode > > After some hours of running or after some heavy file-i/o > (find / | cpio -padm /test) I always get a corrupted > XFS-filesystem. What is the corruption message in the log from XFS? Can you please post that? Without it we really can't help you. Also, please check to see if there are any I/O errors in the log around the time the corruption message appears. > I used already the following Kernels: > 2.6.19.2 > 2.6.19.7 > 2.6.20.2 > 2.6.20.4 > > After xfs_repair I get damaged files in lost+found. > > I read in newsgroups that the write-cache of the harddisk > should be turned of, but the messages are all very old. That's really only an issue for crashes, not runtime failures. > I also often get a sata-bus-reset with the kernels 2.6.19.2 > and 2.6.20.2. I/O errors. That's what we need to isolate first. The reports in your logs are the first thing we need to seeee. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Mar 28 06:08:46 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 06:08:51 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.gatrixx.com (mail.gatrixx.com [217.111.11.44]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2SD8j6p021066 for ; Wed, 28 Mar 2007 06:08:46 -0700 Received: (qmail 23747 invoked by uid 1008); 28 Mar 2007 14:42:02 +0200 Received: from unknown (HELO majestix.gallier.de) (ojoa@gatrixx.com@89.54.92.66) by 0 with AES256-SHA encrypted SMTP; 28 Mar 2007 14:42:02 +0200 Received: from [192.168.10.3] (olli@gutemine.gallier.de [192.168.10.3]) by majestix.gallier.de (8.13.8/8.13.8/Debian-2) with ESMTP id l2SCg0co016319; Wed, 28 Mar 2007 14:42:00 +0200 Message-ID: <460A6298.4040702@j-o-a.de> Date: Wed, 28 Mar 2007 14:42:00 +0200 From: Oliver Joa User-Agent: Icedove 1.5.0.10 (X11/20070307) MIME-Version: 1.0 To: David Chinner CC: linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> In-Reply-To: <20070328113141.GQ32597093@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10964 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: oliver@j-o-a.de Precedence: bulk X-list: xfs Content-Length: 3915 Lines: 101 Hi, David Chinner wrote: [...] > What is the corruption message in the log from XFS? > Can you please post that? Without it we really can't help you. > > Also, please check to see if there are any I/O errors > in the log around the time the corruption message appears. Ok, here is a test: test:/# find / -xdev | cpio -padm /test/ cpio: /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt: Structure needs cleaning 3648371 blocks test:/# test:/home/olli# uname -a Linux test 2.6.20.4-majestix-1 #1 SMP PREEMPT Tue Mar 27 12:15:41 CEST 2007 i686 GNU/Linux dmesg gives the following: [15442.935941] Filesystem "sda3": XFS internal error xfs_iformat(6) at line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 [15442.936003] [] xfs_iread+0x4ee/0x6e8 [15442.936039] [] xfs_iget+0x2e4/0x714 [15442.936071] [] xfs_iget+0x2e4/0x714 [15442.936101] [] xfs_dir_lookup_int+0x7d/0xd4 [15442.936135] [] xfs_lookup+0x52/0x78 [15442.936167] [] xfs_vn_lookup+0x3b/0x70 [15442.936201] [] do_lookup+0xa3/0x140 [15442.936234] [] __link_path_walk+0x73d/0xb5e [15442.936278] [] xfs_iunlock+0x51/0x6d [15442.936309] [] link_path_walk+0x44/0xb3 [15442.936342] [] do_path_lookup+0x176/0x191 [15442.936373] [] getname+0x59/0x8f [15442.936402] [] __user_walk_fd+0x2f/0x45 [15442.936431] [] vfs_lstat_fd+0x16/0x3d [15442.936461] [] sys_lstat64+0xf/0x23 [15442.936490] [] syscall_call+0x7/0xb [15442.936519] ======================= And after this command: test:/# rm /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt rm: cannot remove `/usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt': Structure needs cleaning test:/# I got: [18359.750604] Filesystem "sda3": XFS internal error xfs_iformat(6) at line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 [18359.750701] [] xfs_iread+0x4ee/0x6e8 [18359.750755] [] xfs_iget+0x2e4/0x714 [18359.750802] [] xfs_iget+0x2e4/0x714 [18359.750849] [] xfs_dir_lookup_int+0x7d/0xd4 [18359.750897] [] xfs_lookup+0x52/0x78 [18359.750943] [] xfs_vn_lookup+0x3b/0x70 [18359.750990] [] do_lookup+0xa3/0x140 [18359.751036] [] __link_path_walk+0x73d/0xb5e [18359.751086] [] link_path_walk+0x44/0xb3 [18359.751133] [] rb_insert_color+0x4c/0xad [18359.751180] [] vma_link+0x54/0xcd [18359.751226] [] do_path_lookup+0x176/0x191 [18359.751273] [] getname+0x59/0x8f [18359.751318] [] __user_walk_fd+0x2f/0x45 [18359.751364] [] vfs_lstat_fd+0x16/0x3d [18359.751410] [] rb_insert_color+0x4c/0xad [18359.751457] [] vma_link+0x54/0xcd [18359.751501] [] sys_lstat64+0xf/0x23 [18359.751546] [] do_page_fault+0x277/0x526 [18359.751595] [] do_page_fault+0x0/0x526 [18359.751640] [] syscall_call+0x7/0xb [18359.751686] [] rsc_parse+0x6f/0x37f [18359.751732] ======================= [18359.751784] Filesystem "sda3": XFS internal error xfs_iformat(6) at line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 [18359.751859] [] xfs_iread+0x4ee/0x6e8 [18359.751906] [] xfs_iget+0x2e4/0x714 [18359.751952] [] xfs_iget+0x2e4/0x714 [18359.751998] [] xfs_dir_lookup_int+0x7d/0xd4 [18359.752047] [] xfs_lookup+0x52/0x78 [18359.752094] [] xfs_vn_lookup+0x3b/0x70 [18359.752140] [] __lookup_hash+0xb1/0xe1 [18359.752191] [] do_unlinkat+0x5f/0x126 [18359.752237] [] do_page_fault+0x277/0x526 [18359.752285] [] syscall_call+0x7/0xb [18359.752331] [] rsc_parse+0x6f/0x37f [18359.752376] ======================= Thanks a Lot Oliver From owner-xfs@oss.sgi.com Wed Mar 28 07:56:29 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 07:56:35 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00, SPF_HELO_PASS autolearn=ham version=3.2.0-pre1-r499012 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2SEuR6p009535 for ; Wed, 28 Mar 2007 07:56:29 -0700 Received: from [10.0.0.4] (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id A4FA0187B8C45; Wed, 28 Mar 2007 09:56:25 -0500 (CDT) Message-ID: <460A821B.4080308@sandeen.net> Date: Wed, 28 Mar 2007 09:56:27 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: Oliver Joa CC: David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> In-Reply-To: <460A6298.4040702@j-o-a.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10965 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Content-Length: 1826 Lines: 45 Oliver Joa wrote: > Ok, here is a test: > > test:/# find / -xdev | cpio -padm /test/ > cpio: /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt: > Structure needs cleaning > 3648371 blocks > test:/# That, cryptically enough, means that the filesystem has detected a problem and has shut down. > test:/home/olli# uname -a > Linux test 2.6.20.4-majestix-1 #1 SMP PREEMPT Tue Mar 27 12:15:41 CEST > 2007 i686 GNU/Linux > > dmesg gives the following: > [15442.935941] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [15442.936003] [] xfs_iread+0x4ee/0x6e8 > [15442.936039] [] xfs_iget+0x2e4/0x714 > [15442.936071] [] xfs_iget+0x2e4/0x714 > [15442.936101] [] xfs_dir_lookup_int+0x7d/0xd4 > [15442.936135] [] xfs_lookup+0x52/0x78 > [15442.936167] [] xfs_vn_lookup+0x3b/0x70 > [15442.936201] [] do_lookup+0xa3/0x140 > [15442.936234] [] __link_path_walk+0x73d/0xb5e > [15442.936278] [] xfs_iunlock+0x51/0x6d > [15442.936309] [] link_path_walk+0x44/0xb3 > [15442.936342] [] do_path_lookup+0x176/0x191 > [15442.936373] [] getname+0x59/0x8f > [15442.936402] [] __user_walk_fd+0x2f/0x45 > [15442.936431] [] vfs_lstat_fd+0x16/0x3d > [15442.936461] [] sys_lstat64+0xf/0x23 > [15442.936490] [] syscall_call+0x7/0xb > [15442.936519] ======================= For one reason or another, xfs has detected a corrupted on-disk inode format which it cannot recognize, and shuts down. It is likely the result of something which has gone wrong previously. xfs_repair should fix it. Are there other non-xfs messages in your logs indicating other problems prior to this? -Eric From owner-xfs@oss.sgi.com Wed Mar 28 12:56:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 12:56:20 -0700 (PDT) X-Spam-oss-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00, J_CHICKENPOX_45 autolearn=no version=3.2.0-pre1-r499012 Received: from mail.gatrixx.com (mail.gatrixx.com [217.111.11.44]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2SJuD6p007683 for ; Wed, 28 Mar 2007 12:56:15 -0700 Received: (qmail 8811 invoked by uid 1008); 28 Mar 2007 21:56:09 +0200 Received: from unknown (HELO majestix.gallier.de) (ojoa@gatrixx.com@89.54.92.66) by 0 with AES256-SHA encrypted SMTP; 28 Mar 2007 21:56:09 +0200 Received: from [192.168.10.3] (olli@gutemine.gallier.de [192.168.10.3]) by majestix.gallier.de (8.13.8/8.13.8/Debian-2) with ESMTP id l2SJu8w4000747; Wed, 28 Mar 2007 21:56:08 +0200 Message-ID: <460AC857.6040305@j-o-a.de> Date: Wed, 28 Mar 2007 21:56:07 +0200 From: Oliver Joa User-Agent: Icedove 1.5.0.10 (X11/20070307) MIME-Version: 1.0 To: Eric Sandeen CC: David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> <460A821B.4080308@sandeen.net> In-Reply-To: <460A821B.4080308@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10966 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: oliver@j-o-a.de Precedence: bulk X-list: xfs Content-Length: 4709 Lines: 107 Hi, Eric Sandeen wrote: [...] > For one reason or another, xfs has detected a corrupted on-disk inode > format which it cannot recognize, and shuts down. It is likely the > result of something which has gone wrong previously. xfs_repair should > fix it. Are there other non-xfs messages in your logs indicating other > problems prior to this? i sent already the dmesg output to the list. there is nothing else. I made a xfs_repair. Now I have some Files in lost+found. So I tried it again with a new cable: test:/# find / -xdev | cpio -padm /test/ 3648526 blocks test:/# rm -rf test test:/# find / -xdev | cpio -padm /test/ find: /usr/src/linux-2.6.19.2/arch/sh/kernel/cpufreq.c: Structure needs cleaning find: /usr/src/linux-2.6.19.2/arch/sh/kernel/head.S: Structure needs cleaning find: /usr/src/linux-2.6.19.2/arch/sh/kernel/irq.c: Structure needs cleaning 3653268 blocks test:/# Since the reboot I did not get any bus-reset, but the following: [ 1878.777203] Filesystem "sda3": XFS internal error xfs_iformat(6) at line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 [ 1878.777264] [] xfs_iread+0x4ee/0x6e8 [ 1878.777298] [] xfs_iget+0x2e4/0x714 [ 1878.777451] [] xfs_iget+0x2e4/0x714 [ 1878.777513] [] xfs_dir_lookup_int+0x7d/0xd4 [ 1878.777576] [] xfs_lookup+0x52/0x78 [ 1878.777636] [] xfs_vn_lookup+0x3b/0x70 [ 1878.777696] [] do_lookup+0xa3/0x140 [ 1878.777757] [] __link_path_walk+0x73d/0xb5e [ 1878.777819] [] mntput_no_expire+0x11/0x63 [ 1878.777879] [] link_path_walk+0xa9/0xb3 [ 1878.777941] [] link_path_walk+0x44/0xb3 [ 1878.778001] [] nameidata_to_filp+0x24/0x33 [ 1878.778074] [] do_filp_open+0x32/0x39 [ 1878.778145] [] do_path_lookup+0x176/0x191 [ 1878.778209] [] getname+0x59/0x8f [ 1878.778270] [] __user_walk_fd+0x2f/0x45 [ 1878.778334] [] vfs_lstat_fd+0x16/0x3d [ 1878.778397] [] nameidata_to_filp+0x24/0x33 [ 1878.778461] [] do_filp_open+0x32/0x39 [ 1878.778524] [] sys_lstat64+0xf/0x23 [ 1878.778585] [] __fput+0x112/0x13c [ 1878.778647] [] mntput_no_expire+0x11/0x63 [ 1878.778709] [] filp_close+0x51/0x58 [ 1878.778771] [] sys_close+0x67/0x9e [ 1878.778832] [] syscall_call+0x7/0xb [ 1878.778895] ======================= [ 1878.974434] Filesystem "sda3": XFS internal error xfs_iformat(6) at line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 [ 1878.974493] [] xfs_iread+0x4ee/0x6e8 [ 1878.974599] [] xfs_iget+0x2e4/0x714 [ 1878.974692] [] xfs_iget+0x2e4/0x714 [ 1878.974759] [] xfs_dir_lookup_int+0x7d/0xd4 [ 1878.974799] [] xfs_lookup+0x52/0x78 [ 1878.974888] [] xfs_vn_lookup+0x3b/0x70 [ 1878.974950] [] do_lookup+0xa3/0x140 [ 1878.975015] [] __link_path_walk+0x73d/0xb5e [ 1878.975080] [] _spin_unlock_irqrestore+0xf/0x23 [ 1878.975145] [] n_tty_receive_buf+0xc77/0xd1a [ 1878.975210] [] link_path_walk+0x44/0xb3 [ 1878.975275] [] do_path_lookup+0x176/0x191 [ 1878.975338] [] getname+0x59/0x8f [ 1878.975399] [] __user_walk_fd+0x2f/0x45 [ 1878.975461] [] vfs_lstat_fd+0x16/0x3d [ 1878.975525] [] sys_lstat64+0xf/0x23 [ 1878.975588] [] syscall_call+0x7/0xb [ 1878.975650] [] rsc_parse+0x6f/0x37f [ 1878.975712] ======================= [ 1878.975956] Filesystem "sda3": XFS internal error xfs_iformat(6) at line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 [ 1878.976012] [] xfs_iread+0x4ee/0x6e8 [ 1878.976111] [] xfs_iget+0x2e4/0x714 [ 1878.976184] [] xfs_iget+0x2e4/0x714 [ 1878.976249] [] xfs_dir_lookup_int+0x7d/0xd4 [ 1878.976314] [] xfs_lookup+0x52/0x78 [ 1878.976376] [] xfs_vn_lookup+0x3b/0x70 [ 1878.976438] [] do_lookup+0xa3/0x140 [ 1878.976500] [] __link_path_walk+0x73d/0xb5e [ 1878.976564] [] _spin_unlock_irqrestore+0xf/0x23 [ 1878.976629] [] n_tty_receive_buf+0xc77/0xd1a [ 1878.976701] [] link_path_walk+0x44/0xb3 [ 1878.976766] [] do_path_lookup+0x176/0x191 [ 1878.976835] [] getname+0x59/0x8f [ 1878.976898] [] __user_walk_fd+0x2f/0x45 [ 1878.976961] [] vfs_lstat_fd+0x16/0x3d [ 1878.977024] [] sys_lstat64+0xf/0x23 [ 1878.977088] [] syscall_call+0x7/0xb [ 1878.977150] [] rsc_parse+0x6f/0x37f [ 1878.977212] ======================= Thanks Oliver From owner-xfs@oss.sgi.com Wed Mar 28 16:47:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 16:47:04 -0700 (PDT) X-Spam-oss-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00, J_CHICKENPOX_44,J_CHICKENPOX_45,J_CHICKENPOX_46,J_CHICKENPOX_47 autolearn=no version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2SNkw6p020945 for ; Wed, 28 Mar 2007 16:47:00 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA12422; Thu, 29 Mar 2007 09:46:53 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2SNkpAf42040929; Thu, 29 Mar 2007 10:46:52 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2SNkm0142083931; Thu, 29 Mar 2007 10:46:48 +1100 (AEDT) Date: Thu, 29 Mar 2007 10:46:48 +1100 From: David Chinner To: Oliver Joa Cc: David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel Message-ID: <20070328234647.GT32597093@melbourne.sgi.com> References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <460A6298.4040702@j-o-a.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 10967 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 4385 Lines: 127 On Wed, Mar 28, 2007 at 02:42:00PM +0200, Oliver Joa wrote: > Hi, > > David Chinner wrote: > > [...] > > >What is the corruption message in the log from XFS? > >Can you please post that? Without it we really can't help you. > > > >Also, please check to see if there are any I/O errors > >in the log around the time the corruption message appears. > > Ok, here is a test: > > test:/# find / -xdev | cpio -padm /test/ > cpio: /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt: > Structure needs cleaning > 3648371 blocks > test:/# > > test:/home/olli# uname -a > Linux test 2.6.20.4-majestix-1 #1 SMP PREEMPT Tue Mar 27 12:15:41 CEST > 2007 i686 GNU/Linux > > dmesg gives the following: > [15442.935941] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [15442.936003] [] xfs_iread+0x4ee/0x6e8 > [15442.936039] [] xfs_iget+0x2e4/0x714 > [15442.936071] [] xfs_iget+0x2e4/0x714 > [15442.936101] [] xfs_dir_lookup_int+0x7d/0xd4 So we have a corrupt inode. The error tells me that the corrupted inode is either a regular file, directory or link. Unfortunately it doesn't tell us the inode number that is corrupted. > test:/# rm /usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt > rm: cannot remove > `/usr/src/linux-2.6.20.2/Documentation/networking/NAPI_HOWTO.txt': > Structure needs cleaning > test:/# Once the filesystem shuts down this will happen to every operation. Next time you get a shutdown, can you unmount the filesystems and run xfs_check and then "xfs_repair -n" on the filesystem. These will tell you the inode numbers that are bad. Can you post the errors reported by these tools? Once you have the bad inode numbers, can you run the following on the bad inodes: # xfs_db -r -c "inode " -c "p" E.g.: # xfs_db -r -c "inode 128" -c p /dev/sdb8 core.magic = 0x494e core.mode = 040755 core.version = 2 core.format = 2 (extents) ...... and post the output for us? That will enable us to see exactly what the corruption is on the inode. Cheers, Dave. > > I got: > > [18359.750604] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [18359.750701] [] xfs_iread+0x4ee/0x6e8 > [18359.750755] [] xfs_iget+0x2e4/0x714 > [18359.750802] [] xfs_iget+0x2e4/0x714 > [18359.750849] [] xfs_dir_lookup_int+0x7d/0xd4 > [18359.750897] [] xfs_lookup+0x52/0x78 > [18359.750943] [] xfs_vn_lookup+0x3b/0x70 > [18359.750990] [] do_lookup+0xa3/0x140 > [18359.751036] [] __link_path_walk+0x73d/0xb5e > [18359.751086] [] link_path_walk+0x44/0xb3 > [18359.751133] [] rb_insert_color+0x4c/0xad > [18359.751180] [] vma_link+0x54/0xcd > [18359.751226] [] do_path_lookup+0x176/0x191 > [18359.751273] [] getname+0x59/0x8f > [18359.751318] [] __user_walk_fd+0x2f/0x45 > [18359.751364] [] vfs_lstat_fd+0x16/0x3d > [18359.751410] [] rb_insert_color+0x4c/0xad > [18359.751457] [] vma_link+0x54/0xcd > [18359.751501] [] sys_lstat64+0xf/0x23 > [18359.751546] [] do_page_fault+0x277/0x526 > [18359.751595] [] do_page_fault+0x0/0x526 > [18359.751640] [] syscall_call+0x7/0xb > [18359.751686] [] rsc_parse+0x6f/0x37f > [18359.751732] ======================= > [18359.751784] Filesystem "sda3": XFS internal error xfs_iformat(6) at > line 492 of file fs/xfs/xfs_inode.c. Caller 0xc0211f94 > [18359.751859] [] xfs_iread+0x4ee/0x6e8 > [18359.751906] [] xfs_iget+0x2e4/0x714 > [18359.751952] [] xfs_iget+0x2e4/0x714 > [18359.751998] [] xfs_dir_lookup_int+0x7d/0xd4 > [18359.752047] [] xfs_lookup+0x52/0x78 > [18359.752094] [] xfs_vn_lookup+0x3b/0x70 > [18359.752140] [] __lookup_hash+0xb1/0xe1 > [18359.752191] [] do_unlinkat+0x5f/0x126 > [18359.752237] [] do_page_fault+0x277/0x526 > [18359.752285] [] syscall_call+0x7/0xb > [18359.752331] [] rsc_parse+0x6f/0x37f > [18359.752376] ======================= > > > > Thanks a Lot > > Oliver -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Mar 28 17:33:10 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 17:33:13 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_45 autolearn=no version=3.2.0-pre1-r499012 Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2T0X96p029987 for ; Wed, 28 Mar 2007 17:33:09 -0700 Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id l2T0LWUR023886; Wed, 28 Mar 2007 17:21:32 -0700 Message-ID: <460B068C.6060903@tlinx.org> Date: Wed, 28 Mar 2007 17:21:32 -0700 From: Linda Walsh User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: Oliver Joa CC: Eric Sandeen , David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> <460A821B.4080308@sandeen.net> <460AC857.6040305@j-o-a.de> In-Reply-To: <460AC857.6040305@j-o-a.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10968 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lkml@tlinx.org Precedence: bulk X-list: xfs Content-Length: 2260 Lines: 46 Oliver Joa wrote: >> eason or another, xfs has detected a corrupted on-disk inode format >> which it cannot recognize, and shuts down. It is likely the result >> of something which has gone wrong previously. xfs_repair should fix >> it. Are there other non-xfs messages in your logs indicating other >> problems prior to this? > i sent already the dmesg output to the list. there is nothing else. > I made a xfs_repair. Now I have some Files in lost+found. > So I tried it again with a new cable: --- I doubt it has changed significantly, but xfs was designed for stable hardware. That doesn't mean you can't pull the plug, but if you are getting SATA resets, you may be getting some writes aborted, with subsequent writes going through (speculation). I know when I had a flakey SCSI disk problem (was cable or connector in my case), I'd get a rare XFS corruption (out of ~10 years of XFS use, maybe 2-3 corruptions, all caused by loose connections, cables, etc). I'd strongly suggest you get to the bottom of the SATA reset problem. After that is fixed, then try to clean up your XFS disks (or restore from backups). Sometimes, after some intermittent hardware problems, my xfs file system was too corrupt for me to repair (at least with default xfs_repair options). Doesn't mean it was irreparable, just, I didn't know how to proceed and it was easier to restore from a daily backup than attempt to manually repair the damage. The above is based solely on my own experience. I use xfs with max(8?) logbuffs, and noatime/nodiratime, and find it to have among the best performance characteristics of any file system (overall; lowest performance aspect was file delete). XFS has a low fragmentation rate, due to how it allocates space and can delay writes. Even so, it is also one of the few file systems (only?) that comes with a "defragmenter" (xfs_fsr (file system reorganizer)). Sgi used to ship systems with xfs_fsr configured to run weekly to "watch out for" rare, degenerate cases (important for some real-time video apps). My cron runs it nightly, but often it will pass through all file systems making no changes. Fix the flakey hw -- then see if your xfs probs don't "magically" go away...however, YMMV... Linda From owner-xfs@oss.sgi.com Wed Mar 28 19:34:45 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 28 Mar 2007 19:34:48 -0700 (PDT) X-Spam-oss-Status: No, score=-0.4 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from ishtar.tlinx.org (ishtar.tlinx.org [64.81.245.74]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2T2Yi6p027537 for ; Wed, 28 Mar 2007 19:34:45 -0700 Received: from [192.168.3.11] (Athena [192.168.3.11]) by ishtar.tlinx.org (8.13.3/8.12.10/SuSE Linux 0.7) with ESMTP id l2T2Yc5L024692; Wed, 28 Mar 2007 19:34:38 -0700 Message-ID: <460B25BE.3050808@tlinx.org> Date: Wed, 28 Mar 2007 19:34:38 -0700 From: Linda Walsh User-Agent: Thunderbird 1.5.0.10 (Windows/20070221) MIME-Version: 1.0 To: Linda Walsh CC: Oliver Joa , Eric Sandeen , David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> <460A821B.4080308@sandeen.net> <460AC857.6040305@j-o-a.de> <460B068C.6060903@tlinx.org> In-Reply-To: <460B068C.6060903@tlinx.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10969 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lkml@tlinx.org Precedence: bulk X-list: xfs Content-Length: 1478 Lines: 34 Oliver Joa wrote: >> eason or another, xfs has detected a corrupted on-disk inode format >> which it cannot recognize, and shuts down. ---- Oh, one other thing that may not apply in your case, but may. Does your SATA disk support write caching? Does it support something called a barrier function? (not real clear on all the ways this can go wrong, but I believe barriers are supposed to guarantee previous data has been fixed on disk (not in write cache). If the SATA controller issues a reset, it may very well purge the write cache. Theoretically, I can think of a _possibility_, that the reset disk would purge the write cache and the barrier indicator would tell xfs to resume writing. From a recent thread on the xfs list, it would appear this could be a "bad" thing (like crossing the streams ala "ghostbusters", but in a data-integrity context). Just a "shot in the dark" -- absent knowing anything specific about your hardware or situation... If that's the case, you might want to turn off write caching, since when xfs thinks "barriers" work, it turns off some "protection", that can enable some significant speedup in some situations. As an aside, some disks, I gather, may "claim" to support barriers, but really don't. Xfs tries to verify the barrier claim, but I don't know that a reset issued to the disk will have deterministic behavior across all manufacturer's disks. A bunch of "coulds" and "maybe's", but just thinking off top of head... Linda From owner-xfs@oss.sgi.com Thu Mar 29 02:58:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 02:58:45 -0700 (PDT) X-Spam-oss-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from atrey.karlin.mff.cuni.cz (atrey.karlin.mff.cuni.cz [195.113.31.123]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2T9wc6p016037 for ; Thu, 29 Mar 2007 02:58:39 -0700 Received: by atrey.karlin.mff.cuni.cz (Postfix, from userid 4043) id BB73AC7C0C; Thu, 29 Mar 2007 11:34:00 +0200 (CEST) Date: Thu, 29 Mar 2007 11:34:00 +0200 From: Jan Kara To: Linda Walsh Cc: Oliver Joa , Eric Sandeen , David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel Message-ID: <20070329093400.GB14616@atrey.karlin.mff.cuni.cz> References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> <460A821B.4080308@sandeen.net> <460AC857.6040305@j-o-a.de> <460B068C.6060903@tlinx.org> <460B25BE.3050808@tlinx.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <460B25BE.3050808@tlinx.org> User-Agent: Mutt/1.5.9i X-archive-position: 10971 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jack@suse.cz Precedence: bulk X-list: xfs Content-Length: 1220 Lines: 26 > Oliver Joa wrote: > >>eason or another, xfs has detected a corrupted on-disk inode format > >>which it cannot recognize, and shuts down. > ---- > Oh, one other thing that may not apply in your case, but may. > Does your SATA disk support write caching? Does it support > something called a barrier function? (not real clear on all > the ways this can go wrong, but I believe barriers are supposed > to guarantee previous data has been fixed on disk (not in write > cache). If the SATA controller issues a reset, it may very well > purge the write cache. Theoretically, I can think of a _possibility_, > that the reset disk would purge the write cache and the barrier > indicator would tell xfs to resume writing. From a recent thread > on the xfs list, it would appear this could be a "bad" thing (like > crossing the streams ala "ghostbusters", but in a data-integrity > context). As far as I can remember, barrier does not mean that data is fixed on disk. It is only a command that forces all the writes before the barrier to be performed before all the writes after the barrier. So this is more an ordering restriction than a data integrity thing... Honza -- Jan Kara SuSE CR Labs From owner-xfs@oss.sgi.com Thu Mar 29 04:51:27 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 04:51:31 -0700 (PDT) X-Spam-oss-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from e6.ny.us.ibm.com (e6.ny.us.ibm.com [32.97.182.146]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TBpP6p011068 for ; Thu, 29 Mar 2007 04:51:27 -0700 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e6.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l2TBq9Ub012186 for ; Thu, 29 Mar 2007 07:52:09 -0400 Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2TBpMb9293154 for ; Thu, 29 Mar 2007 07:51:22 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2TBpMb0030817 for ; Thu, 29 Mar 2007 07:51:22 -0400 Received: from amitarora.in.ibm.com ([9.124.31.181]) by d01av04.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2TBpK7A030777; Thu, 29 Mar 2007 07:51:21 -0400 Received: from amitarora.in.ibm.com (localhost.localdomain [127.0.0.1]) by amitarora.in.ibm.com (Postfix) with ESMTP id B59E229EC8B; Thu, 29 Mar 2007 17:21:26 +0530 (IST) Received: (from amit@localhost) by amitarora.in.ibm.com (8.13.1/8.13.1/Submit) id l2TBpQUU016337; Thu, 29 Mar 2007 17:21:26 +0530 Date: Thu, 29 Mar 2007 17:21:26 +0530 From: "Amit K. Arora" To: torvalds@osdl.org, akpm@linux-foundation.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Interface for the new fallocate() system call Message-ID: <20070329115126.GB7374@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070321120425.GA27273@amitarora.in.ibm.com> User-Agent: Mutt/1.4.1i X-archive-position: 10972 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: aarora@linux.vnet.ibm.com Precedence: bulk X-list: xfs Content-Length: 1727 Lines: 53 Hello, We need to come up with the best possible layout of arguments for the fallocate() system call. Various architectures have different requirements for how the arguments should look like. Since the mail chain has become huge, here is the summary of various inputs received so far. Platform: s390 -------------- s390 prefers following layout: int fallocate(int fd, loff_t offset, loff_t len, int mode) For details on why and how "int, int, loff_t, loff_t" is a problem on s390, please see Heiko's mail on 16th March. Here is the link: http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html Platform: ppc, arm ------------------ ppc (32 bit) has a problem with "int, loff_t, loff_t, int" layout, since this will result in a pad between fd and offset, making seven arguments total - which is not supported by ppc32. It supports only 6 arguments. Thus the desired layout by ppc32 is: int fallocate(int fd, int mode, loff_t offset, loff_t len) Even ARM prefers above kind of layout. For details please see the definition of sys_arm_sync_file_range(). Option of loff_t => high u32 + low u32 -------------------------------------- Matthew and Russell have suggested another option of breaking each "loff_t" into two "u32"s. This will result in 6 arguments in total. Following think that this is a good alternative: Matthew Wilcox, Russell King, Heiko Carstens Following do not like this idea: Chris Wedgwood What are your thoughts on this ? What layout should we finalize on ? Perhaps, since sync_file_range() system call has similar arguments, we can take hint from the challenges faced on implementing it on various architectures, and decide. Please suggest. Thanks! -- Regards, Amit Arora From owner-xfs@oss.sgi.com Thu Mar 29 05:07:34 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 05:07:38 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from rgminet02.oracle.com (rgminet02.oracle.com [148.87.113.119]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TC7X6p015022 for ; Thu, 29 Mar 2007 05:07:33 -0700 Received: from rgminet01.oracle.com (rgminet01.oracle.com [148.87.113.118]) by rgminet02.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2TBEgl4031466 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 29 Mar 2007 05:14:42 -0600 Received: from rgmgw2.us.oracle.com (rgmgw2.us.oracle.com [138.1.186.111]) by rgminet01.oracle.com (Switch-3.2.4/Switch-3.1.6) with ESMTP id l2TBEaE8015537; Thu, 29 Mar 2007 05:14:36 -0600 Received: from carl.home.kernel.dk (dhcp-ballerup-10-172-131-146.dk.oracle.com [10.172.131.146]) by rgmgw2.us.oracle.com (Switch-3.2.4/Switch-3.1.7) with ESMTP id l2TBEY25001451; Thu, 29 Mar 2007 05:14:35 -0600 Received: by carl.home.kernel.dk (Postfix, from userid 1000) id AF56A942FB; Thu, 29 Mar 2007 13:14:07 +0200 (CEST) Date: Thu, 29 Mar 2007 13:14:07 +0200 From: Jens Axboe To: Jan Kara Cc: Linda Walsh , Oliver Joa , Eric Sandeen , David Chinner , linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel Message-ID: <20070329111407.GA9959@kernel.dk> References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> <460A821B.4080308@sandeen.net> <460AC857.6040305@j-o-a.de> <460B068C.6060903@tlinx.org> <460B25BE.3050808@tlinx.org> <20070329093400.GB14616@atrey.karlin.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070329093400.GB14616@atrey.karlin.mff.cuni.cz> X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAI= X-Brightmail-Tracker: AAAAAQAAAAQ= X-Whitelist: TRUE X-Whitelist: TRUE X-Whitelist: TRUE X-archive-position: 10973 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jens.axboe@oracle.com Precedence: bulk X-list: xfs Content-Length: 1382 Lines: 28 On Thu, Mar 29 2007, Jan Kara wrote: > > Oliver Joa wrote: > > >>eason or another, xfs has detected a corrupted on-disk inode format > > >>which it cannot recognize, and shuts down. > > ---- > > Oh, one other thing that may not apply in your case, but may. > > Does your SATA disk support write caching? Does it support > > something called a barrier function? (not real clear on all > > the ways this can go wrong, but I believe barriers are supposed > > to guarantee previous data has been fixed on disk (not in write > > cache). If the SATA controller issues a reset, it may very well > > purge the write cache. Theoretically, I can think of a _possibility_, > > that the reset disk would purge the write cache and the barrier > > indicator would tell xfs to resume writing. From a recent thread > > on the xfs list, it would appear this could be a "bad" thing (like > > crossing the streams ala "ghostbusters", but in a data-integrity > > context). > As far as I can remember, barrier does not mean that data is fixed on > disk. It is only a command that forces all the writes before the barrier > to be performed before all the writes after the barrier. So this is more > an ordering restriction than a data integrity thing... A barrier write guarentees both data before barrier is on disk, as well as the barrier itself when completion is signalled. -- Jens Axboe From owner-xfs@oss.sgi.com Thu Mar 29 07:56:31 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 07:56:38 -0700 (PDT) X-Spam-oss-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TEuQ6p016012 for ; Thu, 29 Mar 2007 07:56:31 -0700 Received: from localhost (dslb-084-057-114-201.pools.arcor-ip.net [84.57.114.201]) by mail.lichtvoll.de (Postfix) with ESMTP id BD4865AD36 for ; Thu, 29 Mar 2007 16:56:23 +0200 (CEST) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: Re: XFS and write barriers. Date: Thu, 29 Mar 2007 16:56:21 +0200 User-Agent: KMail/1.9.6 References: <17923.11463.459927.628762@notabene.brown> <17927.19372.553410.527506@notabene.brown> <20070326090456.GO32597093@melbourne.sgi.com> In-Reply-To: <20070326090456.GO32597093@melbourne.sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200703291656.22084.Martin@lichtvoll.de> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2TEuV6p016024 X-archive-position: 10974 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 1137 Lines: 29 Am Montag 26 März 2007 schrieb David Chinner: > > Is there some mount flag to say "cope without barriers" or "require > > barriers" ?? > > XFs has "-o nobarrier" to say don't use barriers, and this is > *not* the default. If barriers don't work, we drop back to "-o > nobarrier" after leaving a loud warning inthe log.... Hello David! Just a thought, maybe it shouldn't do that automatically, but require the sysadmin to explicitely state "-o nobarrier" in that case. Safest default behavior IMHO would be either not to mount at all without "-o nobarrier" if the device has no barrier support or disable the write cache of that device. The latter can be considered a layering violation in itself. BTW XFS copes really well here with commodity hardware such as my ThinkPads with 2.5 inch notebook harddisks *since* 2.6.17.7. But right now I wondered about barrier support on USB connected devices? I have to check whether XFS does barriers on those. Does the usb mass storage driver support barriers? Regards, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 From owner-xfs@oss.sgi.com Thu Mar 29 08:07:23 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 08:07:30 -0700 (PDT) X-Spam-oss-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TF7L6p018552 for ; Thu, 29 Mar 2007 08:07:22 -0700 Received: from localhost (dslb-084-057-114-201.pools.arcor-ip.net [84.57.114.201]) by mail.lichtvoll.de (Postfix) with ESMTP id 4542B5AD36 for ; Thu, 29 Mar 2007 17:07:19 +0200 (CEST) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: cache flush support in SATA drives (was: Re: Questions about XFS) Date: Thu, 29 Mar 2007 17:07:18 +0200 User-Agent: KMail/1.9.6 References: <200703131440.56678.clflush@chello.be> <200703161136.32234.Martin@lichtvoll.de> <20070317004731.GA5236@jdc.local> In-Reply-To: <20070317004731.GA5236@jdc.local> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200703291707.18246.Martin@lichtvoll.de> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2TF7N6p018559 X-archive-position: 10975 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 1379 Lines: 35 Am Samstag 17 März 2007 schrieb Jason White: > This might be slightly off-topic, but in choosing a SATA drive for a > desktop machine, what features/standard-complaince should one look for > in order to ensure that write barriers work? I know this involves > flushing the drive cache, but is this support mandatory in any of the > applicable standards? Hello Jason! I have no exact idea. I just now that dmesg usually tells you whether cache flushes are supported. But shouldn't modern SATA drives support NCQ anyway? Since NCQ doesn't make any sense without the ability to flush the cache, I *think* any SATA drive with NCQ support should do. NCQ support would allow the block layer to offload the write barrier request ordering at least partly to the device firmware. "ii. For devices which have queue depth greater than 1 but don't support ordered tags, block layer ensures that the requests preceding a barrier request finishes before issuing the barrier request. Also, it defers requests following the barrier until the barrier request is finished. Older SCSI controllers/drives and SATA drives fall in this category." (Documentation/block/barrier.txt of Linux Kernel 2.6.20.4) This also indicates that SATA drives should support NCQ. Regards, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 From owner-xfs@oss.sgi.com Thu Mar 29 08:16:17 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 08:16:24 -0700 (PDT) X-Spam-oss-Status: No, score=2.0 required=5.0 tests=BAYES_80 autolearn=no version=3.2.0-pre1-r499012 Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.244]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TFGE6p020866 for ; Thu, 29 Mar 2007 08:16:17 -0700 Received: by an-out-0708.google.com with SMTP id c5so181576anc for ; Thu, 29 Mar 2007 08:16:12 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=fZ8LQc5aC7ScBNcjIBnQiFaSlgOwI80+MN8A14kmygaQoNEr++1BeK4KwsBM2vN48xMdjQ9OBy1pm9TmqwVrXbNq0mKjK64naSN/oOGrif9TOdAmOOwZausvuylTK8mUu3zGjwUcVv1qDMNLzT26qlIClnXJP5gB7zLH5XqnZ24= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=YU0NN2nbzoLMKF8R1HNBw72j631DiTIuo/8nq6gh7gacokVuISD8ZEZbORjnwHMEYnqs/ZDXGax3ynwb7uwp2VNTkAi8FfpCDYdY386GII168EptDlSdwQ+fKxoFurTmlobIAXsXghMlyZtEccTY4g1OMz6M7zk2kT5ihHsuoEs= Received: by 10.100.91.6 with SMTP id o6mr353247anb.1175175336735; Thu, 29 Mar 2007 06:35:36 -0700 (PDT) Received: by 10.100.7.9 with HTTP; Thu, 29 Mar 2007 06:35:36 -0700 (PDT) Message-ID: <54fb6c0f0703290635i50cd51ddv16e1dd0941435d77@mail.gmail.com> Date: Thu, 29 Mar 2007 15:35:36 +0200 From: "Mike Machuidel" To: xfs@oss.sgi.com Subject: ISCSI / Xen: EIP at xfs_bmap_local_to_extents MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10976 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: machuidel@gmail.com Precedence: bulk X-list: xfs Content-Length: 613 Lines: 26 Hi, I am using XFS over ISCSI in Xen and after a while I get Oops's saying "EIP is at xfs_bmap_local_to_extents". The errors occur mainly when opening files (for reading?). For example when using the program "file". ISCSI seems to be working fine. No connection errors occured which means it should behave as a typical block device. I am using the following software: Debian Etch Linux Kernel: 2.6.18 Xen: 3.0.2 (I am not using SMP inside the virtual machine) Oops can be found at: http://www.satl.com/~machuidel/xfs_oops.txt Hope anyone can help to improve this situation. Thanks. Cheers, Mike Machuidel From owner-xfs@oss.sgi.com Thu Mar 29 08:33:24 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 08:33:31 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_20 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2TFXL6p030157 for ; Thu, 29 Mar 2007 08:33:23 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id BAA10576; Fri, 30 Mar 2007 01:19:00 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2TFIxAf42840290; Fri, 30 Mar 2007 02:18:59 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2TFIwPM42855875; Fri, 30 Mar 2007 02:18:58 +1100 (AEDT) Date: Fri, 30 Mar 2007 02:18:58 +1100 From: David Chinner To: Martin Steigerwald Cc: linux-xfs@oss.sgi.com Subject: Re: XFS and write barriers. Message-ID: <20070329151858.GI32597093@melbourne.sgi.com> References: <17923.11463.459927.628762@notabene.brown> <17927.19372.553410.527506@notabene.brown> <20070326090456.GO32597093@melbourne.sgi.com> <200703291656.22084.Martin@lichtvoll.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <200703291656.22084.Martin@lichtvoll.de> User-Agent: Mutt/1.4.2.1i X-archive-position: 10978 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1017 Lines: 32 On Thu, Mar 29, 2007 at 04:56:21PM +0200, Martin Steigerwald wrote: > Am Montag 26 März 2007 schrieb David Chinner: > > > > Is there some mount flag to say "cope without barriers" or "require > > > barriers" ?? > > > > XFs has "-o nobarrier" to say don't use barriers, and this is > > *not* the default. If barriers don't work, we drop back to "-o > > nobarrier" after leaving a loud warning inthe log.... > > Hello David! > > Just a thought, maybe it shouldn't do that automatically, but require the > sysadmin to explicitely state "-o nobarrier" in that case. And prevent most existing XFS filesystems from mounting after a kernel upgrade? Think about the problems that might cause with XFs root filesystems on hardware/software that doesn't support barriers.... Default behaviour is tolerant - it tries the safest method known and if it can't use that it tells you and then continues onwards. That's a good default to have. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Mar 29 08:33:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 08:33:08 -0700 (PDT) X-Spam-oss-Status: No, score=2.1 required=5.0 tests=AWL,BAYES_50,RCVD_IN_PSBL autolearn=no version=3.2.0-pre1-r499012 Received: from smtp2.mundo-r.com (smtp3.mundo-r.com [212.51.32.191]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TFWw6p030001 for ; Thu, 29 Mar 2007 08:32:59 -0700 Received: from cm44039.red83-165.mundo-r.com (HELO [192.168.1.36]) ([83.165.44.39]) by smtp2.mundo-r.com with ESMTP; 29 Mar 2007 17:32:48 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAAx5C0ZTpSwn/2dsb2JhbAAN X-IronPort-AV: i="4.14,347,1170630000"; d="po'?scan'208"; a="81501772:sNHT1419559533" Message-ID: <460BDC1F.9030300@mundo-r.com> Date: Thu, 29 Mar 2007 17:32:47 +0200 From: Antonio Trueba User-Agent: IceDove 1.5.0.10 (X11/20070307) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: New ATTR translations X-Enigmail-Version: 0.94.2.0 Content-Type: multipart/mixed; boundary="------------090700090804050503040904" X-archive-position: 10977 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: atrueba@mundo-r.com Precedence: bulk X-list: xfs Content-Length: 17114 Lines: 511 This is a multi-part message in MIME format. --------------090700090804050503040904 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hello all, Atached are new translations of attr to Spanish (es) and Galician (gl), both complete against current CVS tree. Regards, -- --------------090700090804050503040904 Content-Type: text/x-gettext-translation; name="es.po" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="es.po" # xfsprogs' ATTR package # Copyright (C) 2007 Free Software Foundation # This file is distributed under the same license as the xfsprogs package. # Antonio Trueba , 2007 # msgid "" msgstr "" "Project-Id-Version: attr-2.4.37.0\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2007-03-06 12:02+0100\n" "PO-Revision-Date: 2007-03-16 23:29+0100\n" "Last-Translator: Antonio Trueba \n" "Language-Team: Spanish\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=utf-8\n" "Content-Transfer-Encoding: 8bit\n" "X-Poedit-Language: Spanish\n" #: ../attr/attr.c:46 #, c-format msgid "" "Usage: %s [-LRSq] -s attrname [-V attrvalue] pathname # set value\n" " %s [-LRSq] -g attrname pathname # get value\n" " %s [-LRSq] -r attrname pathname # remove attr\n" " %s [-LRq] -l pathname # list attrs \n" " -s reads a value from stdin and -g writes a value to stdout\n" msgstr "" "Uso: %s [-LRSq] -s nomatrib ruta [-V valoratr] ruta # establecer valor\n" " %s [-LRSq] -g nomatrib ruta # obtener valor\n" " %s [-LRSq] -r nomatrib ruta # borrar atributo\n" " %s [-LRq] -l ruta # listar atributos \n" " -s lee un valor de la entrada estándar y -g escribe un valor a la salida estándar\n" #: ../attr/attr.c:83 #: ../attr/attr.c:100 #: ../attr/attr.c:109 #: ../attr/attr.c:118 #, c-format msgid "Only one of -s, -g, -r, or -l allowed\n" msgstr "Sólo está permitido usar uno de -s, -g, -r, o -l\n" #: ../attr/attr.c:91 #, c-format msgid "-V only allowed with -s\n" msgstr "-V sólo está permitido con -s\n" #: ../attr/attr.c:136 #, c-format msgid "Unrecognized option: %c\n" msgstr "Opción no reconocida: %c\n" #: ../attr/attr.c:143 #, c-format msgid "A filename to operate on is required\n" msgstr "Se necesita un nombre de archivo sobre el que operar\n" #: ../attr/attr.c:171 #, c-format msgid "Could not set \"%s\" for %s\n" msgstr "No se pudo establecer \"%s\" para %s\n" #: ../attr/attr.c:176 #, c-format msgid "Attribute \"%s\" set to a %d byte value for %s:\n" msgstr "Atributo \"%s\" establecido al valor byte %d para %s:\n" #: ../attr/attr.c:194 #, c-format msgid "Could not get \"%s\" for %s\n" msgstr "No se pudo obtener \"%s\" para %s\n" #: ../attr/attr.c:199 #, c-format msgid "Attribute \"%s\" had a %d byte value for %s:\n" msgstr "El atributo \"%s\" tenía el valor byte %d para %s:\n" #: ../attr/attr.c:212 #, c-format msgid "Could not remove \"%s\" for %s\n" msgstr "No se pudo eliminar \"%s\" para %s\n" #: ../attr/attr.c:230 #, c-format msgid "Could not list \"%s\" for %s\n" msgstr "No se pudo listar \"%s\" para %s\n" #: ../attr/attr.c:240 #, c-format msgid "Attribute \"%s\" has a %d byte value for %s\n" msgstr "El atributo \"%s\" tiene el valor byte %d para %s\n" #: ../attr/attr.c:252 #, c-format msgid "At least one of -s, -g, -r, or -l is required\n" msgstr "Se necesita al menos uno de -s, -g, -r, o -l\n" #: ../getfattr/getfattr.c:98 #: ../setfattr/setfattr.c:70 msgid "No such attribute" msgstr "Atributo inexistente" #: ../getfattr/getfattr.c:256 #, c-format msgid "%s: Removing leading '/' from absolute path names\n" msgstr "%s: Eliminando '/' inicial en nombres de ruta absolutos\n" #: ../getfattr/getfattr.c:394 #, c-format msgid "%s %s -- get extended attributes\n" msgstr "%s %s -- obtener atributos extendidos\n" #: ../getfattr/getfattr.c:396 #: ../setfattr/setfattr.c:175 #, c-format msgid "Usage: %s %s\n" msgstr "Uso: %s %s\n" #: ../getfattr/getfattr.c:399 #, c-format msgid "" " -n, --name=name get the named extended attribute value\n" " -d, --dump get all extended attribute values\n" " -e, --encoding=... encode values (as 'text', 'hex' or 'base64')\n" " --match=pattern only get attributes with names matching pattern\n" " --only-values print the bare values only\n" " -h, --no-dereference do not dereference symbolic links\n" " --absolute-names don't strip leading '/' in pathnames\n" " -R, --recursive recurse into subdirectories\n" " -L, --logical logical walk, follow symbolic links\n" " -P --physical physical walk, do not follow symbolic links\n" " --version print version and exit\n" " --help this help text\n" msgstr "" " -n, --name=nombre obtener el atributo extendido \"nombre\"\n" " -d, --dump obtener valor de todos los atributos extendidos\n" " -e, --encoding=... codificar valores (como 'text', 'hex' o 'base64')\n" " --match=patrón sólo obtener atributos cuyo nombre coincida con \"patrón\"\n" " --only-values sólo mostrar los valores en crudo\n" " -h, --no-dereference no resolver enlaces simbólicos\n" " --absolute-names no eliminar '/' iniciales en nombres de ruta\n" " -R, --recursive recorrer subdirectorios recursivamente\n" " -L, --logical recorrido lógico, siguiendo enlaces simbólicos\n" " -P --physical recorrido físico, no seguir enlaces simbólicos\n" " --version mostrar versión y salir\n" " --help este texto de ayuda\n" #: ../getfattr/getfattr.c:496 #, c-format msgid "%s: invalid regular expression \"%s\"\n" msgstr "%s: expresión regular inválida \"%s\"\n" #: ../getfattr/getfattr.c:514 #, c-format msgid "" "Usage: %s %s\n" "Try `%s --help' for more information.\n" msgstr "" "Uso: %s %s\n" "Escriba `%s --help' para más información.\n" #: ../setfattr/setfattr.c:123 #, c-format msgid "%s: %s: No filename found in line %d, aborting\n" msgstr "%s: %s: No se encontró nombre de archivo en línea %d, abortando\n" #: ../setfattr/setfattr.c:127 #, c-format msgid "%s: No filename found in line %d of standard input, aborting\n" msgstr "%s: %s: No se encontró nombre de archivo en línea %d de entrada estándar, abortando\n" #: ../setfattr/setfattr.c:174 #, c-format msgid "%s %s -- set extended attributes\n" msgstr "%s %s -- establecer atributos extendidos\n" #: ../setfattr/setfattr.c:176 #, c-format msgid " %s %s\n" msgstr " %s %s\n" #: ../setfattr/setfattr.c:178 #, c-format msgid "" " -n, --name=name set the value of the named extended attribute\n" " -x, --remove=name remove the named extended attribute\n" " -v, --value=value use value as the attribute value\n" " -h, --no-dereference do not dereference symbolic links\n" " --restore=file restore extended attributes\n" " --version print version and exit\n" " --help this help text\n" msgstr "" " -n, --name=nombre establecer valor para el atributo extendido \"nombre\"\n" " -x, --remove=nombre eliminar atributo extendido \"nombre\"\n" " -v, --value=valor usar \"valor\" como el valor del atributo\n" " -h, --no-dereference no resolver enlaces simbólicos\n" " --restore=archivo restaurar atributos extendidos\n" " --version mostrar versión y salir\n" " --help este texto de ayuda\n" #: ../setfattr/setfattr.c:253 #, c-format msgid "" "Usage: %s %s\n" " %s %s\n" "Try `%s --help' for more information.\n" msgstr "" "Uso: %s %s\n" " %s %s\n" "Escriba `%s --help' para más información.\n" #: ../libattr/attr_copy_fd.c:82 #: ../libattr/attr_copy_fd.c:97 #: ../libattr/attr_copy_file.c:80 #: ../libattr/attr_copy_file.c:95 #, c-format msgid "listing attributes of %s" msgstr "listando atributos de %s" #: ../libattr/attr_copy_fd.c:117 #: ../libattr/attr_copy_fd.c:134 #: ../libattr/attr_copy_file.c:115 #: ../libattr/attr_copy_file.c:132 #, c-format msgid "getting attribute %s of %s" msgstr "obteniendo atributo %s de %s" #: ../libattr/attr_copy_fd.c:147 #: ../libattr/attr_copy_fd.c:165 #: ../libattr/attr_copy_file.c:144 #: ../libattr/attr_copy_file.c:163 #, c-format msgid "setting attributes for %s" msgstr "estableciendo atributos para %s" #: ../libattr/attr_copy_fd.c:153 #: ../libattr/attr_copy_file.c:151 #, c-format msgid "setting attribute %s for %s" msgstr "estableciendo atributo %s para %s" --------------090700090804050503040904 Content-Type: text/x-gettext-translation; name="gl.po" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="gl.po" # xfsprogs' ATTR package # Copyright (C) 2007 Free Software Foundation # This file is distributed under the same license as the xfsprogs package. # Antonio Trueba , 2007 # msgid "" msgstr "" "Project-Id-Version: attr-2.4.37\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2007-03-06 12:02+0100\n" "PO-Revision-Date: 2007-03-16 23:28+0100\n" "Last-Translator: Antonio Trueba \n" "Language-Team: Galician\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=utf-8\n" "Content-Transfer-Encoding: 8bit\n" "X-Poedit-Language: Galician\n" #: ../attr/attr.c:46 #, c-format msgid "" "Usage: %s [-LRSq] -s attrname [-V attrvalue] pathname # set value\n" " %s [-LRSq] -g attrname pathname # get value\n" " %s [-LRSq] -r attrname pathname # remove attr\n" " %s [-LRq] -l pathname # list attrs \n" " -s reads a value from stdin and -g writes a value to stdout\n" msgstr "" "Uso: %s [-LRSq] -s nomatrib rota [-V valatrib] rota # establecer valor\n" " %s [-LRSq] -g nomatrib rota # obter valor\n" " %s [-LRSq] -r nomatrib rota # borrar atributo\n" " %s [-LRq] -l rota # listar atributos \n" " -s le un valor da entrada estándar e -g escrebe un valor á saída estándar\n" #: ../attr/attr.c:83 #: ../attr/attr.c:100 #: ../attr/attr.c:109 #: ../attr/attr.c:118 #, c-format msgid "Only one of -s, -g, -r, or -l allowed\n" msgstr "Só se permite un de -s, -g, -r, ou -l\n" #: ../attr/attr.c:91 #, c-format msgid "-V only allowed with -s\n" msgstr "-V só está permitido con -s\n" #: ../attr/attr.c:136 #, c-format msgid "Unrecognized option: %c\n" msgstr "Opción non recoñecida: %c\n" #: ../attr/attr.c:143 #, c-format msgid "A filename to operate on is required\n" msgstr "Precísase un nome de ficheiro a tratar\n" #: ../attr/attr.c:171 #, c-format msgid "Could not set \"%s\" for %s\n" msgstr "Non se puido establecer \"%s\" para %s\n" #: ../attr/attr.c:176 #, c-format msgid "Attribute \"%s\" set to a %d byte value for %s:\n" msgstr "Atributo \"%s\" establecido ó valor byte %d para %s:\n" #: ../attr/attr.c:194 #, c-format msgid "Could not get \"%s\" for %s\n" msgstr "Non se puido obter \"%s\" para %s\n" #: ../attr/attr.c:199 #, c-format msgid "Attribute \"%s\" had a %d byte value for %s:\n" msgstr "O atributo \"%s\" tiña o valor byte %d para %s:\n" #: ../attr/attr.c:212 #, c-format msgid "Could not remove \"%s\" for %s\n" msgstr "Non se puido eliminar \"%s\" para %s\n" #: ../attr/attr.c:230 #, c-format msgid "Could not list \"%s\" for %s\n" msgstr "Non se puido listar \"%s\" para %s\n" #: ../attr/attr.c:240 #, c-format msgid "Attribute \"%s\" has a %d byte value for %s\n" msgstr "O atributo \"%s\" ten o valor byte %d para %s\n" #: ../attr/attr.c:252 #, c-format msgid "At least one of -s, -g, -r, or -l is required\n" msgstr "É preciso alomenos un de -s, -g, -r, ou -l\n" #: ../getfattr/getfattr.c:98 #: ../setfattr/setfattr.c:70 msgid "No such attribute" msgstr "Non hai tal atributo" #: ../getfattr/getfattr.c:256 #, c-format msgid "%s: Removing leading '/' from absolute path names\n" msgstr "%s: Borrando '/' iniciais dos nomes de rota absolutos\n" #: ../getfattr/getfattr.c:394 #, c-format msgid "%s %s -- get extended attributes\n" msgstr "%s %s -- obter atributos estendidos\n" #: ../getfattr/getfattr.c:396 #: ../setfattr/setfattr.c:175 #, c-format msgid "Usage: %s %s\n" msgstr "Uso: %s %s\n" #: ../getfattr/getfattr.c:399 #, c-format msgid "" " -n, --name=name get the named extended attribute value\n" " -d, --dump get all extended attribute values\n" " -e, --encoding=... encode values (as 'text', 'hex' or 'base64')\n" " --match=pattern only get attributes with names matching pattern\n" " --only-values print the bare values only\n" " -h, --no-dereference do not dereference symbolic links\n" " --absolute-names don't strip leading '/' in pathnames\n" " -R, --recursive recurse into subdirectories\n" " -L, --logical logical walk, follow symbolic links\n" " -P --physical physical walk, do not follow symbolic links\n" " --version print version and exit\n" " --help this help text\n" msgstr "" " -n, --name=nome obter o valor de atributo estendido especificado\n" " -d, --dump obté-lo valor de tódolos atributos estendidos\n" " -e, --encoding=... codificar valores (coma 'text', 'hex' ou 'base64')\n" " --match=patrón só obté-los atributos de nome coincidente co patrón\n" " --only-values só amosá-los valores crus dos atributos\n" " -h, --no-dereference non resolvé-los enlaces simbólicos\n" " --absolute-names non eliminá-los '/' iniciáis en nomes de rota\n" " -R, --recursive recorrer subdirectorios recursivamente\n" " -L, --logical percorrido lóxico, seguindo enlaces simbólicos\n" " -P --physical percorrido físico, non segui-los enlaces simbólicos\n" " --version amosar versión e sair\n" " --help este texto de axuda\n" #: ../getfattr/getfattr.c:496 #, c-format msgid "%s: invalid regular expression \"%s\"\n" msgstr "%s: expresión regular incorrecta: \"%s\"\n" #: ../getfattr/getfattr.c:514 #, c-format msgid "" "Usage: %s %s\n" "Try `%s --help' for more information.\n" msgstr "" "Uso: %s %s\n" "Escriba `%s --help' para obter máis información.\n" #: ../setfattr/setfattr.c:123 #, c-format msgid "%s: %s: No filename found in line %d, aborting\n" msgstr "%s: %s: Non se atopuo un nome de ficheiro na liña %d, abortando\n" #: ../setfattr/setfattr.c:127 #, c-format msgid "%s: No filename found in line %d of standard input, aborting\n" msgstr "%s: Non se atopou un nome de ficheiro na liña %d da entrada estándar, abortando\n" #: ../setfattr/setfattr.c:174 #, c-format msgid "%s %s -- set extended attributes\n" msgstr "%s %s -- establecer atributos estendidos\n" #: ../setfattr/setfattr.c:176 #, c-format msgid " %s %s\n" msgstr " %s %s\n" #: ../setfattr/setfattr.c:178 #, c-format msgid "" " -n, --name=name set the value of the named extended attribute\n" " -x, --remove=name remove the named extended attribute\n" " -v, --value=value use value as the attribute value\n" " -h, --no-dereference do not dereference symbolic links\n" " --restore=file restore extended attributes\n" " --version print version and exit\n" " --help this help text\n" msgstr "" " -n, --name=nome establecé-lo valor do atributo estendido especificado\n" " -x, --remove=nome eliminá-lo atributo estendido especificado\n" " -v, --value=valor usar 'valor' coma o valor do atributo\n" " -h, --no-dereference non resolver enlaces simbólicos\n" " --restore=fich restaurar atributos estendidos\n" " --version amosar versión e sair\n" " --help este texto de axuda\n" #: ../setfattr/setfattr.c:253 #, c-format msgid "" "Usage: %s %s\n" " %s %s\n" "Try `%s --help' for more information.\n" msgstr "" "Uso: %s %s\n" " %s %s\n" "Escriba `%s --help' para obter máis información.\n" #: ../libattr/attr_copy_fd.c:82 #: ../libattr/attr_copy_fd.c:97 #: ../libattr/attr_copy_file.c:80 #: ../libattr/attr_copy_file.c:95 #, c-format msgid "listing attributes of %s" msgstr "listando atributos de %s" #: ../libattr/attr_copy_fd.c:117 #: ../libattr/attr_copy_fd.c:134 #: ../libattr/attr_copy_file.c:115 #: ../libattr/attr_copy_file.c:132 #, c-format msgid "getting attribute %s of %s" msgstr "obtendo atributo %s de %s" #: ../libattr/attr_copy_fd.c:147 #: ../libattr/attr_copy_fd.c:165 #: ../libattr/attr_copy_file.c:144 #: ../libattr/attr_copy_file.c:163 #, c-format msgid "setting attributes for %s" msgstr "establecendo atributos para %s" #: ../libattr/attr_copy_fd.c:153 #: ../libattr/attr_copy_file.c:151 #, c-format msgid "setting attribute %s for %s" msgstr "establecendo atributo %s para %s" --------------090700090804050503040904-- From owner-xfs@oss.sgi.com Thu Mar 29 09:35:28 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 09:35:37 -0700 (PDT) X-Spam-oss-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from smtp112.sbc.mail.mud.yahoo.com (smtp112.sbc.mail.mud.yahoo.com [68.142.198.211]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2TGZP6p013100 for ; Thu, 29 Mar 2007 09:35:26 -0700 Received: (qmail 27494 invoked from network); 29 Mar 2007 16:35:23 -0000 Received: from unknown (HELO stupidest.org) (cwedgwood@sbcglobal.net@24.5.75.45 with login) by smtp112.sbc.mail.mud.yahoo.com with SMTP; 29 Mar 2007 16:35:22 -0000 Received: by tuatara.stupidest.org (Postfix, from userid 10000) id 6F5281826129; Thu, 29 Mar 2007 09:35:20 -0700 (PDT) Date: Thu, 29 Mar 2007 09:35:20 -0700 From: Chris Wedgwood To: "Amit K. Arora" Cc: torvalds@osdl.org, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070329163520.GA16632@tuatara.stupidest.org> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070329115126.GB7374@amitarora.in.ibm.com> X-archive-position: 10980 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cw@f00f.org Precedence: bulk X-list: xfs Content-Length: 1166 Lines: 31 On Thu, Mar 29, 2007 at 05:21:26PM +0530, Amit K. Arora wrote: > int fallocate(int fd, loff_t offset, loff_t len, int mode) Right now there are only two possible values for mode --- it's not clear what additional values there will be in the future. How about two syscalls? If we decide later on we need something more complicated we can revisit this and *THEN* add another system call which may end up being a superset of the other two. I know that sounds somewhat icky but: * it's fairly simple * we get nice argument handling on all arches by dropping u32 mode (don't we?) * syscalls don't really cost a lot to keep about, they do cost in terms on maintenance though, but in this case i don't see it being all that much of a problem * IMO badly/over designed syscalls are going to be a bigger problem long term Given that *NO* single fs in mainline right now can *reliably* use this functionality for a while maybe whatever solution people come up with next should sit in -mm for a while? At least that gives people exposure to it and a chance to make some changes as once it's merged to mainline it's pretty hard to change. From owner-xfs@oss.sgi.com Thu Mar 29 09:49:50 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 09:49:55 -0700 (PDT) X-Spam-oss-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TGnm6p016654 for ; Thu, 29 Mar 2007 09:49:50 -0700 Received: from localhost (dslb-084-057-114-201.pools.arcor-ip.net [84.57.114.201]) by mail.lichtvoll.de (Postfix) with ESMTP id 9D24B5AD36; Thu, 29 Mar 2007 18:49:45 +0200 (CEST) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: Re: XFS and write barriers. Date: Thu, 29 Mar 2007 18:49:38 +0200 User-Agent: KMail/1.9.6 Cc: David Chinner References: <17923.11463.459927.628762@notabene.brown> <200703291656.22084.Martin@lichtvoll.de> <20070329151858.GI32597093@melbourne.sgi.com> In-Reply-To: <20070329151858.GI32597093@melbourne.sgi.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart5603208.FTvvu8bBA2"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200703291849.44297.Martin@lichtvoll.de> X-archive-position: 10981 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 1954 Lines: 58 --nextPart5603208.FTvvu8bBA2 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Am Donnerstag 29 M=E4rz 2007 schrieb David Chinner: > On Thu, Mar 29, 2007 at 04:56:21PM +0200, Martin Steigerwald wrote: > > Am Montag 26 M=E4rz 2007 schrieb David Chinner: > > > > Is there some mount flag to say "cope without barriers" or > > > > "require barriers" ?? > > > > > > XFs has "-o nobarrier" to say don't use barriers, and this is > > > *not* the default. If barriers don't work, we drop back to "-o > > > nobarrier" after leaving a loud warning inthe log.... > > > > Hello David! > > > > Just a thought, maybe it shouldn't do that automatically, but require > > the sysadmin to explicitely state "-o nobarrier" in that case. > > And prevent most existing XFS filesystems from mounting after > a kernel upgrade? Think about the problems that might cause > with XFs root filesystems on hardware/software that doesn't > support barriers.... Hello David! Granted. So it might turn out to be a decision between does not boot or is= =20 not totally safe in power outages or crashes. I see no easy default=20 answer to that. So while probably being a layering violation at least trying to disable=20 the write cache on devices without cache flush support unless "-o=20 nobarrier" (as in "I know what I am doing") is given, might help safety.=20 But this adds complexity and a possible source for bugs. And maybe trying= =20 to disable write cache isn't safe on all setups? Regards, --=20 Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 --nextPart5603208.FTvvu8bBA2 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBGC+4omRvqrKWZhMcRAqBqAJ9aGM4oYzmMEyfYl3OpLCz/0jacIQCfUBeQ L4Oc/JUnOhYE/bkd78nHygk= =EVVf -----END PGP SIGNATURE----- --nextPart5603208.FTvvu8bBA2-- From owner-xfs@oss.sgi.com Thu Mar 29 10:36:15 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 10:36:23 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from tmailer.gwdg.de (tmailer.gwdg.de [134.76.10.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2THaD6p026706 for ; Thu, 29 Mar 2007 10:36:14 -0700 Received: from linux01.gwdg.de ([134.76.13.21]) by mailer.gwdg.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.66) (envelope-from ) id 1HWy9G-0005fd-Sn; Thu, 29 Mar 2007 19:10:55 +0200 Received: from linux01.gwdg.de (localhost [127.0.0.1]) by linux01.gwdg.de (8.13.3/8.13.3/SuSE Linux 0.7) with ESMTP id l2TH1sfQ023759; Thu, 29 Mar 2007 19:01:56 +0200 Received: from localhost (jengelh@localhost) by linux01.gwdg.de (8.13.3/8.13.3/Submit) with ESMTP id l2TH1saw023753; Thu, 29 Mar 2007 19:01:54 +0200 Date: Thu, 29 Mar 2007 19:01:54 +0200 (MEST) From: Jan Engelhardt To: "Amit K. Arora" cc: torvalds@osdl.org, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call In-Reply-To: <20070329115126.GB7374@amitarora.in.ibm.com> Message-ID: References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 10984 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jengelh@linux01.gwdg.de Precedence: bulk X-list: xfs Content-Length: 1903 Lines: 57 Hi, On Mar 29 2007 17:21, Amit K. Arora wrote: > >We need to come up with the best possible layout of arguments for the >fallocate() system call. Various architectures have different >requirements for how the arguments should look like. Since the mail >chain has become huge, here is the summary of various inputs received >so far. >s390 prefers following layout: > int fallocate(int fd, loff_t offset, loff_t len, int mode) >For details on why and how "int, int, loff_t, loff_t" is a problem on >s390, please see Heiko's mail on 16th March. Here is the link: >http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html Quoting that... |len -> r6 + second halve on stack Then, is not this a gcc glitch? (IMO, it should put all of "len" on the stack) >Platform: ppc, arm >------------------ >6 arguments. Thus the desired layout by ppc32 is: > int fallocate(int fd, int mode, loff_t offset, loff_t len) > >Option of loff_t => high u32 + low u32 >-------------------------------------- >Matthew and Russell have suggested another option of breaking each >"loff_t" into two "u32"s. This will result in 6 arguments in total. > >What are your thoughts on this ? What layout should we finalize on ? >Perhaps, since sync_file_range() system call has similar arguments, we >can take hint from the challenges faced on implementing it on various >architectures, and decide. > >Please suggest. Thanks! Does it actually matter? Glibc can have its own argument ordering different from the syscalls, so at least it would be possible to lay out the syscall arguments in the most portable way while retaining nice userspace C code. Hey, glibc might even wrap it up in a struct! (Using a pointer, as suggested in one of the proposals.) int fallocate(int fd, loff_t offset, loff_t len, int mode) { struct fallocate_foobar d = {fd, offset, len, mode}; return _syscall(..., &d); } Jan -- From owner-xfs@oss.sgi.com Thu Mar 29 10:40:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 10:40:45 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=BAYES_50,J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.24]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2THee6p027854 for ; Thu, 29 Mar 2007 10:40:41 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id l2THABU2023848 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 29 Mar 2007 10:10:11 -0700 Received: from box (shell0.pdx.osdl.net [10.9.0.31]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with SMTP id l2THAAOM007528; Thu, 29 Mar 2007 10:10:10 -0700 Date: Thu, 29 Mar 2007 10:10:10 -0700 From: Andrew Morton To: "Amit K. Arora" Cc: torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-Id: <20070329101010.7a2b8783.akpm@linux-foundation.org> In-Reply-To: <20070329115126.GB7374@amitarora.in.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> X-Mailer: Sylpheed version 2.2.7 (GTK+ 2.8.17; x86_64-unknown-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.177 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 10985 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: akpm@linux-foundation.org Precedence: bulk X-list: xfs Content-Length: 2118 Lines: 58 On Thu, 29 Mar 2007 17:21:26 +0530 "Amit K. Arora" wrote: > Hello, > > We need to come up with the best possible layout of arguments for the > fallocate() system call. Various architectures have different > requirements for how the arguments should look like. Since the mail > chain has become huge, here is the summary of various inputs received > so far. > > Platform: s390 > -------------- > s390 prefers following layout: > > int fallocate(int fd, loff_t offset, loff_t len, int mode) > > For details on why and how "int, int, loff_t, loff_t" is a problem on > s390, please see Heiko's mail on 16th March. Here is the link: > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html > > Platform: ppc, arm > ------------------ > ppc (32 bit) has a problem with "int, loff_t, loff_t, int" layout, > since this will result in a pad between fd and offset, making seven > arguments total - which is not supported by ppc32. It supports only > 6 arguments. Thus the desired layout by ppc32 is: > > int fallocate(int fd, int mode, loff_t offset, loff_t len) > > Even ARM prefers above kind of layout. For details please see the > definition of sys_arm_sync_file_range(). This is a clean-looking option. Can s390 be changed to support seven-arg syscalls? > Option of loff_t => high u32 + low u32 > -------------------------------------- > Matthew and Russell have suggested another option of breaking each > "loff_t" into two "u32"s. This will result in 6 arguments in total. > > Following think that this is a good alternative: > Matthew Wilcox, Russell King, Heiko Carstens > > Following do not like this idea: > Chris Wedgwood It's a bit weird-looking, but the six-32-bit-args approach is simple enought to understand and implement. Presumably the glibc wrapper would hide that detail from everyone. > > What are your thoughts on this ? What layout should we finalize on ? > Perhaps, since sync_file_range() system call has similar arguments, we > can take hint from the challenges faced on implementing it on various > architectures, and decide. > From owner-xfs@oss.sgi.com Thu Mar 29 10:49:30 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 10:49:37 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=BAYES_50,J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from odyssey.analogic.com (odyssey.analogic.com [204.178.40.5]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2THnR6p030520 for ; Thu, 29 Mar 2007 10:49:29 -0700 Received: from chaos.analogic.com ([10.112.50.11]) by phoenix.analogic.com with Microsoft SMTPSVC(6.0.3790.211); Thu, 29 Mar 2007 13:18:53 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-MimeOLE: Produced By Microsoft Exchange V6.5 Received: from chaos.analogic.com (localhost [127.0.0.1]) by chaos.analogic.com (8.12.11/8.12.11) with ESMTP id l2THIr0l004373; Thu, 29 Mar 2007 13:18:53 -0400 Received: (from linux-os@localhost) by chaos.analogic.com (8.12.11/8.12.11/Submit) id l2THIrCA004372; Thu, 29 Mar 2007 13:18:53 -0400 X-OriginalArrivalTime: 29 Mar 2007 17:18:53.0407 (UTC) FILETIME=[533B56F0:01C77226] Content-class: urn:content-classes:message Subject: Re: Interface for the new fallocate() system call Date: Thu, 29 Mar 2007 13:18:53 -0400 Message-ID: In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Interface for the new fallocate() system call thread-index: AcdyJlNCqkMDRiU0Q1y6AxdFfz37oQ== References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> From: "linux-os \(Dick Johnson\)" To: "Jan Engelhardt" Cc: "Amit K. Arora" , , , , , , , , Reply-To: "linux-os \(Dick Johnson\)" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id l2THnU6p030540 X-archive-position: 10986 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: linux-os@analogic.com Precedence: bulk X-list: xfs Content-Length: 2841 Lines: 72 On Thu, 29 Mar 2007, Jan Engelhardt wrote: > Hi, > > On Mar 29 2007 17:21, Amit K. Arora wrote: >> >> We need to come up with the best possible layout of arguments for the >> fallocate() system call. Various architectures have different >> requirements for how the arguments should look like. Since the mail >> chain has become huge, here is the summary of various inputs received >> so far. > >> s390 prefers following layout: >> int fallocate(int fd, loff_t offset, loff_t len, int mode) >> For details on why and how "int, int, loff_t, loff_t" is a problem on >> s390, please see Heiko's mail on 16th March. Here is the link: >> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html > > Quoting that... > |len -> r6 + second halve on stack > > Then, is not this a gcc glitch? (IMO, it should put all of "len" on the > stack) > >> Platform: ppc, arm >> ------------------ >> 6 arguments. Thus the desired layout by ppc32 is: >> int fallocate(int fd, int mode, loff_t offset, loff_t len) >> >> Option of loff_t => high u32 + low u32 >> -------------------------------------- >> Matthew and Russell have suggested another option of breaking each >> "loff_t" into two "u32"s. This will result in 6 arguments in total. >> >> What are your thoughts on this ? What layout should we finalize on ? >> Perhaps, since sync_file_range() system call has similar arguments, we >> can take hint from the challenges faced on implementing it on various >> architectures, and decide. >> >> Please suggest. Thanks! > > Does it actually matter? Glibc can have its own argument ordering > different from the syscalls, so at least it would be possible to lay out > the syscall arguments in the most portable way while retaining nice > userspace C code. Hey, glibc might even wrap it up in a struct! (Using a > pointer, as suggested in one of the proposals.) > > int fallocate(int fd, loff_t offset, loff_t len, int mode) > { > struct fallocate_foobar d = {fd, offset, len, mode}; > return _syscall(..., &d); > } > > Jan > -- I think it's always better to put only a pointer on the stack as above. Cheers, Dick Johnson Penguin : Linux version 2.6.16.24 on an i686 machine (5592.62 BogoMips). New book: http://www.AbominableFirebug.com/ _  **************************************************************** The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to DeliveryErrors@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them. Thank you. From owner-xfs@oss.sgi.com Thu Mar 29 11:17:57 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 11:18:02 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from tmailer.gwdg.de (tmailer.gwdg.de [134.76.10.23]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TIHu6p005322 for ; Thu, 29 Mar 2007 11:17:57 -0700 Received: from linux01.gwdg.de ([134.76.13.21]) by mailer.gwdg.de with esmtps (TLSv1:AES256-SHA:256) (Exim 4.66) (envelope-from ) id 1HWzAt-0000MW-Ux; Thu, 29 Mar 2007 20:16:40 +0200 Received: from linux01.gwdg.de (localhost [127.0.0.1]) by linux01.gwdg.de (8.13.3/8.13.3/SuSE Linux 0.7) with ESMTP id l2TI5PES029241; Thu, 29 Mar 2007 20:05:27 +0200 Received: from localhost (jengelh@localhost) by linux01.gwdg.de (8.13.3/8.13.3/Submit) with ESMTP id l2TI5P51029235; Thu, 29 Mar 2007 20:05:25 +0200 Date: Thu, 29 Mar 2007 20:05:24 +0200 (MEST) From: Jan Engelhardt To: "linux-os (Dick Johnson)" cc: "Amit K. Arora" , torvalds@osdl.org, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call In-Reply-To: Message-ID: References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 10987 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jengelh@linux01.gwdg.de Precedence: bulk X-list: xfs Content-Length: 515 Lines: 17 On Mar 29 2007 13:18, linux-os (Dick Johnson) wrote: > >I think it's always better to put only a pointer on the stack as >above. I have to disagree, since wrapping it into a struct and copying the struct in kernelspace from userspace requires more code. Pointers only become useful at 3 (rarely) or 4 (yeah, more likely) and 5+ (definitely) arguments, (3) see above about copying, (4) middle thing and (5) tons of arguments like mmap() should be wrapped up... for simplicity of dealing with it later. Jan -- From owner-xfs@oss.sgi.com Thu Mar 29 11:55:43 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 11:55:50 -0700 (PDT) X-Spam-oss-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_40 autolearn=ham version=3.2.0-pre1-r499012 Received: from smtp.osdl.org (smtp.osdl.org [65.172.181.24]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2TItf6p012971 for ; Thu, 29 Mar 2007 11:55:42 -0700 Received: from shell0.pdx.osdl.net (fw.osdl.org [65.172.181.6]) by smtp.osdl.org (8.12.8/8.12.8) with ESMTP id l2TIb5U2026535 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 29 Mar 2007 11:37:05 -0700 Received: from localhost (shell0.pdx.osdl.net [10.9.0.31]) by shell0.pdx.osdl.net (8.13.1/8.11.6) with ESMTP id l2TIb3ex009285; Thu, 29 Mar 2007 11:37:03 -0700 Date: Thu, 29 Mar 2007 11:37:03 -0700 (PDT) From: Linus Torvalds To: Jan Engelhardt cc: "linux-os (Dick Johnson)" , "Amit K. Arora" , akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call In-Reply-To: Message-ID: References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-MIMEDefang-Filter: osdl$Revision: 1.177 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 10988 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: torvalds@linux-foundation.org Precedence: bulk X-list: xfs Content-Length: 691 Lines: 20 On Thu, 29 Mar 2007, Jan Engelhardt wrote: > > I have to disagree, since wrapping it into a struct and copying the struct > in kernelspace from userspace requires more code. Not just more code, but more security issues too. Passing system call arguments by value means that there are no subtle security issues - the value you use is the value you got. But once you pass-by-reference, you have to make damn sure that you do the proper user space accesses and verify the pointer correctly. User-space (aka "user-supplied") pointers are just more dangerous. We obviously can't avoid them, but they need much more care than just a random value directly passed in a register. Linus From owner-xfs@oss.sgi.com Thu Mar 29 16:33:32 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 16:33:36 -0700 (PDT) X-Spam-oss-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2TNXU6p004715 for ; Thu, 29 Mar 2007 16:33:31 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA27417; Fri, 30 Mar 2007 09:33:24 +1000 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id l2TNXNAf42501678; Fri, 30 Mar 2007 10:33:24 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id l2TNXNKj43075328; Fri, 30 Mar 2007 10:33:23 +1100 (AEDT) Date: Fri, 30 Mar 2007 10:33:23 +1100 From: David Chinner To: xfs-dev Cc: xfs-oss Subject: Review: Make xfs_dm_sync_by_handle really sync data Message-ID: <20070329233323.GM32597093@melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-archive-position: 10989 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 1581 Lines: 61 xfs_dm_sync_by_handle doesn't sync data right now. Never has. It is supposed to work exactly like fsync(), except that it only ever calls XFS functions that log the inode and not the generic functions that actually sync out the data. Fix it. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/dmapi/xfs_dm.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/dmapi/xfs_dm.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/dmapi/xfs_dm.c 2007-03-30 09:02:07.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/dmapi/xfs_dm.c 2007-03-30 09:28:54.348239386 +1000 @@ -3054,22 +3054,36 @@ xfs_dm_symlink_by_handle( } +/* + * xfs_dm_sync_by_handle needs to do the same thing as sys_fsync() + */ STATIC int xfs_dm_sync_by_handle( struct inode *inode, dm_right_t right) { + int err, ret; bhv_vnode_t *vp = vn_from_inode(inode); /* Returns negative errors to DMAPI */ - if (right < DM_RIGHT_EXCL) return(-EACCES); + /* We need to protect against concurrent writers.. */ + ret = filemap_fdatawrite(inode->i_mapping); + down_rw_sems(inode, DM_FLAGS_IMUX); + err = bhv_vop_fsync(vp, FSYNC_WAIT, NULL, (xfs_off_t)0,(xfs_off_t)-1); + if (!ret) + ret = err; + up_rw_sems(inode, DM_FLAGS_IMUX); + err = filemap_fdatawait(inode->i_mapping); + if (!ret) + ret = err; + if (VN_TRUNC(vp)) VUNTRUNCATE(vp); - return -bhv_vop_fsync(vp, FSYNC_WAIT, NULL, (xfs_off_t)0,(xfs_off_t)-1); + return(-ret); /* Return negative error to DMAPI */ } From owner-xfs@oss.sgi.com Thu Mar 29 17:46:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 17:46:07 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2U0k16p013360 for ; Thu, 29 Mar 2007 17:46:03 -0700 Received: from boing.melbourne.sgi.com (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA29637; Fri, 30 Mar 2007 10:45:53 +1000 Date: Fri, 30 Mar 2007 10:46:42 +1100 From: Timothy Shimmin To: Antonio Trueba , xfs@oss.sgi.com Subject: Re: New ATTR translations Message-ID: <227C987A103A946C20981C05@timothy-shimmins-power-mac-g5.local> In-Reply-To: <460BDC1F.9030300@mundo-r.com> References: <460BDC1F.9030300@mundo-r.com> X-Mailer: Mulberry/4.0.8 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 10991 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 272 Lines: 18 I will check them in. Thanks. --Tim --On 29 March 2007 5:32:47 PM +0200 Antonio Trueba wrote: > Hello all, > > Atached are new translations of attr to Spanish (es) and Galician (gl), > both complete against current CVS tree. > > Regards, > -- From owner-xfs@oss.sgi.com Thu Mar 29 17:46:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 17:46:18 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2U0kA6p013392 for ; Thu, 29 Mar 2007 17:46:11 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA29655 for ; Fri, 30 Mar 2007 10:46:07 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 1116) id 5037658FF808; Fri, 30 Mar 2007 10:46:07 +1000 (EST) To: xfs@oss.sgi.com Subject: TAKE attr translations - Spanish and Galician Message-Id: <20070330004607.5037658FF808@chook.melbourne.sgi.com> Date: Fri, 30 Mar 2007 10:46:07 +1000 (EST) From: tes@sgi.com (Tim Shimmin) X-archive-position: 10992 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Content-Length: 1142 Lines: 33 Attr translations for Spanish (es) and Galician (gl). Contributed by Antonio Trueba. --Tim Date: Fri Mar 30 10:44:37 AEST 2007 Workarea: chook.melbourne.sgi.com:/build/tes/xfs-cmds Inspected by: atrueba@mundo-r.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-cmds/master-melb Modid: master-melb:xfs-cmds:28323a attr/po/gl.po - 1.1 - new http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/attr/po/gl.po - Add Galician translation. attr/po/es.po - 1.1 - new http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/attr/po/es.po - Add Spanish translation. attr/VERSION - 1.68 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/attr/VERSION.diff?r1=text&tr1=1.68&r2=text&tr2=1.67&f=h - Bump version# for Attr translations for Spanish (es) and Galician (gl). attr/doc/CHANGES - 1.80 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/attr/doc/CHANGES.diff?r1=text&tr1=1.80&r2=text&tr2=1.79&f=h attr/po/Makefile - 1.10 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-cmds/attr/po/Makefile.diff?r1=text&tr1=1.10&r2=text&tr2=1.9&f=h - Add Attr translations for Spanish (es) and Galician (gl). From owner-xfs@oss.sgi.com Thu Mar 29 18:03:25 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 18:03:28 -0700 (PDT) X-Spam-oss-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2U13M6p016348 for ; Thu, 29 Mar 2007 18:03:23 -0700 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA29634; Fri, 30 Mar 2007 10:45:49 +1000 Message-ID: <460C5E38.5040006@sgi.com> Date: Fri, 30 Mar 2007 10:47:52 +1000 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: David Chinner CC: xfs-dev , xfs-oss Subject: Re: Review: Make xfs_dm_sync_by_handle really sync data References: <20070329233323.GM32597093@melbourne.sgi.com> In-Reply-To: <20070329233323.GM32597093@melbourne.sgi.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10993 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 346 Lines: 15 It is looking good Dave. I will test it today. Regards, Vlad David Chinner wrote: > xfs_dm_sync_by_handle doesn't sync data right now. Never has. > It is supposed to work exactly like fsync(), except that it > only ever calls XFS functions that log the inode and > not the generic functions that actually sync out the data. > > Fix it. > > From owner-xfs@oss.sgi.com Thu Mar 29 18:32:27 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 18:32:29 -0700 (PDT) X-Spam-oss-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2U1WO6p021325 for ; Thu, 29 Mar 2007 18:32:25 -0700 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA01131; Fri, 30 Mar 2007 11:32:18 +1000 Message-ID: <460C691D.9080100@sgi.com> Date: Fri, 30 Mar 2007 11:34:21 +1000 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: xfs-dev CC: xfs-oss Subject: Review: do not hold dm_reg_lock spinlock when calling dm_add_fsys_entry() Content-Type: multipart/mixed; boundary="------------010501070808020005050807" X-archive-position: 10994 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 1067 Lines: 43 This is a multi-part message in MIME format. --------------010501070808020005050807 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit dm_reg_lock spinlock has been held when calling dm_add_fsys_entry() Attached is a fix. Regards, Vlad --------------010501070808020005050807 Content-Type: text/x-patch; name="sleeping-in-atomic-in-dmapi.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="sleeping-in-atomic-in-dmapi.patch" Index: linux/fs/dmapi/dmapi_register.c =================================================================== --- linux.orig/fs/dmapi/dmapi_register.c +++ linux/fs/dmapi/dmapi_register.c @@ -252,6 +252,7 @@ dm_add_fsys_entry( fsrp->fr_next = dm_registers; dm_registers = fsrp; dm_fsys_cnt++; + mutex_spinunlock(&dm_reg_lock, lc); #ifdef CONFIG_PROC_FS { char buf[100]; @@ -262,7 +263,6 @@ dm_add_fsys_entry( entry->owner = THIS_MODULE; } #endif - mutex_spinunlock(&dm_reg_lock, lc); return(0); } --------------010501070808020005050807-- From owner-xfs@oss.sgi.com Thu Mar 29 23:49:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 29 Mar 2007 23:49:45 -0700 (PDT) X-Spam-oss-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2U6nb6p004921 for ; Thu, 29 Mar 2007 23:49:39 -0700 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA10000; Fri, 30 Mar 2007 16:35:59 +1000 Message-ID: <460CB049.5070105@sgi.com> Date: Fri, 30 Mar 2007 16:38:01 +1000 From: Vlad Apostolov User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: sgi.bugs.xfs@engr.sgi.com CC: linux-xfs@oss.sgi.com Subject: TAKE 962866 - sleeping in atomic in dmapi Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 10995 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs Content-Length: 523 Lines: 15 Date: Fri Mar 30 16:33:02 AEST 2007 Workarea: soarer.melbourne.sgi.com:/home/vapo/isms/linux-xfs Inspected by: dgc Author: vapo The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: linux-melb:dmapi:28328a fs/dmapi/dmapi_register.c - 1.51 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/linux-2.6-xfs/fs/dmapi/dmapi_register.c.diff?r1=text&tr1=1.51&r2=text&tr2=1.50&f=h - pv 962866, rv dgc - do not hold dm_reg_lock while calling create_proc_read_entry() From owner-xfs@oss.sgi.com Fri Mar 30 00:00:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 00:00:33 -0700 (PDT) X-Spam-oss-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_20, J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from mtagate7.uk.ibm.com (mtagate7.uk.ibm.com [195.212.29.140]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2U70K6p009382 for ; Fri, 30 Mar 2007 00:00:22 -0700 Received: from d06nrmr1407.portsmouth.uk.ibm.com (d06nrmr1407.portsmouth.uk.ibm.com [9.149.38.185]) by mtagate7.uk.ibm.com (8.13.8/8.13.8) with ESMTP id l2U70I6Z109950 for ; Fri, 30 Mar 2007 07:00:18 GMT Received: from d06av01.portsmouth.uk.ibm.com (d06av01.portsmouth.uk.ibm.com [9.149.37.212]) by d06nrmr1407.portsmouth.uk.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2U70H672773026 for ; Fri, 30 Mar 2007 08:00:17 +0100 Received: from d06av01.portsmouth.uk.ibm.com (loopback [127.0.0.1]) by d06av01.portsmouth.uk.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2U70GxP004510 for ; Fri, 30 Mar 2007 08:00:17 +0100 Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d06av01.portsmouth.uk.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2U70G2Y004505; Fri, 30 Mar 2007 08:00:16 +0100 Date: Fri, 30 Mar 2007 09:00:16 +0200 From: Heiko Carstens To: Jan Engelhardt Cc: "Amit K. Arora" , torvalds@osdl.org, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070330070016.GB8365@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 10996 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 997 Lines: 26 On Thu, Mar 29, 2007 at 07:01:54PM +0200, Jan Engelhardt wrote: > Hi, > > On Mar 29 2007 17:21, Amit K. Arora wrote: > > > >We need to come up with the best possible layout of arguments for the > >fallocate() system call. Various architectures have different > >requirements for how the arguments should look like. Since the mail > >chain has become huge, here is the summary of various inputs received > >so far. > > >s390 prefers following layout: > > int fallocate(int fd, loff_t offset, loff_t len, int mode) > >For details on why and how "int, int, loff_t, loff_t" is a problem on > >s390, please see Heiko's mail on 16th March. Here is the link: > >http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html > > Quoting that... > |len -> r6 + second halve on stack > > Then, is not this a gcc glitch? (IMO, it should put all of "len" on the > stack) It _does_ put all of "len" on the stack. That is what I tried to explain in the section that follows what you quoted. From owner-xfs@oss.sgi.com Fri Mar 30 00:04:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 00:04:43 -0700 (PDT) X-Spam-oss-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id l2U74b6p010395 for ; Fri, 30 Mar 2007 00:04:39 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA10133; Fri, 30 Mar 2007 16:41:58 +1000 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 7663458FF807; Fri, 30 Mar 2007 16:41:58 +1000 (EST) To: xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: TAKE 962571 - xfs_dm_sync_by_handle is busted Message-Id: <20070330064158.7663458FF807@chook.melbourne.sgi.com> Date: Fri, 30 Mar 2007 16:41:58 +1000 (EST) From: dgc@sgi.com (David Chinner) X-archive-position: 10997 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Content-Length: 763 Lines: 21 xfs_dm_sync_by_handle does not sync file data. xfs_dm_sync_by_handle() is supposed to behave like fsync. When it returns the handle against which it was called is supposed to be completely on stable storage. xfs_dm_sync_by_handle syncs the inode but not the data. Make it sync hte data as well. Date: Fri Mar 30 16:41:20 AEST 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: vapo The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:28329a fs/xfs/dmapi/xfs_dm.c - 1.35 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/dmapi/xfs_dm.c.diff?r1=text&tr1=1.35&r2=text&tr2=1.34&f=h - sync data as well as the inode in xfs_dm_sync_by_handle(). From owner-xfs@oss.sgi.com Fri Mar 30 00:19:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 00:19:44 -0700 (PDT) X-Spam-oss-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.2.0-pre1-r499012 Received: from mtagate7.de.ibm.com (mtagate7.de.ibm.com [195.212.29.156]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2U7Ja6p018351 for ; Fri, 30 Mar 2007 00:19:37 -0700 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate7.de.ibm.com (8.13.8/8.13.8) with ESMTP id l2U7JUdW131392 for ; Fri, 30 Mar 2007 07:19:30 GMT Received: from d12av04.megacenter.de.ibm.com (d12av04.megacenter.de.ibm.com [9.149.165.229]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2U7JUCX1917088 for ; Fri, 30 Mar 2007 09:19:30 +0200 Received: from d12av04.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2U7JTNx030448 for ; Fri, 30 Mar 2007 09:19:30 +0200 Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d12av04.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2U7JTrG030442; Fri, 30 Mar 2007 09:19:29 +0200 Date: Fri, 30 Mar 2007 09:19:29 +0200 From: Heiko Carstens To: Andrew Morton Cc: "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070330071929.GC8365@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070329101010.7a2b8783.akpm@linux-foundation.org> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 10998 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 1204 Lines: 28 > > Even ARM prefers above kind of layout. For details please see the > > definition of sys_arm_sync_file_range(). > > This is a clean-looking option. Can s390 be changed to support seven-arg > syscalls? > > > Option of loff_t => high u32 + low u32 > > -------------------------------------- > > Matthew and Russell have suggested another option of breaking each > > "loff_t" into two "u32"s. This will result in 6 arguments in total. > > > > Following think that this is a good alternative: > > Matthew Wilcox, Russell King, Heiko Carstens > > > > Following do not like this idea: > > Chris Wedgwood > > It's a bit weird-looking, but the six-32-bit-args approach is simple > enought to understand and implement. Presumably the glibc wrapper > would hide that detail from everyone. s390 can be changed to support seven-arg syscalls. But that would require creating an additional stackframe in *libc to save original register contents and in addition it would make our syscall hotpath slower. That is because we have to take care of an additional register that might contain user space passed contents and needs to be put on the kernel stack. If possible I'd prefer the six-32-bit-args approach. From owner-xfs@oss.sgi.com Fri Mar 30 00:49:09 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 00:49:13 -0700 (PDT) X-Spam-oss-Status: No, score=0.1 required=5.0 tests=BAYES_50,J_CHICKENPOX_93, SPF_HELO_PASS autolearn=no version=3.2.0-pre1-r499012 Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2U7n66p023571 for ; Fri, 30 Mar 2007 00:49:08 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.1/8.13.1) with ESMTP id l2U7EWo8009955; Fri, 30 Mar 2007 03:14:33 -0400 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id l2U7EWJB024759; Fri, 30 Mar 2007 03:14:32 -0400 Received: from devserv.devel.redhat.com (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id l2U7EWmQ032418; Fri, 30 Mar 2007 02:14:32 -0500 Received: (from jakub@localhost) by devserv.devel.redhat.com (8.12.11.20060308/8.12.11/Submit) id l2U7EHQQ032414; Fri, 30 Mar 2007 03:14:17 -0400 Date: Fri, 30 Mar 2007 02:14:17 -0500 From: Jakub Jelinek To: Andrew Morton Cc: "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070330071417.GI355@devserv.devel.redhat.com> Reply-To: Jakub Jelinek References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070329101010.7a2b8783.akpm@linux-foundation.org> User-Agent: Mutt/1.4.1i X-archive-position: 10999 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jakub@redhat.com Precedence: bulk X-list: xfs Content-Length: 1245 Lines: 33 On Thu, Mar 29, 2007 at 10:10:10AM -0700, Andrew Morton wrote: > > Platform: s390 > > -------------- > > s390 prefers following layout: > > > > int fallocate(int fd, loff_t offset, loff_t len, int mode) > > > > For details on why and how "int, int, loff_t, loff_t" is a problem on > > s390, please see Heiko's mail on 16th March. Here is the link: > > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html > > > > Platform: ppc, arm > > ------------------ > > ppc (32 bit) has a problem with "int, loff_t, loff_t, int" layout, > > since this will result in a pad between fd and offset, making seven > > arguments total - which is not supported by ppc32. It supports only > > 6 arguments. Thus the desired layout by ppc32 is: > > > > int fallocate(int fd, int mode, loff_t offset, loff_t len) > > > > Even ARM prefers above kind of layout. For details please see the > > definition of sys_arm_sync_file_range(). > > This is a clean-looking option. Can s390 be changed to support seven-arg > syscalls? Wouldn't int fallocate(loff_t offset, loff_t len, int fd, int mode) work on both s390 and ppc/arm? glibc will certainly wrap it and reorder the arguments as needed, so there is no need to keep fd first. Jakub From owner-xfs@oss.sgi.com Fri Mar 30 01:39:24 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 01:39:29 -0700 (PDT) X-Spam-oss-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_40, J_CHICKENPOX_93 autolearn=no version=3.2.0-pre1-r499012 Received: from mtagate7.de.ibm.com (mtagate7.de.ibm.com [195.212.29.156]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2U8dL6p002770 for ; Fri, 30 Mar 2007 01:39:22 -0700 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate7.de.ibm.com (8.13.8/8.13.8) with ESMTP id l2U8dJDo146422 for ; Fri, 30 Mar 2007 08:39:19 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2U8dJRu2220066 for ; Fri, 30 Mar 2007 10:39:19 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2U8dIpv001587 for ; Fri, 30 Mar 2007 10:39:19 +0200 Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2U8dIpH001584; Fri, 30 Mar 2007 10:39:18 +0200 Date: Fri, 30 Mar 2007 10:39:18 +0200 From: Heiko Carstens To: Jakub Jelinek Cc: Andrew Morton , "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070330083918.GD8365@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> <20070330071417.GI355@devserv.devel.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070330071417.GI355@devserv.devel.redhat.com> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 11000 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 1390 Lines: 34 On Fri, Mar 30, 2007 at 02:14:17AM -0500, Jakub Jelinek wrote: > On Thu, Mar 29, 2007 at 10:10:10AM -0700, Andrew Morton wrote: > > > Platform: s390 > > > -------------- > > > s390 prefers following layout: > > > > > > int fallocate(int fd, loff_t offset, loff_t len, int mode) > > > > > > For details on why and how "int, int, loff_t, loff_t" is a problem on > > > s390, please see Heiko's mail on 16th March. Here is the link: > > > http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg133595.html > > > > > > Platform: ppc, arm > > > ------------------ > > > ppc (32 bit) has a problem with "int, loff_t, loff_t, int" layout, > > > since this will result in a pad between fd and offset, making seven > > > arguments total - which is not supported by ppc32. It supports only > > > 6 arguments. Thus the desired layout by ppc32 is: > > > > > > int fallocate(int fd, int mode, loff_t offset, loff_t len) > > > > > > Even ARM prefers above kind of layout. For details please see the > > > definition of sys_arm_sync_file_range(). > > > > This is a clean-looking option. Can s390 be changed to support seven-arg > > syscalls? > > Wouldn't > int fallocate(loff_t offset, loff_t len, int fd, int mode) > work on both s390 and ppc/arm? glibc will certainly wrap it and > reorder the arguments as needed, so there is no need to keep fd first. That would be fine for s390. From owner-xfs@oss.sgi.com Fri Mar 30 03:32:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 03:32:50 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2UAWd6p000674 for ; Fri, 30 Mar 2007 03:32:41 -0700 Received: by ozlabs.org (Postfix, from userid 1003) id 73BBFDDE47; Fri, 30 Mar 2007 20:32:37 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17932.54577.163137.202721@cargo.ozlabs.ibm.com> Date: Fri, 30 Mar 2007 19:15:29 +1000 From: Paul Mackerras To: Jakub Jelinek Cc: Andrew Morton , "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call In-Reply-To: <20070330071417.GI355@devserv.devel.redhat.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> <20070330071417.GI355@devserv.devel.redhat.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-archive-position: 11002 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: paulus@samba.org Precedence: bulk X-list: xfs Content-Length: 267 Lines: 11 Jakub Jelinek writes: > Wouldn't > int fallocate(loff_t offset, loff_t len, int fd, int mode) > work on both s390 and ppc/arm? glibc will certainly wrap it and > reorder the arguments as needed, so there is no need to keep fd first. That looks fine to me. Paul. From owner-xfs@oss.sgi.com Fri Mar 30 03:32:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 03:32:47 -0700 (PDT) X-Spam-oss-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2UAWd6p000675 for ; Fri, 30 Mar 2007 03:32:41 -0700 Received: by ozlabs.org (Postfix, from userid 1003) id 6EB68DDEA0; Fri, 30 Mar 2007 20:32:37 +1000 (EST) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17932.54606.323431.491736@cargo.ozlabs.ibm.com> Date: Fri, 30 Mar 2007 19:15:58 +1000 From: Paul Mackerras To: Heiko Carstens Cc: Andrew Morton , "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call In-Reply-To: <20070330071929.GC8365@osiris.boeblingen.de.ibm.com> References: <20070117094658.GA17390@amitarora.in.ibm.com> <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> <20070330071929.GC8365@osiris.boeblingen.de.ibm.com> X-Mailer: VM 7.19 under Emacs 21.4.1 X-archive-position: 11001 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: paulus@samba.org Precedence: bulk X-list: xfs Content-Length: 156 Lines: 8 Heiko Carstens writes: > If possible I'd prefer the six-32-bit-args approach. It does mean extra unnecessary work for 64-bit platforms, though... Paul. From owner-xfs@oss.sgi.com Fri Mar 30 05:55:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 05:55:58 -0700 (PDT) X-Spam-oss-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00, MIME_8BIT_HEADER autolearn=no version=3.2.0-pre1-r499012 Received: from mtagate4.de.ibm.com (mtagate4.de.ibm.com [195.212.29.153]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2UCtp6p007392 for ; Fri, 30 Mar 2007 05:55:52 -0700 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate4.de.ibm.com (8.13.8/8.13.8) with ESMTP id l2UCtnFg119100 for ; Fri, 30 Mar 2007 12:55:49 GMT Received: from d12av01.megacenter.de.ibm.com (d12av01.megacenter.de.ibm.com [9.149.165.212]) by d12nrmr1607.megacenter.de.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2UCtmYu2150494 for ; Fri, 30 Mar 2007 14:55:48 +0200 Received: from d12av01.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av01.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2UCtcZg003995 for ; Fri, 30 Mar 2007 14:55:39 +0200 Received: from localhost (dyn-9-152-198-39.boeblingen.de.ibm.com [9.152.198.39]) by d12av01.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2UCtc5w003992; Fri, 30 Mar 2007 14:55:38 +0200 Date: Fri, 30 Mar 2007 14:55:38 +0200 From: Heiko Carstens To: =?iso-8859-1?Q?J=F6rn?= Engel Cc: Paul Mackerras , Andrew Morton , "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070330125538.GE8365@osiris.boeblingen.de.ibm.com> References: <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> <20070330071929.GC8365@osiris.boeblingen.de.ibm.com> <17932.54606.323431.491736@cargo.ozlabs.ibm.com> <20070330104449.GA9371@lazybastard.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20070330104449.GA9371@lazybastard.org> User-Agent: mutt-ng/devel-r804 (Linux) X-archive-position: 11004 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: heiko.carstens@de.ibm.com Precedence: bulk X-list: xfs Content-Length: 401 Lines: 10 On Fri, Mar 30, 2007 at 12:44:49PM +0200, Jörn Engel wrote: > On Fri, 30 March 2007 19:15:58 +1000, Paul Mackerras wrote: > > It does mean extra unnecessary work for 64-bit platforms, though... > > Wouldn't that work be confined to fallocate()? If I understand Heiko > correctly, the alternative would slow s390 down for every syscall, > including more performance-critical ones. That is correct. From owner-xfs@oss.sgi.com Fri Mar 30 06:24:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 06:24:05 -0700 (PDT) X-Spam-oss-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.0-pre1-r499012 Received: from mail.lichtvoll.de (mondschein.lichtvoll.de [194.150.191.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2UDO06p013024 for ; Fri, 30 Mar 2007 06:24:01 -0700 Received: from localhost (dslb-084-056-071-160.pools.arcor-ip.net [84.56.71.160]) by mail.lichtvoll.de (Postfix) with ESMTP id A639F5AD2C for ; Fri, 30 Mar 2007 15:23:58 +0200 (CEST) From: Martin Steigerwald To: linux-xfs@oss.sgi.com Subject: write barrier and USB devices Date: Fri, 30 Mar 2007 15:23:57 +0200 User-Agent: KMail/1.9.6 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200703301523.58027.Martin@lichtvoll.de> X-archive-position: 11005 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Martin@lichtvoll.de Precedence: bulk X-list: xfs Content-Length: 662 Lines: 23 Hello! Does the usb mass storage driver support write barriers? I think it does, since XFS doesn't complain: --------------------------------------------------------------------- shambala:~> mount -o barrier /dev/sdb2 /mnt/daten (note barrier is default since 2.6.17, but anyway) shambala:~> tail -f /var/log/syslog [...] Mar 30 15:11:59 shambala kernel: XFS mounting filesystem sdb2 Mar 30 15:11:59 shambala kernel: Ending clean XFS mount for filesystem: sdb2 --------------------------------------------------------------------- Regards, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 From owner-xfs@oss.sgi.com Fri Mar 30 06:29:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 06:29:30 -0700 (PDT) X-Spam-oss-Status: No, score=-0.4 required=5.0 tests=BAYES_20,MIME_8BIT_HEADER autolearn=no version=3.2.0-pre1-r499012 Received: from longford.lazybastard.org (lazybastard.de [212.112.238.170]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2UDTO6p014552 for ; Fri, 30 Mar 2007 06:29:25 -0700 Received: from joern by longford.lazybastard.org with local (Exim 4.50) id 1HXEbE-0002U0-Tt; Fri, 30 Mar 2007 12:44:53 +0200 Date: Fri, 30 Mar 2007 12:44:49 +0200 From: =?utf-8?B?SsO2cm4=?= Engel To: Paul Mackerras Cc: Heiko Carstens , Andrew Morton , "Amit K. Arora" , torvalds@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org, xfs@oss.sgi.com, suparna@in.ibm.com, cmm@us.ibm.com Subject: Re: Interface for the new fallocate() system call Message-ID: <20070330104449.GA9371@lazybastard.org> References: <20070225022326.137b4875.akpm@linux-foundation.org> <20070301183445.GA7911@amitarora.in.ibm.com> <20070316143101.GA10152@amitarora.in.ibm.com> <20070316161704.GE8525@osiris.boeblingen.de.ibm.com> <20070317111036.GC29931@parisc-linux.org> <20070321120425.GA27273@amitarora.in.ibm.com> <20070329115126.GB7374@amitarora.in.ibm.com> <20070329101010.7a2b8783.akpm@linux-foundation.org> <20070330071929.GC8365@osiris.boeblingen.de.ibm.com> <17932.54606.323431.491736@cargo.ozlabs.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <17932.54606.323431.491736@cargo.ozlabs.ibm.com> User-Agent: Mutt/1.5.9i X-archive-position: 11006 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: joern@lazybastard.org Precedence: bulk X-list: xfs Content-Length: 495 Lines: 17 On Fri, 30 March 2007 19:15:58 +1000, Paul Mackerras wrote: > Heiko Carstens writes: > > > If possible I'd prefer the six-32-bit-args approach. > > It does mean extra unnecessary work for 64-bit platforms, though... Wouldn't that work be confined to fallocate()? If I understand Heiko correctly, the alternative would slow s390 down for every syscall, including more performance-critical ones. Jörn -- tglx1 thinks that joern should get a (TM) for "Thinking Is Hard" -- Thomas Gleixner From owner-xfs@oss.sgi.com Fri Mar 30 06:45:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 30 Mar 2007 06:45:23 -0700 (PDT) X-Spam-oss-Status: No, score=-0.3 required=5.0 tests=AWL,BAYES_50, J_CHICKENPOX_13,J_CHICKENPOX_43,J_CHICKENPOX_44,J_CHICKENPOX_45, J_CHICKENPOX_46,J_CHICKENPOX_47,J_CHICKENPOX_48 autolearn=no version=3.2.0-pre1-r499012 Received: from mail.gatrixx.com (mail.gatrixx.com [217.111.11.44]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id l2UDjE6p018560 for ; Fri, 30 Mar 2007 06:45:15 -0700 Received: (qmail 4550 invoked by uid 1008); 30 Mar 2007 15:45:02 +0200 Received: from unknown (HELO majestix.gallier.de) (ojoa@gatrixx.com@89.54.37.137) by 0 with AES256-SHA encrypted SMTP; 30 Mar 2007 15:45:02 +0200 Received: from [192.168.10.3] (olli@gutemine.gallier.de [192.168.10.3]) by majestix.gallier.de (8.13.8/8.13.8/Debian-2) with ESMTP id l2UDj1IP020997; Fri, 30 Mar 2007 15:45:01 +0200 Message-ID: <460D145C.6080807@j-o-a.de> Date: Fri, 30 Mar 2007 15:45:00 +0200 From: Oliver Joa User-Agent: Icedove 1.5.0.10 (X11/20070307) MIME-Version: 1.0 To: David Chinner CC: linux-kernel@vger.kernel.org, xfs-oss Subject: Re: Corrupt XFS -Filesystems on new Hardware and Kernel References: <46094344.4090007@j-o-a.de> <20070328113141.GQ32597093@melbourne.sgi.com> <460A6298.4040702@j-o-a.de> <20070328234647.GT32597093@melbourne.sgi.com> In-Reply-To: <20070328234647.GT32597093@melbourne.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 11007 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: oliver@j-o-a.de Precedence: bulk X-list: xfs Content-Length: 6523 Lines: 263 Hi, David Chinner wrote: [...] > Next time you get a shutdown, can you unmount the filesystems and > run xfs_check and then "xfs_repair -n" on the filesystem. These will > tell you the inode numbers that are bad. Can you post the errors > reported by these tools? xfs_check gives this: bad format 0 for inode 8458341 type 0100000 bad format 0 for inode 8458344 type 0100000 bad format 0 for inode 8458348 type 0100000 block 1/4962 type unknown not expected block 1/4963 type unknown not expected block 1/4970 type unknown not expected block 1/4975 type unknown not expected block 1/4976 type unknown not expected link count mismatch for inode 8458341 (name ?), nlink 0, counted 1 link count mismatch for inode 8458344 (name ?), nlink 0, counted 1 link count mismatch for inode 8458348 (name ?), nlink 0, counted 1 xfs_repair -n gives this: Phase 1 - find and verify superblock... Phase 2 - using internal log - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 bad inode format in inode 8458341 bad inode format in inode 8458344 bad inode format in inode 8458348 bad inode format in inode 8458341 would have cleared inode 8458341 bad inode format in inode 8458344 would have cleared inode 8458344 bad inode format in inode 8458348 would have cleared inode 8458348 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 entry "cpufreq.c" at block 0 offset 152 in directory inode 8458336 references free inode 8458341 would clear inode number in entry at offset 152... entry "head.S" at block 0 offset 232 in directory inode 8458336 references free inode 8458344 would clear inode number in entry at offset 232... entry "irq.c" at block 0 offset 320 in directory inode 8458336 references free inode 8458348 would clear inode number in entry at offset 320... bad inode format in inode 8458341 would have cleared inode 8458341 bad inode format in inode 8458344 would have cleared inode 8458344 bad inode format in inode 8458348 would have cleared inode 8458348 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem starting at / ... entry "cpufreq.c" in directory inode 8458336 points to free inode 8458341, would junk entry entry "head.S" in directory inode 8458336 points to free inode 8458344, would junk entry entry "irq.c" in directory inode 8458336 points to free inode 8458348, would junk entry - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. > Once you have the bad inode numbers, can you run the following > on the bad inodes: > > # xfs_db -r -c "inode " -c "p" xfs_db on inode 8458341 gives: core.magic = 0x494e core.mode = 0100644 core.version = 1 core.format = 0 (dev) core.nlinkv1 = 1 core.uid = 0 core.gid = 0 core.flushiter = 6 core.atime.sec = Tue Jan 30 22:42:51 2007 core.atime.nsec = 000000000 core.mtime.sec = Wed Jan 10 19:10:37 2007 core.mtime.nsec = 000000000 core.ctime.sec = Wed Mar 28 18:15:36 2007 core.ctime.nsec = 612718490 core.size = 6209 core.nblocks = 2 core.extsize = 0 core.nextents = 1 core.naextents = 0 core.forkoff = 0 core.aformat = 2 (extents) core.dmevmask = 0 core.dmstate = 0 core.newrtbm = 0 core.prealloc = 0 core.realtime = 0 core.immutable = 0 core.append = 0 core.sync = 0 core.noatime = 0 core.nodump = 0 core.rtinherit = 0 core.projinherit = 0 core.nosymlinks = 0 core.extsz = 0 core.extszinherit = 0 core.nodefrag = 0 core.gen = 0 next_unlinked = null u.dev = 0 xfs_db on inode 8458344 gives: core.magic = 0x494e core.mode = 0100644 core.version = 1 core.format = 0 (dev) core.nlinkv1 = 1 core.uid = 0 core.gid = 0 core.flushiter = 6 core.atime.sec = Tue Jan 30 22:42:51 2007 core.atime.nsec = 000000000 core.mtime.sec = Wed Jan 10 19:10:37 2007 core.mtime.nsec = 000000000 core.ctime.sec = Wed Mar 28 18:15:36 2007 core.ctime.nsec = 612849562 core.size = 2326 core.nblocks = 1 core.extsize = 0 core.nextents = 1 core.naextents = 0 core.forkoff = 0 core.aformat = 2 (extents) core.dmevmask = 0 core.dmstate = 0 core.newrtbm = 0 core.prealloc = 0 core.realtime = 0 core.immutable = 0 core.append = 0 core.sync = 0 core.noatime = 0 core.nodump = 0 core.rtinherit = 0 core.projinherit = 0 core.nosymlinks = 0 core.extsz = 0 core.extszinherit = 0 core.nodefrag = 0 core.gen = 0 next_unlinked = null u.dev = 0 xfs_db on inode 8458336 gives: core.magic = 0x494e core.mode = 040755 core.version = 1 core.format = 2 (extents) core.nlinkv1 = 5 core.uid = 0 core.gid = 0 core.flushiter = 1 core.atime.sec = Tue Jan 30 22:42:51 2007 core.atime.nsec = 906063000 core.mtime.sec = Wed Jan 10 19:10:37 2007 core.mtime.nsec = 000000000 core.ctime.sec = Tue Jan 30 22:44:48 2007 core.ctime.nsec = 428077021 core.size = 4096 core.nblocks = 1 core.extsize = 0 core.nextents = 1 core.naextents = 0 core.forkoff = 0 core.aformat = 2 (extents) core.dmevmask = 0 core.dmstate = 0 core.newrtbm = 0 core.prealloc = 0 core.realtime = 0 core.immutable = 0 core.append = 0 core.sync = 0 core.noatime = 0 core.nodump = 0 core.rtinherit = 0 core.projinherit = 0 core.nosymlinks = 0 core.extsz = 0 core.extszinherit = 0 core.nodefrag = 0 core.gen = 0 next_unlinked = null u.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,528704,1,0] [...] > and post the output for us? That will enable us to see exactly what > the corruption is on the inode. Here is it... Thanks a lot... Olli