From owner-linux-xfs@oss.sgi.com Tue Nov 1 01:24:25 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 01:24:28 -0800 (PST) Received: from smtp.pzkagis.cz (gis6.netbox.cz [83.240.30.214]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA19OHO0017518 for ; Tue, 1 Nov 2005 01:24:24 -0800 Received: (from luf@localhost) by smtp.pzkagis.cz (8.11.6/8.11.6) id jA19Kj026714; Tue, 1 Nov 2005 10:20:45 +0100 Date: Tue, 1 Nov 2005 10:20:45 +0100 From: Ludek Finstrle To: Eric Sandeen Cc: Renaat Dumon , linux-xfs@oss.sgi.com Subject: Re: XFS corruption on 2.4.28 Message-ID: <20051101092045.GB26576@soptik.pzkagis.cz> References: <200510302326.j9UNPw4u005031@outmx013.isp.belgacom.be> <20051031090429.GA17240@soptik.pzkagis.cz> <436650D4.9000307@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <436650D4.9000307@sgi.com> User-Agent: Mutt/1.4i X-archive-position: 6480 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: luf@pzkagis.cz Precedence: bulk X-list: linux-xfs > >I notice this behaviour few weeks ago (max. 4 weeks). There is a patch for > >it in CVS. Try search through mail-archiv for "df vs du -sk" (or similar). > > I -think- that that fix is for a different problem... in that previous > case, xfs_repair could correctly repair the filesystem, without moving > files to lost+found. I'm not so sure. xfs_repair moved / and some other files to lost+found in first run after a long time (and xfs_fsr). I try xfs_fsr and xfs_repair after few hours next time. So it seems conformable to my problems. I didn't try remount. Finally file with same size has same odd size (I don't remember the right numbers: e.g. 4321 -> 774329921, ...) I think Renaat could try the fix. He doesn't go to bigger problems when he try it. I know I had to run xfs_fsr but my filesystem isn't under heavy load. Luf From owner-linux-xfs@oss.sgi.com Tue Nov 1 02:22:26 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 02:22:29 -0800 (PST) Received: from postit.belbone.be (postit.belbone.be [195.13.1.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA1AMOO0022989 for ; Tue, 1 Nov 2005 02:22:25 -0800 Received: from overdrive (overdrive.ops.belbone.be [192.168.20.80]) by postit.belbone.be (Postfix) with ESMTP id 488ED1774FE; Tue, 1 Nov 2005 11:19:10 +0100 (CET) From: "Renaat Dumon" To: "'Ludek Finstrle'" , "'Eric Sandeen'" Cc: Subject: RE: XFS corruption on 2.4.28 Date: Tue, 1 Nov 2005 11:19:08 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Thread-Index: AcXexZY2SlmJkhvXQmGqTAQrhx9UwgAB/avw In-Reply-To: <20051101092045.GB26576@soptik.pzkagis.cz> Message-Id: <20051101101910.488ED1774FE@postit.belbone.be> X-archive-position: 6481 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: renaat.dumon@belbone.be Precedence: bulk X-list: linux-xfs Thanks, but putting another kernel on that box might not be that easy, I get this box in an "appliance" form. I don't have the kernel .config etc R. -----Original Message----- From: Ludek Finstrle [mailto:luf@pzkagis.cz] Sent: 01 November 2005 10:21 To: Eric Sandeen Cc: Renaat Dumon; linux-xfs@oss.sgi.com Subject: Re: XFS corruption on 2.4.28 > >I notice this behaviour few weeks ago (max. 4 weeks). There is a > >patch for it in CVS. Try search through mail-archiv for "df vs du -sk" (or similar). > > I -think- that that fix is for a different problem... in that previous > case, xfs_repair could correctly repair the filesystem, without moving > files to lost+found. I'm not so sure. xfs_repair moved / and some other files to lost+found in first run after a long time (and xfs_fsr). I try xfs_fsr and xfs_repair after few hours next time. So it seems conformable to my problems. I didn't try remount. Finally file with same size has same odd size (I don't remember the right numbers: e.g. 4321 -> 774329921, ...) I think Renaat could try the fix. He doesn't go to bigger problems when he try it. I know I had to run xfs_fsr but my filesystem isn't under heavy load. Luf From owner-linux-xfs@oss.sgi.com Tue Nov 1 03:17:36 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 03:17:40 -0800 (PST) Received: from smtp.pzkagis.cz (gis6.netbox.cz [83.240.30.214]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA1BHZO0026673 for ; Tue, 1 Nov 2005 03:17:35 -0800 Received: (from luf@localhost) by smtp.pzkagis.cz (8.11.6/8.11.6) id jA1BE0a27504; Tue, 1 Nov 2005 12:14:00 +0100 Date: Tue, 1 Nov 2005 12:14:00 +0100 From: "'Ludek Finstrle'" To: Renaat Dumon Cc: "'Eric Sandeen'" , linux-xfs@oss.sgi.com Subject: Re: XFS corruption on 2.4.28 Message-ID: <20051101111400.GA27417@soptik.pzkagis.cz> References: <20051101092045.GB26576@soptik.pzkagis.cz> <20051101101910.488ED1774FE@postit.belbone.be> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20051101101910.488ED1774FE@postit.belbone.be> User-Agent: Mutt/1.4i X-archive-position: 6482 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: luf@pzkagis.cz Precedence: bulk X-list: linux-xfs > Thanks, but putting another kernel on that box might not be that easy, I get > this box in an "appliance" form. I don't have the kernel .config etc I'm sorry, I misunderstood word "appliance" before. Why don't you claim it? Or why don't you consult it with your suplier at least? Luf From owner-linux-xfs@oss.sgi.com Tue Nov 1 15:39:57 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 15:40:01 -0800 (PST) Received: from omx1.americas.sgi.com (omx1-ext.sgi.com [192.48.179.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA1NdvO0008173 for ; Tue, 1 Nov 2005 15:39:57 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA1NahxT019902 for ; Tue, 1 Nov 2005 17:36:44 -0600 Received: from [128.162.232.50] (stout.americas.sgi.com [128.162.232.50]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA1NahDN19010863; Tue, 1 Nov 2005 17:36:43 -0600 (CST) Message-ID: <4367FC0A.7060401@sgi.com> Date: Tue, 01 Nov 2005 17:36:42 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux@horizon.com CC: linux-xfs@oss.sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid References: <20051101233237.18777.qmail@science.horizon.com> In-Reply-To: <20051101233237.18777.qmail@science.horizon.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6485 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 567 Lines: 15 linux@horizon.com wrote: >>Is the problematic filesystem on the aforementioned flakey driver? > > > Yes. Sorry I wasn't clear. The SATA driver hung (the machine was still > "up", but with all the root FS inaccessible, I couldn't do much), and > when I rebooted it, the root FS wouldn't come back. Well, xfs does assume that if the underlying IO layers tell it that something is written, that it is in fact written. Depending on the level of flakiness in your SATA driver, it looks quite possible that you have encountered a SATA bug, not an xfs bug. -Eric From owner-linux-xfs@oss.sgi.com Tue Nov 1 15:35:51 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 15:35:55 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA1NZoO0007806 for ; Tue, 1 Nov 2005 15:35:51 -0800 Received: (qmail 18786 invoked by uid 1000); 1 Nov 2005 18:32:37 -0500 Date: 1 Nov 2005 18:32:37 -0500 Message-ID: <20051101233237.18777.qmail@science.horizon.com> From: linux@horizon.com To: linux@horizon.com, sandeen@sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid Cc: linux-xfs@oss.sgi.com In-Reply-To: X-archive-position: 6484 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 719 Lines: 21 > Is the problematic filesystem on the aforementioned flakey driver? Yes. Sorry I wasn't clear. The SATA driver hung (the machine was still "up", but with all the root FS inaccessible, I couldn't do much), and when I rebooted it, the root FS wouldn't come back. > Any kernel messages prior to the fs problems? (related to underlying > IO problems?) I couldn't get dmesg off the machine, but FWIW, the scrollback buffer was reported in http://marc.theaimsgroup.com/?l=linux-ide&m=113035431221239 My understanding was that this was a "freeze" type failure, so there shouldn't be any garbage on the disk, but it's possible I'm wrong; after all, we don't understand the failures very well. Thanks for the reply! From owner-linux-xfs@oss.sgi.com Tue Nov 1 17:21:11 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 17:21:13 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA21LAO0018680 for ; Tue, 1 Nov 2005 17:21:11 -0800 Received: (qmail 28540 invoked by uid 1000); 1 Nov 2005 20:17:53 -0500 Date: 1 Nov 2005 20:17:53 -0500 Message-ID: <20051102011753.28539.qmail@science.horizon.com> From: linux@horizon.com To: linux@horizon.com, sandeen@sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid Cc: linux-xfs@oss.sgi.com In-Reply-To: <4367FC0A.7060401@sgi.com> X-archive-position: 6486 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 887 Lines: 20 > Well, xfs does assume that if the underlying IO layers tell it that > something is written, that it is in fact written. Depending on the level > of flakiness in your SATA driver, it looks quite possible that you have > encountered a SATA bug, not an xfs bug. I assure you, I don't expect perfection in the face of such flakiness, but it did seem a little bit less than robust. Mostly, I'm wondering: - Can we extract any information about what misbehaved to help the SATA debugging process? - Is running xfs_repair the best thing to do? Are all of those error messages reasonably harmless? I don't know what's "normal" in xfs_repair output the way that I know that complaints about dtime, too many blocks allocated, and bitmap inconsistencies are basically harmless in e2fsck output, and I only need to worry about other messages. Thank you very much for your help! From owner-linux-xfs@oss.sgi.com Tue Nov 1 19:49:55 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 19:49:59 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA23nsO0026535 for ; Tue, 1 Nov 2005 19:49:54 -0800 Received: from spindle.corp.sgi.com (spindle.corp.sgi.com [198.29.75.13]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA24t3nB012320 for ; Tue, 1 Nov 2005 20:55:03 -0800 Received: from [127.0.0.1] (sshgate.corp.sgi.com [198.149.36.12]) by spindle.corp.sgi.com (SGI-8.12.5/8.12.9/generic_config-1.2) with ESMTP id jA23jdOS2568115; Tue, 1 Nov 2005 19:45:40 -0800 (PST) Message-ID: <43683663.9030807@sgi.com> Date: Tue, 01 Nov 2005 21:45:39 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux@horizon.com CC: linux-xfs@oss.sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid References: <20051102011753.28539.qmail@science.horizon.com> In-Reply-To: <20051102011753.28539.qmail@science.horizon.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6487 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 1275 Lines: 33 linux@horizon.com wrote: >>Well, xfs does assume that if the underlying IO layers tell it that >>something is written, that it is in fact written. Depending on the level >>of flakiness in your SATA driver, it looks quite possible that you have >>encountered a SATA bug, not an xfs bug. > > > I assure you, I don't expect perfection in the face of such > flakiness, but it did seem a little bit less than robust. If the underlying layers tell XFS that data is safe, it is not XFS's fault when it's not there later, and in fact there is nothing that XFS or any other journaling filesystem can do about it... things are journaled only until they are safe on disk, as reported by the IO subsystems. > Mostly, I'm wondering: > - Can we extract any information about what misbehaved to help the SATA > debugging process I doubt it. > - Is running xfs_repair the best thing to do? Are all of those error > messages reasonably harmless? I don't know what's "normal" in > xfs_repair output the way that I know that complaints about dtime, > too many blocks allocated, and bitmap inconsistencies are basically > harmless in e2fsck output, and I only need to worry about other > messages. xfs_repair is your only option. Run it and hope for the best. -Eric From owner-linux-xfs@oss.sgi.com Tue Nov 1 22:47:36 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Tue, 01 Nov 2005 22:47:44 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA26lYO0007870 for ; Tue, 1 Nov 2005 22:47:35 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA22484 for ; Wed, 2 Nov 2005 17:44:19 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16302) id 8BD9549BFB65; Wed, 2 Nov 2005 17:44:17 +1100 (EST) To: linux-xfs@oss.sgi.com Subject: TAKE 904196 - use gfp_t where needed Message-Id: <20051102064417.8BD9549BFB65@chook.melbourne.sgi.com> Date: Wed, 2 Nov 2005 17:44:17 +1100 (EST) From: nathans@sgi.com (Nathan Scott) X-archive-position: 6488 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 797 Lines: 19 Use the gfp_t type in the places we need to, for sparse - thanks to Chris Wedgwood. Date: Wed Nov 2 17:42:53 AEDT 2005 Workarea: chook.melbourne.sgi.com:/build/nathans/xfs-linux Inspected by: cw@f00f.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-kern/xfs-linux-melb Modid: xfs-linux-melb:xfs-kern:24276a linux-2.6/xfs_aops.c - 1.100 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_aops.c.diff?r1=text&tr1=1.100&r2=text&tr2=1.99&f=h linux-2.6/xfs_buf.c - 1.211 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_buf.c.diff?r1=text&tr1=1.211&r2=text&tr2=1.210&f=h linux-2.6/kmem.h - 1.32 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/kmem.h.diff?r1=text&tr1=1.32&r2=text&tr2=1.31&f=h From owner-linux-xfs@oss.sgi.com Wed Nov 2 01:11:08 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 01:11:16 -0800 (PST) Received: from s14.s14avahost.net (s14.s14avahost.net [66.98.146.55]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA29B3O0023140 for ; Wed, 2 Nov 2005 01:11:07 -0800 Received: from 81-178-240-98.dsl.pipex.com ([81.178.240.98] helo=[192.168.1.2]) by s14.s14avahost.net with esmtpa (Exim 4.52) id 1EXEZ1-00085G-GU; Wed, 02 Nov 2005 03:05:48 -0600 Message-ID: <436881E3.3040208@katalix.com> Date: Wed, 02 Nov 2005 09:07:47 +0000 From: Chris Elston User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: celston@katalix.com CC: linux-xfs@oss.sgi.com, nathans@sgi.com, sandeen@sgi.com, jchapman@katalix.com Subject: Re: [PATCH] Re: Files >4GB in XFS realtime partition Content-Type: multipart/mixed; boundary="------------060802050105000708020709" X-PopBeforeSMTPSenders: celston@katalix.com X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - s14.s14avahost.net X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - katalix.com X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 6489 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: celston@katalix.com Precedence: bulk X-list: linux-xfs Content-Length: 1289 Lines: 39 This is a multi-part message in MIME format. --------------060802050105000708020709 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit > Thanks! Could you resend that patch as just a regular text file (inline text at the end of your mail would be fine)? Should be straight text now, was UUENCODED before. Cheers, -- Chris Elston Katalix Systems Ltd http://www.katalix.com Catalysts for your Embedded Linux software development --------------060802050105000708020709 Content-Type: text/plain; name="XFS_RT_4GB.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="XFS_RT_4GB.patch" --- TDC775-2.4.29/fs/xfs/xfs_iomap.h 2005-10-31 15:37:54.000000000 +0000 +++ PATCHED/fs/xfs/xfs_iomap.h 2005-10-31 15:18:29.000000000 +0000 @@ -86,7 +86,7 @@ xfs_buftarg_t *iomap_target; loff_t iomap_offset; /* offset of mapping, bytes */ loff_t iomap_bsize; /* size of mapping, bytes */ - size_t iomap_delta; /* offset into mapping, bytes */ + loff_t iomap_delta; /* offset into mapping, bytes */ iomap_flags_t iomap_flags; } xfs_iomap_t; --------------060802050105000708020709-- From owner-linux-xfs@oss.sgi.com Wed Nov 2 04:43:11 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 04:43:16 -0800 (PST) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.190]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA2Ch8O0020799 for ; Wed, 2 Nov 2005 04:43:11 -0800 Received: from [212.227.126.200] (helo=mrvnet.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 1EXHuE-0002HH-00 for linux-xfs@oss.sgi.com; Wed, 02 Nov 2005 13:39:54 +0100 Received: from [172.23.4.152] (helo=pustefix152.kundenserver.de) by mrvnet.kundenserver.de with esmtp (Exim 3.35 #1) id 1EXHuE-0002Mo-00 for linux-xfs@oss.sgi.com; Wed, 02 Nov 2005 13:39:54 +0100 Message-Id: <14058202.53011130935194511.JavaMail.servlet@kundenserver> From: o.otahal@omnikum.de To: Subject: xfsdump request MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Priority: 3 X-Binford: 6100 (more power) X-Mailer: Webmail X-Originating-From: 27360706 X-Routing: DE X-Message-Id: <27360706$1130935194511172.23.4.152680097@pustefix152.kundenserver.de-303834160> X-Received: from pustefix152.kundenserver.de by 84.150.195.128 with HTTP id 27360706 for [linux-xfs@oss.sgi.com]; Wed, 2 Nov 2005 13:39:54 CET Date: Wed, 02 Nov 2005 13:39:54 +0100 X-Provags-ID: kundenserver.de abuse@kundenserver.de ident:@172.23.4.152 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id jA2ChBO0020802 X-archive-position: 6490 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: o.otahal@omnikum.de Precedence: bulk X-list: linux-xfs Content-Length: 331 Lines: 7 Hi Is it possible to expand the xfsdump command for a more verbose mode like “stat –t” for every dumped file in the next version of xfsdump? Additionally with a checksum like md5 for every file would be very fine? The intention is, to create a table of contents with all information about a backuped file. Thanks, Othmar Otahal From owner-linux-xfs@oss.sgi.com Wed Nov 2 05:53:28 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 05:53:32 -0800 (PST) Received: from postit.belbone.be (postit.belbone.be [195.13.1.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA2DrQO0027524 for ; Wed, 2 Nov 2005 05:53:27 -0800 Received: from overdrive (overdrive.ops.belbone.be [192.168.20.80]) by postit.belbone.be (Postfix) with ESMTP id 99B69177505; Wed, 2 Nov 2005 14:50:10 +0100 (CET) From: "Renaat Dumon" To: "'Eric Sandeen'" Cc: Subject: RE: XFS corruption on 2.4.28 Date: Wed, 2 Nov 2005 14:50:13 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Thread-Index: AcXemxon+DTwCBFeQgq6VmfLEgptkABGIQFA In-Reply-To: Message-Id: <20051102135010.99B69177505@postit.belbone.be> X-archive-position: 6492 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: renaat.dumon@belbone.be Precedence: bulk X-list: linux-xfs Content-Length: 3162 Lines: 87 Hi Eric, I did the tests, (added one "for" loop because of [0-9a-f]/[0-9a-f]/[0-9a-f]/somereallylongfilename.somenumber.db ) I did not observe the behaviour then :( I have - in the mean time - gotten a chance to remount the filesystem too on this particular box (which is the worst box I have for the phenomenon, due to the amount of data that is sitting on it I guess). This night new backups will occur, so I'll know pretty soon now whether or not the geometry options have anything to do with it One question though, suppose I create an XFS filesystem using a 2.6 bootdisk, untar a 2.4 system backup on the disk, and then boot from disk (so a 2.4 kernel). Could that interfere? That's what I did originally, but I have the mean time recreated the filesystem under the running 2.4 kernel, so I guess that shouldn't be an issue .. Kind regards, Renaat -----Original Message----- From: Eric Sandeen [mailto:sandeen@sgi.com] Sent: 01 November 2005 05:17 To: Renaat Dumon Subject: RE: XFS corruption on 2.4.28 Renaat Dumon wrote: > Hi Eric, > > Thanks for taking the time to look at this. > > bacardi root # xfs_info /Storage > meta-data=/Storage isize=256 agcount=56, agsize=1048576 > blks > = sectsz=512 > data = bsize=4096 blocks=58663328, imaxpct=25 > = sunit=0 swidth=0 blks, unwritten=0 > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=7161, version=1 > = sectsz=512 sunit=0 blks > realtime =none extsz=65536 blocks=0, rtextents=0 FWIW I tried this test with stock 2.4.28: [root@penguin5 src2]# mkfs.xfs -f -bsize=4096 -dfile,name=testfs,agsize=1048576b,size=58663328b,unwritten=0 meta-data=testfs isize=256 agcount=56, agsize=1048576 blks = sectsz=512 data = bsize=4096 blocks=58663328, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=0 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=28644, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 [root@penguin5 src2]# mount -o loop,noatime,sunit=128,swidth=256 testfs /mnt/test/ [root@penguin5 src2]# cd /mnt/test/ [root@penguin5 test]# ls [root@penguin5 test]# echo abcdefghijklmnopqrstuvwxyza > file [root@penguin5 test]# ls -l file -rw-r--r-- 1 root root 28 Oct 31 22:03 file [root@penguin5 test]# for a in `seq 1 3`; do for b in `seq 1 3`; do for c in `seq 1 10000`; do mkdir -p $a/$b; cp file $a/$b/00005d697a5a05795f53cb7b081f242d.$c.db; done; done; done [root@penguin5 test]# find . | xargs du -sk | grep -v ^4 366124 . 122040 ./1 122040 ./2 122040 ./3 so that did not trip it. perhaps you could try a similar test with your kernel.... either on loopback like this, or on your real filesystem? Does the above tree structure / file naming more or less match your real application? -Eric From owner-linux-xfs@oss.sgi.com Wed Nov 2 09:04:05 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 09:04:08 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA2H45O0013657 for ; Wed, 2 Nov 2005 09:04:05 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA2I9Ioe026317 for ; Wed, 2 Nov 2005 10:09:19 -0800 Received: from tulip-e236.americas.sgi.com (tulip-e236.americas.sgi.com [128.162.236.208]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA2GxoDN19064288; Wed, 2 Nov 2005 10:59:50 -0600 (CST) Received: from [128.162.232.14] (lnx-yingping.americas.sgi.com [128.162.232.14]) by tulip-e236.americas.sgi.com (8.12.9/ASC-news-1.4) with ESMTP id jA2GxoRA10385294; Wed, 2 Nov 2005 10:59:50 -0600 (CST) Message-ID: <4368F086.2060503@sgi.com> Date: Wed, 02 Nov 2005 10:59:50 -0600 From: Yingping Lu User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020 X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-xfs@oss.sgi.com, sgi.bugs.xfs@fido.engr.sgi.com Subject: TAKE 940655 - XFS corruption caused by serial fsstresses Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6493 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: yingping@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 606 Lines: 19 Fixed the inconsistency between attribute b-tree intermidiate node and leaf blocks. The problem came from xfsqa test 117. Date: Wed Nov 2 08:40:48 PST 2005 Workarea: penguin.americas.sgi.com:/src/yingping/xfs-kern Inspected by: tes nathans Author: yingping The following file(s) were checked into: bonnie.engr.sgi.com:/isms/xfs-kern/xfs-linux Modid: xfs-linux:xfs-kern:201527a xfs_da_btree.c - 1.159 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_da_btree.c.diff?r1=text&tr1=1.159&r2=text&tr2=1.158&f=h - Enabled useextra flag for attribute fork in v2 directory format. From owner-linux-xfs@oss.sgi.com Wed Nov 2 11:10:59 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 11:11:04 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA2JAwO0022765 for ; Wed, 2 Nov 2005 11:10:59 -0800 Received: (qmail 29536 invoked by uid 1000); 2 Nov 2005 14:07:44 -0500 Date: 2 Nov 2005 14:07:44 -0500 Message-ID: <20051102190744.29534.qmail@science.horizon.com> From: linux@horizon.com To: sandeen@sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid Cc: linux@horizon.com, linux-xfs@oss.sgi.com In-Reply-To: <43683663.9030807@sgi.com> X-archive-position: 6494 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 1118 Lines: 30 > xfs_repair is your only option. Run it and hope for the best. Ah, it complains about an unflushed log and won't run. It might be a worthwhile addition to the sfs_repair man page to mention that "-n" implies "-L". If it is indeed the case that the *only* code which can replay a log is in the kernel, that might be worth saying explicitly, too. I'm poking at xfs_logprint wondering if there's a way to get it to do something useful. >> - Can we extract any information about what misbehaved to help the SATA >> debugging process > I doubt it. Well, we can at least conclude that it didn't "fail fast" and freeze at a particular point in time, right? Because that would have left consistent metadata. (OF course, it could been the RAID-10 setup. If I have mirror pairs A/B and C/D, and the B&C driver got wedged, so the last write went only to A and D, and on recovery the RAID system synchronized A to B and C to D, that would leave a half-written log entry. But I'm using a 256K stripe size, and log entries are 32K, so they shouldn't be split across stripes....) Anyway, thanks a lot for your help! From owner-linux-xfs@oss.sgi.com Wed Nov 2 13:43:01 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 13:43:07 -0800 (PST) Received: from postit.belbone.be (postit.belbone.be [195.13.1.2]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA2LgxO0009992 for ; Wed, 2 Nov 2005 13:43:00 -0800 Received: from overdrive (overdrive.ops.belbone.be [192.168.20.80]) by postit.belbone.be (Postfix) with ESMTP id 1EEA31775D1; Wed, 2 Nov 2005 22:39:46 +0100 (CET) From: "Renaat Dumon" To: "'Eric Sandeen'" Cc: Subject: RE: XFS corruption on 2.4.28 Date: Wed, 2 Nov 2005 22:39:50 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 Thread-Index: AcXemxon+DTwCBFeQgq6VmfLEgptkABWY0jw In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Message-Id: <20051102213946.1EEA31775D1@postit.belbone.be> X-archive-position: 6495 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: renaat.dumon@belbone.be Precedence: bulk X-list: linux-xfs Content-Length: 3534 Lines: 93 Mounting the filesystem without the geometry doesn't change things, the size is still wrongly reported for .db files. I have however found files that should be 44 bytes (from ls -al) and bacardi 0 # ls -al 000b21176cda012e1c0a7828f75347c3.289492.db -rw------- 1 root root 44 Nov 2 20:18 000b21176cda012e1c0a7828f75347c3.289492.db bacardi 0 # du -sk 000b21176cda012e1c0a7828f75347c3.289492.db 2147483532 000b21176cda012e1c0a7828f75347c3.289492.db The value that du reports is the same as the one for 28-byte files, so it's a constant, regardless of the real size of the file. FWIW, this is a filesystem with a huge amount of files: bacardi 0 # df -hi Filesystem Inodes IUsed IFree IUse% Mounted on /dev/root 8.0M 44K 8.0M 1% / /dev/md0 12K 33 12K 1% /boot /dev/md3 224M 11M 214M 5% /Storage none 63K 1 63K 1% /dev/shm I have requested my vendor to actually build a kernel with the most recent patches + include the most recent userland progs. Could it be that a newer version of xfs_repair might catch some inconsistencies the current version would not ? Kind regards, Renaat -----Original Message----- From: Eric Sandeen [mailto:sandeen@sgi.com] Sent: 01 November 2005 05:17 To: Renaat Dumon Subject: RE: XFS corruption on 2.4.28 Renaat Dumon wrote: > Hi Eric, > > Thanks for taking the time to look at this. > > bacardi root # xfs_info /Storage > meta-data=/Storage isize=256 agcount=56, agsize=1048576 > blks > = sectsz=512 > data = bsize=4096 blocks=58663328, imaxpct=25 > = sunit=0 swidth=0 blks, unwritten=0 > naming =version 2 bsize=4096 > log =internal bsize=4096 blocks=7161, version=1 > = sectsz=512 sunit=0 blks > realtime =none extsz=65536 blocks=0, rtextents=0 FWIW I tried this test with stock 2.4.28: [root@penguin5 src2]# mkfs.xfs -f -bsize=4096 -dfile,name=testfs,agsize=1048576b,size=58663328b,unwritten=0 meta-data=testfs isize=256 agcount=56, agsize=1048576 blks = sectsz=512 data = bsize=4096 blocks=58663328, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=0 naming =version 2 bsize=4096 log =internal log bsize=4096 blocks=28644, version=1 = sectsz=512 sunit=0 blks realtime =none extsz=65536 blocks=0, rtextents=0 [root@penguin5 src2]# mount -o loop,noatime,sunit=128,swidth=256 testfs /mnt/test/ [root@penguin5 src2]# cd /mnt/test/ [root@penguin5 test]# ls [root@penguin5 test]# echo abcdefghijklmnopqrstuvwxyza > file [root@penguin5 test]# ls -l file -rw-r--r-- 1 root root 28 Oct 31 22:03 file [root@penguin5 test]# for a in `seq 1 3`; do for b in `seq 1 3`; do for c in `seq 1 10000`; do mkdir -p $a/$b; cp file $a/$b/00005d697a5a05795f53cb7b081f242d.$c.db; done; done; done [root@penguin5 test]# find . | xargs du -sk | grep -v ^4 366124 . 122040 ./1 122040 ./2 122040 ./3 so that did not trip it. perhaps you could try a similar test with your kernel.... either on loopback like this, or on your real filesystem? Does the above tree structure / file naming more or less match your real application? -Eric From owner-linux-xfs@oss.sgi.com Wed Nov 2 14:00:05 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 14:00:07 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA2M04O0011592 for ; Wed, 2 Nov 2005 14:00:04 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA2N5KaX032360 for ; Wed, 2 Nov 2005 15:05:20 -0800 Received: from [128.162.232.50] (stout.americas.sgi.com [128.162.232.50]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA2LunDN19076523; Wed, 2 Nov 2005 15:56:50 -0600 (CST) Message-ID: <43693621.6040902@sgi.com> Date: Wed, 02 Nov 2005 15:56:49 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Renaat Dumon CC: linux-xfs@oss.sgi.com Subject: Re: XFS corruption on 2.4.28 References: <20051102213946.1EEA31775D1@postit.belbone.be> In-Reply-To: <20051102213946.1EEA31775D1@postit.belbone.be> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6496 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 1465 Lines: 39 Renaat Dumon wrote: > Mounting the filesystem without the geometry doesn't change things, the size > is still wrongly reported for .db files. Ok, good to know. You should put it back now :) > I have however found files that should be 44 bytes (from ls -al) and > > bacardi 0 # ls -al 000b21176cda012e1c0a7828f75347c3.289492.db > -rw------- 1 root root 44 Nov 2 20:18 > 000b21176cda012e1c0a7828f75347c3.289492.db > bacardi 0 # du -sk 000b21176cda012e1c0a7828f75347c3.289492.db > 2147483532 000b21176cda012e1c0a7828f75347c3.289492.db > > > The value that du reports is the same as the one for 28-byte files, so it's > a constant, regardless of the real size of the file. Interesting. > FWIW, this is a filesystem with a huge amount of files: > > bacardi 0 # df -hi > Filesystem Inodes IUsed IFree IUse% Mounted on > /dev/root 8.0M 44K 8.0M 1% / > /dev/md0 12K 33 12K 1% /boot > /dev/md3 224M 11M 214M 5% /Storage > none 63K 1 63K 1% /dev/shm > > I have requested my vendor to actually build a kernel with the most recent > patches + include the most recent userland progs. Could it be that a newer > version of xfs_repair might catch some inconsistencies the current version > would not ? It would probably be good to see some of the repair output when you run it on a problematic filesystem... -Eric From owner-linux-xfs@oss.sgi.com Wed Nov 2 14:21:15 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 14:21:17 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA2MLEO0013478 for ; Wed, 2 Nov 2005 14:21:14 -0800 Received: (qmail 16065 invoked by uid 1000); 2 Nov 2005 17:18:00 -0500 Date: 2 Nov 2005 17:18:00 -0500 Message-ID: <20051102221800.16064.qmail@science.horizon.com> From: linux@horizon.com To: sandeen@sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid Cc: linux-xfs@oss.sgi.com, linux@horizon.com In-Reply-To: <20051102190744.29534.qmail@science.horizon.com> X-archive-position: 6497 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 2557 Lines: 54 Just some more info... I ran xfs_repair -L, but then ran xfs_check, and it found a lot of problems remaining, in particular a lot of: link count mismatch for inode 514608143 (name ?), nlink 4, counted 3 link count mismatch for inode 514608156 (name ?), nlink 4, counted 3 link count mismatch for inode 514608168 (name ?), nlink 4, counted 3 link count mismatch for inode 514608172 (name ?), nlink 4, counted 3 link count mismatch for inode 514608180 (name ?), nlink 4, counted 3 link count mismatch for inode 514964512 (name ?), nlink 18, counted 17 link count mismatch for inode 514964526 (name ?), nlink 18, counted 17 link count mismatch for inode 514964528 (name ?), nlink 18, counted 17 link count mismatch for inode 514964531 (name ?), nlink 18, counted 17 link count mismatch for inode 514964532 (name ?), nlink 18, counted 17 link count mismatch for inode 514964533 (name ?), nlink 18, counted 17 link count mismatch for inode 514964535 (name ?), nlink 18, counted 17 link count mismatch for inode 514939937 (name ?), nlink 26, counted 25 link count mismatch for inode 515270699 (name ?), nlink 4, counted 3 link count mismatch for inode 515270700 (name ?), nlink 4, counted 3 link count mismatch for inode 515270701 (name ?), nlink 4, counted 3 link count mismatch for inode 515270703 (name ?), nlink 4, counted 3 link count mismatch for inode 515270704 (name ?), nlink 4, counted 3 link count mismatch for inode 514965508 (name ?), nlink 16, counted 15 link count mismatch for inode 514610208 (name ?), nlink 3, counted 4 link count mismatch for inode 514610209 (name ?), nlink 3, counted 4 link count mismatch for inode 523361301 (name ?), nlink 17, counted 16 link count mismatch for inode 523361312 (name ?), nlink 0, counted 2 So I'm running xfs_repair again, and it's not coming up clean. Is this supposed to happen? I didn't see a message about "re-run xfs_repair" anywhere... # xfs_repair /dev/md4 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 imap claims a free inode 129571903 is in use, correcting imap and clearing inode cleared inode 129571903 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 (in progress as I write this) From owner-linux-xfs@oss.sgi.com Wed Nov 2 15:14:19 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 15:14:22 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA2NEHO0016624 for ; Wed, 2 Nov 2005 15:14:18 -0800 Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA14990; Thu, 3 Nov 2005 10:10:57 +1100 Received: from wobbly.melbourne.sgi.com (localhost [127.0.0.1]) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id jA2NB9kt6179512; Thu, 3 Nov 2005 10:11:10 +1100 (EST) Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id jA2NB75Z6220653; Thu, 3 Nov 2005 10:11:07 +1100 (EST) Date: Thu, 3 Nov 2005 10:11:07 +1100 From: Nathan Scott To: Jan Kasprzak Cc: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051103101107.O6239737@wobbly.melbourne.sgi.com> References: <20051102212722.GC6759@fi.muni.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20051102212722.GC6759@fi.muni.cz>; from kas@fi.muni.cz on Wed, Nov 02, 2005 at 10:27:22PM +0100 X-archive-position: 6498 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 2103 Lines: 48 Hello Jan, On Wed, Nov 02, 2005 at 10:27:22PM +0100, Jan Kasprzak wrote: > Hello, world!\n > > I have found that after the system crash (e.h. a hard reset or a power > failure) XFS corrupts files which have been written to just before the crash: > The result is that those files contain data from random blocks on the > disk (e.g. from previously deleted files). This can have security/privacy > implications - users can see the contents of other users' old files. If you think you have found a security issue, it would be courteous to at least discuss this with the maintainers first. And since you are a frequent linux-xfs list poster too, it seems a bit odd that you're reporting this on linux-kernel instead... *shrug*, whatever. This issue affects every filesystem, right? Or are you claiming its only XFS affected here? Have you run your parallel-buffered-writers test case on any other filesystems? I'd be interested in the results, in particular, with all of the data=xxx modes of other filesystems. > either). Does XFS support a something like ext3's "data=ordered" mount > option? No, it doesn't. > Otherwise it is pretty unusable on multi-user systems. That's a ridiculous assertion. While this small metadata vs. buffered- data-write window exists on _any_ filesystem not using a data=ordered/ data=journaled mode (which I believe is quite a common mode of operation even on filesystems that offer those modes), it is impossible to exploit this in any sane way. You'd think people on a multi-user system might actually notice the machine being frequently rebooted while you try to tickle this window to get at "interesting" uninitialised freespace, no? Having said that, a data=ordered mode for XFS would be a nice feature. It just hasn't reached the top of our priority list, and its not been offered up as a patch by anyone yet. If anyone's interested in writing this, they should coordinate with hch and myself - there's a fair bit of I/O path work being done at the moment, which in the end will make a data=ordered mode alot easier to implement. cheers. -- Nathan From owner-linux-xfs@oss.sgi.com Wed Nov 2 15:39:48 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 15:39:53 -0800 (PST) Received: from tirith.ics.muni.cz (tirith.ics.muni.cz [147.251.4.36]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA2NdlO0018155 for ; Wed, 2 Nov 2005 15:39:47 -0800 Received: from anxur.fi.muni.cz (anxur.fi.muni.cz [147.251.48.3]) by tirith.ics.muni.cz (8.13.2/8.13.2) with ESMTP id jA2NaTNi031908; Thu, 3 Nov 2005 00:36:30 +0100 Received: by anxur.fi.muni.cz (Postfix, from userid 11561) id 5708322AF74; Thu, 3 Nov 2005 00:36:29 +0100 (CET) Date: Thu, 3 Nov 2005 00:36:29 +0100 From: Jan Kasprzak To: Nathan Scott Cc: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051102233629.GD6759@fi.muni.cz> References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20051103101107.O6239737@wobbly.melbourne.sgi.com> User-Agent: Mutt/1.4.1i X-Muni-Spam-TestIP: 147.251.48.3 X-Muni-Envelope-From: kas@fi.muni.cz X-Muni-Virus-Test: Clean X-archive-position: 6499 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: kas@fi.muni.cz Precedence: bulk X-list: linux-xfs Content-Length: 3726 Lines: 83 Nathan Scott wrote: : > The result is that those files contain data from random blocks on the : > disk (e.g. from previously deleted files). This can have security/privacy : > implications - users can see the contents of other users' old files. : : If you think you have found a security issue, it would be courteous : to at least discuss this with the maintainers first. Well, I think while it is a security issue, it is not serious enough to make it secret (it is not exploitable by anyone except those who are able to crash the machine). : And since you : are a frequent linux-xfs list poster too, it seems a bit odd that : you're reporting this on linux-kernel instead... *shrug*, whatever. I am sorry for this one - I am not subscribed to linux-xfs. Next time I will post to linux-xfs first. : : This issue affects every filesystem, right? Or are you claiming its : only XFS affected here? Have you run your parallel-buffered-writers : test case on any other filesystems? I'd be interested in the results, : in particular, with all of the data=xxx modes of other filesystems. : I will do this tomorrow or the day after and post the results. : > either). Does XFS support a something like ext3's "data=ordered" mount : > option? : : No, it doesn't. : OK. : > Otherwise it is pretty unusable on multi-user systems. : : That's a ridiculous assertion. While this small metadata vs. buffered- : data-write window exists on _any_ filesystem not using a data=ordered/ : data=journaled mode (which I believe is quite a common mode of operation : even on filesystems that offer those modes), As for ext3, I believe the vast majority of ext3 filesystems run in data=ordered mode. But yes, the same problem affects all filesystem except those running in data=ordered/journal mode. : it is impossible to exploit : this in any sane way. You'd think people on a multi-user system might : actually notice the machine being frequently rebooted while you try to : tickle this window to get at "interesting" uninitialised freespace, no? Yes, of course. However, the issue is probably much worse on XFS, because on XFS it probably affects not only the files being created/extended, but also the files being rewritten. Most other filesystems rewrite the files in-place, so when you rewrite the file, even with data=writeback you get only the mix of the old and new contents. Not somebody else's random data. This particular problem was that one of my users apparently opened his TeX document just to fix few typos, and ended up with the file which contained some part of a shell script and some binary data :-( I agree this is hard to exploit on purpose, however it can still leak a sensitive data. For example, this particular server runs also a mail server for ~2200 users, so a private mail can end up in somebody else's files. : Having said that, a data=ordered mode for XFS would be a nice feature. : It just hasn't reached the top of our priority list, and its not been : offered up as a patch by anyone yet. If anyone's interested in writing : this, they should coordinate with hch and myself - there's a fair bit : of I/O path work being done at the moment, which in the end will make : a data=ordered mode alot easier to implement. OK, thanks! I wish I would have time to do more kernel hacking ... -Yenya -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ | > Specs are a basis for _talking_about_ things. But they are _not_ a basis < > for implementing software. --Linus Torvalds < From owner-linux-xfs@oss.sgi.com Wed Nov 2 15:53:10 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 15:53:12 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA2Nr8O0019036 for ; Wed, 2 Nov 2005 15:53:09 -0800 Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA16148; Thu, 3 Nov 2005 10:49:46 +1100 Received: from wobbly.melbourne.sgi.com (localhost [127.0.0.1]) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id jA2Nnwkt6282574; Thu, 3 Nov 2005 10:49:59 +1100 (EST) Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id jA2NnumE6186888; Thu, 3 Nov 2005 10:49:56 +1100 (EST) Date: Thu, 3 Nov 2005 10:49:56 +1100 From: Nathan Scott To: Jan Kasprzak Cc: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051103104956.B6081538@wobbly.melbourne.sgi.com> References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> <20051102233629.GD6759@fi.muni.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20051102233629.GD6759@fi.muni.cz>; from kas@fi.muni.cz on Thu, Nov 03, 2005 at 12:36:29AM +0100 X-archive-position: 6500 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 919 Lines: 24 On Thu, Nov 03, 2005 at 12:36:29AM +0100, Jan Kasprzak wrote: > ... > Yes, of course. However, the issue is probably much worse > on XFS, because on XFS it probably affects not only the files being > created/extended, but also the files being rewritten. Most other No, thats not correct - XFS behaves as most filesystems do and will write over the top of existing data. > filesystems rewrite the files in-place, so when you rewrite the file, > even with data=writeback you get only the mix of the old and new > contents. Not somebody else's random data. XFS also rewrites files in-place. You will never get someone else's current data (that would be metadata corruption...), it would only ever be uninitialised, previously-free space. But as I said, other filesystems have the same window in which this can happen (in the absence of stronger data ordering/journalling semantics, of course). cheers. -- Nathan From owner-linux-xfs@oss.sgi.com Wed Nov 2 16:06:33 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 16:06:37 -0800 (PST) Received: from tirith.ics.muni.cz (tirith.ics.muni.cz [147.251.4.36]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA306WO0020268 for ; Wed, 2 Nov 2005 16:06:33 -0800 Received: from anxur.fi.muni.cz (anxur.fi.muni.cz [147.251.48.3]) by tirith.ics.muni.cz (8.13.2/8.13.2) with ESMTP id jA303Hj9026783; Thu, 3 Nov 2005 01:03:18 +0100 Received: by anxur.fi.muni.cz (Postfix, from userid 11561) id 76C1D22AF74; Thu, 3 Nov 2005 01:03:17 +0100 (CET) Date: Thu, 3 Nov 2005 01:03:17 +0100 From: Jan Kasprzak To: Nathan Scott Cc: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051103000317.GE6759@fi.muni.cz> References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> <20051102233629.GD6759@fi.muni.cz> <20051103104956.B6081538@wobbly.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20051103104956.B6081538@wobbly.melbourne.sgi.com> User-Agent: Mutt/1.4.1i X-Muni-Spam-TestIP: 147.251.48.3 X-Muni-Envelope-From: kas@fi.muni.cz X-Muni-Virus-Test: Clean X-archive-position: 6501 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: kas@fi.muni.cz Precedence: bulk X-list: linux-xfs Content-Length: 882 Lines: 27 Nathan Scott wrote: : XFS behaves as most filesystems do and : will write over the top of existing data. OK, thanks for the clarification. : XFS also rewrites files in-place. You will never get someone else's : current data (that would be metadata corruption...), Of course. : it would only : ever be uninitialised, previously-free space. Yes, but an old data from previously deleted files (sendmail's temporary files, vim save files, etc) may contain a sensitive information. -Y. -- | Jan "Yenya" Kasprzak | | GPG: ID 1024/D3498839 Fingerprint 0D99A7FB206605D7 8B35FCDE05B18A5E | | http://www.fi.muni.cz/~kas/ Journal: http://www.fi.muni.cz/~kas/blog/ | > Specs are a basis for _talking_about_ things. But they are _not_ a basis < > for implementing software. --Linus Torvalds < From owner-linux-xfs@oss.sgi.com Wed Nov 2 16:14:27 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 16:14:29 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA30EPO0020976 for ; Wed, 2 Nov 2005 16:14:26 -0800 Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA16671; Thu, 3 Nov 2005 11:11:06 +1100 Received: from wobbly.melbourne.sgi.com (localhost [127.0.0.1]) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id jA30BHkt6264294; Thu, 3 Nov 2005 11:11:18 +1100 (EST) Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id jA30BFJq6285589; Thu, 3 Nov 2005 11:11:15 +1100 (EST) Date: Thu, 3 Nov 2005 11:11:15 +1100 From: Nathan Scott To: Jan Kasprzak Cc: linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051103111115.C6081538@wobbly.melbourne.sgi.com> References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> <20051102233629.GD6759@fi.muni.cz> <20051103104956.B6081538@wobbly.melbourne.sgi.com> <20051103000317.GE6759@fi.muni.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20051103000317.GE6759@fi.muni.cz>; from kas@fi.muni.cz on Thu, Nov 03, 2005 at 01:03:17AM +0100 X-archive-position: 6502 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 417 Lines: 15 On Thu, Nov 03, 2005 at 01:03:17AM +0100, Jan Kasprzak wrote: > : it would only ever be uninitialised, previously-free space. > > Yes, but an old data from previously deleted files > (sendmail's temporary files, vim save files, etc) may contain > a sensitive information. Indeed. But this is a generic issue affecting most filesystems; its not specific to XFS as your original mail claimed. cheers. -- Nathan From owner-linux-xfs@oss.sgi.com Wed Nov 2 16:22:25 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 16:22:29 -0800 (PST) Received: from artax.karlin.mff.cuni.cz (artax.karlin.mff.cuni.cz [195.113.31.125]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA30MOO0025778 for ; Wed, 2 Nov 2005 16:22:25 -0800 Received: by artax.karlin.mff.cuni.cz (Postfix, from userid 17421) id A96893FFE; Thu, 3 Nov 2005 01:19:10 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by artax.karlin.mff.cuni.cz (Postfix) with ESMTP id A8CA73FF3; Thu, 3 Nov 2005 01:19:10 +0100 (CET) Date: Thu, 3 Nov 2005 01:19:10 +0100 (CET) From: Mikulas Patocka To: Nathan Scott Cc: Jan Kasprzak , linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash In-Reply-To: <20051103101107.O6239737@wobbly.melbourne.sgi.com> Message-ID: References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 6503 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: mikulas@artax.karlin.mff.cuni.cz Precedence: bulk X-list: linux-xfs Content-Length: 292 Lines: 11 >> either). Does XFS support a something like ext3's "data=ordered" mount >> option? > > No, it doesn't. BTW. Why does it sometimes overwrite files with zeros after crash and journal replay then? I thought that this was because it tries to avoid users seeing uninitialized data. Mikulas From owner-linux-xfs@oss.sgi.com Wed Nov 2 16:41:10 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 16:41:13 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA30f8O0026989 for ; Wed, 2 Nov 2005 16:41:09 -0800 Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA17405; Thu, 3 Nov 2005 11:37:52 +1100 Received: from wobbly.melbourne.sgi.com (localhost [127.0.0.1]) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id jA30c4kt6289716; Thu, 3 Nov 2005 11:38:04 +1100 (EST) Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id jA30c1S76281310; Thu, 3 Nov 2005 11:38:01 +1100 (EST) Date: Thu, 3 Nov 2005 11:38:01 +1100 From: Nathan Scott To: Mikulas Patocka Cc: Jan Kasprzak , linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051103113801.E6081538@wobbly.melbourne.sgi.com> References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from mikulas@artax.karlin.mff.cuni.cz on Thu, Nov 03, 2005 at 01:19:10AM +0100 X-archive-position: 6504 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 804 Lines: 24 On Thu, Nov 03, 2005 at 01:19:10AM +0100, Mikulas Patocka wrote: > >> either). Does XFS support a something like ext3's "data=ordered" mount > >> option? > > > > No, it doesn't. > > BTW. Why does it sometimes overwrite files with zeros after crash and > journal replay then? I thought that this was because it tries to avoid > users seeing uninitialized data. No, thats kinda related but not the same issue, its more to do with a truncate (or open(O_TRUNC)) followed by buffered writes to an existing file, which some applications do, and how that interacts poorly with delayed allocation (nothing is "overwritten with zeroes", its actually just a "hole"). But I do intend to get _some_ work done today, so you can google for a more detailed answer there if you're interested. cheers. -- Nathan From owner-linux-xfs@oss.sgi.com Wed Nov 2 16:45:31 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Wed, 02 Nov 2005 16:45:36 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA30jUO0027519 for ; Wed, 2 Nov 2005 16:45:31 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA31olMO032345 for ; Wed, 2 Nov 2005 17:50:47 -0800 Received: from daisy-e236.americas.sgi.com (daisy-e236.americas.sgi.com [128.162.236.214]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA30gGDN19082675; Wed, 2 Nov 2005 18:42:16 -0600 (CST) Received: (from overby@localhost) by daisy-e236.americas.sgi.com (8.12.9/SGI-server-1.8) id jA30fu3a18715941; Wed, 2 Nov 2005 18:41:56 -0600 (CST) Date: Wed, 2 Nov 2005 18:41:56 -0600 (CST) Message-Id: <200511030041.jA30fu3a18715941@daisy-e236.americas.sgi.com> From: Glen Overby To: mikulas@artax.karlin.mff.cuni.cz Cc: nathans@sgi.com, linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash In-Reply-To: message from Mikulas Patocka sent 3 November 2005 References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> X-archive-position: 6505 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: overby@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 921 Lines: 23 On November 3, Mikulas Patocka wrote: > BTW. Why does it sometimes overwrite files with zeros after crash and > journal replay then? I thought that this was because it tries to avoid > users seeing uninitialized data. It doesn't overwrite the file with zeros. You're getting an inode that has a non-zero size, but no data in the file. That is, a file that is a single hole. This happens because XFS logs metadata quickly, but the data in the file gets written more slowly. You'll see the same zeroing if you create a sparse file: write a megabyte of data, lseek forward a megabyte, and write another megabyte of data. When reading the area you lseeked over, it will read as zeros. The same is done for files that were preallocated, but haven't been written to (that is, the file has unwritten extents). You can look at the files in question with xfs_bmap -v and see that there's no extents there. Glen Overby From owner-linux-xfs@oss.sgi.com Thu Nov 3 04:20:12 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Thu, 03 Nov 2005 04:20:35 -0800 (PST) Received: from lxorguk.ukuu.org.uk (clock-tower.bc.nu [81.2.110.250] (may be forged)) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA3CKAO0022809 for ; Thu, 3 Nov 2005 04:20:11 -0800 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by lxorguk.ukuu.org.uk (8.13.4/8.13.4) with ESMTP id jA3CjooI020022; Thu, 3 Nov 2005 12:45:50 GMT Received: (from alan@localhost) by localhost.localdomain (8.13.4/8.13.4/Submit) id jA3CjoC4020021; Thu, 3 Nov 2005 12:45:50 GMT X-Authentication-Warning: localhost.localdomain: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: XFS information leak during crash From: Alan Cox To: Nathan Scott Cc: Jan Kasprzak , linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com In-Reply-To: <20051103111115.C6081538@wobbly.melbourne.sgi.com> References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> <20051102233629.GD6759@fi.muni.cz> <20051103104956.B6081538@wobbly.melbourne.sgi.com> <20051103000317.GE6759@fi.muni.cz> <20051103111115.C6081538@wobbly.melbourne.sgi.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Thu, 03 Nov 2005 12:45:49 +0000 Message-Id: <1131021949.18848.21.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) X-archive-position: 6507 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: linux-xfs Content-Length: 586 Lines: 14 On Iau, 2005-11-03 at 11:11 +1100, Nathan Scott wrote: > On Thu, Nov 03, 2005 at 01:03:17AM +0100, Jan Kasprzak wrote: > > : it would only ever be uninitialised, previously-free space. > > > > Yes, but an old data from previously deleted files > > (sendmail's temporary files, vim save files, etc) may contain > > a sensitive information. > > Indeed. But this is a generic issue affecting most filesystems; > its not specific to XFS as your original mail claimed. Very true. You can use ext3 in data journalling mode if this is a concern but that guarantee has a performance cost From owner-linux-xfs@oss.sgi.com Thu Nov 3 08:37:32 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Thu, 03 Nov 2005 08:37:37 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA3GbVO0025684 for ; Thu, 3 Nov 2005 08:37:32 -0800 Received: from ledzep.americas.sgi.com (ledzep.americas.sgi.com [198.149.16.14]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA3HgrRB023621 for ; Thu, 3 Nov 2005 09:42:53 -0800 Received: from [128.162.232.50] (stout.americas.sgi.com [128.162.232.50]) by ledzep.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA3GYGsL24723656; Thu, 3 Nov 2005 10:34:16 -0600 (CST) Message-ID: <436A3C07.20402@sgi.com> Date: Thu, 03 Nov 2005 10:34:15 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Renaat Dumon CC: linux-xfs@oss.sgi.com Subject: Re: XFS corruption on 2.4.28 References: <20051103112237.5C045177612@postit.belbone.be> In-Reply-To: <20051103112237.5C045177612@postit.belbone.be> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6508 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 842 Lines: 25 Renaat Dumon wrote: > Hi Eric, > > I've put the parameters back, and had a contact with one the "platform" guys > of my vendor. > > When I asked if he could think of something to explain the fact that > apparently only the small .db files were affected, he wasn't sure. But he > did mention these files where designed to be this small, so they could be > stored on disk in meta-data only. While I don't know what this really means, > I thought I'd run this by you to see if maybe it could isolate the issue a > little bit further. It means that for very small files, the file data can be stored in the disk inode itself, rather than in an extent outside the inode. Not a bad idea. If you're in touch w/ the vendor, perhaps you can work with him to try a stock 2.4.28 kernel, see if the problem persists. -Eric > Thanks, > > Renaat From owner-linux-xfs@oss.sgi.com Thu Nov 3 09:08:59 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Thu, 03 Nov 2005 09:09:10 -0800 (PST) Received: from thunker.thunk.org (thunk.org [69.25.196.29]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA3H8xO0027952 for ; Thu, 3 Nov 2005 09:08:59 -0800 Received: from root (helo=think.thunk.org) by thunker.thunk.org with local-esmtp (Exim 3.35 #1 (Debian)) id 1EXiWn-00050J-00; Thu, 03 Nov 2005 12:05:29 -0500 Received: from tytso by think.thunk.org with local (Exim 4.54) id 1EXiWl-0001s0-74; Thu, 03 Nov 2005 12:05:27 -0500 Date: Thu, 3 Nov 2005 12:05:27 -0500 From: "Theodore Ts'o" To: Alan Cox Cc: Nathan Scott , Jan Kasprzak , linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com Subject: Re: XFS information leak during crash Message-ID: <20051103170527.GA7113@thunk.org> Mail-Followup-To: Theodore Ts'o , Alan Cox , Nathan Scott , Jan Kasprzak , linux-kernel@vger.kernel.org, linux-xfs@oss.sgi.com References: <20051102212722.GC6759@fi.muni.cz> <20051103101107.O6239737@wobbly.melbourne.sgi.com> <20051102233629.GD6759@fi.muni.cz> <20051103104956.B6081538@wobbly.melbourne.sgi.com> <20051103000317.GE6759@fi.muni.cz> <20051103111115.C6081538@wobbly.melbourne.sgi.com> <1131021949.18848.21.camel@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1131021949.18848.21.camel@localhost.localdomain> User-Agent: Mutt/1.5.11 X-archive-position: 6509 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: tytso@mit.edu Precedence: bulk X-list: linux-xfs Content-Length: 763 Lines: 20 On Thu, Nov 03, 2005 at 12:45:49PM +0000, Alan Cox wrote: > On Iau, 2005-11-03 at 11:11 +1100, Nathan Scott wrote: > > On Thu, Nov 03, 2005 at 01:03:17AM +0100, Jan Kasprzak wrote: > > > : it would only ever be uninitialised, previously-free space. > > > > > > Yes, but an old data from previously deleted files > > > (sendmail's temporary files, vim save files, etc) may contain > > > a sensitive information. > > > > Indeed. But this is a generic issue affecting most filesystems; > > its not specific to XFS as your original mail claimed. > > Very true. You can use ext3 in data journalling mode if this is a > concern but that guarantee has a performance cost The default ordered journalling mode solves this problem at a much lower cost. - Ted From owner-linux-xfs@oss.sgi.com Thu Nov 3 12:36:29 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Thu, 03 Nov 2005 12:36:31 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA3KaRO0013331 for ; Thu, 3 Nov 2005 12:36:28 -0800 Received: (qmail 27627 invoked by uid 1000); 3 Nov 2005 15:33:13 -0500 Date: 3 Nov 2005 15:33:13 -0500 Message-ID: <20051103203313.27622.qmail@science.horizon.com> From: linux@horizon.com To: sandeen@sgi.com Subject: Re: 2.6.13.2 amd64: XFS: xlog_recover_process_data: bad clientid Cc: linux@horizon.com, linux-xfs@oss.sgi.com In-Reply-To: <20051102221800.16064.qmail@science.horizon.com> X-archive-position: 6510 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 6481 Lines: 164 Ah! The *second* xfs_repair run ended in a segfault.... (xfs_repair version 2.6.36, debian package xfsprogs 2.6.36-1) I'm trying a third... # xfs_repair /dev/md4 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 imap claims a free inode 129571903 is in use, correcting imap and clearing inode cleared inode 129571903 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 no .. entry for directory 313303087 - agno = 10 - agno = 11 no .. entry for directory 386031650 mismatch between format (2) and size (0) in directory ino 386031671 cleared inode 386031671 - agno = 12 entry "/." at block 0 offset 32 in directory inode 424609817 references invalid inode 18374686479671623679 clearing inode number in entry at offset 32... entry at block 0 offset 32 in directory inode 424609817 has illegal name "/.": no .. entry for directory 424609817 entry "/." at block 0 offset 32 in directory inode 424609838 references invalid inode 18374686479671623679 clearing inode number in entry at offset 32... entry at block 0 offset 32 in directory inode 424609838 has illegal name "/.": entry "/alpha" at block 0 offset 72 in directory inode 424609838 references invalid inode 18374686479671623679 clearing inode number in entry at offset 72... entry at block 0 offset 72 in directory inode 424609838 has illegal name "/alpha": entry "/arm" at block 0 offset 112 in directory inode 424609838 references invalid inode 18374686479671623679 clearing inode number in entry at offset 112... entry at block 0 offset 112 in directory inode 424609838 has illegal name "/arm": entry "/c4x" at block 0 offset 144 in directory inode 424609838 references invalid inode 18374686479671623679 clearing inode number in entry at offset 144... entry at block 0 offset 144 in directory inode 424609838 has illegal name "/c4x": entry "/frv" at block 0 offset 648 in directory inode 424609838 references invalid inode 18374686479671623679 clearing inode number in entry at offset 648... entry at block 0 offset 648 in directory inode 424609838 has illegal name "/frv": entry "/i386" at block 0 offset 752 in directory inode 424609838 references invalid inode 18374686479671623679 clearing inode number in entry at offset 752... entry at block 0 offset 752 in directory inode 424609838 has illegal name "/i386": no .. entry for directory 424609838 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - marking entry "lost+found" to be deleted - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 no .. entry for directory 313303087 - agno = 10 - agno = 11 - agno = 12 no .. entry for directory 424609817 entry "fcris" at block 0 offset 184 in directory inode 424609838 references free inode 129571903 clearing inode number in entry at offset 184... entry "fh8300" at block 0 offset 712 in directory inode 424609838 references free inode 386031671 clearing inode number in entry at offset 712... no .. entry for directory 424609838 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... rebuilding directory inode 1024 - traversal finished ... - traversing all unattached subtrees ... rebuilding directory inode 424609817 rebuilding directory inode 424609838 corrupt block 0 in directory inode 424609839: junking block Segmentation fault Run number 3 actually finished... # xfs_repair /dev/md4 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 no .. entry for directory 313303087 - agno = 10 - agno = 11 no .. entry for directory 386031650 - agno = 12 no .. entry for directory 424609817 no .. entry for directory 424609838 mismatch between format (2) and size (0) in directory ino 424609839 cleared inode 424609839 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - marking entry "lost+found" to be deleted - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 no .. entry for directory 313303087 - agno = 10 [missed some here] resetting inode 278955071 nlinks from 9 to 8 resetting inode 278958082 nlinks from 9 to 8 resetting inode 278958083 nlinks from 7 to 6 resetting inode 278958084 nlinks from 7 to 6 resetting inode 278958085 nlinks from 7 to 6 resetting inode 278958086 nlinks from 7 to 6 resetting inode 278958087 nlinks from 7 to 6 [...] resetting inode 515270696 nlinks from 5 to 6 resetting inode 515270697 nlinks from 3 to 4 resetting inode 515270710 nlinks from 3 to 4 resetting inode 523361301 nlinks from 17 to 16 done From owner-linux-xfs@oss.sgi.com Thu Nov 3 13:34:37 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Thu, 03 Nov 2005 13:34:45 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA3LYZO0016787 for ; Thu, 3 Nov 2005 13:34:36 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id IAA13407; Fri, 4 Nov 2005 08:31:16 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16302) id 6ABB849E5EF0; Fri, 4 Nov 2005 08:31:15 +1100 (EST) To: linux-xfs@oss.sgi.com, sgi.bugs.xfs@engr.sgi.com Subject: PARTIAL TAKE 945242 - fix inode32 mode Message-Id: <20051103213115.6ABB849E5EF0@chook.melbourne.sgi.com> Date: Fri, 4 Nov 2005 08:31:15 +1100 (EST) From: nathans@sgi.com (Nathan Scott) X-archive-position: 6511 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 573 Lines: 15 Fix an inode32 regression - if no options are presented, must still set default flags. Thanks to Chris Pascoe for finding and fixing. Date: Fri Nov 4 08:30:01 AEDT 2005 Workarea: chook.melbourne.sgi.com:/build/nathans/xfs-linux Inspected by: Christopher Pascoe The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/xfs-kern/xfs-linux-melb Modid: xfs-linux-melb:xfs-kern:24292a xfs_vfsops.c - 1.485 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.485&r2=text&tr2=1.484&f=h From owner-linux-xfs@oss.sgi.com Fri Nov 4 07:23:35 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 07:23:40 -0800 (PST) Received: from omx1.americas.sgi.com (omx1-ext.sgi.com [192.48.179.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA4FNYO0016452 for ; Fri, 4 Nov 2005 07:23:34 -0800 Received: from ledzep.americas.sgi.com (ledzep.americas.sgi.com [198.149.16.14]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA4FKIxT020004 for ; Fri, 4 Nov 2005 09:20:19 -0600 Received: from [128.162.232.50] (stout.americas.sgi.com [128.162.232.50]) by ledzep.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA4FKFsL24785165; Fri, 4 Nov 2005 09:20:15 -0600 (CST) Message-ID: <436B7C2E.1000304@sgi.com> Date: Fri, 04 Nov 2005 09:20:14 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Chris Elston CC: linux-xfs@oss.sgi.com, nathans@sgi.com, jchapman@katalix.com Subject: Re: [PATCH] Re: Files >4GB in XFS realtime partition References: <200510311812.j9VICiO0030280@oss.sgi.com> In-Reply-To: <200510311812.j9VICiO0030280@oss.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6513 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 423 Lines: 16 Chris Elston wrote: > See > http://marc.theaimsgroup.com/?t=111642565400004&r=1&w=2 > for original report. Did you guys ever try the xfs_io sequence as suggested by Nathan in that original report? It passed for Nathan on x86; maybe you guys can try it on your mips rig too, just to satisfy our curiosity. I'll go stare at code a bit today, too, try to convince myself that your patch is correct. :) Thanks, -Eric From owner-linux-xfs@oss.sgi.com Fri Nov 4 11:22:02 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 11:22:07 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA4JM2O0001346 for ; Fri, 4 Nov 2005 11:22:02 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA4KRXLF006174 for ; Fri, 4 Nov 2005 12:27:33 -0800 Received: from [128.162.232.50] (stout.americas.sgi.com [128.162.232.50]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA4JHhDN19208798; Fri, 4 Nov 2005 13:17:43 -0600 (CST) Message-ID: <436BB3D7.10601@sgi.com> Date: Fri, 04 Nov 2005 13:17:43 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6-1.1.fc4 (X11/20050720) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Chris Elston CC: linux-xfs@oss.sgi.com, nathans@sgi.com, jchapman@katalix.com Subject: Re: [PATCH] Re: Files >4GB in XFS realtime partition References: <200510311812.j9VICiO0030280@oss.sgi.com> In-Reply-To: <200510311812.j9VICiO0030280@oss.sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6514 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 809 Lines: 18 Chris Elston wrote: --- TDC775-2.4.29/fs/xfs/xfs_iomap.h 2005-10-31 15:37:54.000000000 +0000 > +++ PATCHED/fs/xfs/xfs_iomap.h 2005-10-31 15:18:29.000000000 +0000 > @@ -86,7 +86,7 @@ > xfs_buftarg_t *iomap_target; > loff_t iomap_offset; /* offset of mapping, bytes */ > loff_t iomap_bsize; /* size of mapping, bytes */ > - size_t iomap_delta; /* offset into mapping, bytes */ > + loff_t iomap_delta; /* offset into mapping, bytes */ > iomap_flags_t iomap_flags; > } xfs_iomap_t; Yep, I agree that this is correct. The way I was trying to expose it wasn't quite the right approach, but as I traced it through it's pretty obviously correct. Thanks! -Eric From owner-linux-xfs@oss.sgi.com Fri Nov 4 14:36:34 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 14:36:39 -0800 (PST) Received: from omx1.americas.sgi.com (omx1-ext.sgi.com [192.48.179.11]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA4MaXO0015593 for ; Fri, 4 Nov 2005 14:36:34 -0800 Received: from internal-mail-relay.corp.sgi.com (internal-mail-relay.corp.sgi.com [198.149.32.51]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA4MXIxT029891 for ; Fri, 4 Nov 2005 16:33:18 -0600 Received: from naboo.americas.sgi.com (naboo.americas.sgi.com [128.162.233.73]) by internal-mail-relay.corp.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id jA4MXH2Z282498900; Fri, 4 Nov 2005 14:33:18 -0800 (PST) Received: from naboo.americas.sgi.com (localhost [127.0.0.1]) by naboo.americas.sgi.com (8.13.3/8.13.3) with ESMTP id jA4MXHAB031069; Fri, 4 Nov 2005 16:33:17 -0600 Received: (from hch@localhost) by naboo.americas.sgi.com (8.13.3/8.13.3/Submit) id jA4MXHGQ031068; Fri, 4 Nov 2005 16:33:17 -0600 Date: Fri, 4 Nov 2005 16:33:17 -0600 From: Christoph Hellwig Message-Id: <200511042233.jA4MXHGQ031068@naboo.americas.sgi.com> To: linux-xfs@oss.sgi.com, sgi.bugs.xfs@fido.engr.sgi.com Subject: TAKE 941804 - remove over-eager assert X-archive-position: 6515 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: hch@relay.sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 520 Lines: 16 Date: Fri Nov 4 14:33:06 PST 2005 Workarea: naboo.americas.sgi.com:/go/space/hch/xfs-2.6.x Inspected by: yingping The following file(s) were checked into: bonnie.engr.sgi.com:/isms/linux/2.6.x-xfs Modid: xfs-linux:xfs-kern:201702a fs/xfs/xfs_vnodeops.c - 1.656 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.656&r2=text&tr2=1.655&f=h - i_mapping.nrpages may be non-zero for device inodes. the vfs already checks i_data.nrpages which is what we care about. From owner-linux-xfs@oss.sgi.com Fri Nov 4 15:50:05 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 15:50:08 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA4No3O0019248 for ; Fri, 4 Nov 2005 15:50:04 -0800 Received: (qmail 10452 invoked by uid 1000); 4 Nov 2005 18:46:48 -0500 Date: 4 Nov 2005 18:46:48 -0500 Message-ID: <20051104234648.10451.qmail@science.horizon.com> From: linux@horizon.com To: linux-xfs@oss.sgi.com Subject: Should xfs_repair make xfs_check stop complaining? Cc: linux@horizon.com In-Reply-To: <20051103203313.27622.qmail@science.horizon.com> X-archive-position: 6516 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 17703 Lines: 378 Um, just wondering... I have a file system, on which I have run xfs_repair six times, and xfs_check still has complaints about it. I understand the xfs_repair rebuilds lost+found every time, so it keeps finding unreferenced files, but xfs_repair keeps fixing things like > resetting inode 335565855 nlinks from 14 to 15 but leaves the subsequent inodes for xfs_check to complain about: > link count mismatch for inode 335565856 (name ?), nlink 14, counted 15 > link count mismatch for inode 335565857 (name ?), nlink 14, counted 15 Note that they're NOT the same inode number, so it's as if xfs_repair is missing some problems. Looking at the multiple runs, I see different inode numbers each time. xfs_repair and xfs_db are both version 2.6.36. Is this normal behaviour? I'm used to e2fsck which complains loudly if it leaves uncorrected errors. xfs_repair (#6) said: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - marking entry "lost+found" to be deleted - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... rebuilding directory inode 1024 - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 15509532, moving to lost+found disconnected dir inode 15509538, moving to lost+found disconnected inode 33556494, moving to lost+found disconnected dir inode 51346432, moving to lost+found disconnected dir inode 51346437, moving to lost+found disconnected inode 67238927, moving to lost+found disconnected inode 67263492, moving to lost+found disconnected inode 67263493, moving to lost+found disconnected dir inode 84868125, moving to lost+found disconnected dir inode 129571895, moving to lost+found disconnected inode 167868474, moving to lost+found disconnected inode 167868477, moving to lost+found disconnected inode 167868479, moving to lost+found disconnected inode 167869440, moving to lost+found disconnected inode 167869442, moving to lost+found disconnected inode 167869444, moving to lost+found disconnected inode 167869445, moving to lost+found disconnected inode 167869449, moving to lost+found disconnected inode 167869454, moving to lost+found disconnected inode 193486865, moving to lost+found disconnected dir inode 287203344, moving to lost+found disconnected dir inode 313303087, moving to lost+found disconnected dir inode 349614088, moving to lost+found disconnected dir inode 386031650, moving to lost+found disconnected dir inode 386031670, moving to lost+found disconnected inode 402863163, moving to lost+found disconnected dir inode 424609813, moving to lost+found disconnected dir inode 424609817, moving to lost+found disconnected dir inode 424609838, moving to lost+found disconnected dir inode 486566967, moving to lost+found Phase 7 - verify and correct link counts... resetting inode 47615030 nlinks from 9 to 10 resetting inode 72902656 nlinks from 3 to 4 resetting inode 72902657 nlinks from 3 to 4 resetting inode 72902658 nlinks from 3 to 4 resetting inode 72902659 nlinks from 3 to 4 resetting inode 72902660 nlinks from 3 to 4 resetting inode 72902661 nlinks from 3 to 4 resetting inode 72902662 nlinks from 3 to 4 resetting inode 72902663 nlinks from 3 to 4 resetting inode 72902664 nlinks from 3 to 4 resetting inode 72902665 nlinks from 3 to 4 resetting inode 72902666 nlinks from 3 to 4 resetting inode 72902667 nlinks from 3 to 4 resetting inode 72902668 nlinks from 3 to 4 resetting inode 72902669 nlinks from 3 to 4 resetting inode 72902670 nlinks from 3 to 4 resetting inode 72902671 nlinks from 3 to 4 resetting inode 72902672 nlinks from 3 to 4 resetting inode 72902673 nlinks from 3 to 4 resetting inode 72902674 nlinks from 3 to 4 resetting inode 72902675 nlinks from 3 to 4 resetting inode 72902676 nlinks from 3 to 4 resetting inode 72902677 nlinks from 3 to 4 resetting inode 72902678 nlinks from 3 to 4 resetting inode 72902679 nlinks from 3 to 4 resetting inode 72902680 nlinks from 3 to 4 resetting inode 72902681 nlinks from 3 to 4 resetting inode 72902682 nlinks from 3 to 4 resetting inode 72902683 nlinks from 3 to 4 resetting inode 72902684 nlinks from 3 to 4 resetting inode 72902685 nlinks from 3 to 4 resetting inode 72902686 nlinks from 3 to 4 resetting inode 72902687 nlinks from 3 to 4 resetting inode 302496781 nlinks from 14 to 15 resetting inode 302496782 nlinks from 20 to 21 resetting inode 302496783 nlinks from 20 to 21 resetting inode 302496784 nlinks from 14 to 15 resetting inode 302496785 nlinks from 20 to 21 resetting inode 302496786 nlinks from 20 to 21 resetting inode 302496787 nlinks from 14 to 15 resetting inode 302496788 nlinks from 20 to 21 resetting inode 302496789 nlinks from 14 to 15 resetting inode 335565855 nlinks from 14 to 15 done and xfs_check then said: block 2/262 expected type unknown got free2 block 2/263 expected type unknown got free2 block 2/264 expected type unknown got free2 block 2/265 expected type unknown got free2 block 2/266 expected type unknown got free2 block 2/158606 expected type unknown got free2 link count mismatch for inode 9521204 (name ?), nlink 18, counted 17 link count mismatch for inode 9521206 (name ?), nlink 18, counted 17 link count mismatch for inode 9521207 (name ?), nlink 18, counted 17 link count mismatch for inode 8159262 (name ?), nlink 2277, counted 2278 link count mismatch for inode 4295701 (name ?), nlink 13, counted 12 link count mismatch for inode 4295702 (name ?), nlink 13, counted 12 link count mismatch for inode 4295703 (name ?), nlink 11, counted 10 link count mismatch for inode 4295705 (name ?), nlink 13, counted 12 link count mismatch for inode 37956638 (name ?), nlink 3, counted 4 link count mismatch for inode 37956639 (name ?), nlink 3, counted 4 link count mismatch for inode 47756317 (name ?), nlink 57, counted 62 link count mismatch for inode 47756319 (name ?), nlink 14, counted 15 link count mismatch for inode 39545890 (name ?), nlink 149, counted 150 link count mismatch for inode 47281154 (name ?), nlink 16, counted 17 link count mismatch for inode 36780063 (name ?), nlink 2, counted 3 link count mismatch for inode 67423241 (name ?), nlink 445, counted 448 link count mismatch for inode 72395795 (name ?), nlink 6, counted 7 link count mismatch for inode 72395796 (name ?), nlink 10, counted 11 link count mismatch for inode 72395799 (name ?), nlink 6, counted 7 link count mismatch for inode 72395800 (name ?), nlink 6, counted 7 link count mismatch for inode 72395801 (name ?), nlink 6, counted 7 link count mismatch for inode 72395802 (name ?), nlink 6, counted 7 link count mismatch for inode 72395803 (name ?), nlink 8, counted 9 link count mismatch for inode 107157534 (name ?), nlink 8, counted 9 link count mismatch for inode 107157541 (name ?), nlink 3, counted 4 link count mismatch for inode 107157542 (name ?), nlink 3, counted 4 link count mismatch for inode 107157543 (name ?), nlink 3, counted 4 link count mismatch for inode 107157544 (name ?), nlink 3, counted 4 link count mismatch for inode 107157545 (name ?), nlink 3, counted 4 link count mismatch for inode 107157546 (name ?), nlink 3, counted 4 link count mismatch for inode 107157547 (name ?), nlink 3, counted 4 link count mismatch for inode 107157548 (name ?), nlink 3, counted 4 link count mismatch for inode 107157550 (name ?), nlink 3, counted 4 link count mismatch for inode 109702154 (name ?), nlink 3, counted 4 link count mismatch for inode 109702156 (name ?), nlink 5, counted 6 link count mismatch for inode 109702157 (name ?), nlink 5, counted 6 link count mismatch for inode 109702158 (name ?), nlink 5, counted 6 link count mismatch for inode 109702159 (name ?), nlink 3, counted 4 link count mismatch for inode 144276493 (name ?), nlink 24, counted 25 link count mismatch for inode 144276494 (name ?), nlink 24, counted 25 link count mismatch for inode 144276495 (name ?), nlink 24, counted 25 link count mismatch for inode 144920600 (name ?), nlink 6, counted 7 link count mismatch for inode 144920601 (name ?), nlink 8, counted 9 link count mismatch for inode 144920607 (name ?), nlink 6, counted 7 link count mismatch for inode 144920608 (name ?), nlink 6, counted 7 link count mismatch for inode 144920609 (name ?), nlink 8, counted 9 link count mismatch for inode 144058403 (name ?), nlink 9, counted 10 link count mismatch for inode 241771561 (name ?), nlink 3, counted 4 link count mismatch for inode 240273428 (name ?), nlink 7, counted 6 link count mismatch for inode 240273430 (name ?), nlink 9, counted 8 link count mismatch for inode 240273431 (name ?), nlink 7, counted 6 link count mismatch for inode 240273432 (name ?), nlink 9, counted 8 link count mismatch for inode 240273433 (name ?), nlink 9, counted 8 link count mismatch for inode 240273434 (name ?), nlink 9, counted 8 link count mismatch for inode 240273435 (name ?), nlink 7, counted 6 link count mismatch for inode 240273436 (name ?), nlink 9, counted 8 link count mismatch for inode 240273437 (name ?), nlink 7, counted 6 link count mismatch for inode 240273438 (name ?), nlink 9, counted 8 link count mismatch for inode 240273439 (name ?), nlink 9, counted 8 link count mismatch for inode 346406949 (name ?), nlink 3, counted 4 link count mismatch for inode 340941885 (name ?), nlink 10, counted 11 link count mismatch for inode 340941886 (name ?), nlink 10, counted 11 link count mismatch for inode 343417900 (name ?), nlink 18, counted 17 link count mismatch for inode 343417902 (name ?), nlink 16, counted 15 link count mismatch for inode 340957216 (name ?), nlink 10, counted 11 link count mismatch for inode 340957217 (name ?), nlink 10, counted 11 link count mismatch for inode 335565856 (name ?), nlink 14, counted 15 link count mismatch for inode 335565857 (name ?), nlink 14, counted 15 link count mismatch for inode 383432717 (name ?), nlink 18, counted 17 link count mismatch for inode 383432719 (name ?), nlink 18, counted 17 link count mismatch for inode 424606730 (name ?), nlink 16, counted 18 link count mismatch for inode 414889995 (name ?), nlink 17, counted 18 link count mismatch for inode 414889996 (name ?), nlink 17, counted 18 link count mismatch for inode 414889998 (name ?), nlink 17, counted 18 link count mismatch for inode 414889999 (name ?), nlink 17, counted 18 link count mismatch for inode 404029469 (name ?), nlink 15, counted 16 link count mismatch for inode 410588170 (name ?), nlink 4, counted 3 link count mismatch for inode 442877964 (name ?), nlink 94, counted 95 link count mismatch for inode 441138178 (name ?), nlink 4, counted 3 link count mismatch for inode 441138180 (name ?), nlink 4, counted 3 link count mismatch for inode 437784616 (name ?), nlink 17, counted 18 link count mismatch for inode 473682976 (name ?), nlink 13, counted 12 link count mismatch for inode 473682978 (name ?), nlink 13, counted 12 link count mismatch for inode 473682979 (name ?), nlink 13, counted 12 link count mismatch for inode 473682980 (name ?), nlink 13, counted 12 link count mismatch for inode 473682981 (name ?), nlink 13, counted 12 link count mismatch for inode 473682982 (name ?), nlink 13, counted 12 link count mismatch for inode 477552672 (name ?), nlink 4, counted 3 link count mismatch for inode 477552673 (name ?), nlink 4, counted 3 link count mismatch for inode 477552674 (name ?), nlink 4, counted 3 link count mismatch for inode 477552675 (name ?), nlink 4, counted 3 link count mismatch for inode 477552676 (name ?), nlink 4, counted 3 link count mismatch for inode 477552677 (name ?), nlink 4, counted 3 link count mismatch for inode 477552678 (name ?), nlink 4, counted 3 link count mismatch for inode 477552679 (name ?), nlink 4, counted 3 link count mismatch for inode 477552680 (name ?), nlink 4, counted 3 link count mismatch for inode 477552681 (name ?), nlink 4, counted 3 link count mismatch for inode 477552682 (name ?), nlink 4, counted 3 link count mismatch for inode 477552683 (name ?), nlink 4, counted 3 link count mismatch for inode 477580336 (name ?), nlink 4, counted 3 link count mismatch for inode 473673741 (name ?), nlink 7, counted 8 link count mismatch for inode 514965508 (name ?), nlink 16, counted 15 link count mismatch for inode 514610208 (name ?), nlink 3, counted 4 link count mismatch for inode 514610209 (name ?), nlink 3, counted 4 If it helps, the output of xfs_repair #5 was: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - clearing existing "lost+found" inode - marking entry "lost+found" to be deleted - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... rebuilding directory inode 1024 - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 15509532, moving to lost+found disconnected dir inode 15509538, moving to lost+found disconnected inode 33556494, moving to lost+found disconnected dir inode 51346432, moving to lost+found disconnected dir inode 51346437, moving to lost+found disconnected inode 67238927, moving to lost+found disconnected inode 67263492, moving to lost+found disconnected inode 67263493, moving to lost+found disconnected dir inode 84868125, moving to lost+found disconnected dir inode 129571895, moving to lost+found disconnected inode 167868474, moving to lost+found disconnected inode 167868477, moving to lost+found disconnected inode 167868479, moving to lost+found disconnected inode 167869440, moving to lost+found disconnected inode 167869442, moving to lost+found disconnected inode 167869444, moving to lost+found disconnected inode 167869445, moving to lost+found disconnected inode 167869449, moving to lost+found disconnected inode 167869454, moving to lost+found disconnected inode 193486865, moving to lost+found disconnected dir inode 287203344, moving to lost+found disconnected dir inode 313303087, moving to lost+found disconnected dir inode 349614088, moving to lost+found disconnected dir inode 386031650, moving to lost+found disconnected dir inode 386031670, moving to lost+found disconnected inode 402863163, moving to lost+found disconnected dir inode 424609813, moving to lost+found disconnected dir inode 424609817, moving to lost+found disconnected dir inode 424609838, moving to lost+found disconnected dir inode 486566967, moving to lost+found Phase 7 - verify and correct link counts... resetting inode 47756320 nlinks from 14 to 15 resetting inode 47756332 nlinks from 14 to 15 resetting inode 47756342 nlinks from 14 to 15 resetting inode 47756345 nlinks from 40 to 43 resetting inode 47756350 nlinks from 27 to 29 resetting inode 201775157 nlinks from 14 to 15 resetting inode 201775158 nlinks from 14 to 15 resetting inode 269022216 nlinks from 9 to 10 resetting inode 269022217 nlinks from 9 to 10 resetting inode 269099015 nlinks from 9 to 10 resetting inode 269099016 nlinks from 9 to 10 resetting inode 269415486 nlinks from 9 to 10 done From owner-linux-xfs@oss.sgi.com Fri Nov 4 17:07:31 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 17:07:33 -0800 (PST) Received: from smtp.well.com (smtp.well.com [206.80.4.7]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA517UO0027606 for ; Fri, 4 Nov 2005 17:07:30 -0800 X-WELL-Auth: Yes Received: from well.com (well.com [206.80.4.5]) by smtp.well.com (8.13.5/8.13.5) with ESMTP id jA514Fco026014 for ; Fri, 4 Nov 2005 17:04:15 -0800 (PST) Date: Fri, 4 Nov 2005 17:04:15 -0800 (PST) From: Neil Harkins To: linux-xfs@oss.sgi.com Subject: xfsdump, the journal, and incremental dumps of subtrees... Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 6517 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nharkins@well.com Precedence: bulk X-list: linux-xfs Content-Length: 743 Lines: 21 Hi. New to list, and didn't find an answer in the FAQ, or a quick search of the archive, sorry if this resurrects a horse: Does xfsdump -l use the journal to determine what's changed for an incremental dump? Background: I'm currently using rsync to perform incremental updates of a subtree with a significant number of files, when only about 5% changes, and the filesystem walk to check each file's timestamp is totally unnecessary if the journal can be consulted. If xfsdump uses the journal to avoid the walk, then using xfsdump | xfsrestore would be ideal, except according to the man page, it doesn't allow incrementals of subtrees. :( Could someone explain why that is the case? Thanks in advance for any insight, -neil From owner-linux-xfs@oss.sgi.com Fri Nov 4 17:56:54 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 17:56:58 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with ESMTP id jA51usO0029755 for ; Fri, 4 Nov 2005 17:56:54 -0800 Received: from spindle.corp.sgi.com (spindle.corp.sgi.com [198.29.75.13]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id jA532R9L006679 for ; Fri, 4 Nov 2005 19:02:27 -0800 Received: from [127.0.0.1] (sshgate.corp.sgi.com [198.149.36.12]) by spindle.corp.sgi.com (SGI-8.12.5/8.12.9/generic_config-1.2) with ESMTP id jA51rcOS3188789; Fri, 4 Nov 2005 17:53:38 -0800 (PST) Message-ID: <436C10A1.6000802@sgi.com> Date: Fri, 04 Nov 2005 19:53:37 -0600 From: Eric Sandeen User-Agent: Mozilla Thunderbird 1.0.6 (Macintosh/20050716) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux@horizon.com CC: linux-xfs@oss.sgi.com Subject: Re: Should xfs_repair make xfs_check stop complaining? References: <20051104234648.10451.qmail@science.horizon.com> In-Reply-To: <20051104234648.10451.qmail@science.horizon.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 6518 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: sandeen@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 1112 Lines: 32 linux@horizon.com wrote: > Um, just wondering... I have a file system, on which I have run xfs_repair > six times, and xfs_check still has complaints about it. > > I understand the xfs_repair rebuilds lost+found every time, so it > keeps finding unreferenced files, but xfs_repair keeps fixing things like > >>resetting inode 335565855 nlinks from 14 to 15 > > > but leaves the subsequent inodes for xfs_check to complain about: > >>link count mismatch for inode 335565856 (name ?), nlink 14, counted 15 >>link count mismatch for inode 335565857 (name ?), nlink 14, counted 15 > > > Note that they're NOT the same inode number, so it's as if xfs_repair is missing > some problems. Looking at the multiple runs, I see different inode numbers each > time. > > xfs_repair and xfs_db are both version 2.6.36. > > Is this normal behaviour? I'm used to e2fsck which complains loudly if > it leaves uncorrected errors. Try moving lost+found to somewhere else, /lost+found2 or something, and re-run xfs_repair. Do problems still persist? Also that xfs_repair is not the -very- latest version.... -Eric From owner-linux-xfs@oss.sgi.com Fri Nov 4 18:40:32 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 18:40:38 -0800 (PST) Received: from larry.melbourne.sgi.com (mverd138.asia.info.net [61.14.31.138]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA52eUO0032351 for ; Fri, 4 Nov 2005 18:40:31 -0800 Received: from wobbly.melbourne.sgi.com (wobbly.melbourne.sgi.com [134.14.55.135]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA18015; Sat, 5 Nov 2005 13:37:10 +1100 Received: from wobbly.melbourne.sgi.com (localhost [127.0.0.1]) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id jA52bNkt6313791; Sat, 5 Nov 2005 13:37:24 +1100 (EST) Received: (from nathans@localhost) by wobbly.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id jA52bMTb6350362; Sat, 5 Nov 2005 13:37:22 +1100 (EST) Date: Sat, 5 Nov 2005 13:37:22 +1100 From: Nathan Scott To: Neil Harkins Cc: linux-xfs@oss.sgi.com Subject: Re: xfsdump, the journal, and incremental dumps of subtrees... Message-ID: <20051105133722.A6350193@wobbly.melbourne.sgi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from nharkins@well.com on Fri, Nov 04, 2005 at 05:04:15PM -0800 X-archive-position: 6519 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: nathans@sgi.com Precedence: bulk X-list: linux-xfs Content-Length: 1024 Lines: 31 On Fri, Nov 04, 2005 at 05:04:15PM -0800, Neil Harkins wrote: > > Hi. New to list, and didn't find an answer in the FAQ, or > a quick search of the archive, sorry if this resurrects a horse: > > Does xfsdump -l use the journal to determine what's changed > for an incremental dump? No, the journal doesnt hold the sort of information needed to make decisions related to incremental dumps (its a circular log, not what you're thinking). > Background: I'm currently using rsync to perform incremental updates > of a subtree with a significant number of files, when only about 5% > changes, and the filesystem walk to check each file's timestamp > is totally unnecessary if the journal can be consulted. It cannot. > If xfsdump uses the journal to avoid the walk, then using > xfsdump | xfsrestore would be ideal, except according to > the man page, it doesn't allow incrementals of subtrees. :( > Could someone explain why that is the case? Not sure about that one off the top of my head. cheers. -- Nathan From owner-linux-xfs@oss.sgi.com Fri Nov 4 19:31:03 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 19:31:06 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA53V2O0002995 for ; Fri, 4 Nov 2005 19:31:02 -0800 Received: (qmail 5543 invoked by uid 1000); 4 Nov 2005 22:27:47 -0500 Date: 4 Nov 2005 22:27:47 -0500 Message-ID: <20051105032747.5542.qmail@science.horizon.com> From: linux@horizon.com To: linux@horizon.com, sandeen@sgi.com Subject: Re: Should xfs_repair make xfs_check stop complaining? Cc: linux-xfs@oss.sgi.com In-Reply-To: <436C10A1.6000802@sgi.com> X-archive-position: 6520 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 630 Lines: 19 > Try moving lost+found to somewhere else, /lost+found2 or something, and > re-run xfs_repair. Do problems still persist? Er... does this imply that I should try to mount the file system? I haven't been doing that until it checks out clean. Or is there some other way to rename the lost+found directory? > Also that xfs_repair is not the -very- latest version.... Should I try 2.7.3 or something else? Confusingly, the "latest" and "Release-1.3.1" directories hold 2.5.6, "testing" holds 2.6.4 and 2.6.13 and it's the plain ftp://oss.sgi.com/projects/xfs/cmd_tars directory which holds 2.7.3. Thanks a lot for your help! From owner-linux-xfs@oss.sgi.com Fri Nov 4 22:25:06 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 22:25:09 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA56P5O0014200 for ; Fri, 4 Nov 2005 22:25:06 -0800 Received: (qmail 28466 invoked by uid 1000); 5 Nov 2005 01:21:49 -0500 Date: 5 Nov 2005 01:21:49 -0500 Message-ID: <20051105062149.28459.qmail@science.horizon.com> From: linux@horizon.com To: linux@horizon.com, sandeen@sgi.com Subject: Re: Should xfs_repair make xfs_check stop complaining? Cc: linux-xfs@oss.sgi.com In-Reply-To: <436C10A1.6000802@sgi.com> X-archive-position: 6521 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 8245 Lines: 132 > Also that xfs_repair is not the -very- latest version.... Well, xfsprogs 2.7.3 produces a much cleaner xfs_repair output, with no link count messages in phase 7, but xfs_check still thinks there's something wrong: # xfs_check /dev/md4 block 2/262 expected type unknown got free2 block 2/263 expected type unknown got free2 block 2/264 expected type unknown got free2 block 2/265 expected type unknown got free2 block 2/266 expected type unknown got free2 block 2/158606 expected type unknown got free2 link count mismatch for inode 9521204 (name ?), nlink 18, counted 17 link count mismatch for inode 9521206 (name ?), nlink 18, counted 17 link count mismatch for inode 9521207 (name ?), nlink 18, counted 17 link count mismatch for inode 8159262 (name ?), nlink 2277, counted 2278 link count mismatch for inode 4295701 (name ?), nlink 13, counted 12 link count mismatch for inode 4295702 (name ?), nlink 13, counted 12 link count mismatch for inode 4295703 (name ?), nlink 11, counted 10 link count mismatch for inode 4295705 (name ?), nlink 13, counted 12 link count mismatch for inode 37956638 (name ?), nlink 3, counted 4 link count mismatch for inode 37956639 (name ?), nlink 3, counted 4 link count mismatch for inode 47756317 (name ?), nlink 57, counted 62 link count mismatch for inode 47756319 (name ?), nlink 14, counted 15 link count mismatch for inode 39545890 (name ?), nlink 149, counted 150 link count mismatch for inode 47281154 (name ?), nlink 16, counted 17 link count mismatch for inode 36780063 (name ?), nlink 2, counted 3 link count mismatch for inode 67423241 (name ?), nlink 445, counted 448 link count mismatch for inode 72395795 (name ?), nlink 6, counted 7 link count mismatch for inode 72395796 (name ?), nlink 10, counted 11 link count mismatch for inode 72395799 (name ?), nlink 6, counted 7 link count mismatch for inode 72395800 (name ?), nlink 6, counted 7 link count mismatch for inode 72395801 (name ?), nlink 6, counted 7 link count mismatch for inode 72395802 (name ?), nlink 6, counted 7 link count mismatch for inode 72395803 (name ?), nlink 8, counted 9 link count mismatch for inode 107157534 (name ?), nlink 8, counted 9 link count mismatch for inode 107157541 (name ?), nlink 3, counted 4 link count mismatch for inode 107157542 (name ?), nlink 3, counted 4 link count mismatch for inode 107157543 (name ?), nlink 3, counted 4 link count mismatch for inode 107157544 (name ?), nlink 3, counted 4 link count mismatch for inode 107157545 (name ?), nlink 3, counted 4 link count mismatch for inode 107157546 (name ?), nlink 3, counted 4 link count mismatch for inode 107157547 (name ?), nlink 3, counted 4 link count mismatch for inode 107157548 (name ?), nlink 3, counted 4 link count mismatch for inode 107157550 (name ?), nlink 3, counted 4 link count mismatch for inode 109702154 (name ?), nlink 3, counted 4 link count mismatch for inode 109702156 (name ?), nlink 5, counted 6 link count mismatch for inode 109702157 (name ?), nlink 5, counted 6 link count mismatch for inode 109702158 (name ?), nlink 5, counted 6 link count mismatch for inode 109702159 (name ?), nlink 3, counted 4 link count mismatch for inode 144276493 (name ?), nlink 24, counted 25 link count mismatch for inode 144276494 (name ?), nlink 24, counted 25 link count mismatch for inode 144276495 (name ?), nlink 24, counted 25 link count mismatch for inode 144920600 (name ?), nlink 6, counted 7 link count mismatch for inode 144920601 (name ?), nlink 8, counted 9 link count mismatch for inode 144920607 (name ?), nlink 6, counted 7 link count mismatch for inode 144920608 (name ?), nlink 6, counted 7 link count mismatch for inode 144920609 (name ?), nlink 8, counted 9 link count mismatch for inode 144058403 (name ?), nlink 9, counted 10 link count mismatch for inode 241771561 (name ?), nlink 3, counted 4 link count mismatch for inode 240273428 (name ?), nlink 7, counted 6 link count mismatch for inode 240273430 (name ?), nlink 9, counted 8 link count mismatch for inode 240273431 (name ?), nlink 7, counted 6 link count mismatch for inode 240273432 (name ?), nlink 9, counted 8 link count mismatch for inode 240273433 (name ?), nlink 9, counted 8 link count mismatch for inode 240273434 (name ?), nlink 9, counted 8 link count mismatch for inode 240273435 (name ?), nlink 7, counted 6 link count mismatch for inode 240273436 (name ?), nlink 9, counted 8 link count mismatch for inode 240273437 (name ?), nlink 7, counted 6 link count mismatch for inode 240273438 (name ?), nlink 9, counted 8 link count mismatch for inode 240273439 (name ?), nlink 9, counted 8 link count mismatch for inode 346406949 (name ?), nlink 3, counted 4 link count mismatch for inode 340941885 (name ?), nlink 10, counted 11 link count mismatch for inode 340941886 (name ?), nlink 10, counted 11 link count mismatch for inode 343417900 (name ?), nlink 18, counted 17 link count mismatch for inode 343417902 (name ?), nlink 16, counted 15 link count mismatch for inode 340957216 (name ?), nlink 10, counted 11 link count mismatch for inode 340957217 (name ?), nlink 10, counted 11 link count mismatch for inode 335565856 (name ?), nlink 14, counted 15 link count mismatch for inode 335565857 (name ?), nlink 14, counted 15 link count mismatch for inode 383432717 (name ?), nlink 18, counted 17 link count mismatch for inode 383432719 (name ?), nlink 18, counted 17 link count mismatch for inode 424606730 (name ?), nlink 16, counted 18 link count mismatch for inode 414889995 (name ?), nlink 17, counted 18 link count mismatch for inode 414889996 (name ?), nlink 17, counted 18 link count mismatch for inode 414889998 (name ?), nlink 17, counted 18 link count mismatch for inode 414889999 (name ?), nlink 17, counted 18 link count mismatch for inode 404029469 (name ?), nlink 15, counted 16 link count mismatch for inode 410588170 (name ?), nlink 4, counted 3 link count mismatch for inode 442877964 (name ?), nlink 94, counted 95 link count mismatch for inode 441138178 (name ?), nlink 4, counted 3 link count mismatch for inode 441138180 (name ?), nlink 4, counted 3 link count mismatch for inode 437784616 (name ?), nlink 17, counted 18 link count mismatch for inode 473682976 (name ?), nlink 13, counted 12 link count mismatch for inode 473682978 (name ?), nlink 13, counted 12 link count mismatch for inode 473682979 (name ?), nlink 13, counted 12 link count mismatch for inode 473682980 (name ?), nlink 13, counted 12 link count mismatch for inode 473682981 (name ?), nlink 13, counted 12 link count mismatch for inode 473682982 (name ?), nlink 13, counted 12 link count mismatch for inode 477552672 (name ?), nlink 4, counted 3 link count mismatch for inode 477552673 (name ?), nlink 4, counted 3 link count mismatch for inode 477552674 (name ?), nlink 4, counted 3 link count mismatch for inode 477552675 (name ?), nlink 4, counted 3 link count mismatch for inode 477552676 (name ?), nlink 4, counted 3 link count mismatch for inode 477552677 (name ?), nlink 4, counted 3 link count mismatch for inode 477552678 (name ?), nlink 4, counted 3 link count mismatch for inode 477552679 (name ?), nlink 4, counted 3 link count mismatch for inode 477552680 (name ?), nlink 4, counted 3 link count mismatch for inode 477552681 (name ?), nlink 4, counted 3 link count mismatch for inode 477552682 (name ?), nlink 4, counted 3 link count mismatch for inode 477552683 (name ?), nlink 4, counted 3 link count mismatch for inode 477580336 (name ?), nlink 4, counted 3 link count mismatch for inode 473673741 (name ?), nlink 7, counted 8 link count mismatch for inode 514965508 (name ?), nlink 16, counted 15 link count mismatch for inode 514610208 (name ?), nlink 3, counted 4 link count mismatch for inode 514610209 (name ?), nlink 3, counted 4 Yet another xfs_repair run produces just one nlinks message: disconnected inode 402863163, moving to lost+found disconnected dir inode 424609813, moving to lost+found disconnected dir inode 424609817, moving to lost+found disconnected dir inode 424609838, moving to lost+found disconnected dir inode 486566967, moving to lost+found Phase 7 - verify and correct link counts... resetting inode 217079825 nlinks from 10 to 9 done (Which, oddly, isn't on the xfs_check list. This is getting weird.) From owner-linux-xfs@oss.sgi.com Fri Nov 4 23:22:32 2005 Received: with ECARTIS (v1.0.0; list linux-xfs); Fri, 04 Nov 2005 23:22:35 -0800 (PST) Received: from science.horizon.com (science.horizon.com [192.35.100.1]) by oss.sgi.com (8.12.10/8.12.10/SuSE Linux 0.7) with SMTP id jA57MVO0017157 for ; Fri, 4 Nov 2005 23:22:31 -0800 Received: (qmail 4955 invoked by uid 1000); 5 Nov 2005 02:19:15 -0500 Date: 5 Nov 2005 02:19:15 -0500 Message-ID: <20051105071915.4954.qmail@science.horizon.com> From: linux@horizon.com To: sandeen@sgi.com Subject: Re: Should xfs_repair make xfs_check stop complaining? Cc: linux@horizon.com, linux-xfs@oss.sgi.com In-Reply-To: <436C10A1.6000802@sgi.com> X-archive-position: 6522 X-ecartis-version: Ecartis v1.0.0 Sender: linux-xfs-bounce@oss.sgi.com Errors-to: linux-xfs-bounce@oss.sgi.com X-original-sender: linux@horizon.com Precedence: bulk X-list: linux-xfs Content-Length: 9679 Lines: 184 > Try moving lost+found to somewhere else, /lost+found2 or something, and > re-run xfs_repair. Do problems still persist? > > Also that xfs_repair is not the -very- latest version.... Okay, with xfsprogs 2.7.3, I found one minor bug: /# xfs_check -V Usage: xfs_check [-fsvV] [-l logdev] [-i ino]... [-b bno]... special and moving /lost+found to /lost+found2 produces a clean xfs_repair: # xfs_repair /dev/md4 2>&1 | tee /tmp/repair10 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - clear lost+found (if it exists) ... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 10 - agno = 11 - agno = 12 - agno = 13 - agno = 14 - agno = 15 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - ensuring existence of lost+found directory - traversing filesystem starting at / ... - traversal finished ... - traversing all unattached subtrees ... - traversals finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done ...but xfs_check is still complaining: /# xfs_check /dev/md4 block 2/262 expected type unknown got free2 block 2/263 expected type unknown got free2 block 2/264 expected type unknown got free2 block 2/265 expected type unknown got free2 block 2/266 expected type unknown got free2 block 2/158606 expected type unknown got free2 link count mismatch for inode 9521204 (name ?), nlink 18, counted 17 link count mismatch for inode 9521206 (name ?), nlink 18, counted 17 link count mismatch for inode 9521207 (name ?), nlink 18, counted 17 link count mismatch for inode 8159262 (name ?), nlink 2277, counted 2278 link count mismatch for inode 4295701 (name ?), nlink 13, counted 12 link count mismatch for inode 4295702 (name ?), nlink 13, counted 12 link count mismatch for inode 4295703 (name ?), nlink 11, counted 10 link count mismatch for inode 4295705 (name ?), nlink 13, counted 12 link count mismatch for inode 37956638 (name ?), nlink 3, counted 4 link count mismatch for inode 37956639 (name ?), nlink 3, counted 4 link count mismatch for inode 47756317 (name ?), nlink 57, counted 62 link count mismatch for inode 47756319 (name ?), nlink 14, counted 15 link count mismatch for inode 39545890 (name ?), nlink 149, counted 150 link count mismatch for inode 47281154 (name ?), nlink 16, counted 17 link count mismatch for inode 36780063 (name ?), nlink 2, counted 3 link count mismatch for inode 67423241 (name ?), nlink 445, counted 448 link count mismatch for inode 72395795 (name ?), nlink 6, counted 7 link count mismatch for inode 72395796 (name ?), nlink 10, counted 11 link count mismatch for inode 72395799 (name ?), nlink 6, counted 7 link count mismatch for inode 72395800 (name ?), nlink 6, counted 7 link count mismatch for inode 72395801 (name ?), nlink 6, counted 7 link count mismatch for inode 72395802 (name ?), nlink 6, counted 7 link count mismatch for inode 72395803 (name ?), nlink 8, counted 9 link count mismatch for inode 107157534 (name ?), nlink 8, counted 9 link count mismatch for inode 107157541 (name ?), nlink 3, counted 4 link count mismatch for inode 107157542 (name ?), nlink 3, counted 4 lin