From owner-xfs@oss.sgi.com Thu Nov 1 03:16:07 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 03:16:10 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA1AG6bR006267 for ; Thu, 1 Nov 2007 03:16:07 -0700 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1InWqS-0005F4-D3; Thu, 01 Nov 2007 10:00:12 +0000 Date: Thu, 1 Nov 2007 10:00:12 +0000 From: Christoph Hellwig To: Lachlan McIlroy Cc: xfs-dev , xfs-oss Subject: Re: [PATCH] Turn off XBF_READ_AHEAD in io completion Message-ID: <20071101100012.GA20065@infradead.org> References: <47296FF7.8080607@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47296FF7.8080607@sgi.com> User-Agent: Mutt/1.4.2.3i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-Virus-Scanned: ClamAV 0.91.2/4655/Thu Nov 1 00:41:48 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13514 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Thu, Nov 01, 2007 at 05:19:35PM +1100, Lachlan McIlroy wrote: > Read-ahead of an inode cluster will set XBF_READ_AHEAD in the buffer. > If we don't remove the flag it will still be set when we flush the > buffer back to disk. Not sure if leaving this flag set causes any > serious problems but it does trigger an assert. It might be better if such temporary flags never actually make it to bp->b_flags. Just pass down a flags variable all the way to _xfs_buf_ioapply and keep the flags just for this I/O separate from those that are permanent and in bp->b_flags. From owner-xfs@oss.sgi.com Thu Nov 1 11:58:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 11:58:32 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.0-r574664 Received: from postfix2-g20.free.fr (postfix2-g20.free.fr [212.27.60.43]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA1IwO21023920 for ; Thu, 1 Nov 2007 11:58:25 -0700 Received: from smtp7-g19.free.fr (smtp7-g19.free.fr [212.27.42.64]) by postfix2-g20.free.fr (Postfix) with ESMTP id E92D11D7F582 for ; Thu, 1 Nov 2007 17:57:30 +0100 (CET) Received: from smtp7-g19.free.fr (localhost [127.0.0.1]) by smtp7-g19.free.fr (Postfix) with ESMTP id 246E33227EA; Thu, 1 Nov 2007 19:58:23 +0100 (CET) Received: from galadriel.home (pla78-1-82-235-234-79.fbx.proxad.net [82.235.234.79]) by smtp7-g19.free.fr (Postfix) with ESMTP id 06C5932283B; Thu, 1 Nov 2007 19:58:21 +0100 (CET) Date: Thu, 1 Nov 2007 19:58:12 +0100 From: Emmanuel Florac To: Joshua Baker-LePain Cc: "paul.lkw" , xfs@oss.sgi.com Subject: Re: 2.6TB Storage Size Problem Message-ID: <20071101195812.7355aa92@galadriel.home> In-Reply-To: References: <13501909.post@talk.nabble.com> Organization: Intellique X-Mailer: Claws Mail 2.9.1 (GTK+ 2.8.20; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id lA1IwQ21023932 X-archive-position: 13515 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: eflorac@intellique.com Precedence: bulk X-list: xfs Le Tue, 30 Oct 2007 23:30:04 -0400 (EDT) vous écriviez: > 3) You can't boot from such a device (as neither grub nor lilo > support gpt disklabels). lilo does support booting from gpt on Debian since Sarge at least. I'd be surprised if the CentOS build doesn't. -- -------------------------------------------------- Emmanuel Florac www.intellique.com -------------------------------------------------- From owner-xfs@oss.sgi.com Thu Nov 1 13:22:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 13:22:19 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.0 required=5.0 tests=BAYES_50,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from SVITS26.main.ad.rit.edu (svits26.main.ad.rit.edu [129.21.18.136]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA1KMFsq003297 for ; Thu, 1 Nov 2007 13:22:16 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 MIME-Version: 1.0 Subject: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Date: Thu, 1 Nov 2007 16:06:35 -0400 Message-ID: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Thread-Index: AcgcwrSV+pTfvWMBRwuhoBrzLC8Qsg== From: "Jay Sullivan" To: X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 3008 X-archive-position: 13516 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpspgd@rit.edu Precedence: bulk X-list: xfs I have an XFS filesystem that has had the following happen twice in 3 months, both times with an impossibly large block number was requested. Unfortunately my logs don't go back far enough for me to know if it was the _exact_ same block both times... I'm running xfsprogs 2.8.21. Excerpt from syslog (hostname obfuscated to 'servername' to protect the innocent): ## Nov 1 14:06:32 servername dm-1: rw=0, want=39943195856896, limit=7759462400 Nov 1 14:06:32 servername I/O error in filesystem ("dm-1") meta-data dev dm-1 block 0x245400000ff8 ("xfs_trans_read_buf") error 5 buf count 4096 Nov 1 14:06:32 servername xfs_force_shutdown(dm-1,0x1) called from line 415 of file fs/xfs/xfs_trans_buf.c. Return address = 0xc02baa25 Nov 1 14:06:32 servername Filesystem "dm-1": I/O Error Detected. Shutting down filesystem: dm-1 Nov 1 14:06:32 servername Please umount the filesystem, and rectify the problem(s) ### I ran xfs_repair -L on the FS and it could be mounted again, but how long until it happens a third time? What concerns me is that this is a FS smaller than 4TB and 39943195856896 (or 0x245400000ff8) seems like a block that I would only have if my FS was muuuuuch larger. The following is output from some pertinent programs: ### servername ~ # xfs_info /mnt/san meta-data=/dev/servername-sanvg01/servername-sanlv01 isize=256 agcount=5, agsize=203161600 blks = sectsz=512 attr=2 data = bsize=4096 blocks=969932800, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 servername ~ # mount /dev/sda3 on / type ext3 (rw,noatime,acl) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec) udev on /dev type tmpfs (rw,nosuid) devpts on /dev/pts type devpts (rw,nosuid,noexec) shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev) usbfs on /proc/bus/usb type usbfs (rw,noexec,nosuid,devmode=0664,devgid=85) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev) nfsd on /proc/fs/nfsd type nfsd (rw) /dev/mapper/servername--sanvg01-servername--sanlv01 on /mnt/san type xfs (rw,noatime,nodiratime,logbufs=8,attr2) /dev/mapper/servername--sanvg01-servername--rendersharelv01 on /mnt/san/rendershare type xfs (rw,noatime,nodiratime,logbufs=8,attr2) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) servername ~ # uname -a Linux servername 2.6.20-gentoo-r8 #7 SMP Fri Jun 29 14:46:02 EDT 2007 i686 Intel(R) Xeon(TM) CPU 3.20GHz GenuineIntel GNU/Linux ### Does anyone know if this points to a bad block on a disk or if something is corrupted and can be fixed with some expert knowledge of xfs_db? ~Jay [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Nov 1 15:47:49 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 15:47:53 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA1MlkaL017277 for ; Thu, 1 Nov 2007 15:47:48 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA29284; Fri, 2 Nov 2007 09:47:46 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA1MljdD90344303; Fri, 2 Nov 2007 09:47:45 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA1MliWs89544045; Fri, 2 Nov 2007 09:47:44 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 2 Nov 2007 09:47:44 +1100 From: David Chinner To: Lachlan McIlroy Cc: David Chinner , xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] Implement fallocate Message-ID: <20071101224744.GE995458@sgi.com> References: <20071029233841.GT995458@sgi.com> <472928C1.5080707@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <472928C1.5080707@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13517 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Thu, Nov 01, 2007 at 12:15:45PM +1100, Lachlan McIlroy wrote: > >+ xfs_ilock(XFS_I(inode), XFS_IOLOCK_EXCL); > >+ error = xfs_change_file_space(XFS_I(inode), XFS_IOC_RESVSP, &bf, > >+ 0, NULL, ATTR_NOLOCK); > >+ if (!error && !(mode & FALLOC_FL_KEEP_SIZE) && > >+ offset + len > i_size_read(inode)) > >+ new_size = offset + len; > >+ > >+ /* Change file size if needed */ > >+ if (new_size) { > >+ bhv_vattr_t va; > >+ > >+ va.va_mask = XFS_AT_SIZE; > >+ va.va_size = new_size; > >+ error = xfs_setattr(XFS_I(inode), &va, ATTR_NOLOCK, NULL); > >+ } > > Is it necessary to call xfs_setattr() here? Could we just do an explicit > call to xfs_zero_eof(), set the new size, set i_update_core/size and mark > the inode dirty? Hmmm, then again, that approach wouldn't be as clean as > above. And it also violates the atomicity that posix_fallocate is supposed to provide. i.e. if it returns success, the change of file size must be permanent. i.e. the change of size needs to be recorded in a transaction. Hence we need to call xfs_setattr.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 1 15:54:59 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 15:55:01 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA1MssbB018151 for ; Thu, 1 Nov 2007 15:54:57 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA29422; Fri, 2 Nov 2007 09:54:54 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA1MssdD90412464; Fri, 2 Nov 2007 09:54:54 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA1Msr2Y90572624; Fri, 2 Nov 2007 09:54:53 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 2 Nov 2007 09:54:53 +1100 From: David Chinner To: Lachlan McIlroy Cc: David Chinner , xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] fix transaction overrun during writeback Message-ID: <20071101225453.GF995458@sgi.com> References: <20071029234010.GU995458@sgi.com> <4729304A.2010202@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4729304A.2010202@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13518 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Thu, Nov 01, 2007 at 12:47:54PM +1100, Lachlan McIlroy wrote: > Looks good Dave. Since this is a writeback path is there some way > we can tell xfs_bmapi() that it should not convert anything but > delayed allocs and have it assert/error out if it tries to - not > that it will now with this change but just as defensive measure? I looked at that, but it's not straight forward. In this case we are simply asking for an allocation, assuming the range we ask for is already delalloc. however, the same call could be used to allocate the space if the transaction reservation took into account the space needing to be allocated. So there's not really any simple way to deal with this, esp. as it is valid to allocate both delalloc and unreserved space in the one xfs_bmapi() call as long as you do the right thing with the transaction reservation... We really need to fix the way xfs_iomap works so we don't have the race condition in the first place.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 1 17:30:23 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 17:30:26 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA20UI7e001094 for ; Thu, 1 Nov 2007 17:30:21 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA02245; Fri, 2 Nov 2007 11:30:17 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 6D3EE58C38F7; Fri, 2 Nov 2007 11:30:17 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 971186 - Clean up bitops Message-Id: <20071102003017.6D3EE58C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 11:30:17 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13519 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Use the generic bitops rather than implementing them ourselves. Patch inspired by Andi Kleen. Date: Fri Nov 2 11:29:35 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30000a fs/xfs/xfs_bit.h - 1.21 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bit.h.diff?r1=text&tr1=1.21&r2=text&tr2=1.20&f=h - wrap xfs bitop functions around generic implementations. fs/xfs/xfs_bit.c - 1.32 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bit.c.diff?r1=text&tr1=1.32&r2=text&tr2=1.31&f=h - Remove implementation of generic bitops. fs/xfs/xfs_rtalloc.c - 1.108 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rtalloc.c.diff?r1=text&tr1=1.108&r2=text&tr2=1.107&f=h - remove implementation of generic bitops. From owner-xfs@oss.sgi.com Thu Nov 1 18:11:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 18:11:31 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA21BL24006076 for ; Thu, 1 Nov 2007 18:11:24 -0700 Received: from timothy-shimmins-power-mac-g5.local (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA03042; Fri, 2 Nov 2007 12:11:18 +1100 Message-ID: <472A7940.5070800@sgi.com> Date: Fri, 02 Nov 2007 12:11:28 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Roger Willcocks CC: xfs@oss.sgi.com Subject: Re: bug: truncate to zero + setuid References: <47249E7A.7060709@filmlight.ltd.uk> <47252F62.6030503@sgi.com> <47262CD0.5010708@filmlight.ltd.uk> <4726ADAE.9070206@sgi.com> <472769A1.5090605@filmlight.ltd.uk> In-Reply-To: <472769A1.5090605@filmlight.ltd.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13520 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Hi Roger, Roger Willcocks wrote: > Timothy Shimmin wrote: >> I presume it was done where it was done so that the inode was locked >> and we >> were under the XFS_AT_SIZE predicate. >> >> I was just thinking of something like... >> but I'm probably missing something. >> >> Index: 2.6.x-xfs/fs/xfs/xfs_vnodeops.c >> =================================================================== >> --- 2.6.x-xfs.orig/fs/xfs/xfs_vnodeops.c 2007-10-12 >> 16:06:15.000000000 +1000 >> +++ 2.6.x-xfs/fs/xfs/xfs_vnodeops.c 2007-10-30 14:59:46.418837757 >> +1100 >> @@ -304,6 +304,24 @@ >> } >> >> /* >> + * Short circuit the truncate case for zero length files. >> + * If more mask bits are set, then just remove the SIZE one >> + * and keep going. >> + */ >> + if (mask & XFS_AT_SIZE) { >> + xfs_ilock(ip, XFS_ILOCK_SHARED); >> + if ((vap->va_size == 0) && (ip->i_size == 0) && >> (ip->i_d.di_nextents == 0)) { >> + if (mask & ~XFS_AT_SIZE) { >> + mask &= ~XFS_AT_SIZE; >> + } else { >> + xfs_iunlock(ip, XFS_ILOCK_SHARED); >> + return 0; >> + } >> + } >> + xfs_iunlock(ip, XFS_ILOCK_SHARED); >> + } >> + >> + /* >> * For the other attributes, we acquire the inode lock and >> * first do an error checking pass. >> */ >> @@ -451,17 +469,6 @@ >> * Truncate file. Must have write permission and not be a >> directory. >> */ >> if (mask & XFS_AT_SIZE) { >> - /* Short circuit the truncate case for zero length >> files */ >> - if ((vap->va_size == 0) && >> - (ip->i_size == 0) && (ip->i_d.di_nextents == 0)) { >> - xfs_iunlock(ip, XFS_ILOCK_EXCL); >> - lock_flags &= ~XFS_ILOCK_EXCL; >> - if (mask & XFS_AT_CTIME) >> - xfs_ichgtime(ip, XFS_ICHGTIME_MOD | >> XFS_ICHGTIME_CHG); >> - code = 0; >> - goto error_return; >> - } >> - >> if (VN_ISDIR(vp)) { >> code = XFS_ERROR(EISDIR); >> goto error_return; > > This misses setting the access and changed times which still need to be > touched even if the file's already zero bytes. How about this: > (noting that open.c/do_truncate uses XFS_AT_SIZE | XFS_AT_CTIME) > Well, if XFS_AT_CTIME was set then my patch wouldn't return straight away and would continue processing the mask fields. And I was presuming that the times would be set doing this. However, it doesn't look quite so simple and consistent: * going down the normal (non-short-circuit) path for AT_SIZE, it doesn't test for CTIME but rather just does: /* * Have to do this even if the file's size doesn't change. */ timeflags |= XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG; And yet our short-circuit case only does it if AT_CTIME is set. Doesn't look consistent to me. * So what in the vfs would call it... ----------------------- open(O_TRUNC)... int may_open(struct nameidata *nd, int acc_mode, int flag) { if (flag & O_TRUNC) { ... error = do_truncate(dentry, 0, ATTR_MTIME|ATTR_CTIME, NULL); ------------------------ static long do_sys_ftruncate(unsigned int fd, loff_t length, int small) { if (!error) error = do_truncate(dentry, length, ATTR_MTIME|ATTR_CTIME, file); ------------------------ This is NOR the case for * do_sys_truncate() and * do_coredump(), however, which send in zero for those bits. Not to mention other ways to get to these calls. Anyway, open(O_TRUNC) will be a good candidate for the optimization by the looks of it as it always trunc's to zero. So in the 2 truncate cases, it is setting MTIME and CTIME. And how is MTIME handled in the xfs code... /* * Change file access or modified times. */ if (mask & (XFS_AT_ATIME|XFS_AT_MTIME)) { if (mask & XFS_AT_ATIME) { ip->i_d.di_atime.t_sec = vap->va_atime.tv_sec; ip->i_d.di_atime.t_nsec = vap->va_atime.tv_nsec; ip->i_update_core = 1; timeflags &= ~XFS_ICHGTIME_ACC; } if (mask & XFS_AT_MTIME) { ip->i_d.di_mtime.t_sec = vap->va_mtime.tv_sec; ip->i_d.di_mtime.t_nsec = vap->va_mtime.tv_nsec; timeflags &= ~XFS_ICHGTIME_MOD; timeflags |= XFS_ICHGTIME_CHG; } if (tp && (flags & ATTR_UTIME)) xfs_trans_log_inode (tp, ip, XFS_ILOG_CORE); } /* * Send out timestamp changes that need to be set to the * current time. Not done when called by a DMI function. */ if (timeflags && !(flags & ATTR_DMI)) xfs_ichgtime(ip, timeflags); And note that MTIME, will actually turn on the timeflags for XFS_ICHGTIME_CHG, which is the time associated with XFS_AT_CTIME. So for the 2 do_truncate calls paths, it will set the mtime based on the va_mtime and set the ctime based on current time (nanotime()). Our shortcut is setting both to current time. I don't know if anyone really cares. I don't like all these inconsistencies. One way to reduce inconsistencies is to allow code to go thru common paths so we can do the same strange thing in the one spot ;-) It looks like in the AT_SIZE, we should always set those timeflags irrespective of AT_CTIME. BTW, your locking looks wrong - it appears you don't unlock when the file is non-zero size. --Tim > --- xfs_vnodeops.c 2007-09-04 15:57:40.000000000 +0100 > +++ /tmp/xfs_vnodeops.c 2007-10-30 17:11:32.000000000 +0000 > @@ -378,6 +378,24 @@ > return (code); > } > > + > + if ((mask & XFS_AT_SIZE) && (vap->va_size == 0)) { > + > + /* Short circuit the truncate case for zero length files */ > + > + xfs_ilock(ip, XFS_ILOCK_EXCL); > + if ((ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) { > + xfs_iunlock(ip, XFS_ILOCK_EXCL); > + if (mask & XFS_AT_CTIME) > + xfs_ichgtime(ip, > XFS_ICHGTIME_MOD|XFS_ICHGTIME_CHG); > + mask &= ~(XFS_AT_SIZE|XFS_AT_CTIME); > + if (mask == 0) { > + code = 0; > + goto error_return; > + } > + } > + } > + > /* > * For the other attributes, we acquire the inode lock and > * first do an error checking pass. > @@ -528,17 +546,6 @@ > * Truncate file. Must have write permission and not be a > directory. > */ > if (mask & XFS_AT_SIZE) { > - /* Short circuit the truncate case for zero length files */ > - if ((vap->va_size == 0) && > - (ip->i_d.di_size == 0) && (ip->i_d.di_nextents == 0)) { > - xfs_iunlock(ip, XFS_ILOCK_EXCL); > - lock_flags &= ~XFS_ILOCK_EXCL; > - if (mask & XFS_AT_CTIME) > - xfs_ichgtime(ip, XFS_ICHGTIME_MOD | > XFS_ICHGTIME_CHG); > - code = 0; > - goto error_return; > - } > - > if (vp->v_type == VDIR) { > code = XFS_ERROR(EISDIR); > goto error_return; From owner-xfs@oss.sgi.com Thu Nov 1 18:41:02 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 18:41:07 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA21exEM009518 for ; Thu, 1 Nov 2007 18:41:01 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA03691; Fri, 2 Nov 2007 12:40:58 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 6F86B58C38F7; Fri, 2 Nov 2007 12:40:58 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 971186 - Fix up sparse warnings Message-Id: <20071102014058.6F86B58C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 12:40:58 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13521 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Fix up sparse warnings. These are mostly locking annotations, marking things static, casts where needed and declaring stuff in header files. Date: Fri Nov 2 12:40:25 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org,lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30002a fs/xfs/xfs_log.c - 1.343 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log.c.diff?r1=text&tr1=1.343&r2=text&tr2=1.342&f=h - Fix up sparse warnings. fs/xfs/xfs_buf_item.h - 1.47 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_buf_item.h.diff?r1=text&tr1=1.47&r2=text&tr2=1.46&f=h - Fix up sparse warnings. fs/xfs/xfs_da_btree.h - 1.67 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_da_btree.h.diff?r1=text&tr1=1.67&r2=text&tr2=1.66&f=h - Fix up sparse warnings. fs/xfs/xfs_log_recover.c - 1.331 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.331&r2=text&tr2=1.330&f=h - Fix up sparse warnings. fs/xfs/xfs_trans_item.c - 1.46 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans_item.c.diff?r1=text&tr1=1.46&r2=text&tr2=1.45&f=h - Fix up sparse warnings. fs/xfs/xfs_vfsops.c - 1.546 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.546&r2=text&tr2=1.545&f=h - Fix up sparse warnings. fs/xfs/xfs_mount.c - 1.416 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mount.c.diff?r1=text&tr1=1.416&r2=text&tr2=1.415&f=h - Fix up sparse warnings. fs/xfs/xfs_btree.h - 1.67 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_btree.h.diff?r1=text&tr1=1.67&r2=text&tr2=1.66&f=h - Fix up sparse warnings. fs/xfs/xfs_trans.h - 1.146 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_trans.h.diff?r1=text&tr1=1.146&r2=text&tr2=1.145&f=h - Fix up sparse warnings. fs/xfs/xfs_bmap.h - 1.102 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap.h.diff?r1=text&tr1=1.102&r2=text&tr2=1.101&f=h - Fix up sparse warnings. fs/xfs/xfs_bmap.c - 1.380 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap.c.diff?r1=text&tr1=1.380&r2=text&tr2=1.379&f=h - Fix up sparse warnings. fs/xfs/xfs_rename.c - 1.77 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rename.c.diff?r1=text&tr1=1.77&r2=text&tr2=1.76&f=h - Fix up sparse warnings. fs/xfs/xfs_attr.c - 1.146 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_attr.c.diff?r1=text&tr1=1.146&r2=text&tr2=1.145&f=h - Fix up sparse warnings. fs/xfs/xfs_dir2.c - 1.61 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dir2.c.diff?r1=text&tr1=1.61&r2=text&tr2=1.60&f=h - Fix up sparse warnings. fs/xfs/linux-2.6/xfs_ioctl.c - 1.157 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ioctl.c.diff?r1=text&tr1=1.157&r2=text&tr2=1.156&f=h - Fix up sparse warnings. fs/xfs/linux-2.6/xfs_globals.c - 1.74 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_globals.c.diff?r1=text&tr1=1.74&r2=text&tr2=1.73&f=h - Fix up sparse warnings. fs/xfs/linux-2.6/xfs_ioctl32.c - 1.23 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ioctl32.c.diff?r1=text&tr1=1.23&r2=text&tr2=1.22&f=h - Fix up sparse warnings. fs/xfs/xfs_mru_cache.c - 1.5 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_mru_cache.c.diff?r1=text&tr1=1.5&r2=text&tr2=1.4&f=h - Fix up sparse warnings. fs/xfs/xfs_filestream.c - 1.4 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_filestream.c.diff?r1=text&tr1=1.4&r2=text&tr2=1.3&f=h - Fix up sparse warnings. From owner-xfs@oss.sgi.com Thu Nov 1 18:45:15 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 18:45:20 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA21jCXK010341 for ; Thu, 1 Nov 2007 18:45:14 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA03857; Fri, 2 Nov 2007 12:45:13 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 0567058C38F7; Fri, 2 Nov 2007 12:45:12 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 972755 - Fix sparse warning in xlog_recover_do_efd_trans. Message-Id: <20071102014513.0567058C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 12:45:12 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13522 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Fix sparse warning in xlog_recover_do_efd_trans. Sparse trips over the locking order in xlog_recover_do_efd_trans() when xfs_trans_delete_ail() drops the ail lock. Because the unlock is conditional, we need to either annotate with a "fake unlock" or change the structure of the code so sparse thinks the function always unlocks. Reordering the code makes it simpler, so do that. Date: Fri Nov 2 12:44:49 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org, lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30003a fs/xfs/xfs_log_recover.c - 1.332 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_log_recover.c.diff?r1=text&tr1=1.332&r2=text&tr2=1.331&f=h - Fix sparse warning in xlog_recover_do_efd_trans. From owner-xfs@oss.sgi.com Thu Nov 1 18:49:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 18:49:19 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA21n3ug011063 for ; Thu, 1 Nov 2007 18:49:07 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA03947; Fri, 2 Nov 2007 12:48:57 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA21mudD89856169; Fri, 2 Nov 2007 12:48:57 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA21mt2V90549984; Fri, 2 Nov 2007 12:48:55 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 2 Nov 2007 12:48:55 +1100 From: David Chinner To: Christoph Hellwig Cc: David Chinner , xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] show all mount args in /proc/mounts Message-ID: <20071102014855.GH995458@sgi.com> References: <20071029233543.GQ995458@sgi.com> <20071030100617.GB23489@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071030100617.GB23489@infradead.org> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13523 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Oct 30, 2007 at 10:06:17AM +0000, Christoph Hellwig wrote: > On Tue, Oct 30, 2007 at 10:35:43AM +1100, David Chinner wrote: > > There are several mount options that don't show up in /proc/mounts. > > Add them in and clean up the showargs code at the same time. > > Looks good. Care to submit a patch ontop of this to move all the mount > option handling to xfs_super.c as it's entirely linux-specific in this > form? Sure. This what you mean? ----- Mount option parsing is platform specific. Move it out of core code into the platform specific superblock operation file. Signed-off-by: Dave Chinner --- fs/xfs/linux-2.6/xfs_super.c | 430 +++++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_vfsops.c | 430 ------------------------------------------- fs/xfs/xfs_vfsops.h | 3 3 files changed, 430 insertions(+), 433 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c 2007-10-24 16:01:47.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c 2007-10-31 10:24:31.412771393 +1100 @@ -50,6 +50,7 @@ #include "xfs_vnodeops.h" #include "xfs_vfsops.h" #include "xfs_version.h" +#include "xfs_log_priv.h" #include #include @@ -88,6 +89,435 @@ xfs_args_allocate( return args; } +#define MNTOPT_LOGBUFS "logbufs" /* number of XFS log buffers */ +#define MNTOPT_LOGBSIZE "logbsize" /* size of XFS log buffers */ +#define MNTOPT_LOGDEV "logdev" /* log device */ +#define MNTOPT_RTDEV "rtdev" /* realtime I/O device */ +#define MNTOPT_BIOSIZE "biosize" /* log2 of preferred buffered io size */ +#define MNTOPT_WSYNC "wsync" /* safe-mode nfs compatible mount */ +#define MNTOPT_INO64 "ino64" /* force inodes into 64-bit range */ +#define MNTOPT_NOALIGN "noalign" /* turn off stripe alignment */ +#define MNTOPT_SWALLOC "swalloc" /* turn on stripe width allocation */ +#define MNTOPT_SUNIT "sunit" /* data volume stripe unit */ +#define MNTOPT_SWIDTH "swidth" /* data volume stripe width */ +#define MNTOPT_NOUUID "nouuid" /* ignore filesystem UUID */ +#define MNTOPT_MTPT "mtpt" /* filesystem mount point */ +#define MNTOPT_GRPID "grpid" /* group-ID from parent directory */ +#define MNTOPT_NOGRPID "nogrpid" /* group-ID from current process */ +#define MNTOPT_BSDGROUPS "bsdgroups" /* group-ID from parent directory */ +#define MNTOPT_SYSVGROUPS "sysvgroups" /* group-ID from current process */ +#define MNTOPT_ALLOCSIZE "allocsize" /* preferred allocation size */ +#define MNTOPT_NORECOVERY "norecovery" /* don't run XFS recovery */ +#define MNTOPT_BARRIER "barrier" /* use writer barriers for log write and + * unwritten extent conversion */ +#define MNTOPT_NOBARRIER "nobarrier" /* .. disable */ +#define MNTOPT_OSYNCISOSYNC "osyncisosync" /* o_sync is REALLY o_sync */ +#define MNTOPT_64BITINODE "inode64" /* inodes can be allocated anywhere */ +#define MNTOPT_IKEEP "ikeep" /* do not free empty inode clusters */ +#define MNTOPT_NOIKEEP "noikeep" /* free empty inode clusters */ +#define MNTOPT_LARGEIO "largeio" /* report large I/O sizes in stat() */ +#define MNTOPT_NOLARGEIO "nolargeio" /* do not report large I/O sizes + * in stat(). */ +#define MNTOPT_ATTR2 "attr2" /* do use attr2 attribute format */ +#define MNTOPT_NOATTR2 "noattr2" /* do not use attr2 attribute format */ +#define MNTOPT_FILESTREAM "filestreams" /* use filestreams allocator */ +#define MNTOPT_QUOTA "quota" /* disk quotas (user) */ +#define MNTOPT_NOQUOTA "noquota" /* no quotas */ +#define MNTOPT_USRQUOTA "usrquota" /* user quota enabled */ +#define MNTOPT_GRPQUOTA "grpquota" /* group quota enabled */ +#define MNTOPT_PRJQUOTA "prjquota" /* project quota enabled */ +#define MNTOPT_UQUOTA "uquota" /* user quota (IRIX variant) */ +#define MNTOPT_GQUOTA "gquota" /* group quota (IRIX variant) */ +#define MNTOPT_PQUOTA "pquota" /* project quota (IRIX variant) */ +#define MNTOPT_UQUOTANOENF "uqnoenforce"/* user quota limit enforcement */ +#define MNTOPT_GQUOTANOENF "gqnoenforce"/* group quota limit enforcement */ +#define MNTOPT_PQUOTANOENF "pqnoenforce"/* project quota limit enforcement */ +#define MNTOPT_QUOTANOENF "qnoenforce" /* same as uqnoenforce */ +#define MNTOPT_DMAPI "dmapi" /* DMI enabled (DMAPI / XDSM) */ +#define MNTOPT_XDSM "xdsm" /* DMI enabled (DMAPI / XDSM) */ +#define MNTOPT_DMI "dmi" /* DMI enabled (DMAPI / XDSM) */ + +STATIC unsigned long +suffix_strtoul(char *s, char **endp, unsigned int base) +{ + int last, shift_left_factor = 0; + char *value = s; + + last = strlen(value) - 1; + if (value[last] == 'K' || value[last] == 'k') { + shift_left_factor = 10; + value[last] = '\0'; + } + if (value[last] == 'M' || value[last] == 'm') { + shift_left_factor = 20; + value[last] = '\0'; + } + if (value[last] == 'G' || value[last] == 'g') { + shift_left_factor = 30; + value[last] = '\0'; + } + + return simple_strtoul((const char *)s, endp, base) << shift_left_factor; +} + +STATIC int +xfs_parseargs( + struct xfs_mount *mp, + char *options, + struct xfs_mount_args *args, + int update) +{ + char *this_char, *value, *eov; + int dsunit, dswidth, vol_dsunit, vol_dswidth; + int iosize; + int ikeep = 0; + + args->flags |= XFSMNT_BARRIER; + args->flags2 |= XFSMNT2_COMPAT_IOSIZE; + + if (!options) + goto done; + + iosize = dsunit = dswidth = vol_dsunit = vol_dswidth = 0; + + while ((this_char = strsep(&options, ",")) != NULL) { + if (!*this_char) + continue; + if ((value = strchr(this_char, '=')) != NULL) + *value++ = 0; + + if (!strcmp(this_char, MNTOPT_LOGBUFS)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + args->logbufs = simple_strtoul(value, &eov, 10); + } else if (!strcmp(this_char, MNTOPT_LOGBSIZE)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + args->logbufsize = suffix_strtoul(value, &eov, 10); + } else if (!strcmp(this_char, MNTOPT_LOGDEV)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + strncpy(args->logname, value, MAXNAMELEN); + } else if (!strcmp(this_char, MNTOPT_MTPT)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + strncpy(args->mtpt, value, MAXNAMELEN); + } else if (!strcmp(this_char, MNTOPT_RTDEV)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + strncpy(args->rtname, value, MAXNAMELEN); + } else if (!strcmp(this_char, MNTOPT_BIOSIZE)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + iosize = simple_strtoul(value, &eov, 10); + args->flags |= XFSMNT_IOSIZE; + args->iosizelog = (uint8_t) iosize; + } else if (!strcmp(this_char, MNTOPT_ALLOCSIZE)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + iosize = suffix_strtoul(value, &eov, 10); + args->flags |= XFSMNT_IOSIZE; + args->iosizelog = ffs(iosize) - 1; + } else if (!strcmp(this_char, MNTOPT_GRPID) || + !strcmp(this_char, MNTOPT_BSDGROUPS)) { + mp->m_flags |= XFS_MOUNT_GRPID; + } else if (!strcmp(this_char, MNTOPT_NOGRPID) || + !strcmp(this_char, MNTOPT_SYSVGROUPS)) { + mp->m_flags &= ~XFS_MOUNT_GRPID; + } else if (!strcmp(this_char, MNTOPT_WSYNC)) { + args->flags |= XFSMNT_WSYNC; + } else if (!strcmp(this_char, MNTOPT_OSYNCISOSYNC)) { + args->flags |= XFSMNT_OSYNCISOSYNC; + } else if (!strcmp(this_char, MNTOPT_NORECOVERY)) { + args->flags |= XFSMNT_NORECOVERY; + } else if (!strcmp(this_char, MNTOPT_INO64)) { + args->flags |= XFSMNT_INO64; +#if !XFS_BIG_INUMS + cmn_err(CE_WARN, + "XFS: %s option not allowed on this system", + this_char); + return EINVAL; +#endif + } else if (!strcmp(this_char, MNTOPT_NOALIGN)) { + args->flags |= XFSMNT_NOALIGN; + } else if (!strcmp(this_char, MNTOPT_SWALLOC)) { + args->flags |= XFSMNT_SWALLOC; + } else if (!strcmp(this_char, MNTOPT_SUNIT)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + dsunit = simple_strtoul(value, &eov, 10); + } else if (!strcmp(this_char, MNTOPT_SWIDTH)) { + if (!value || !*value) { + cmn_err(CE_WARN, + "XFS: %s option requires an argument", + this_char); + return EINVAL; + } + dswidth = simple_strtoul(value, &eov, 10); + } else if (!strcmp(this_char, MNTOPT_64BITINODE)) { + args->flags &= ~XFSMNT_32BITINODES; +#if !XFS_BIG_INUMS + cmn_err(CE_WARN, + "XFS: %s option not allowed on this system", + this_char); + return EINVAL; +#endif + } else if (!strcmp(this_char, MNTOPT_NOUUID)) { + args->flags |= XFSMNT_NOUUID; + } else if (!strcmp(this_char, MNTOPT_BARRIER)) { + args->flags |= XFSMNT_BARRIER; + } else if (!strcmp(this_char, MNTOPT_NOBARRIER)) { + args->flags &= ~XFSMNT_BARRIER; + } else if (!strcmp(this_char, MNTOPT_IKEEP)) { + ikeep = 1; + args->flags &= ~XFSMNT_IDELETE; + } else if (!strcmp(this_char, MNTOPT_NOIKEEP)) { + args->flags |= XFSMNT_IDELETE; + } else if (!strcmp(this_char, MNTOPT_LARGEIO)) { + args->flags2 &= ~XFSMNT2_COMPAT_IOSIZE; + } else if (!strcmp(this_char, MNTOPT_NOLARGEIO)) { + args->flags2 |= XFSMNT2_COMPAT_IOSIZE; + } else if (!strcmp(this_char, MNTOPT_ATTR2)) { + args->flags |= XFSMNT_ATTR2; + } else if (!strcmp(this_char, MNTOPT_NOATTR2)) { + args->flags &= ~XFSMNT_ATTR2; + } else if (!strcmp(this_char, MNTOPT_FILESTREAM)) { + args->flags2 |= XFSMNT2_FILESTREAMS; + } else if (!strcmp(this_char, MNTOPT_NOQUOTA)) { + args->flags &= ~(XFSMNT_UQUOTAENF|XFSMNT_UQUOTA); + args->flags &= ~(XFSMNT_GQUOTAENF|XFSMNT_GQUOTA); + } else if (!strcmp(this_char, MNTOPT_QUOTA) || + !strcmp(this_char, MNTOPT_UQUOTA) || + !strcmp(this_char, MNTOPT_USRQUOTA)) { + args->flags |= XFSMNT_UQUOTA | XFSMNT_UQUOTAENF; + } else if (!strcmp(this_char, MNTOPT_QUOTANOENF) || + !strcmp(this_char, MNTOPT_UQUOTANOENF)) { + args->flags |= XFSMNT_UQUOTA; + args->flags &= ~XFSMNT_UQUOTAENF; + } else if (!strcmp(this_char, MNTOPT_PQUOTA) || + !strcmp(this_char, MNTOPT_PRJQUOTA)) { + args->flags |= XFSMNT_PQUOTA | XFSMNT_PQUOTAENF; + } else if (!strcmp(this_char, MNTOPT_PQUOTANOENF)) { + args->flags |= XFSMNT_PQUOTA; + args->flags &= ~XFSMNT_PQUOTAENF; + } else if (!strcmp(this_char, MNTOPT_GQUOTA) || + !strcmp(this_char, MNTOPT_GRPQUOTA)) { + args->flags |= XFSMNT_GQUOTA | XFSMNT_GQUOTAENF; + } else if (!strcmp(this_char, MNTOPT_GQUOTANOENF)) { + args->flags |= XFSMNT_GQUOTA; + args->flags &= ~XFSMNT_GQUOTAENF; + } else if (!strcmp(this_char, MNTOPT_DMAPI)) { + args->flags |= XFSMNT_DMAPI; + } else if (!strcmp(this_char, MNTOPT_XDSM)) { + args->flags |= XFSMNT_DMAPI; + } else if (!strcmp(this_char, MNTOPT_DMI)) { + args->flags |= XFSMNT_DMAPI; + } else if (!strcmp(this_char, "ihashsize")) { + cmn_err(CE_WARN, + "XFS: ihashsize no longer used, option is deprecated."); + } else if (!strcmp(this_char, "osyncisdsync")) { + /* no-op, this is now the default */ + cmn_err(CE_WARN, + "XFS: osyncisdsync is now the default, option is deprecated."); + } else if (!strcmp(this_char, "irixsgid")) { + cmn_err(CE_WARN, + "XFS: irixsgid is now a sysctl(2) variable, option is deprecated."); + } else { + cmn_err(CE_WARN, + "XFS: unknown mount option [%s].", this_char); + return EINVAL; + } + } + + if (args->flags & XFSMNT_NORECOVERY) { + if ((mp->m_flags & XFS_MOUNT_RDONLY) == 0) { + cmn_err(CE_WARN, + "XFS: no-recovery mounts must be read-only."); + return EINVAL; + } + } + + if ((args->flags & XFSMNT_NOALIGN) && (dsunit || dswidth)) { + cmn_err(CE_WARN, + "XFS: sunit and swidth options incompatible with the noalign option"); + return EINVAL; + } + + if ((args->flags & XFSMNT_GQUOTA) && (args->flags & XFSMNT_PQUOTA)) { + cmn_err(CE_WARN, + "XFS: cannot mount with both project and group quota"); + return EINVAL; + } + + if ((args->flags & XFSMNT_DMAPI) && *args->mtpt == '\0') { + printk("XFS: %s option needs the mount point option as well\n", + MNTOPT_DMAPI); + return EINVAL; + } + + if ((dsunit && !dswidth) || (!dsunit && dswidth)) { + cmn_err(CE_WARN, + "XFS: sunit and swidth must be specified together"); + return EINVAL; + } + + if (dsunit && (dswidth % dsunit != 0)) { + cmn_err(CE_WARN, + "XFS: stripe width (%d) must be a multiple of the stripe unit (%d)", + dswidth, dsunit); + return EINVAL; + } + + /* + * Applications using DMI filesystems often expect the + * inode generation number to be monotonically increasing. + * If we delete inode chunks we break this assumption, so + * keep unused inode chunks on disk for DMI filesystems + * until we come up with a better solution. + * Note that if "ikeep" or "noikeep" mount options are + * supplied, then they are honored. + */ + if (!(args->flags & XFSMNT_DMAPI) && !ikeep) + args->flags |= XFSMNT_IDELETE; + + if ((args->flags & XFSMNT_NOALIGN) != XFSMNT_NOALIGN) { + if (dsunit) { + args->sunit = dsunit; + args->flags |= XFSMNT_RETERR; + } else { + args->sunit = vol_dsunit; + } + dswidth ? (args->swidth = dswidth) : + (args->swidth = vol_dswidth); + } else { + args->sunit = args->swidth = 0; + } + +done: + if (args->flags & XFSMNT_32BITINODES) + mp->m_flags |= XFS_MOUNT_SMALL_INUMS; + if (args->flags2) + args->flags |= XFSMNT_FLAGS2; + return 0; +} + +struct proc_xfs_info { + int flag; + char *str; +}; + +STATIC int +xfs_showargs( + struct xfs_mount *mp, + struct seq_file *m) +{ + static struct proc_xfs_info xfs_info_set[] = { + /* the few simple ones we can get from the mount struct */ + { XFS_MOUNT_WSYNC, "," MNTOPT_WSYNC }, + { XFS_MOUNT_INO64, "," MNTOPT_INO64 }, + { XFS_MOUNT_NOALIGN, "," MNTOPT_NOALIGN }, + { XFS_MOUNT_SWALLOC, "," MNTOPT_SWALLOC }, + { XFS_MOUNT_NOUUID, "," MNTOPT_NOUUID }, + { XFS_MOUNT_NORECOVERY, "," MNTOPT_NORECOVERY }, + { XFS_MOUNT_OSYNCISOSYNC, "," MNTOPT_OSYNCISOSYNC }, + { XFS_MOUNT_ATTR2, "," MNTOPT_ATTR2 }, + { XFS_MOUNT_FILESTREAMS, "," MNTOPT_FILESTREAM }, + { XFS_MOUNT_DMAPI, "," MNTOPT_DMAPI }, + { XFS_MOUNT_GRPID, "," MNTOPT_GRPID }, + { 0, NULL } + }; + static struct proc_xfs_info xfs_info_unset[] = { + /* the few simple ones we can get from the mount struct */ + { XFS_MOUNT_IDELETE, "," MNTOPT_IKEEP }, + { XFS_MOUNT_COMPAT_IOSIZE, "," MNTOPT_LARGEIO }, + { XFS_MOUNT_BARRIER, "," MNTOPT_NOBARRIER }, + { XFS_MOUNT_SMALL_INUMS, "," MNTOPT_64BITINODE }, + { 0, NULL } + }; + struct proc_xfs_info *xfs_infop; + + for (xfs_infop = xfs_info_set; xfs_infop->flag; xfs_infop++) { + if (mp->m_flags & xfs_infop->flag) + seq_puts(m, xfs_infop->str); + } + for (xfs_infop = xfs_info_unset; xfs_infop->flag; xfs_infop++) { + if (!(mp->m_flags & xfs_infop->flag)) + seq_puts(m, xfs_infop->str); + } + + if (mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) + seq_printf(m, "," MNTOPT_ALLOCSIZE "=%dk", + (int)(1 << mp->m_writeio_log) >> 10); + + if (mp->m_logbufs > 0) + seq_printf(m, "," MNTOPT_LOGBUFS "=%d", mp->m_logbufs); + if (mp->m_logbsize > 0) + seq_printf(m, "," MNTOPT_LOGBSIZE "=%dk", mp->m_logbsize >> 10); + + if (mp->m_logname) + seq_printf(m, "," MNTOPT_LOGDEV "=%s", mp->m_logname); + if (mp->m_rtname) + seq_printf(m, "," MNTOPT_RTDEV "=%s", mp->m_rtname); + + if (mp->m_dalign > 0) + seq_printf(m, "," MNTOPT_SUNIT "=%d", + (int)XFS_FSB_TO_BB(mp, mp->m_dalign)); + if (mp->m_swidth > 0) + seq_printf(m, "," MNTOPT_SWIDTH "=%d", + (int)XFS_FSB_TO_BB(mp, mp->m_swidth)); + + if (mp->m_qflags & (XFS_UQUOTA_ACCT|XFS_UQUOTA_ENFD)) + seq_puts(m, "," MNTOPT_USRQUOTA); + else if (mp->m_qflags & XFS_UQUOTA_ACCT) + seq_puts(m, "," MNTOPT_UQUOTANOENF); + + if (mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_OQUOTA_ENFD)) + seq_puts(m, "," MNTOPT_PRJQUOTA); + else if (mp->m_qflags & XFS_PQUOTA_ACCT) + seq_puts(m, "," MNTOPT_PQUOTANOENF); + + if (mp->m_qflags & (XFS_GQUOTA_ACCT|XFS_OQUOTA_ENFD)) + seq_puts(m, "," MNTOPT_GRPQUOTA); + else if (mp->m_qflags & XFS_GQUOTA_ACCT) + seq_puts(m, "," MNTOPT_GQUOTANOENF); + + if (!(mp->m_qflags & XFS_ALL_QUOTA_ACCT)) + seq_puts(m, "," MNTOPT_NOQUOTA); + + return 0; +} __uint64_t xfs_max_file_offset( unsigned int blockshift) Index: 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vfsops.c 2007-10-31 10:06:18.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_vfsops.c 2007-10-31 10:24:25.909483756 +1100 @@ -1482,433 +1482,3 @@ xfs_vget( return 0; } - -#define MNTOPT_LOGBUFS "logbufs" /* number of XFS log buffers */ -#define MNTOPT_LOGBSIZE "logbsize" /* size of XFS log buffers */ -#define MNTOPT_LOGDEV "logdev" /* log device */ -#define MNTOPT_RTDEV "rtdev" /* realtime I/O device */ -#define MNTOPT_BIOSIZE "biosize" /* log2 of preferred buffered io size */ -#define MNTOPT_WSYNC "wsync" /* safe-mode nfs compatible mount */ -#define MNTOPT_INO64 "ino64" /* force inodes into 64-bit range */ -#define MNTOPT_NOALIGN "noalign" /* turn off stripe alignment */ -#define MNTOPT_SWALLOC "swalloc" /* turn on stripe width allocation */ -#define MNTOPT_SUNIT "sunit" /* data volume stripe unit */ -#define MNTOPT_SWIDTH "swidth" /* data volume stripe width */ -#define MNTOPT_NOUUID "nouuid" /* ignore filesystem UUID */ -#define MNTOPT_MTPT "mtpt" /* filesystem mount point */ -#define MNTOPT_GRPID "grpid" /* group-ID from parent directory */ -#define MNTOPT_NOGRPID "nogrpid" /* group-ID from current process */ -#define MNTOPT_BSDGROUPS "bsdgroups" /* group-ID from parent directory */ -#define MNTOPT_SYSVGROUPS "sysvgroups" /* group-ID from current process */ -#define MNTOPT_ALLOCSIZE "allocsize" /* preferred allocation size */ -#define MNTOPT_NORECOVERY "norecovery" /* don't run XFS recovery */ -#define MNTOPT_BARRIER "barrier" /* use writer barriers for log write and - * unwritten extent conversion */ -#define MNTOPT_NOBARRIER "nobarrier" /* .. disable */ -#define MNTOPT_OSYNCISOSYNC "osyncisosync" /* o_sync is REALLY o_sync */ -#define MNTOPT_64BITINODE "inode64" /* inodes can be allocated anywhere */ -#define MNTOPT_IKEEP "ikeep" /* do not free empty inode clusters */ -#define MNTOPT_NOIKEEP "noikeep" /* free empty inode clusters */ -#define MNTOPT_LARGEIO "largeio" /* report large I/O sizes in stat() */ -#define MNTOPT_NOLARGEIO "nolargeio" /* do not report large I/O sizes - * in stat(). */ -#define MNTOPT_ATTR2 "attr2" /* do use attr2 attribute format */ -#define MNTOPT_NOATTR2 "noattr2" /* do not use attr2 attribute format */ -#define MNTOPT_FILESTREAM "filestreams" /* use filestreams allocator */ -#define MNTOPT_QUOTA "quota" /* disk quotas (user) */ -#define MNTOPT_NOQUOTA "noquota" /* no quotas */ -#define MNTOPT_USRQUOTA "usrquota" /* user quota enabled */ -#define MNTOPT_GRPQUOTA "grpquota" /* group quota enabled */ -#define MNTOPT_PRJQUOTA "prjquota" /* project quota enabled */ -#define MNTOPT_UQUOTA "uquota" /* user quota (IRIX variant) */ -#define MNTOPT_GQUOTA "gquota" /* group quota (IRIX variant) */ -#define MNTOPT_PQUOTA "pquota" /* project quota (IRIX variant) */ -#define MNTOPT_UQUOTANOENF "uqnoenforce"/* user quota limit enforcement */ -#define MNTOPT_GQUOTANOENF "gqnoenforce"/* group quota limit enforcement */ -#define MNTOPT_PQUOTANOENF "pqnoenforce"/* project quota limit enforcement */ -#define MNTOPT_QUOTANOENF "qnoenforce" /* same as uqnoenforce */ -#define MNTOPT_DMAPI "dmapi" /* DMI enabled (DMAPI / XDSM) */ -#define MNTOPT_XDSM "xdsm" /* DMI enabled (DMAPI / XDSM) */ -#define MNTOPT_DMI "dmi" /* DMI enabled (DMAPI / XDSM) */ - -STATIC unsigned long -suffix_strtoul(char *s, char **endp, unsigned int base) -{ - int last, shift_left_factor = 0; - char *value = s; - - last = strlen(value) - 1; - if (value[last] == 'K' || value[last] == 'k') { - shift_left_factor = 10; - value[last] = '\0'; - } - if (value[last] == 'M' || value[last] == 'm') { - shift_left_factor = 20; - value[last] = '\0'; - } - if (value[last] == 'G' || value[last] == 'g') { - shift_left_factor = 30; - value[last] = '\0'; - } - - return simple_strtoul((const char *)s, endp, base) << shift_left_factor; -} - -int -xfs_parseargs( - struct xfs_mount *mp, - char *options, - struct xfs_mount_args *args, - int update) -{ - char *this_char, *value, *eov; - int dsunit, dswidth, vol_dsunit, vol_dswidth; - int iosize; - int ikeep = 0; - - args->flags |= XFSMNT_BARRIER; - args->flags2 |= XFSMNT2_COMPAT_IOSIZE; - - if (!options) - goto done; - - iosize = dsunit = dswidth = vol_dsunit = vol_dswidth = 0; - - while ((this_char = strsep(&options, ",")) != NULL) { - if (!*this_char) - continue; - if ((value = strchr(this_char, '=')) != NULL) - *value++ = 0; - - if (!strcmp(this_char, MNTOPT_LOGBUFS)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - args->logbufs = simple_strtoul(value, &eov, 10); - } else if (!strcmp(this_char, MNTOPT_LOGBSIZE)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - args->logbufsize = suffix_strtoul(value, &eov, 10); - } else if (!strcmp(this_char, MNTOPT_LOGDEV)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - strncpy(args->logname, value, MAXNAMELEN); - } else if (!strcmp(this_char, MNTOPT_MTPT)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - strncpy(args->mtpt, value, MAXNAMELEN); - } else if (!strcmp(this_char, MNTOPT_RTDEV)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - strncpy(args->rtname, value, MAXNAMELEN); - } else if (!strcmp(this_char, MNTOPT_BIOSIZE)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - iosize = simple_strtoul(value, &eov, 10); - args->flags |= XFSMNT_IOSIZE; - args->iosizelog = (uint8_t) iosize; - } else if (!strcmp(this_char, MNTOPT_ALLOCSIZE)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - iosize = suffix_strtoul(value, &eov, 10); - args->flags |= XFSMNT_IOSIZE; - args->iosizelog = ffs(iosize) - 1; - } else if (!strcmp(this_char, MNTOPT_GRPID) || - !strcmp(this_char, MNTOPT_BSDGROUPS)) { - mp->m_flags |= XFS_MOUNT_GRPID; - } else if (!strcmp(this_char, MNTOPT_NOGRPID) || - !strcmp(this_char, MNTOPT_SYSVGROUPS)) { - mp->m_flags &= ~XFS_MOUNT_GRPID; - } else if (!strcmp(this_char, MNTOPT_WSYNC)) { - args->flags |= XFSMNT_WSYNC; - } else if (!strcmp(this_char, MNTOPT_OSYNCISOSYNC)) { - args->flags |= XFSMNT_OSYNCISOSYNC; - } else if (!strcmp(this_char, MNTOPT_NORECOVERY)) { - args->flags |= XFSMNT_NORECOVERY; - } else if (!strcmp(this_char, MNTOPT_INO64)) { - args->flags |= XFSMNT_INO64; -#if !XFS_BIG_INUMS - cmn_err(CE_WARN, - "XFS: %s option not allowed on this system", - this_char); - return EINVAL; -#endif - } else if (!strcmp(this_char, MNTOPT_NOALIGN)) { - args->flags |= XFSMNT_NOALIGN; - } else if (!strcmp(this_char, MNTOPT_SWALLOC)) { - args->flags |= XFSMNT_SWALLOC; - } else if (!strcmp(this_char, MNTOPT_SUNIT)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - dsunit = simple_strtoul(value, &eov, 10); - } else if (!strcmp(this_char, MNTOPT_SWIDTH)) { - if (!value || !*value) { - cmn_err(CE_WARN, - "XFS: %s option requires an argument", - this_char); - return EINVAL; - } - dswidth = simple_strtoul(value, &eov, 10); - } else if (!strcmp(this_char, MNTOPT_64BITINODE)) { - args->flags &= ~XFSMNT_32BITINODES; -#if !XFS_BIG_INUMS - cmn_err(CE_WARN, - "XFS: %s option not allowed on this system", - this_char); - return EINVAL; -#endif - } else if (!strcmp(this_char, MNTOPT_NOUUID)) { - args->flags |= XFSMNT_NOUUID; - } else if (!strcmp(this_char, MNTOPT_BARRIER)) { - args->flags |= XFSMNT_BARRIER; - } else if (!strcmp(this_char, MNTOPT_NOBARRIER)) { - args->flags &= ~XFSMNT_BARRIER; - } else if (!strcmp(this_char, MNTOPT_IKEEP)) { - ikeep = 1; - args->flags &= ~XFSMNT_IDELETE; - } else if (!strcmp(this_char, MNTOPT_NOIKEEP)) { - args->flags |= XFSMNT_IDELETE; - } else if (!strcmp(this_char, MNTOPT_LARGEIO)) { - args->flags2 &= ~XFSMNT2_COMPAT_IOSIZE; - } else if (!strcmp(this_char, MNTOPT_NOLARGEIO)) { - args->flags2 |= XFSMNT2_COMPAT_IOSIZE; - } else if (!strcmp(this_char, MNTOPT_ATTR2)) { - args->flags |= XFSMNT_ATTR2; - } else if (!strcmp(this_char, MNTOPT_NOATTR2)) { - args->flags &= ~XFSMNT_ATTR2; - } else if (!strcmp(this_char, MNTOPT_FILESTREAM)) { - args->flags2 |= XFSMNT2_FILESTREAMS; - } else if (!strcmp(this_char, MNTOPT_NOQUOTA)) { - args->flags &= ~(XFSMNT_UQUOTAENF|XFSMNT_UQUOTA); - args->flags &= ~(XFSMNT_GQUOTAENF|XFSMNT_GQUOTA); - } else if (!strcmp(this_char, MNTOPT_QUOTA) || - !strcmp(this_char, MNTOPT_UQUOTA) || - !strcmp(this_char, MNTOPT_USRQUOTA)) { - args->flags |= XFSMNT_UQUOTA | XFSMNT_UQUOTAENF; - } else if (!strcmp(this_char, MNTOPT_QUOTANOENF) || - !strcmp(this_char, MNTOPT_UQUOTANOENF)) { - args->flags |= XFSMNT_UQUOTA; - args->flags &= ~XFSMNT_UQUOTAENF; - } else if (!strcmp(this_char, MNTOPT_PQUOTA) || - !strcmp(this_char, MNTOPT_PRJQUOTA)) { - args->flags |= XFSMNT_PQUOTA | XFSMNT_PQUOTAENF; - } else if (!strcmp(this_char, MNTOPT_PQUOTANOENF)) { - args->flags |= XFSMNT_PQUOTA; - args->flags &= ~XFSMNT_PQUOTAENF; - } else if (!strcmp(this_char, MNTOPT_GQUOTA) || - !strcmp(this_char, MNTOPT_GRPQUOTA)) { - args->flags |= XFSMNT_GQUOTA | XFSMNT_GQUOTAENF; - } else if (!strcmp(this_char, MNTOPT_GQUOTANOENF)) { - args->flags |= XFSMNT_GQUOTA; - args->flags &= ~XFSMNT_GQUOTAENF; - } else if (!strcmp(this_char, MNTOPT_DMAPI)) { - args->flags |= XFSMNT_DMAPI; - } else if (!strcmp(this_char, MNTOPT_XDSM)) { - args->flags |= XFSMNT_DMAPI; - } else if (!strcmp(this_char, MNTOPT_DMI)) { - args->flags |= XFSMNT_DMAPI; - } else if (!strcmp(this_char, "ihashsize")) { - cmn_err(CE_WARN, - "XFS: ihashsize no longer used, option is deprecated."); - } else if (!strcmp(this_char, "osyncisdsync")) { - /* no-op, this is now the default */ - cmn_err(CE_WARN, - "XFS: osyncisdsync is now the default, option is deprecated."); - } else if (!strcmp(this_char, "irixsgid")) { - cmn_err(CE_WARN, - "XFS: irixsgid is now a sysctl(2) variable, option is deprecated."); - } else { - cmn_err(CE_WARN, - "XFS: unknown mount option [%s].", this_char); - return EINVAL; - } - } - - if (args->flags & XFSMNT_NORECOVERY) { - if ((mp->m_flags & XFS_MOUNT_RDONLY) == 0) { - cmn_err(CE_WARN, - "XFS: no-recovery mounts must be read-only."); - return EINVAL; - } - } - - if ((args->flags & XFSMNT_NOALIGN) && (dsunit || dswidth)) { - cmn_err(CE_WARN, - "XFS: sunit and swidth options incompatible with the noalign option"); - return EINVAL; - } - - if ((args->flags & XFSMNT_GQUOTA) && (args->flags & XFSMNT_PQUOTA)) { - cmn_err(CE_WARN, - "XFS: cannot mount with both project and group quota"); - return EINVAL; - } - - if ((args->flags & XFSMNT_DMAPI) && *args->mtpt == '\0') { - printk("XFS: %s option needs the mount point option as well\n", - MNTOPT_DMAPI); - return EINVAL; - } - - if ((dsunit && !dswidth) || (!dsunit && dswidth)) { - cmn_err(CE_WARN, - "XFS: sunit and swidth must be specified together"); - return EINVAL; - } - - if (dsunit && (dswidth % dsunit != 0)) { - cmn_err(CE_WARN, - "XFS: stripe width (%d) must be a multiple of the stripe unit (%d)", - dswidth, dsunit); - return EINVAL; - } - - /* - * Applications using DMI filesystems often expect the - * inode generation number to be monotonically increasing. - * If we delete inode chunks we break this assumption, so - * keep unused inode chunks on disk for DMI filesystems - * until we come up with a better solution. - * Note that if "ikeep" or "noikeep" mount options are - * supplied, then they are honored. - */ - if (!(args->flags & XFSMNT_DMAPI) && !ikeep) - args->flags |= XFSMNT_IDELETE; - - if ((args->flags & XFSMNT_NOALIGN) != XFSMNT_NOALIGN) { - if (dsunit) { - args->sunit = dsunit; - args->flags |= XFSMNT_RETERR; - } else { - args->sunit = vol_dsunit; - } - dswidth ? (args->swidth = dswidth) : - (args->swidth = vol_dswidth); - } else { - args->sunit = args->swidth = 0; - } - -done: - if (args->flags & XFSMNT_32BITINODES) - mp->m_flags |= XFS_MOUNT_SMALL_INUMS; - if (args->flags2) - args->flags |= XFSMNT_FLAGS2; - return 0; -} - -struct proc_xfs_info { - int flag; - char *str; -}; - -int -xfs_showargs( - struct xfs_mount *mp, - struct seq_file *m) -{ - static struct proc_xfs_info xfs_info_set[] = { - /* the few simple ones we can get from the mount struct */ - { XFS_MOUNT_WSYNC, "," MNTOPT_WSYNC }, - { XFS_MOUNT_INO64, "," MNTOPT_INO64 }, - { XFS_MOUNT_NOALIGN, "," MNTOPT_NOALIGN }, - { XFS_MOUNT_SWALLOC, "," MNTOPT_SWALLOC }, - { XFS_MOUNT_NOUUID, "," MNTOPT_NOUUID }, - { XFS_MOUNT_NORECOVERY, "," MNTOPT_NORECOVERY }, - { XFS_MOUNT_OSYNCISOSYNC, "," MNTOPT_OSYNCISOSYNC }, - { XFS_MOUNT_ATTR2, "," MNTOPT_ATTR2 }, - { XFS_MOUNT_FILESTREAMS, "," MNTOPT_FILESTREAM }, - { XFS_MOUNT_DMAPI, "," MNTOPT_DMAPI }, - { XFS_MOUNT_GRPID, "," MNTOPT_GRPID }, - { 0, NULL } - }; - static struct proc_xfs_info xfs_info_unset[] = { - /* the few simple ones we can get from the mount struct */ - { XFS_MOUNT_IDELETE, "," MNTOPT_IKEEP }, - { XFS_MOUNT_COMPAT_IOSIZE, "," MNTOPT_LARGEIO }, - { XFS_MOUNT_BARRIER, "," MNTOPT_NOBARRIER }, - { XFS_MOUNT_SMALL_INUMS, "," MNTOPT_64BITINODE }, - { 0, NULL } - }; - struct proc_xfs_info *xfs_infop; - - for (xfs_infop = xfs_info_set; xfs_infop->flag; xfs_infop++) { - if (mp->m_flags & xfs_infop->flag) - seq_puts(m, xfs_infop->str); - } - for (xfs_infop = xfs_info_unset; xfs_infop->flag; xfs_infop++) { - if (!(mp->m_flags & xfs_infop->flag)) - seq_puts(m, xfs_infop->str); - } - - if (mp->m_flags & XFS_MOUNT_DFLT_IOSIZE) - seq_printf(m, "," MNTOPT_ALLOCSIZE "=%dk", - (int)(1 << mp->m_writeio_log) >> 10); - - if (mp->m_logbufs > 0) - seq_printf(m, "," MNTOPT_LOGBUFS "=%d", mp->m_logbufs); - if (mp->m_logbsize > 0) - seq_printf(m, "," MNTOPT_LOGBSIZE "=%dk", mp->m_logbsize >> 10); - - if (mp->m_logname) - seq_printf(m, "," MNTOPT_LOGDEV "=%s", mp->m_logname); - if (mp->m_rtname) - seq_printf(m, "," MNTOPT_RTDEV "=%s", mp->m_rtname); - - if (mp->m_dalign > 0) - seq_printf(m, "," MNTOPT_SUNIT "=%d", - (int)XFS_FSB_TO_BB(mp, mp->m_dalign)); - if (mp->m_swidth > 0) - seq_printf(m, "," MNTOPT_SWIDTH "=%d", - (int)XFS_FSB_TO_BB(mp, mp->m_swidth)); - - if (mp->m_qflags & (XFS_UQUOTA_ACCT|XFS_UQUOTA_ENFD)) - seq_puts(m, "," MNTOPT_USRQUOTA); - else if (mp->m_qflags & XFS_UQUOTA_ACCT) - seq_puts(m, "," MNTOPT_UQUOTANOENF); - - if (mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_OQUOTA_ENFD)) - seq_puts(m, "," MNTOPT_PRJQUOTA); - else if (mp->m_qflags & XFS_PQUOTA_ACCT) - seq_puts(m, "," MNTOPT_PQUOTANOENF); - - if (mp->m_qflags & (XFS_GQUOTA_ACCT|XFS_OQUOTA_ENFD)) - seq_puts(m, "," MNTOPT_GRPQUOTA); - else if (mp->m_qflags & XFS_GQUOTA_ACCT) - seq_puts(m, "," MNTOPT_GQUOTANOENF); - - if (!(mp->m_qflags & XFS_ALL_QUOTA_ACCT)) - seq_puts(m, "," MNTOPT_NOQUOTA); - - return 0; -} Index: 2.6.x-xfs-new/fs/xfs/xfs_vfsops.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vfsops.h 2007-10-02 16:01:48.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_vfsops.h 2007-10-31 10:24:35.088295625 +1100 @@ -16,9 +16,6 @@ int xfs_mntupdate(struct xfs_mount *mp, int xfs_root(struct xfs_mount *mp, bhv_vnode_t **vpp); int xfs_sync(struct xfs_mount *mp, int flags); int xfs_vget(struct xfs_mount *mp, bhv_vnode_t **vpp, struct xfs_fid *xfid); -int xfs_parseargs(struct xfs_mount *mp, char *options, - struct xfs_mount_args *args, int update); -int xfs_showargs(struct xfs_mount *mp, struct seq_file *m); void xfs_do_force_shutdown(struct xfs_mount *mp, int flags, char *fname, int lnnum); void xfs_attr_quiesce(struct xfs_mount *mp); From owner-xfs@oss.sgi.com Thu Nov 1 18:51:54 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 18:51:58 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA21pns7011676 for ; Thu, 1 Nov 2007 18:51:53 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA04058; Fri, 2 Nov 2007 12:51:50 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 22C5658C38F7; Fri, 2 Nov 2007 12:51:50 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 971186 - Show all mount args in /proc/mounts Message-Id: <20071102015150.22C5658C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 12:51:50 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13524 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Show all mount args in /proc/mounts There are several mount options that don't show up in /proc/mounts. Add them in and clean up the showargs code at the same time. Date: Fri Nov 2 12:51:17 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30004a fs/xfs/xfs_vfsops.c - 1.547 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.547&r2=text&tr2=1.546&f=h - Show all mount args in /proc/mounts. From owner-xfs@oss.sgi.com Thu Nov 1 18:57:01 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 18:57:04 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA21uuWM012579 for ; Thu, 1 Nov 2007 18:56:59 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA04290; Fri, 2 Nov 2007 12:56:56 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id B31AC58C38F7; Fri, 2 Nov 2007 12:56:56 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 972757 - Fix transaction overrun during writeback. Message-Id: <20071102015656.B31AC58C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 12:56:56 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13525 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Fix transaction overrun during writeback. Prevent transaction overrun in xfs_iomap_write_allocate() if we race with a truncate that overlaps the delalloc range we were planning to allocate. If we race, we may allocate into a hole and that requires block allocation. At this point in time we don't have a reservation for block allocation (apart from metadata blocks) and so allocating into a hole rather than a delalloc region results in overflowing the transaction block reservation. Fix it by only allowing a single extent to be allocated at a time. Date: Fri Nov 2 12:56:36 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30005a fs/xfs/xfs_iomap.c - 1.60 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iomap.c.diff?r1=text&tr1=1.60&r2=text&tr2=1.59&f=h - Only allow xfs_iomap_write_allocate to allocate a single extent at a time to prevent races with truncate from causing unreserved allocation and hence transaction overruns. From owner-xfs@oss.sgi.com Thu Nov 1 19:09:30 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 19:09:33 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA229SWT014054 for ; Thu, 1 Nov 2007 19:09:30 -0700 Received: from macmini.sandeen.net (macmini.sandeen.net [10.0.0.61]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id B3AC418004FD9; Thu, 1 Nov 2007 21:09:31 -0500 (CDT) Message-ID: <472A87FA.7000804@sandeen.net> Date: Thu, 01 Nov 2007 21:14:18 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Jay Sullivan CC: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c References: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> In-Reply-To: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13526 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Jay Sullivan wrote: > I ran xfs_repair -L on the FS and it could be mounted again, Was it not even mountable before this, or why did you use the -L flag? If the log is corrupted that points to more problems... perhaps you've had some power loss & your write caches evaporated, and lvm doesn't do barriers? -eric From owner-xfs@oss.sgi.com Thu Nov 1 19:22:54 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 19:22:57 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.0 required=5.0 tests=BAYES_50,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from sc3app27.rit.edu (sc3app27.rit.edu [129.21.35.56]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA22MpxU016149 for ; Thu, 1 Nov 2007 19:22:54 -0700 Received: from cias-jpspgd-macbook.jayps.home (cpe-72-230-182-205.rochester.res.rr.com [72.230.182.205]) by smtp-server.rit.edu (PMDF V6.3-x14 #31420) with ESMTPSA id <0JQU00675XA7QD@smtp-server.rit.edu> for xfs@oss.sgi.com; Thu, 01 Nov 2007 22:22:56 -0400 (EDT) Date: Thu, 01 Nov 2007 22:22:54 -0400 From: Jay Sullivan Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c In-reply-to: <472A87FA.7000804@sandeen.net> To: xfs@oss.sgi.com Message-id: <9489F071-7966-4230-9DAC-D783B6B9600A@rit.edu> MIME-version: 1.0 X-Mailer: Apple Mail (2.912) X-RIT-Received-From: 72.230.182.205 jpspgd@smtp-server.rit.edu References: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> <472A87FA.7000804@sandeen.net> X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 659 X-archive-position: 13527 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpspgd@rit.edu Precedence: bulk X-list: xfs Good eye: it wasn't mountable, thus the -L flag. No recent (unplanned) power outages. The machine and the array that holds the disks are both on serious batteries/UPS and the array's cache batteries are in good health. ~Jay On Nov 1, 2007, at 10:14 PM, Eric Sandeen wrote: > Jay Sullivan wrote: > > > I ran xfs_repair -L on the FS and it could be mounted again, > > Was it not even mountable before this, or why did you use the -L flag? > If the log is corrupted that points to more problems... perhaps you've > had some power loss & your write caches evaporated, and lvm doesn't do > barriers? > > -eric > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Nov 1 19:30:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 19:30:18 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA22UDO9017576 for ; Thu, 1 Nov 2007 19:30:15 -0700 Received: from Liberator.local (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id F1DFB187B8C0F; Thu, 1 Nov 2007 21:30:17 -0500 (CDT) Message-ID: <472A8BB9.7040100@sandeen.net> Date: Thu, 01 Nov 2007 21:30:17 -0500 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Jay Sullivan CC: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c References: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> <472A87FA.7000804@sandeen.net> <9489F071-7966-4230-9DAC-D783B6B9600A@rit.edu> In-Reply-To: <9489F071-7966-4230-9DAC-D783B6B9600A@rit.edu> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13528 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Jay Sullivan wrote: > Good eye: it wasn't mountable, thus the -L flag. No recent > (unplanned) power outages. The machine and the array that holds the > disks are both on serious batteries/UPS and the array's cache > batteries are in good health. Did you have the xfs_repair output to see what it found? You might also grab the very latest xfsprogs (2.9.4) in case it's catching more cases. I hate it when people suggest running memtest86, but I might do that anyway. :) What controller are you using? If you say "areca" I might be on to something with some other bugs I've seen... -Eric From owner-xfs@oss.sgi.com Thu Nov 1 19:35:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 19:35:54 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA22Zmhd018421 for ; Thu, 1 Nov 2007 19:35:50 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA05273; Fri, 2 Nov 2007 13:35:48 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id E27C658C38F7; Fri, 2 Nov 2007 13:35:47 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 972753 - Fix inode allocation latency Message-Id: <20071102023547.E27C658C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 13:35:47 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13529 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Fix inode allocation latency The log force added in xfs_iget_core() has been a performance issue since it was introduced for tight loops that allocate then unlink a single file. under heavy writeback, this can introduce unnecessary latency due tothe log I/o getting stuck behind bulk data writes. Fix this latency problem by avoinding the need for the log force by moving the place we mark linux inode dirty to the transaction commit rather than on transaction completion. This also closes a potential hole in the sync code where a linux inode is not dirty between the time it is modified and the time the log buffer has been written to disk. Date: Fri Nov 2 13:35:27 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30007a fs/xfs/xfs_inode_item.c - 1.132 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode_item.c.diff?r1=text&tr1=1.132&r2=text&tr2=1.131&f=h - Remove the need to mark the linux inode dirty in xfs_iunpin by marking it dirty during transaction commit. fs/xfs/xfs_iget.c - 1.237 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iget.c.diff?r1=text&tr1=1.237&r2=text&tr2=1.236&f=h - Remove the need to force the log on pinned inode reuse by making sure we never need to touch the linux inode during transaction completion. fs/xfs/xfs_inode.c - 1.485 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.485&r2=text&tr2=1.484&f=h - Remove the need to mark the linux inode dirty in xfs_iunpin by marking it dirty during transaction commit. fs/xfs/xfs_inode.h - 1.237 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.h.diff?r1=text&tr1=1.237&r2=text&tr2=1.236&f=h - Remove the need to mark the linux inode dirty in xfs_iunpin by marking it dirty during transaction commit. fs/xfs/linux-2.6/xfs_iops.c - 1.267 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.267&r2=text&tr2=1.266&f=h - Remove the need to mark the linux inode dirty in xfs_iunpin by marking it dirty during transaction commit. From owner-xfs@oss.sgi.com Thu Nov 1 19:43:20 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 19:43:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA22hFsG019415 for ; Thu, 1 Nov 2007 19:43:19 -0700 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id NAA05472; Fri, 2 Nov 2007 13:43:14 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 9BF3458C38F7; Fri, 2 Nov 2007 13:43:14 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 972756 - Implement fallocate. Message-Id: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> Date: Fri, 2 Nov 2007 13:43:14 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13530 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Implement fallocate. Implement the new generic callout for file preallocation. Atomically change the file size if requested. Date: Fri Nov 2 13:42:52 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30009a fs/xfs/linux-2.6/xfs_iops.c - 1.268 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.268&r2=text&tr2=1.267&f=h - implement ->fallocate() From owner-xfs@oss.sgi.com Thu Nov 1 20:00:19 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 20:00:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: *** X-Spam-Status: No, score=3.0 required=5.0 tests=BAYES_50,HTML_MESSAGE autolearn=no version=3.3.0-r574664 Received: from sc3app27.rit.edu (sc3app27.rit.edu [129.21.35.56]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA230Ibt021435 for ; Thu, 1 Nov 2007 20:00:19 -0700 Received: from cias-jpspgd-macbook.jayps.home (cpe-72-230-182-205.rochester.res.rr.com [72.230.182.205]) by smtp-server.rit.edu (PMDF V6.3-x14 #31420) with ESMTPSA id <0JQU00IC1WLNI4@smtp-server.rit.edu> for xfs@oss.sgi.com; Thu, 01 Nov 2007 22:08:13 -0400 (EDT) Date: Thu, 01 Nov 2007 22:08:09 -0400 From: Jay Sullivan Subject: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c To: xfs@oss.sgi.com Message-id: <3A2120EF-3EB0-4CF1-8C4E-920B9688D51F@rit.edu> MIME-version: 1.0 X-Mailer: Apple Mail (2.912) X-RIT-Received-From: 72.230.182.205 jpspgd@smtp-server.rit.edu X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-length: 3288 X-archive-position: 13531 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpspgd@rit.edu Precedence: bulk X-list: xfs (Sorry if this is a dupe to the list; it has been a long day.) I have an XFS filesystem that has had the following happen twice in 3=20=20 months, both times an impossibly large block number was requested.=20=20=20 Unfortunately my logs don=92t go back far enough for me to know if it=20=20 was the _exact_ same block both times=85 I=92m running xfsprogs 2.8.21.=20= =20=20 Excerpt from syslog (hostname obfuscated to =91servername=92 to protect=20= =20 the innocent): ## Nov 1 14:06:32 servername dm-1: rw=3D0, want=3D39943195856896,=20=20 limit=3D7759462400 Nov 1 14:06:32 servername I/O error in filesystem ("dm-1") meta-data=20=20 dev dm-1 block 0x245400000ff8 ("xfs_trans_read_buf") error 5 buf=20= =20 count 4096 Nov 1 14:06:32 servername xfs_force_shutdown(dm-1,0x1) called from=20=20 line 415 of file fs/xfs/xfs_trans_buf.c. Return address =3D 0xc02baa25 Nov 1 14:06:32 servername Filesystem "dm-1": I/O Error Detected.=20=20=20 Shutting down filesystem: dm-1 Nov 1 14:06:32 servername Please umount the filesystem, and rectify=20=20 the problem(s) ### I ran xfs_repair =96L on the FS and it could be mounted again, but how=20= =20 long until it happens a third time? What concerns me is that this is=20=20 a FS smaller than 4TB and 39943195856896 (or 0x245400000ff8) seems=20=20 like a block that I would only have if my FS was muuuuuch larger. The=20= =20 following is output from some pertinent programs: ### servername ~ # xfs_info /mnt/san meta-data=3D/dev/servername-sanvg01/servername-sanlv01 isize=3D256=20=20=20= =20=20 agcount=3D5, agsize=3D203161600 blks =3D sectsz=3D512 attr=3D2 data =3D bsize=3D4096 blocks=3D969932800,=20=20 imaxpct=3D25 =3D sunit=3D0 swidth=3D0 blks,=20=20 unwritten=3D1 naming =3Dversion 2 bsize=3D4096 log =3Dinternal bsize=3D4096 blocks=3D32768, version= =3D1 =3D sectsz=3D512 sunit=3D0 blks, lazy-=20 count=3D0 realtime =3Dnone extsz=3D4096 blocks=3D0, rtextents=3D0 servername ~ # mount /dev/sda3 on / type ext3 (rw,noatime,acl) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec) udev on /dev type tmpfs (rw,nosuid) devpts on /dev/pts type devpts (rw,nosuid,noexec) shm on /dev/shm type tmpfs (rw,noexec,nosuid,nodev) usbfs on /proc/bus/usb type usbfs=20=20 (rw,noexec,nosuid,devmode=3D0664,devgid=3D85) binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc=20=20 (rw,noexec,nosuid,nodev) nfsd on /proc/fs/nfsd type nfsd (rw) /dev/mapper/servername--sanvg01-servername--sanlv01 on /mnt/san type=20=20 xfs (rw,noatime,nodiratime,logbufs=3D8,attr2) /dev/mapper/servername--sanvg01-servername--rendersharelv01 on /mnt/=20 san/rendershare type xfs (rw,noatime,nodiratime,logbufs=3D8,attr2) rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) servername ~ # uname -a Linux servername 2.6.20-gentoo-r8 #7 SMP Fri Jun 29 14:46:02 EDT 2007=20=20 i686 Intel(R) Xeon(TM) CPU 3.20GHz GenuineIntel GNU/Linux ### Does anyone know if this points to a bad block on a disk or if=20=20 something is corrupted and can be fixed with some expert knowledge of=20=20 xfs_db? ~Jay [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Thu Nov 1 21:37:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 21:37:45 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA24bcr0004066 for ; Thu, 1 Nov 2007 21:37:39 -0700 Received: from timothy-shimmins-power-mac-g5.local (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA07528; Fri, 2 Nov 2007 15:37:34 +1100 Message-ID: <472AA999.6090900@sgi.com> Date: Fri, 02 Nov 2007 15:37:45 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Eric Sandeen CC: Jay Sullivan , xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c References: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> <472A87FA.7000804@sandeen.net> In-Reply-To: <472A87FA.7000804@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13532 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Eric Sandeen wrote: > Jay Sullivan wrote: > >> I ran xfs_repair -L on the FS and it could be mounted again, > > Was it not even mountable before this, or why did you use the -L flag? > If the log is corrupted that points to more problems... perhaps you've > had some power loss & your write caches evaporated, and lvm doesn't do > barriers? > > -eric > BTW, I occasionally wonder about the reason for log corruptions. If we have an "evaporated" write cache that would stop a write from going but it wouldn't do a partial sector (< 512 byte) write, would it? I have presumed that sector writes complete or not and that is what the log code is based on. OOI, Jay, how did it fail to mount - what was the log msg? I presume you couldn't mount such that even the log couldn't be replayed? Did it fail during replay? --Tim From owner-xfs@oss.sgi.com Thu Nov 1 22:18:42 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 01 Nov 2007 22:18:50 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA25IdK9012755 for ; Thu, 1 Nov 2007 22:18:41 -0700 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA08548; Fri, 2 Nov 2007 16:18:38 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA25IbdD90702948; Fri, 2 Nov 2007 16:18:37 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA25IYPj90442928; Fri, 2 Nov 2007 16:18:34 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 2 Nov 2007 16:18:34 +1100 From: David Chinner To: Jay Sullivan Cc: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Message-ID: <20071102051834.GM995458@sgi.com> References: <3A2120EF-3EB0-4CF1-8C4E-920B9688D51F@rit.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3A2120EF-3EB0-4CF1-8C4E-920B9688D51F@rit.edu> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13533 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Thu, Nov 01, 2007 at 10:08:09PM -0400, Jay Sullivan wrote: > (Sorry if this is a dupe to the list; it has been a long day.) > > I have an XFS filesystem that has had the following happen twice in 3 > months, both times an impossibly large block number was requested. .... Sure sign of a corrupted btree. > I ran xfs_repair –L on the FS and it could be mounted again, but how > long until it happens a third time? What was the problem that xfs_repair fixed? BTW, why did you run xfs_repair -L? Also, when it happens next, what does xfs_check tell you is broken? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Fri Nov 2 02:23:19 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Nov 2007 02:23:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from stlx01.stz-softwaretechnik.com (stz-softwaretechnik.de [217.160.223.211]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA29NFqC025821 for ; Fri, 2 Nov 2007 02:23:19 -0700 Received: from rg by stlx01.stz-softwaretechnik.com with local (Exim 3.36 #1 (Debian)) id 1InsVK-0005Cb-00 for ; Fri, 02 Nov 2007 10:07:50 +0100 Date: Fri, 2 Nov 2007 10:07:40 +0100 From: Ralf Gross To: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Message-ID: <20071102090740.GB23263@p15145560.pureserver.info> References: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> <472A87FA.7000804@sandeen.net> <9489F071-7966-4230-9DAC-D783B6B9600A@rit.edu> <472A8BB9.7040100@sandeen.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <472A8BB9.7040100@sandeen.net> User-Agent: Mutt/1.5.9i X-Virus-Scanned: ClamAV 0.91.2/4659/Thu Nov 1 09:24:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13534 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: Ralf-Lists@ralfgross.de Precedence: bulk X-list: xfs Eric Sandeen schrieb: > ... > What controller are you using? If you say "areca" I might be on to > something with some other bugs I've seen... I use areca controllers with xfs, but had no problems yet. Can you explain what bugs might hit me? Ralf From owner-xfs@oss.sgi.com Fri Nov 2 07:00:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Nov 2007 07:00:08 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from SVITS26.main.ad.rit.edu (svits26.main.ad.rit.edu [129.21.18.136]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA2E01B5002655 for ; Fri, 2 Nov 2007 07:00:04 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Date: Fri, 2 Nov 2007 10:00:23 -0400 Message-ID: <06CCEA2EB1B80A4A937ED59005FA855101AED204@svits26.main.ad.rit.edu> In-Reply-To: <472A8BB9.7040100@sandeen.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Thread-Index: Acgc+HCNQCFWOWFQTvaSJTeWO28rWgAWfz9Q From: "Jay Sullivan" To: X-Virus-Scanned: ClamAV 0.91.2/4660/Fri Nov 2 05:13:54 2007 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id lA2E04B5002696 X-archive-position: 13535 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpspgd@rit.edu Precedence: bulk X-list: xfs I lost the xfs_repair output on an xterm with only four lines of scrollback... I'll definitely be more careful to preserve more 'evidence' next time. =( "Pics or it didn't happen", right? I just upgraded xfsprogs and will scan the disk during my next scheduled downtime (probably in about 2 weeks). I'm tempted to just wipe the volume and start over: I have enough 'spare' space lying around to copy everything out to a fresh XFS volume. Regarding "areca": I'm using hardware RAID built into Apple XServe RAIDs o'er LSI FC929X cards. Someone else offered the likely explanation that the btree is corrupted. Isn't this something xfs_repair should be able to fix? Would it be easier, safer, and faster to move the data to a new volume (and restore corrupted files if/as I find them from backup)? We're talking about just less than 4TB of data which used to take about 6 hours to fsck (one pass) with ext3. Restoring the whole shebang from backups would probably take the better part of 12 years (waiting for compression, resetting ACLs, etc.)... FWIW, another (way less important,) much busier and significantly larger logical volume on the same array has been totally fine. Murphy--go figure. Thanks! -----Original Message----- From: Eric Sandeen [mailto:sandeen@sandeen.net] Sent: Thursday, November 01, 2007 10:30 PM To: Jay Sullivan Cc: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Jay Sullivan wrote: > Good eye: it wasn't mountable, thus the -L flag. No recent > (unplanned) power outages. The machine and the array that holds the > disks are both on serious batteries/UPS and the array's cache > batteries are in good health. Did you have the xfs_repair output to see what it found? You might also grab the very latest xfsprogs (2.9.4) in case it's catching more cases. I hate it when people suggest running memtest86, but I might do that anyway. :) What controller are you using? If you say "areca" I might be on to something with some other bugs I've seen... -Eric From owner-xfs@oss.sgi.com Fri Nov 2 07:48:57 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Nov 2007 07:49:01 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from SVITS26.main.ad.rit.edu (svits26.main.ad.rit.edu [129.21.18.136]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA2Emso4008131 for ; Fri, 2 Nov 2007 07:48:57 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Date: Fri, 2 Nov 2007 10:49:16 -0400 Message-ID: <06CCEA2EB1B80A4A937ED59005FA855101AED213@svits26.main.ad.rit.edu> In-Reply-To: <06CCEA2EB1B80A4A937ED59005FA855101AED204@svits26.main.ad.rit.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Thread-Index: Acgc+HCNQCFWOWFQTvaSJTeWO28rWgAWfz9QAAMsAYA= From: "Jay Sullivan" To: X-Virus-Scanned: ClamAV 0.91.2/4661/Fri Nov 2 06:48:31 2007 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id lA2Emvo4008137 X-archive-position: 13536 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpspgd@rit.edu Precedence: bulk X-list: xfs What can I say about Murphy and his silly laws? I just had a drive fail on my array. I wonder if this is the root of my problems... Yay parity. ~Jay -----Original Message----- From: xfs-bounce@oss.sgi.com [mailto:xfs-bounce@oss.sgi.com] On Behalf Of Jay Sullivan Sent: Friday, November 02, 2007 10:00 AM To: xfs@oss.sgi.com Subject: RE: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c I lost the xfs_repair output on an xterm with only four lines of scrollback... I'll definitely be more careful to preserve more 'evidence' next time. =( "Pics or it didn't happen", right? I just upgraded xfsprogs and will scan the disk during my next scheduled downtime (probably in about 2 weeks). I'm tempted to just wipe the volume and start over: I have enough 'spare' space lying around to copy everything out to a fresh XFS volume. Regarding "areca": I'm using hardware RAID built into Apple XServe RAIDs o'er LSI FC929X cards. Someone else offered the likely explanation that the btree is corrupted. Isn't this something xfs_repair should be able to fix? Would it be easier, safer, and faster to move the data to a new volume (and restore corrupted files if/as I find them from backup)? We're talking about just less than 4TB of data which used to take about 6 hours to fsck (one pass) with ext3. Restoring the whole shebang from backups would probably take the better part of 12 years (waiting for compression, resetting ACLs, etc.)... FWIW, another (way less important,) much busier and significantly larger logical volume on the same array has been totally fine. Murphy--go figure. Thanks! -----Original Message----- From: Eric Sandeen [mailto:sandeen@sandeen.net] Sent: Thursday, November 01, 2007 10:30 PM To: Jay Sullivan Cc: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c Jay Sullivan wrote: > Good eye: it wasn't mountable, thus the -L flag. No recent > (unplanned) power outages. The machine and the array that holds the > disks are both on serious batteries/UPS and the array's cache > batteries are in good health. Did you have the xfs_repair output to see what it found? You might also grab the very latest xfsprogs (2.9.4) in case it's catching more cases. I hate it when people suggest running memtest86, but I might do that anyway. :) What controller are you using? If you say "areca" I might be on to something with some other bugs I've seen... -Eric From owner-xfs@oss.sgi.com Fri Nov 2 09:10:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Nov 2007 09:10:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-4.6 required=5.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_MED,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA2GAWvv021419 for ; Fri, 2 Nov 2007 09:10:36 -0700 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.1) with ESMTP id lA2GAawa023026; Fri, 2 Nov 2007 12:10:37 -0400 Received: from lacrosse.corp.redhat.com (lacrosse.corp.redhat.com [172.16.52.154]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lA2GAa6B023757; Fri, 2 Nov 2007 12:10:36 -0400 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by lacrosse.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id lA2GAXtK017600; Fri, 2 Nov 2007 12:10:35 -0400 Message-ID: <472B4BF8.2040500@sandeen.net> Date: Fri, 02 Nov 2007 11:10:32 -0500 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.12 (X11/20070530) MIME-Version: 1.0 To: Ralf Gross CC: xfs@oss.sgi.com Subject: Re: xfs_force_shutdown called from file fs/xfs/xfs_trans_buf.c References: <06CCEA2EB1B80A4A937ED59005FA855101AED1BE@svits26.main.ad.rit.edu> <472A87FA.7000804@sandeen.net> <9489F071-7966-4230-9DAC-D783B6B9600A@rit.edu> <472A8BB9.7040100@sandeen.net> <20071102090740.GB23263@p15145560.pureserver.info> In-Reply-To: <20071102090740.GB23263@p15145560.pureserver.info> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4661/Fri Nov 2 06:48:31 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13537 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Ralf Gross wrote: > Eric Sandeen schrieb: >> ... >> What controller are you using? If you say "areca" I might be on to >> something with some other bugs I've seen... > > I use areca controllers with xfs, but had no problems yet. Can you > explain what bugs might hit me? maybe none, it was just a wild guess. :) I've seen a bug on ext3, volumes > 2T corrupted, on an areca controller. Due to the 2T threshold, it seems more like a lower layer IO issue (2^32 x 512) than a filesystem issue... googling a bit I found others with problems on areca, but then that's what I googled for, so I might have self-selected. So, maybe nothing, I was just looking for a 3rd data point. -Eric From owner-xfs@oss.sgi.com Fri Nov 2 14:26:58 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 02 Nov 2007 14:27:03 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.181]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA2LQukO027923 for ; Fri, 2 Nov 2007 14:26:58 -0700 Received: by py-out-1112.google.com with SMTP id u77so1825595pyb for ; Fri, 02 Nov 2007 14:27:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=DTKb1zBYQ3TjarPcNAj9UXy0ml4Bz5QWewqBR0273K8=; b=hXG0H2lmrzz43qNHUTZ7z543vFzvhnPxfGZMhYAocUOyvva91yQrW/JeqFT1phsyEIUCDYSk2u2TnFHb3fvGJ+ZslMQ8RcAh94BtlUmuWk1Ol94JekxF+PJM7n+TRzAmOEwJBmble8Yd+xib4bEs/GFaJwisxyPjfXB5hYYreLI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=Ed+heS0BoheRcPzOeLRxaBnYfh0mZ/GfjIwVlbZ66GTVnoHuA1ZK+TSms058aPJ5tlE/Emy7ZPlU5KhlR66KQTrkUYdfD4SV7N6jDlAVdYOFfO7OkanYx9rBBgmpe9PmYChwrlV2i7R4dMd7SyyOU5azLuk6MG92kMna+KKTH1g= Received: by 10.64.27.13 with SMTP id a13mr7234527qba.1194037343896; Fri, 02 Nov 2007 14:02:23 -0700 (PDT) Received: by 10.65.112.13 with HTTP; Fri, 2 Nov 2007 14:02:23 -0700 (PDT) Message-ID: <64bb37e0711021402g4961e474u75e48fa5a893ab7a@mail.gmail.com> Date: Fri, 2 Nov 2007 22:02:23 +0100 From: "Torsten Kaiser" To: "David Chinner" Subject: Re: writeout stalls in current -git Cc: "Peter Zijlstra" , "Fengguang Wu" , "Maxim Levitsky" , linux-kernel@vger.kernel.org, "Andrew Morton" , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <20071102204258.GR995458@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200710221505.35397.maximlevitsky@gmail.com> <393060478.03650@ustc.edu.cn> <64bb37e0710310822r5ca6b793p8fd97db2f72a8655@mail.gmail.com> <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4662/Fri Nov 2 10:28:34 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13539 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: just.for.lkml@googlemail.com Precedence: bulk X-list: xfs On 11/2/07, David Chinner wrote: > On Fri, Nov 02, 2007 at 08:22:10PM +0100, Torsten Kaiser wrote: > > [ 630.000000] SysRq : Emergency Sync > > [ 630.120000] Emergency Sync complete > > [ 632.850000] SysRq : Show Blocked State > > [ 632.850000] task PC stack pid father > > [ 632.850000] pdflush D ffff81000f091788 0 285 2 > > [ 632.850000] ffff810005d4da80 0000000000000046 0000000000000800 > > 0000007000000001 > > [ 632.850000] ffff81000fd52400 ffffffff8022d61c ffffffff80819b00 > > ffffffff80819b00 > > [ 632.850000] ffffffff80815f40 ffffffff80819b00 ffff810100316f98 > > 0000000000000000 > > [ 632.850000] Call Trace: > > [ 632.850000] [] task_rq_lock+0x4c/0x90 > > [ 632.850000] [] __wake_up_common+0x5a/0x90 > > [ 632.850000] [] __down+0xa7/0x11e > > [ 632.850000] [] default_wake_function+0x0/0x10 > > [ 632.850000] [] __down_failed+0x35/0x3a > > [ 632.850000] [] xfs_buf_lock+0x3e/0x40 > > [ 632.850000] [] _xfs_buf_find+0x13e/0x240 > > [ 632.850000] [] xfs_buf_get_flags+0x6f/0x190 > > [ 632.850000] [] xfs_buf_read_flags+0x12/0xa0 > > [ 632.850000] [] xfs_trans_read_buf+0x64/0x340 > > [ 632.850000] [] xfs_itobp+0x81/0x1e0 > > [ 632.850000] [] write_cache_pages+0x123/0x330 > > [ 632.850000] [] xfs_iflush+0xfe/0x520 > > That's stalled waiting on the inode cluster buffer lock. That implies > that the inode lcuser is already being written out and the inode has > been redirtied during writeout. > > Does the kernel you are testing have the "flush inodes in ascending > inode number order" patches applied? If so, can you remove that > patch and see if the problem goes away? It's 2.6.23-mm1 with only some small fixes. In it's broken-out directory I see: git-xfs.patch and writeback-fix-periodic-superblock-dirty-inode-flushing.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-2.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-3.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-4.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-5.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-6.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists-7.patch writeback-fix-time-ordering-of-the-per-superblock-dirty-inode-lists.patch writeback-fix-time-ordering-of-the-per-superblock-inode-lists-8.patch writeback-introduce-writeback_controlmore_io-to-indicate-more-io.patch I don't know if the patch you mentioned is part of that version of the mm-patchset. Torsten From owner-xfs@oss.sgi.com Sun Nov 4 02:18:28 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 02:18:31 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4AIP0a001629 for ; Sun, 4 Nov 2007 02:18:27 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1Ioc6b-0003m6-Pn; Sun, 04 Nov 2007 09:49:21 +0000 Date: Sun, 4 Nov 2007 09:49:21 +0000 From: Christoph Hellwig To: David Chinner Cc: Christoph Hellwig , xfs@oss.sgi.com, xfs-dev@sgi.com Subject: Re: [PATCH] show all mount args in /proc/mounts Message-ID: <20071104094921.GA14493@infradead.org> References: <20071029233543.GQ995458@sgi.com> <20071030100617.GB23489@infradead.org> <20071102014855.GH995458@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071102014855.GH995458@sgi.com> User-Agent: Mutt/1.4.2.3i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-Virus-Scanned: ClamAV 0.91.2/4671/Sat Nov 3 18:21:59 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13540 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: xfs On Fri, Nov 02, 2007 at 12:48:55PM +1100, David Chinner wrote: > On Tue, Oct 30, 2007 at 10:06:17AM +0000, Christoph Hellwig wrote: > > On Tue, Oct 30, 2007 at 10:35:43AM +1100, David Chinner wrote: > > > There are several mount options that don't show up in /proc/mounts. > > > Add them in and clean up the showargs code at the same time. > > > > Looks good. Care to submit a patch ontop of this to move all the mount > > option handling to xfs_super.c as it's entirely linux-specific in this > > form? > > Sure. This what you mean? Yes, exactly. Looks good to me. From owner-xfs@oss.sgi.com Sun Nov 4 03:19:19 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 03:19:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_50 autolearn=ham version=3.3.0-r574664 Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.178]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4BJF5q008590 for ; Sun, 4 Nov 2007 03:19:19 -0800 Received: by py-out-1112.google.com with SMTP id u77so2534901pyb for ; Sun, 04 Nov 2007 03:19:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; bh=NZykAKLjtcuHzPXcDO65poQqxlxE3dMCg+zJZrvejS0=; b=rHYLZ8SaYwJqQoncetNIx4YSijxxofphfL00YRNTU6i8LdS+9uyBgii42Z3bKKVnU+HjrWVmlJNlh/y9QlQaiGMWg7DPQKBZUoPDvGN1i4z9m3uZnMkl9VyJ+4xg0Lyw6Ytv6geFqh101j+WUNj4by2zyop9UDyBYJIfeVP/SGM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=BpWxF4kZt/GKotSn5NDhf1si6zuKIbTtlVRxenmCytnfcbkc6qTnpQsbJ441z90lqX/O5kgv671cKUG5JKECJOPEA0BEmGz/eOPnJKGtGkSvb4CGwjrkNlQ0fPSFeDPRJ+I72+0JaTCS1PtW6lvl3GjsJFBpF2P56ExPJLD6m00= Received: by 10.65.211.16 with SMTP id n16mr10422027qbq.1194175159261; Sun, 04 Nov 2007 03:19:19 -0800 (PST) Received: by 10.65.112.13 with HTTP; Sun, 4 Nov 2007 03:19:19 -0800 (PST) Message-ID: <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> Date: Sun, 4 Nov 2007 12:19:19 +0100 From: "Torsten Kaiser" To: "David Chinner" Subject: Re: writeout stalls in current -git Cc: "Peter Zijlstra" , "Fengguang Wu" , "Maxim Levitsky" , linux-kernel@vger.kernel.org, "Andrew Morton" , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <20071102204258.GR995458@sgi.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_16409_14543774.1194175159246" References: <200710221505.35397.maximlevitsky@gmail.com> <393060478.03650@ustc.edu.cn> <64bb37e0710310822r5ca6b793p8fd97db2f72a8655@mail.gmail.com> <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4671/Sat Nov 3 18:21:59 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13541 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: just.for.lkml@googlemail.com Precedence: bulk X-list: xfs ------=_Part_16409_14543774.1194175159246 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On 11/2/07, David Chinner wrote: > That's stalled waiting on the inode cluster buffer lock. That implies > that the inode lcuser is already being written out and the inode has > been redirtied during writeout. > > Does the kernel you are testing have the "flush inodes in ascending > inode number order" patches applied? If so, can you remove that > patch and see if the problem goes away? I can now confirm, that I see this also with the current mainline-git-version I used 2.6.24-rc1-git-b4f555081fdd27d13e6ff39d455d5aefae9d2c0c plus the fix for the sg changes in ieee1394. Bisecting would be troublesome, as the sg changes prevent mainline to boot with my normal config / kill my network. treogen ~ # vmstat 10 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa -> starting emerge 1 0 0 3627072 332 157724 0 0 97 13 41 189 2 2 94 2 0 0 0 3607240 332 163736 0 0 599 10 332 951 2 1 93 4 0 0 0 3601920 332 167592 0 0 380 2 218 870 1 1 98 0 0 0 0 3596356 332 171648 0 0 404 21 182 818 0 0 99 0 0 0 0 3579328 332 180436 0 0 878 12 147 912 1 1 97 2 0 0 0 3575376 332 182776 0 0 236 4 244 953 1 1 95 3 2 1 0 3571792 332 185084 0 0 232 7 256 1003 2 1 95 2 0 0 0 3564844 332 187364 0 0 228 605 246 1167 2 1 93 4 0 0 0 3562128 332 189784 0 0 230 4 527 1238 2 1 93 4 0 1 0 3558764 332 191964 0 0 216 24 438 1059 1 1 93 6 0 0 0 3555120 332 193868 0 0 199 36 406 959 0 0 92 8 0 0 0 3552008 332 195928 0 0 197 11 458 1023 1 1 90 8 0 0 0 3548728 332 197660 0 0 183 7 496 1086 1 1 90 8 0 0 0 3545560 332 199372 0 0 170 8 483 1017 1 1 90 9 0 1 0 3542124 332 201256 0 0 190 1 544 1137 1 1 88 10 1 0 0 3536924 332 203296 0 0 195 7 637 1209 2 1 89 8 1 1 0 3485096 332 249184 0 0 101 16 10372 4537 13 3 76 8 2 0 0 3442004 332 279728 0 0 1086 40 219 1349 7 3 87 4 -> emerge is done reading its package database 1 0 0 3254796 332 448636 0 0 0 27 128 8360 24 6 70 0 2 0 0 3143304 332 554016 0 0 47 33 213 4480 16 11 72 1 -> kernel unpacked 1 0 0 3125700 332 560416 0 0 1 20 122 1675 24 1 75 0 1 0 0 3117356 332 567968 0 0 0 674 157 2975 24 2 73 1 2 0 0 3111636 332 573736 0 0 0 1143 151 1924 23 1 75 1 2 0 0 3102836 332 581332 0 0 0 890 153 1330 24 1 75 0 1 0 0 3097236 332 587360 0 0 0 656 194 1593 24 1 74 0 1 0 0 3086824 332 595480 0 0 0 812 235 2657 25 1 74 0 -> tar.bz2 created, installing starts now 0 0 0 3091612 332 601024 0 0 82 708 499 2397 17 4 78 1 0 0 0 3086088 332 602180 0 0 69 2459 769 2237 3 4 88 6 0 0 0 3085916 332 602236 0 0 2 1752 693 949 1 2 96 1 0 0 0 3084544 332 603564 0 0 66 4057 1176 2850 3 6 91 0 0 0 0 3078780 332 605572 0 0 98 3194 1169 3288 5 6 89 0 0 0 0 3077940 332 605924 0 0 17 1139 823 1547 1 2 97 0 0 0 0 3078268 332 605924 0 0 0 888 807 1329 0 1 99 0 -> first short stall procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 3077040 332 605924 0 0 0 1950 785 1495 0 2 89 8 0 0 0 3076588 332 605896 0 0 2 3807 925 2046 1 4 95 0 0 0 0 3076900 332 606052 0 0 11 2564 768 1471 1 3 95 1 0 0 0 3071584 332 607928 0 0 87 2499 1108 3433 4 6 90 0 -> second longer stall (emerge was not able to complete a single filemove until the 'resume' line) 0 0 0 3071592 332 607928 0 0 0 693 692 1289 0 0 99 0 0 0 0 3072584 332 607928 0 0 0 792 731 1507 0 1 99 0 0 0 0 3072840 332 607928 0 0 0 806 707 1521 0 1 99 0 0 0 0 3072724 332 607928 0 0 0 782 695 1372 0 0 99 0 0 0 0 3072972 332 607928 0 0 0 677 612 1301 0 0 99 0 0 0 0 3072772 332 607928 0 0 0 738 681 1352 1 1 99 0 0 0 0 3073020 332 607928 0 0 0 785 708 1328 0 1 99 0 0 0 0 3072896 332 607928 0 0 0 833 722 1383 0 0 99 0 -> emerge resumed 0 0 0 3069476 332 607972 0 0 2 4885 812 2062 1 4 90 5 1 0 0 3069648 332 608068 0 0 4 4658 833 2158 1 4 93 2 0 0 0 3064972 332 610364 0 0 106 2494 1095 3620 5 7 88 0 0 0 0 3057536 332 612444 0 0 86 2023 1012 3440 4 6 90 0 1 0 0 3054572 332 612368 0 0 102 1526 1024 2277 6 5 87 2 -> emerge finished, but still >100Mb of dirty data accoring to /proc/meminfo 0 0 0 3048548 332 615764 0 0 337 659 796 1000 3 1 96 0 0 0 0 3092100 332 615860 0 0 15 616 606 1040 1 0 99 0 0 0 0 3092148 332 615860 0 0 0 641 622 1085 0 0 99 0 0 0 0 3092528 332 615860 0 0 0 766 654 1055 1 1 99 0 -> slow writeout until here, might be fixed with Peters patch to scale the background threshold 2 0 0 3090828 332 615860 0 0 0 1804 707 1215 0 2 98 0 0 0 0 3091056 332 615864 0 0 0 3877 831 2047 1 4 94 1 3 0 0 3090780 332 615864 0 0 0 2048 784 1154 1 2 97 1 0 0 0 3091096 332 615864 0 0 0 2690 751 1538 0 3 96 1 0 1 0 3091056 332 615864 0 0 0 2018 748 866 0 2 95 2 2 0 0 3092960 332 615864 0 0 0 2076 719 1118 0 2 97 0 -> writeout "done", /proc/meminfo showed 0kb of dirty data remaining 0 0 0 3093072 332 615864 0 0 0 645 646 1104 0 0 99 0 0 0 0 3093532 332 615864 0 0 0 726 658 1223 0 1 99 0 0 0 0 3093540 332 615864 0 0 0 801 699 1314 0 1 99 0 0 0 0 3093580 332 615864 0 0 0 783 738 1350 0 1 99 0 0 0 0 3093284 332 615920 0 0 6 746 655 1381 1 1 98 0 0 0 0 3092872 332 615920 0 0 0 862 703 1391 1 1 98 0 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 3093224 332 615920 0 0 0 799 676 1394 0 0 99 0 0 0 0 3093304 332 615920 0 0 0 835 672 1514 1 1 98 0 0 0 0 3093476 332 615920 0 0 0 784 641 1404 1 1 98 0 0 0 0 3093264 332 615920 0 0 0 722 626 1483 1 1 99 0 0 0 0 3093476 332 615920 0 0 0 7 328 350 0 0 99 0 0 0 0 3093628 332 615920 0 0 0 11 332 407 0 0 99 0 -> disks finally go idle Torsten .config for 2.6.24-rc1+git attached ------=_Part_16409_14543774.1194175159246 Content-Type: application/x-gzip; name=config.gz Content-Transfer-Encoding: base64 X-Attachment-Id: f_f8lgztwk Content-Disposition: attachment; filename=config.gz H4sIABOYK0cCA4w8XXMjp7Lv+RWqza26SdU5WUu2FTtVfmAYRiKaGVhg9LEv U45XSXxjW3tkOSf597dhPgQMjLMPuzvdDTRNf9GAvv3m2wl6Ox2e70+PD/dP T39Pftu/7I/3p/2XyfP9H/vJw+Hl18fffpp8Obz872my//J4ghb548vbX5M/ 9seX/dPkz/3x9fHw8tNk9sP8h9nVv48PUyDJjo+T8vDnZDKbzGY/XV7/NL2e zC4ufvzm228wKzO6qLc383p+dfd39z2/Sqg6fwL6/LEgJREU14oWJAyt13In McrzIRoXTNYVT5Gy2uKc4ZVklcCk3iCFlylbBJpqKrImpZKjyDoRDKUYSWsC n1lJ6rRAl7MzTLdICa9lxTkTFrFUCK+UQMDNEEcKxJdMAConhBNh8VIU1XDE M0RsoHHPsOS01BwMp7LcELpYqsAcUU4TAZKrU5KjnbM8IFi+xcuA2KhELh89 gsFUzmAk8LIu0K5eojWpOa6zFHtYzniVw/iyLllKaqd5WlCLukqpMm2GwyaV ZvLbid3xEsma5mwxq6vL2eTxdfJyOE1e96c42fzKJut4IFm3thRW/8PHp8df Pj4fvrw97V8//k9VIlBNQXKCJPn4Q2NMH74BI/h2sjCG9qQ7e/t6NguyhRUG hS4Vyl3FqVdElMQC0hJmTMo1cKqHL8B8LmdWC5SvQVkoK+8+fDhPy0aA1BQL zEpubDmDZa0pxwOA/hcriyHOJN3WxaeKVJapJTKtuWCYSFkjjFUcU68v7XVS SK7AMJQMLk8liajLMM4oQ2BedNX8x5LhquMd+LDlDdxVWS2XNFN305tz13gh WMXDw2aIitrga4mXJA0SgdAyCXrDBcGg2GmATeHaWpKvgH5t1luk7voLVEBv jSNzlhnjmnHwjPQzqTMmagn/Ca10t4R9Q1IkJE2DfFU0nc4dNYC29cDxruBL 7go5hNQNXT9WDydbmEnNkZSBYZdM8byyFo0LWirLjSU2kuRZjcFbWmgwvjqr bBazSpGt5TxKsAWqPYzFNOHMbiLpokR5Zsnf+H4bIJcFKc6f66IhAXaqUjl+ W+ZVAmvqOSaZoyS0RkDskrEQmVB1oadl5nA2IlruGsYCTYxkZKFX5aIPKSyt clsODaCuIHagdAAG3cIWsh8X0K2TCdtK21wK3PuiPA+wuAJCO04yDkEAL2lp L7Ab1Xp7YbUJqUNMIhdea1ijgiPtm4xzpmwiH37fazd+fD07Z8qMXUMsYpZ/ 7KBI3j37sJSgNNfMDjA4+2TLCwIJqnKIYGFxdeiuv3AoMiSRjjXPI61atu4+ PPz6nw9nQyOk4LqtohltNNjIhx8PD/vX18Nxcvr7635y//Jl8uv+/vR23FvS UhTkCjmJBAM+S6tk9fLz+XMJiQc4PGmyOPFemlXRPHWTEBOX+slqyFoWoYkW qxtHP7nEYcXUziOcEfQs8SowgB47n0LKBHKskx3kLHfzqyCyiSpzG0e1gzAp Try9ktid+4Ix0DpOHREUFEMog54iLBZSeGkcB7/ugsBy2m5t9nXi4MELJaze QPCOnzJ6LotQJG5wBbadeqdrxrT7Xjr4muXgRZHYBZempQqM1LVPVpYzLyuT np4D0Y0BBZqboBBasfkV5CycDrrS3xBEQxpiMsqUSvhUdAEuGbI3lOTEy3kd itY+PRLJkYDUPtCDhGQTMilAMbHTXpY4ETeDZFo3K1BZoTzsauzhh3TdOD0H DYnFQYexxE30YsM0KAef78VZk/y3/MJiSbB0R4v6gXQuSPEoLzqPIPYe8Yxa w1+wfRgKzKdwjKlhq81AgkPnkIRz1cQQXsm73mQLuoCdk06/z/sx0mRq0t/t dls3vT6Lu2kfmiBxsOPXmkKQVwz8oC8/oxUEiRxkmJWaprSNuhQec3y5k9TY cw55zd3FX7A5hz/nmbdTjvi6ZjgjSU9GFl+cqMarezAwDr2lg4TFMv8FEjoI NRvacxaLckDsGkRwo0KZyhPXURWY+DEBQMbF5jGnCHhUOAnMCjKpcITAAkmw 0soNMz0asnfwlErLJuSOOsGDNsMufyh4SbDORcIj46ZSAJsVBVbORChb/lxP Ly7smQBkdn0R7BBQlxdRFPRzERzhDjDfWK4zCZQUkEip+BQI6ADVu63E0uu2 YKCDy5CekzKl5UK369OPw3/3x8nz/cv9b/vn/ctpcvh6ejy8WLkHL5wYUtQ5 WSAciR1Fk4qH9KuSengd2fTuVtLGdZwXq8GHhEQTCOjG/qOtLZqQjWEjjfOm VkcbvU3NIpveDt/sOuI0CMdxCVKQiOwi7IDjUcp2aAaYoXLAZwr+MNZJu+Vn wutHLYko7EgSjK/NFEDw8UkoBvE6QdFJ5GBCulxT78Bj3l34rEd0wSAJ9tjj bEP8icCOWNnhz+Sphe8LDamOtAjyeTGYokycGRq91844O+7/87Z/efh78vpw //T48ttZ6QFdZ4J8ssqcLaRWyNO9HhObbk+gw26gRw2GxsB9LiMdNzuLBVuD DQvYKEKmgEnYsYUa6QoPRGdMxrizG7AyJTBG+i47gIO+12B569HOPdYDUjAU WgUkROHzJs/B9/OI4C22Q2ibWeihUwStB5Ov/T7sy/HxT2evapROc1ayTW32 PmHEwM9oPHg1koIS8xqDKxG0ZJHA2SgxcNOonWHu9ff74/6L5ZOdru0WjbuC wJwFtcLlJKdJZK1omntroyFGeDlKU9voHGRByqpjOnl77TiefMcxnexPDz98 bwUUbIUm+IA0WRC7mKlhReHXFw0lK8DEHSOBYBeK3UBNdPHfye80EDmOw+wA mmMCCfs5j7EGcI5uOO7SlqqNqhY1hsAdTgjajFALx3dM5K/9w9vp/pen/eTX R/3X4fh8f3qdfJyQ57eney84J7TMCqUrdVYVpoEV1N7kUnQ5a9NFJ482cMQq 2y919Rvn25Q+K71V02lu0WwtvCq2W/kp96f/Ho5/OH61JAF0IOeAsEJsjTDf deEcWlQltQqP20wU7pfxFh5IVglYbE7xzrFTjWo2GSEnBkxDDuu0aEEjjWhp T4DyZtPWHmz1/QAcpWvtEdNawCqQkC4DUUYTfYCy9Nrykkf4pZzys5NsIAtB AiDnxM3uvDAMRQ50eKwkX4IWsBUlMiKTGi3turvexEjuQUCXWOEDzSKpqvQO bvTOKARs1lSXgZRApXQPAn2K8Q4SAkv57CBzwTxIStHC5wtzD6wh8N9Fv+gB VELxueseiqswfEOk2jCWBlBL+F8ILCPwHWRzAfga0n0ZgOujCpMKDVE5D/ZT sgAYEsdlAEzzHEIl1QMP6q8JxaOVXSOrUQotmlECM+lRik7wI4VgoWf8tw/t pnj34bh/OXywZ16k15IuPDNcBy2Jr+e2aPS3PsxZx3ZmhqDxMLo0lEX6bE+6 JHiqOkWpq9Dz2l6qBqKN1wO11utzV1A+j/iEuWPcXndhaMi84yTvdDE0cBtr pNaeADbR89kRmaTKg/TjeRLoynZmCyHDnhtsyq0zAmzDxEpXMyDVWYVbZTRX 7s6nBw5TliaPPBz3OgZDknHaH+FTH6i/HU2C4cTrthfNGi1XTgBxUXVzWj1C kLOFhdaHi2Wpz5VWbQEsDWNdqPKOvp1+tIQiKP/ySdOXrv+w2o2MFkJixcOY KtX1ShKZD9K7EBRBZn6fPWZ5ObuMoKjAEQxIN6FM1qWMEHAeHVCiMjYFSWON VDYUyoh6bBt1D6PAFsQCYiXOkZQ029lGE6QDLnwzGG2QSszHbMYiNcozymWZ RXXcJmr0fJStkhlZ/kPOmsPX0VEhfhQyJuRC38vS6bnJ28dpyrI5MXuPqrkm 8x7ViEzPRL6VD6lSjPk4gXyHwAlUAfyS5Bw86BhJTsqFWo6TvCuUAuF38O8I rN2+jPZhAo0TEkMdrZTacTJK86liCo1SCILyYpTC86EBAoUUeZfCJCXvUAla LkZJ/oGVmJuRYwTV5Wxck2Cf1ulBcwWC//QPIq3j7Pn6KubMdZ2n32jbiSKQ Ge8U3kEClnLfEzfwhm3KIZFakDBSsUgrQbDjUmwcwWUY4aSR9igqDyPYprRt 0+krTYWrxg22zYg8aOssBflZV5zCSCdLcTCVn8CUSPnf5oTW3tg5PRRIwhoJ lJIoZ201LIwGddM1kDBSooIM2JFlwfXtJIoHqHDuozF+gNcwPwHRMBUiDOQ+ ZjQeoB0kPIZlyodrXS7ymMwCytliAhrYtRmqWp+JVBI0Z8iDQBsfhETEpAAR VkBAnCfTeoc/51H/MPnOvsX6fdxdeLvB+bg3mMfcwdyybBVBwFYo1iYTaBFB LfNYd75tz22ftJ4vCUqjBK4nsRBFDEEqOr8a4IaLNY97hPmoJ5mH9dVuFJBF o1y2O9cpT9iVtwHVuh2kgDxZ1IVchKtyHQFLfsaR80xDs4TsxFwbCt4z6gjk Ek3d20kdpkivg51DNhI5KVVFEJ4Imi7Ctcd1jsr65mI2/RSpz2BwkkFUnuNZ pDiyjXCH8vBOYzsLTzRHPAkiCPwbYWsD8xnWfp1S8+AOtIXDuWya25eETBPQ 6ox55fbJaf96akryzgiQDC5IGT6tQAUELMrCohMpCq9g5NgcVH0rXG0wvGwe j/un/at1CICzhV7mqRUf8gbQT3RD9YsAaW49W1fpEPYadhBI4VPn9KhHmCJJ 5DqASxSySUJIN+JZMh3w/WY1FjuuS4p2zPGQGBcjWLWiQaxkmWp2HAG+WmSU P5GtaG65qua7piWv1ADaStZSqVvvAOKW11nq7Ys1MFyfSvd/Pj7sJ2l/GHt+ 6vH40IInzD84qppr4s1mroYceXn34ePrL48vH38/nL4+vf32wXonBFlKzuzb z1w0zTMqig3SRyPuLdlsU+vL2W6hzbBfp4KuIwbcEpC1IDL2jqFeQgor1lQG LyLpSG8uKdmsdrdTXH1W4XMZDnkH7AdDB6Wlc00PPiMLwo+H0+Hh8GQf05W8 PfJ2b4UPr4m7C6+hmH+qY66jRWMKlj1CoztOEb6dX4ySVN6N/QEBZhtTGwne HeqIcn1R/dmHGutjYVyZDCatwXJ7M85uMooWKPT8AKeCFdqJ43RtVVEdsH6G k+mL4fYbHIdgY25Mx0JhzUDDa6KWoSvhVGo/sqbYfTNBk6JGMix+vkSlYqHZ EMI1W7eXeOvktXJBa8rwVSTHyArzXCVsY6zc6eisWPjgRy1pueIoNZcqwpHO vQdubEJiSSdfGk/1+vb16+F4OluHQBS6U8p5rYLtOwb6y33eZyBqodz0DmBt fTWSusn2ulomwyy2vDVPDL778vj6x78mp/uv+39NcPpvELW1sejV1L37sxQN NBzVOzSTUo3YkBRDK5FCnwymtm/rB3OqqD0UL4eTPDzv7cV4nXy3/+G3H2B2 k/97+2P/y+Gv73sZPL89nR6/Pu0neVU6V2qMGE3prAZUMAOW5pqZDh2u2zUY 2CUsaLmINZQY8jwkdyX2lnuDqDLYbkdoJqKO9y+vmuEhj1Jf29N6FdeGDA8p bDw1fzfK6auaRPK97jVJThOJxmhgpzvKQ842OUQw57Cr4St2OtvpzOWm3sIf Y2FxBoDqFqjiBCh6QadBIzw+AKL4x/EBGoKa5e/0cvtOL7dXowT6MgEkESMU AhcynJ0UZIGMpyrJJrYD6GlGrv72NKNKseSKRjywwSeVBDOKnN637nJ7Ob2d jgiUQKwasYxKVZDZNTfJ4mSL1I1zLpbykTnqx6SR3VKHR9PIVfHGchQZWW25 K64v8Q2o1WyMQRFHfjIirqezm4sxIgReZBxvrDVOkvOxDlJ8eXv91zj+Qo3g /Sx14H5CN4+VewMaspqyceip5wssCjfHlQa0xM7VQAOU6000d+KURtZUY4t1 yEka7tZOdp6CX08LFO/pk/TeMrhonVdRSeIEcnsVY0XS3JMCQGZXQzFQGe+/ ymkcuaZoDKlgpz/COtgVns5n28gGqF3IEazLmoMqnFS+gYlYLtmgFaz4CB4X 6fzqwr8CGqAZmQ+W19ezi3H85Rh+p1+JyTgByZCIqnT7qiTeGpz95Xw+jv9x O47fzsp3CC7jeKpuZtPtO/iR9j+bF6AjDChBs5yMjFAgARlOPkIQdQwGDZnD OAHsC/DoAKW8+fFqevEewfUIDxCzx7Exj2QItM+KJUCGQBcXY9lLQ/DZfz7l 4iWezkbxBBZB6EtUcpTN2cVsTB3B281vxoahI93HXJtBbmiZsODbpyId1lLc x3UFZGW0JBE7BaxOyi6CXRvU1OvMwEbor66t361oYdf6OeYSceL11dxDQJFM CgjMjcjdyI4xLdy7/tFaJaBMLcYjlyXi+tF8uDCnL30LEYmXgP1MBIu2HJ2b 4TRHuzCnTZXSeUxcSe/pWnM2SAiZTC9vrybfZY/Hva6Rf28VGs7NqSC6Fh4p MuseunMA+fbL69+vp/2zVTx1asOaGLbkImGSxK4OvpyOh6f2Bx2GjSGm5bty O8JMzSB/srkKFBfPvNCUMEP8PMDJhM/84naPMC9E6zT4GL0nJGpp/tUFJ/1A x/q1GJ8mgEnXEcYE2lD7ivjMugkMH3WO29+C2NWsrPFSX72QLgnZKkj2EFf2 SzkPoWdn//SXRrcPZ2yp9FDtiXUJTYWe72oy/V7m2QGYXwRxQKaI5UDMT/48 OycmtNRvB9u6eOSua6BgeIY2FjY4wz6bWFUU4a2o9qVeIcbmTB8ghs/oPlWQ A36OlPFVsCa0JuZGmHtal0TjFeVRFOzRYyeE5pWEftPpEfTFV+qctGjHqYdx H0KSMnYpPp+F64okyqvBhJcF0onLm0hSukTmB2eCuB1kMWyTRfbk4mY6vw2n /lRObyP72FXkCFmudpG98+r2Jo+wABEcVhusgQYjlqILVjo/upWU2/AosHW+ jNWNkMqnMR3o1jU8WyWiZ6oT0PizY93oA3RI/L2zVH17zjrF2HTH7FP7KtNm pl/EP3uAumAlbY6nLE1vUNHDT8DP/L5mTl8ewhxCK/sFWIdpdrW4YvZjvw75 iUmfr9kIX7l+Sa7sFy8dpK5kYpt6D48+c6b2KyH9JNu+bMPz7QCmICfyYSUT iuTD5pgOYPoHSXKXxAcJKotrK2DBnOrPKeTP1mm5UPnN9OZHq9+0uHHO0/n1 VXti5sIaCVkA/dbzvMKbfEMzp27SguJa0uIlKSVVdA22Z61wh+QEK1FBTkaQ rPSvltiPEzuqRg/OwKvb+bXDG4RtC7BkUtk37Zrv/mjY+iEOF1GXa4GKARpW e9CXI5wEF1fgFQaA/rfNfLB9bNQBOWVByhqVqUaaBy2Oj7JoCu+XjgZEXQeh fP3KusYHH43uVZACmB/RsZhqcPqcLop3rxJoSHsTwWYcoDGlMThHPHpQRzQN RVwsLT4ukqbHEXE0pXJHKA1oTDRniqGA7LFbuhEBdBSeGLoBPGGcqcdEYlGN Ceb/Gbu650ZxLf+vpOblvuytMWAw3q37IINsq42ARsJ2+oXKJp6Z1O1OupLu u9P//epIYCQh4Tz0ZHx+R0IIfRwdnY/xGb7ugQUnCBrD4m2g+a1ReHjW9wv1 G1ZhF81cijS6NoFH6YKHy8XCp4oTaDyHJoGFGeWsdqwiaxfpm+Y+aQneq+EW enr4/sOISwBrd4Z4Zq7mB3TCupcC0GpwZ9QleyDKZT5emETxD6RNQ6Qc6SZN Wp/lmZRMp0hO14m+qwz0XZGECzSli3/BIl1Mgbow+nAg04yt0sjB3wjpn8k1 1tlc8AFn9qsIub+jcaJHGJbkMlyFC5O2wcWBlBZfQ4XY2J5NKq5ZVYZpmprk QxYKidWkQdu+oLZpjevYk+egss1z92Dck7p2KrFrw2BU/FSnK9ukQMOvHoUa Td1ef9NJQOk4vzepOYboaNgkbljeO4bqzahrk62yfhahZexakNoZ2XOf2eew rTPCGeipGosVLvWrAvsEbwhf6j2YASiDzzTifzxnR8JyjxZ5bxnbKrXPy/ef P7x2Jcr+TheigNBttxA/rMDM7XkPLNBUMM/q7Qza98vbVwg4+fwiFpY/Hh71 kJOqBBUiNTYsukx6VzOkD3sLZVmDcdmd/xUswuU8z/2/Vklqv9On6l6weN8H H1XTrFL46FpLVadOzAqNkgd8v6mQHhl4oAhB+rCxopr1CGtLATk/75WnONxk OfObLCU+cc91xNgaXp3Qyal51DpeDz4rforPGDpIYlmsmYsOF8jib127QLEk oJqTzF2SbPGmqg4uTDoz1BUxVaMaXrXZ/kC4JwousDHcEE8wRsWA6rrAsqIZ piMTx3PkVthfBya4ox1mWKRzBZ9jgFao0e//WGYQFzWLM1YfmumwR7wQh9EN 94QJHphIJ05IlSewh80UznCJIcdqsF2/VZ0MkUTRLM89Rl77NsXRyj+Tab1/ eHv6v4e3yx35vbob7KWuAQAbXdCVPzuSLpahcWktyeK/toWsxZHxNMxWnqs0 xSIEQTFQXbuOhIX8qSaaVaxBbsOBHaLYbpTSuP/18PbwCN5Cvb3b+M5H7URx lP7SsLNp59HTlCb4RjLEDwL1qd1MVPhMJdS9wuXt+eHrNHhWXzQVsrD5JXri tDVbclbxN8Xvzgh+qhczokgZQFm7gbLpWtRwNgYF1dGmLWX2i4HFfnfJhM8c l1ZEd+Vd8fryT+AQFNkN0tJwsnP3VZkx1TWi1hf24z8xl6EshEBap13N7w0d 13AC455bVVJT0u3Fhyyc8YbEWGgEWGluvVeSiv2pi08jhKh7/5IaEecByQi3 3ETrxG3aC4u2OBG7zYjFae7oXlnqLF1Fyd/drvaYY4l55zDc1wKsGiEmIeCq aDLE+fKwWwGq4FIn22OImNDHKxyVtZn4V7tfSPSvjNTtjCxztAIZhcbdUNZJ IYCU28okZxBEVMpKShYKszv09c/Xt+cff33TpSDBiopdZUTUHYh1ttVvfQYi ujoyikqvqzEEgXNcVKpiJIgjt8HBFU+iedw0+9BRmq9i0xvySu3YMk1Df8U0 T4Mg8OJi55gDPTsDgGDDsfQ0uJRx/0O7xRUsCMxfo3LoEFvKbs/9XE3F0BF5 rkGAQ8FLP04o9wTvVTCL43U8hyfRYg5eJ2cvzFHFhDRP/eW5x1tCloZ8DaBB 8TH4bDJ6rPZc/ku4qvKqiib7AH1+f7x8FYeqy6uYADAjsr+ev7t2ArHQN6zL WRBFmqZdp6+WRvArFiSpW7RQRZTTxLQqMW3TeLV0AWgdR2tPiXUwBSg6J+kq No7jPcRZEcax/0srJyive8bIAovKDZZNy249Z28qCZSEUhPHhzAjdrKZqLWA UcSUVuRaI3147722e5uI6dkSCooVdYNkcIprycHmoa/CKoB4aHZ0LQ3/reOw AfMCJ+HZ/Q1O7pVLiuYyAmZx75QEqB4sUv4EewzXCEDiFdtd27Qe+djiiubZ 8tUySG+w0GARBh/giT/Ak3yAZ32bJ7rZnnW4XNzg4avlajEz0w8px1Z+iKGs ND2Yr56f6/lGwkpzo1+3qyBdxNubPGm43d1giqNVzOZ5wPu5ucVSV7dr8cin A8uuiIPU46ym8YSLGzywysEXmmUiPF3NfOOCJtF0AS7oauGkxk7qyklNXdTU WW/qbEPqfFrqfNraWe86dK0ggj6/Koj9J0iC9S2eJInn54A6GNzmWYarWZ45 6ec6mygOVtGNeijLlisafIhpHX6AbROt5x/J9zS70UtCBErSBM3zcLi1u8GS huayaDGc0miVBvl0nACw9gKhD4ic9GKVxnrcEg1KwtV+60OwhCZvtc9RPb/g SE0jo27ZWe2kbmFFu/5R2TQd7o4vz4/vd+z56/Pj68vd5uHx399B5tQkCLj0 G/MkjFTtNYElI3BP5mYdUGOjYRu/fASYPE1fL7Dd9ZpMgzBFwS/zj58vjzLu TX/5MaqTxvP+FpJIxOZRbKzh2+Xp+cGhFgMj0k5dfhik47II9LQwA20SBvsK hTYtQ7XyKVO2jnp/KQaUH729pjj6gAUwFJi89refwY+QibGyW89ztFrq8ciu 5HWqbwqKTFkdLfU7dEU+ibV7yswQWgXBeVL1sY6NW+O+D85hnC5d5GgpK7G6 5EiOxNsbG54u02kZ0aIkTlOPdZ58XO050o1wOIOLB8ThMkE3WdY+C1HxDaNF lMxVwKW/wwzHl6pB5XwTVkG0nOGg580Mmnti8Qwj8Uxa2lUNqcrbbDtMSUnm uvw8+8HIkR/nSqMt7rLMmapKzER5dT7OOqNofWwEOpkZPbkLwTXYU6YLlzOg zDo6neMDLOc5cV49K0ZMw/Q8mVai7JEwI7iwBoj/Me0VjiRD1CSRDZ3QDlUp GE+WhcjnlmQHwdtRcKgrd7b1BuZJIKZzYDelOopCsESZ7Kf1Okl13SAQq2Mc WMYfDC9texBWrjNxsLKI/JjYdiBfskVkF65Pxo1/T/IutLKaJkqWZ2M5klal vTm9x+U9P24mmgtzTWxL06RAI3dZy3hF3TljFQfl4UIfEYoq1vA01MXnnoxR vEoCN3llkxmhtR6HTDYM0vHucOmggmkX29nffcOzyXgljdQ9T6jT7eKqlxr7 08zNqgL7vD18/wuEmomGBu30PNe7GtS0ieF1CsRJSjID9blmAWZpAYc2NmaT ITx5vnWrKgFsgjB1VwMjDMt5bRSgO+StzNdcwHxKS1nOr+s97pAMszqz2FYt h3vW3pTFbXjkiU0js0Dt9rwrsnzIpzKN6PT8LqTUX1O7Fj1fYl2ge28N2evL ++tXIRUONakoUJMhI17WdZUmyHClCGGvwOqkKoqNOwRzv89a9pnS4cRxXSn6 q0v/lumce+n858uTJopXrUxIJDH09J+Hl8fL013x/PLzb8V6h94e/3r+cXmE pK9aOT2NkfhhJ70CUp1Rk7A/5bg2SQ06UZIT4xADedjw5xaXmU+xIjgoOYsV pPL4B/ePn8UbnslGexny+xJRkolHlVXjr6cfD9J5CTkN3ICr91nrE7sZpxYD hYzX/hY57TCHsGPjscIs5HVFkv3Ia3T0v5u6322DxKc+l3XU7dJx4gHDU0+b UB6k6dpbISqYL11hDy8XsziJl3Hgx/3e7yMszYOpn6lNfXdqAxzOw9EM/IVH kXPJBlQcQFZnewRJogyCNbmeNfgytAgWiR+mxGdLLGG2DNNgDk58sWAk7PX4 lwMVNQWa6bQdKedgsTbPFlfVL+erX96o3o/TqkQzq5Ufw9m+inZeGOxdPJdO I0xuMeSf/M+nbbA4BLdw/2fFJdwHLm7gMw9gwTpKZ+HED2+pdfGtbzg56k1C ftlkx14EZEioncmw8RwffDAqEQTd9aCM7MCcrPDhQkayJ69WNSf7nJKZ91F8 ELD1jIw0ZyZOcU6Qv4nRIl5O0T5oxhTob+Ig9+EMdk1ivbBfj/lXWwD9yyzJ cLAKwnl8ZlbKr5+eFzcZ/E04VM0uCGfaQBFmQiiN5mQV5DPeE3BJwziZEWPO e7+M0hCIcIH9OMVROIeuk3k09pdmcH4/kg32y0a8EY0r/a8u5kIazuwYPX5j Lz6ew9DfzHu6daUyhEvuUTq5OrR4BBbpR9HmM+us8kM5h/eeuav8YQj6rCm5 dfLVcl61TkVBeH1RJhoPP5+eX12HEvUdMocSPrtDeiFI1XjHfn6/vO2H162+ X156IV89zqgYDgYgj1LszHp8lXu1C3jz6h14vAkpSQ4xxX+N/a4bwFteRFa9 PUmz3gWTbY/Hidgfth51iwA1ozNog6DcbV4F59X4waHhl5orjy08YNJI2/PI SZcp9xyVlqPbZ7kHqfa6PakBYRvS95bWdSHiMAWCl5+kuVSFIdqBeV4Z6Sov kuddzdOT2fo+nbZda40bRjxBPKXajrud0Pav7z9mbVyk3nAPPsl2F0sqg2RZ HWGVA2uqinf7dtNx7kA5eIadZAB0+2UIq4MgOcMjPT1UjS0yXxPoGwhkIEQ3 BBdxLPP3icVMKanmnifelBe45zbfqPW0hxVpENjvYXA0KUqSeL2yma5fqHds yb4+vL9PrbDlgM4s7bA8CkvFpbJhrjj+7zulga0aUCRdXiAP7bsMSPpfd+Ap 9A8VZPX5/d/DqvePYa379vDr7uHr++vd/17uXi6Xp8vT/4gWXowK95ev3+Uy +Q2yzz2/qOS2Rh4cjX3SS4o8q+G98iCOtmhjTYoe3DYYZxX11U9YbsWTcj3A 0L3oiPh/xN0Qy/NmsfZjcezGPrVUhjFyo6gQ+yZyYwfUUAsqyKZlrcwOP+4N 3x7+hEy8T67dgeZZurAU8jIyGgjw3+y1uXQOT7lBPpnB3IHeuw2IpUUgP1zK PHV1AEoS+zpBKk4mArWGIdJkZopSE5R7rxtsDpFYW5xY7wDqgrJ9tAycyGlP ON5jxJ2oOD9CONoMF9hM26XXXYfBwl15H8evo6kTxrTGOyey5blYm+W1sjkT ergtQVnqmwmKSQatd9ZOal0U0wE3P853/tcfwI4T9zgQA93zVUh9ctIHB6Za ZitydUDPMd8Bh4K5m3SoNqSA7ChOlGa8a8ModIMVW4HnsQ8zrto1rERHity9 UBdhtIicUMVJksbuwbOvl2cnwNAWe4GuhmzxuXNWMoIbCI+lUnc4q/B84+x+ g5tPRq5ADT2L2V5R9wvWZt5WHaIlUZ7trgEwt9XgL0oWvK5lpsTn3IMxJYn1 xQUptBYalLe8tXodtUJ0O6Eit5z4SRXba3OBdxUHqd0iZ1ZZ0ZnNvXQN2WG7 Bot1WGGy+1WWRDYGAQBs2WIPCXy4k3ggbjoVrW766ycnAwRytC7CSa78Wa3P Jxc27Av0JfsTzLP6Cyff/CYMbt93VrCCKxlURNbWanUb5CbOhCC/aRC3F0lS nVAjPl9jt51jxt0iHiuE4PQkDq3jJnkFdw9Pf15+wPnTPsBCnTsE/ef2n6KZ MzrbyYrDAr/UcWQQGAXF3tKBxN+e//zTGPmyKG/IbmcYaujk/nbrmxMT+2bD N2rz1LxCt6QkG+SMm4lzlA1Ts7lAj4B08z7EPjSOm8DrnegSpHSo0CyEV7En 3LCESRquV/EcQ+y7YwE/M9/RL6doDJynblYp0s71Wg9B4ChSIe6OgXgkDZcR 71RW69Hn/WjcfYuf0+t1DVPBka8RBd6/PzxKb1xjDJpHJgltn9++SV+xfKoC wHnuyaVTFF2zcfsY5Fm+Qe4eI+R6Bbt9/nq5G4fB8MAzD7utthv1hO4MOQzG vhnIdcXIWYhsxRRiOGsbFahqfJ+hNmfwCoFG9sMj98Mj/8OjmYcvZRIeV998 MrPUiJ/euSCqpxuZIsYwoMFErMtb5qleAuObfXI3/5On6UD3B3qXpYbUsMzd YsfDzqpNxm+V6NYgae3RqHp9Yzo4ATizK+22zBxWPaGDG0RSQnQ3M76mjed6 cl241+6ruz5a0ZQlMUXsUFSewJdisTx3ngxnTUV9H5CoKKV6EAIVtlTsutrQ HPpvDHIIhGvmd6lo2KLM46gLioi+hNgUS1/wzs9byruj6ypKIaHVnoxrXz2/ vsho/tPyavLeFrr0wVuIFOFIuKMy6vyeH3O51ExWGnFcWifJwpwVVUGwdrr/ Ipj0EdrmW4MffpfFuP5X7Pct4r+X3P3ILaR01YpTJkoYlKPNAr/7uyZ5o1WD PmYZrVw4qUCGhJhKvz2/v6ZpvP5ncM2zVvJhBRj9wfnWHzRRgs1VcV2/X34+ vd794XqtPuWRlRTtYLrXSxq7Z8ZYGGw4r2EXa1MBKwnjVHep11shTxUbvZae 1NVKdTXmbhN/hj5waIbNVxtHXz4zMrd+bD8LQSwtH7zB/qIbP4R9K0fWIGp+ +ON5pnG1H/tcnpd+VHxMz97WWpuP+t2dxKpuKhfnxiPmEDke4mFh8SSOqZ7m cNhJxrEtKEeXb7oElmZRlepZD1EiqLmeD5nlojarfiB5RqVCzafkkE5JE7rF xpJbP40i+AxqFr3boKl9tM9xBrVlU2f2727H2IQG8UQi7QGCJrZWYO0OzSY2 QkCMEKsP1HPfSzeeAUf0VsMvKR1oLRppoUU8YXTo6pNM3m5BlhAmaXJpMBKk kWEMhZ6WdXomZ6u6ts6QGd+yzGrfeBfLMfKvDJ7JCMk3rwtQ/SDOQ9LNhv/6 rmt7a3HOguRFpcoplZl3LkisrOXI42xAxbY3OBAl4uh8g4ejhtzggcylTg5j k7tyaPEuxZQQB/9DgTZm8jElIrF2M1ctqwrRNjEq0sSofFxMRCUyEOz1Ge5M wTm98YKQ7XCeQ2y+jWjxjWraW99M3hHc4MHbW42BZThJZz+JElnGH1drlt8e v6fx4jcdGoSOTggdxsTQMZ9jpcm0it1tGVnSeOF9Ruqxz7CY4o8wfaC1qccv 1WIKPsL0kYZ7Aq1YTMuPMH2kCzz5gCym9W2mdfSBmtbx4iM1faCf1ssPtCld +ftJSPQgGXfprbEYhNK7zv0EAQaeChDLCDGn1/DMwK5vAMKbzY1uctx+5fgm R3KTY3WTY32TI7j9MsHS07tXhtjuy0NF0q7x1izh1lNry7farYs662t5Gfzp bkjJIOI5yBL2/iMOPFsI7Oo+Qh8E7DMno6g5WD5QUlw4XN5eLl/v/np4/LfK HH/NOCAO9BAOa1ugHZua3ihTe6nZ1cTLEm5k5TFfvF0tBGDETTGj56AtE0dP CJ/l2twRZDAUW07z2dx7IagopEjcVIVTEyj1JKbIvseQ0pt5n6XKHCD7RjFN /M327nhgOeZCtpfOJiBjt2a8DjBnmc1gKHAppzozTVCwBu0lWy0wbtubInWe HVqBDRddy/HZMwgGWF5DzGY0ZzUpvYbwvUJq5kGKAyrohOxbefLINdURD8ow txwFFUBnfKCloo8wrm+0BrRuCILt4GILfTDX+kO1+SS+snfQiH+934mmh5JI H6LNqs8T6qpvHGH+JzEjXuNWHL3B+16q3dzaKwg6Bra5YqXwde0GDLP6STzJ ZjUOmKztuJj34AFvX12Nj1OKovKTPO96BEgEvoquES+jQfYtccxAjrIDuGVs i+o0NwqAr2uZz0Wunx5V7ssfS8QZuPVY+7HL48+35x+/NCM/PTavMwmyQ989 0Aa1w0wxCDuANmKl58RMoXRlAJWFweWJV9Gzgx1eXTi3m3PVqERQmvwug4Fb sSwVjWKa1fc29Wyk+IZcbZX9G4J/IT3X6kDeFIeM1HvcTCGxeGtnfkX0s1vH e0WkqEQ7FzNF2YR4zjbZ5IFlWxQTIs2XDlo8qZHtUeAihnEyKS/IcRBOyKfa ReW7JlhPybttEKa0nTYYZ5sJDd7Wblvt6oKiOU1oZz3gylAh/MkndeZ69rOe tlUZ/aZfF/JEsf30fU/VHH2iw7JgONHrudSGLsdNDQllvk1GHnPRhmom740Y j53U6VfmGE2rbrJppYc9+uIY/6hsN2TaOobxlHcI1T+ZFCTbI1zA3/8fljh0 NjbCIgysdHx8PFEWU8F9hLXugrk3NMTD1c8Ziz6PcIyS1iczqSixqFKhKL+0 JDMPefTKyTMkyBV0FoQrstOcQQdIhKDEn7OhGaoKzxBgHYYiZIySb5zNUa6D Qvd/FOhiEWCNjhqQYFGM4AU2iEKCXUGLy7GJxXsjX5CAJO7ki1XYLdgX+UYp z2DkzQKOwaAqC/kAZYgQ2p4Cx2DoNTEAiI2JuMO8AAA= ------=_Part_16409_14543774.1194175159246-- From owner-xfs@oss.sgi.com Sun Nov 4 04:39:10 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 04:39:15 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4Cd9Vg021712 for ; Sun, 4 Nov 2007 04:39:10 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id EAF2B1C000262; Sun, 4 Nov 2007 07:39:13 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id D363B4019581; Sun, 4 Nov 2007 07:39:13 -0500 (EST) Date: Sun, 4 Nov 2007 07:39:13 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org cc: xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state (md3_raid5 stuck in endless loop?) In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.91.2/4671/Sat Nov 3 18:21:59 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13542 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs Time to reboot, before reboot: top - 07:30:23 up 13 days, 13:33, 10 users, load average: 16.00, 15.99, 14.96 Tasks: 221 total, 7 running, 209 sleeping, 0 stopped, 5 zombie Cpu(s): 0.0%us, 25.5%sy, 0.0%ni, 74.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8039432k total, 1744356k used, 6295076k free, 164k buffers Swap: 16787768k total, 160k used, 16787608k free, 616960k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 688 root 15 -5 0 0 0 R 100 0.0 121:21.43 md3_raid5 273 root 20 0 0 0 0 D 0 0.0 14:40.68 pdflush 274 root 20 0 0 0 0 D 0 0.0 13:00.93 pdflush # cat /proc/fs/xfs/stat extent_alloc 301974 256068291 310513 240764389 abt 1900173 15346352 738568 731314 blk_map 276979807 235589732 864002 211245834 591619 513439614 0 bmbt 50717 367726 14177 11846 dir 3818065 361561 359723 975628 trans 48452 2648064 570998 ig 6034530 2074424 43153 3960106 0 3869384 460831 log 282781 10454333 3028 399803 173488 push_ail 3267594 0 1620 2611 730365 0 4476 0 10269 0 xstrat 291940 0 rw 61423078 103732605 attr 0 0 0 0 icluster 312958 97323 419837 vnodes 90721 4019823 0 1926744 3929102 3929102 3929102 0 buf 14678900 11027087 3651843 25743 760449 0 0 15775888 280425 xpc 966925905920 1047628533165 1162276949815 debug 0 # cat meminfo MemTotal: 8039432 kB MemFree: 6287000 kB Buffers: 164 kB Cached: 617072 kB SwapCached: 0 kB Active: 178404 kB Inactive: 589880 kB SwapTotal: 16787768 kB SwapFree: 16787608 kB Dirty: 494280 kB Writeback: 86004 kB AnonPages: 151240 kB Mapped: 17092 kB Slab: 259696 kB SReclaimable: 170876 kB SUnreclaim: 88820 kB PageTables: 11448 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 20807484 kB Committed_AS: 353536 kB VmallocTotal: 34359738367 kB VmallocUsed: 15468 kB VmallocChunk: 34359722699 kB # echo 3 > /proc/sys/vm/drop_caches # cat /proc/meminfo MemTotal: 8039432 kB MemFree: 6418352 kB Buffers: 32 kB Cached: 597908 kB SwapCached: 0 kB Active: 172028 kB Inactive: 579808 kB SwapTotal: 16787768 kB SwapFree: 16787608 kB Dirty: 494312 kB Writeback: 86004 kB AnonPages: 154104 kB Mapped: 17416 kB Slab: 144072 kB SReclaimable: 53100 kB SUnreclaim: 90972 kB PageTables: 11832 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 20807484 kB Committed_AS: 360748 kB VmallocTotal: 34359738367 kB VmallocUsed: 15468 kB VmallocChunk: 34359722699 kB Nothing is actually happening on the device itself however. Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdj 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 md1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # vmstat 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 6 0 160 6420244 32 600092 0 0 221 227 5 1 1 1 98 0 6 0 160 6420228 32 600120 0 0 0 0 1015 142 0 25 75 0 6 0 160 6420228 32 600120 0 0 0 0 1005 127 0 25 75 0 6 0 160 6420228 32 600120 0 0 0 41 1022 151 0 26 74 0 6 0 160 6420228 32 600120 0 0 0 0 1011 131 0 25 75 0 6 0 160 6420228 32 600120 0 0 0 0 1013 124 0 25 75 0 6 0 160 6420228 32 600120 0 0 0 0 1042 129 0 25 75 0 # uname -mr 2.6.23.1 x86_64 # cat /proc/vmstat nr_free_pages 1598911 nr_inactive 146381 nr_active 42724 nr_anon_pages 37181 nr_mapped 4097 nr_file_pages 151975 nr_dirty 123572 nr_writeback 21501 nr_slab_reclaimable 16152 nr_slab_unreclaimable 24284 nr_page_table_pages 2823 nr_unstable 0 nr_bounce 0 nr_vmscan_write 20712 pgpgin 1015377151 pgpgout 1043634578 pswpin 0 pswpout 40 pgalloc_dma 4 pgalloc_dma32 319052932 pgalloc_normal 621945603 pgalloc_movable 0 pgfree 942598566 pgactivate 31123819 pgdeactivate 18438560 pgfault 360236898 pgmajfault 16158 pgrefill_dma 0 pgrefill_dma32 11683348 pgrefill_normal 18799274 pgrefill_movable 0 pgsteal_dma 0 pgsteal_dma32 176658679 pgsteal_normal 233628315 pgsteal_movable 0 pgscan_kswapd_dma 0 pgscan_kswapd_dma32 164181746 pgscan_kswapd_normal 217338820 pgscan_kswapd_movable 0 pgscan_direct_dma 0 pgscan_direct_dma32 13074075 pgscan_direct_normal 17342937 pgscan_direct_movable 0 pginodesteal 332816 slabs_scanned 12368000 kswapd_steal 380216091 kswapd_inodesteal 9858653 pageoutrun 1167045 allocstall 68454 pgrotated 40 # cat /proc/zoneinfo Node 0, zone DMA pages free 2601 min 3 low 3 high 4 scanned 0 (a: 11 i: 12) spanned 4096 present 2486 nr_free_pages 2601 nr_inactive 0 nr_active 0 nr_anon_pages 0 nr_mapped 1 nr_file_pages 0 nr_dirty 0 nr_writeback 0 nr_slab_reclaimable 0 nr_slab_unreclaimable 4 nr_page_table_pages 0 nr_unstable 0 nr_bounce 0 nr_vmscan_write 0 protection: (0, 3246, 7917, 7917) pagesets cpu: 0 pcp: 0 count: 0 high: 0 batch: 1 cpu: 0 pcp: 1 count: 0 high: 0 batch: 1 vm stats threshold: 6 cpu: 1 pcp: 0 count: 0 high: 0 batch: 1 cpu: 1 pcp: 1 count: 0 high: 0 batch: 1 vm stats threshold: 6 cpu: 2 pcp: 0 count: 0 high: 0 batch: 1 cpu: 2 pcp: 1 count: 0 high: 0 batch: 1 vm stats threshold: 6 cpu: 3 pcp: 0 count: 0 high: 0 batch: 1 cpu: 3 pcp: 1 count: 0 high: 0 batch: 1 vm stats threshold: 6 all_unreclaimable: 1 prev_priority: 12 start_pfn: 0 Node 0, zone DMA32 pages free 699197 min 1166 low 1457 high 1749 scanned 0 (a: 14 i: 0) spanned 1044480 present 831104 nr_free_pages 699197 nr_inactive 38507 nr_active 11855 nr_anon_pages 11228 nr_mapped 612 nr_file_pages 39127 nr_dirty 38462 nr_writeback 34 nr_slab_reclaimable 8164 nr_slab_unreclaimable 4747 nr_page_table_pages 756 nr_unstable 0 nr_bounce 0 nr_vmscan_write 6132 protection: (0, 0, 4671, 4671) pagesets cpu: 0 pcp: 0 count: 183 high: 186 batch: 31 cpu: 0 pcp: 1 count: 52 high: 62 batch: 15 vm stats threshold: 36 cpu: 1 pcp: 0 count: 23 high: 186 batch: 31 cpu: 1 pcp: 1 count: 14 high: 62 batch: 15 vm stats threshold: 36 cpu: 2 pcp: 0 count: 173 high: 186 batch: 31 cpu: 2 pcp: 1 count: 61 high: 62 batch: 15 vm stats threshold: 36 cpu: 3 pcp: 0 count: 95 high: 186 batch: 31 cpu: 3 pcp: 1 count: 57 high: 62 batch: 15 vm stats threshold: 36 all_unreclaimable: 0 prev_priority: 12 start_pfn: 4096 Node 0, zone Normal pages free 897091 min 1678 low 2097 high 2517 scanned 0 (a: 29 i: 0) spanned 1212416 present 1195840 nr_free_pages 897091 nr_inactive 107874 nr_active 30878 nr_anon_pages 25956 nr_mapped 3484 nr_file_pages 112857 nr_dirty 85110 nr_writeback 21467 nr_slab_reclaimable 7988 nr_slab_unreclaimable 19546 nr_page_table_pages 2067 nr_unstable 0 nr_bounce 0 nr_vmscan_write 14580 protection: (0, 0, 0, 0) pagesets cpu: 0 pcp: 0 count: 124 high: 186 batch: 31 cpu: 0 pcp: 1 count: 1 high: 62 batch: 15 vm stats threshold: 42 cpu: 1 pcp: 0 count: 68 high: 186 batch: 31 cpu: 1 pcp: 1 count: 9 high: 62 batch: 15 vm stats threshold: 42 cpu: 2 pcp: 0 count: 79 high: 186 batch: 31 cpu: 2 pcp: 1 count: 10 high: 62 batch: 15 vm stats threshold: 42 cpu: 3 pcp: 0 count: 47 high: 186 batch: 31 cpu: 3 pcp: 1 count: 60 high: 62 batch: 15 vm stats threshold: 42 all_unreclaimable: 0 prev_priority: 12 start_pfn: 1048576 On Sun, 4 Nov 2007, Justin Piszcz wrote: > # ps auxww | grep D > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > root 273 0.0 0.0 0 0 ? D Oct21 14:40 [pdflush] > root 274 0.0 0.0 0 0 ? D Oct21 13:00 [pdflush] > > After several days/weeks, this is the second time this has happened, while > doing regular file I/O (decompressing a file), everything on the device went > into D-state. > > # mdadm -D /dev/md3 > /dev/md3: > Version : 00.90.03 > Creation Time : Wed Aug 22 10:38:53 2007 > Raid Level : raid5 > Array Size : 1318680576 (1257.59 GiB 1350.33 GB) > Used Dev Size : 146520064 (139.73 GiB 150.04 GB) > Raid Devices : 10 > Total Devices : 10 > Preferred Minor : 3 > Persistence : Superblock is persistent > > Update Time : Sun Nov 4 06:38:29 2007 > State : active > Active Devices : 10 > Working Devices : 10 > Failed Devices : 0 > Spare Devices : 0 > > Layout : left-symmetric > Chunk Size : 1024K > > UUID : e37a12d1:1b0b989a:083fb634:68e9eb49 > Events : 0.4309 > > Number Major Minor RaidDevice State > 0 8 33 0 active sync /dev/sdc1 > 1 8 49 1 active sync /dev/sdd1 > 2 8 65 2 active sync /dev/sde1 > 3 8 81 3 active sync /dev/sdf1 > 4 8 97 4 active sync /dev/sdg1 > 5 8 113 5 active sync /dev/sdh1 > 6 8 129 6 active sync /dev/sdi1 > 7 8 145 7 active sync /dev/sdj1 > 8 8 161 8 active sync /dev/sdk1 > 9 8 177 9 active sync /dev/sdl1 > > If I wanted to find out what is causing this, what type of debugging would I > have to enable to track it down? Any attempt to read/write files on the > devices fails (also going into d-state). Is there any useful information I > can get currently before rebooting the machine? > > # pwd > /sys/block/md3/md > # ls > array_state dev-sdj1/ rd2@ stripe_cache_active > bitmap_set_bits dev-sdk1/ rd3@ stripe_cache_size > chunk_size dev-sdl1/ rd4@ suspend_hi > component_size layout rd5@ suspend_lo > dev-sdc1/ level rd6@ sync_action > dev-sdd1/ metadata_version rd7@ sync_completed > dev-sde1/ mismatch_cnt rd8@ sync_speed > dev-sdf1/ new_dev rd9@ sync_speed_max > dev-sdg1/ raid_disks reshape_position sync_speed_min > dev-sdh1/ rd0@ resync_start > dev-sdi1/ rd1@ safe_mode_delay > # cat array_state > active-idle > # cat mismatch_cnt > 0 > # cat stripe_cache_active > 1 > # cat stripe_cache_size > 16384 > # cat sync_action > idle > # cat /proc/mdstat > Personalities : [raid1] [raid6] [raid5] [raid4] > md1 : active raid1 sdb2[1] sda2[0] > 136448 blocks [2/2] [UU] > > md2 : active raid1 sdb3[1] sda3[0] > 129596288 blocks [2/2] [UU] > > md3 : active raid5 sdl1[9] sdk1[8] sdj1[7] sdi1[6] sdh1[5] sdg1[4] sdf1[3] > sde1[2] sdd1[1] sdc1[0] > 1318680576 blocks level 5, 1024k chunk, algorithm 2 [10/10] > [UUUUUUUUUU] > > md0 : active raid1 sdb1[1] sda1[0] > 16787776 blocks [2/2] [UU] > > unused devices: > # > > Justin. > From owner-xfs@oss.sgi.com Sun Nov 4 04:51:56 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 04:52:05 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4CptTc023254 for ; Sun, 4 Nov 2007 04:51:56 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id 123691C000263; Sun, 4 Nov 2007 07:52:00 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id 0C91E4019581; Sun, 4 Nov 2007 07:52:00 -0500 (EST) Date: Sun, 4 Nov 2007 07:52:00 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Michael Tokarev cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state In-Reply-To: <472DBF8C.2060508@msgid.tls.msk.ru> Message-ID: References: <472DBF8C.2060508@msgid.tls.msk.ru> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.91.2/4672/Sun Nov 4 03:38:42 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13543 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs On Sun, 4 Nov 2007, Michael Tokarev wrote: > Justin Piszcz wrote: >> # ps auxww | grep D >> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND >> root 273 0.0 0.0 0 0 ? D Oct21 14:40 [pdflush] >> root 274 0.0 0.0 0 0 ? D Oct21 13:00 [pdflush] >> >> After several days/weeks, this is the second time this has happened, >> while doing regular file I/O (decompressing a file), everything on the >> device went into D-state. > > The next time you come across something like that, do a SysRq-T dump and > post that. It shows a stack trace of all processes - and in particular, > where exactly each task is stuck. > > /mjt > Yes I got it before I rebooted, ran that and then dmesg > file. Here it is: [1172609.665902] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.668768] ffffffff80747dc0 ffff81015c3aa918 ffff810091c899b4 ffff810091c899a8 [1172609.668871] Call Trace: [1172609.674472] [] schedule_timeout+0x5f/0xd0 [1172609.677362] [] process_timeout+0x0/0x10 [1172609.680243] [] do_select+0x468/0x560 [1172609.683105] [] __pollwait+0x0/0x130 [1172609.685969] [] default_wake_function+0x0/0x10 [1172609.688851] [] default_wake_function+0x0/0x10 [1172609.691712] [] default_wake_function+0x0/0x10 [1172609.694534] [] default_wake_function+0x0/0x10 [1172609.697324] [] skb_copy_datagram_iovec+0x1a1/0x260 [1172609.700103] [] _spin_lock_bh+0x9/0x20 [1172609.702856] [] release_sock+0x13/0xb0 [1172609.705598] [] tcp_recvmsg+0x370/0x940 [1172609.708303] [] sock_common_recvmsg+0x30/0x50 [1172609.710999] [] sock_aio_read+0x11b/0x130 [1172609.713694] [] core_sys_select+0x209/0x300 [1172609.716397] [] autoremove_wake_function+0x0/0x30 [1172609.719112] [] default_wake_function+0x0/0x10 [1172609.721824] [] current_fs_time+0x1e/0x30 [1172609.724525] [] tty_ldisc_deref+0x52/0x80 [1172609.727215] [] sys_select+0xd1/0x1c0 [1172609.729880] [] system_call+0x7e/0x83 [1172609.732517] [1172609.735115] bash S 0000000000000000 0 30959 30958 [1172609.737742] ffff810091c8be88 0000000000000086 0000000000000000 ffff8101ea172e20 [1172609.740404] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.743087] ffffffff80747dc0 ffff81015c3ab028 ffff810091c8be54 ffff810091c8be48 [1172609.743190] Call Trace: [1172609.748404] [] do_wait+0x599/0xc90 [1172609.751071] [] __wake_up+0x43/0x70 [1172609.753714] [] default_wake_function+0x0/0x10 [1172609.756345] [] system_call+0x7e/0x83 [1172609.758967] [1172609.761522] sr S 0000000000000000 0 30966 30959 [1172609.764123] ffff810122d7de88 0000000000000082 0000000000000000 ffff8101eab3ee20 [1172609.766769] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.769442] ffffffff80747dc0 ffff8101ea173028 ffff810122d7de54 ffff810122d7de48 [1172609.769545] Call Trace: [1172609.774734] [] do_wait+0x599/0xc90 [1172609.777369] [] default_wake_function+0x0/0x10 [1172609.779999] [] system_call+0x7e/0x83 [1172609.782616] [1172609.785168] screen S 0000000000000000 0 30972 30966 [1172609.787768] ffff810144597f68 0000000000000086 ffff810144597f30 00000000ffffffff [1172609.790416] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.793085] ffffffff80747dc0 ffff8101eab3f028 ffff810144597f34 ffff810144597f28 [1172609.793188] Call Trace: [1172609.798381] [] alarm_setitimer+0x35/0x70 [1172609.801049] [] sys_pause+0x19/0x30 [1172609.803705] [] system_call+0x7e/0x83 [1172609.806361] [1172609.808980] sshd S 0000000000000000 0 30973 7582 [1172609.811659] ffff810084003bf8 0000000000000082 0000000000000000 ffffffff80508e74 [1172609.814376] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.817104] ffffffff80747dc0 ffff8101ea172208 ffff810084003bc4 ffff810084003bb8 [1172609.817207] Call Trace: [1172609.822530] [] skb_queue_tail+0x24/0x60 [1172609.825292] [] schedule_timeout+0x95/0xd0 [1172609.828060] [] prepare_to_wait+0x23/0x80 [1172609.830820] [] unix_stream_recvmsg+0x386/0x550 [1172609.833587] [] autoremove_wake_function+0x0/0x30 [1172609.836344] [] link_path_walk+0x80/0xf0 [1172609.839074] [] sock_aio_read+0x11b/0x130 [1172609.841794] [] get_unused_fd_flags+0x79/0x120 [1172609.844488] [] do_sync_read+0xd9/0x120 [1172609.847161] [] autoremove_wake_function+0x0/0x30 [1172609.849848] [] __dentry_open+0x11f/0x1b0 [1172609.852541] [] do_filp_open+0x3a/0x50 [1172609.855235] [] vfs_read+0x157/0x160 [1172609.857922] [] sys_read+0x53/0x90 [1172609.860620] [] system_call+0x7e/0x83 [1172609.863343] [1172609.866063] sshd S 0000000000000000 0 30975 30973 [1172609.868838] ffff810175c219e8 0000000000000086 ffff810175c219b0 0000000000000002 [1172609.871649] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.874490] ffffffff80747dc0 ffff81021b27d738 ffff810175c219b4 ffff810175c219a8 [1172609.874594] Call Trace: [1172609.880153] [] schedule_timeout+0x5f/0xd0 [1172609.883020] [] process_timeout+0x0/0x10 [1172609.885890] [] do_select+0x468/0x560 [1172609.888742] [] __pollwait+0x0/0x130 [1172609.891581] [] default_wake_function+0x0/0x10 [1172609.894430] [] default_wake_function+0x0/0x10 [1172609.897258] [] default_wake_function+0x0/0x10 [1172609.900060] [] default_wake_function+0x0/0x10 [1172609.902841] [] add_partial+0x19/0x60 [1172609.905606] [] __slab_free+0x15d/0x310 [1172609.908363] [] _spin_lock_bh+0x9/0x20 [1172609.911093] [] release_sock+0x13/0xb0 [1172609.913795] [] tcp_recvmsg+0x370/0x940 [1172609.916486] [] sock_common_recvmsg+0x30/0x50 [1172609.919151] [] sock_aio_read+0x11b/0x130 [1172609.921799] [] core_sys_select+0x209/0x300 [1172609.924455] [] autoremove_wake_function+0x0/0x30 [1172609.927122] [] default_wake_function+0x0/0x10 [1172609.929786] [] current_fs_time+0x1e/0x30 [1172609.932438] [] tty_ldisc_deref+0x52/0x80 [1172609.935083] [] sys_select+0xd1/0x1c0 [1172609.937702] [] system_call+0x7e/0x83 [1172609.940292] [1172609.942843] bash S 0000000000000000 0 30976 30975 [1172609.945423] ffff8101bf371e88 0000000000000082 0000000000000000 ffff81021e322710 [1172609.948037] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.950671] ffffffff80747dc0 ffff8101882bf738 ffff8101bf371e54 ffff8101bf371e48 [1172609.950774] Call Trace: [1172609.955888] [] do_wait+0x599/0xc90 [1172609.958505] [] __wake_up+0x43/0x70 [1172609.961098] [] vfs_ioctl+0x220/0x2c0 [1172609.963662] [] default_wake_function+0x0/0x10 [1172609.966234] [] sys_ioctl+0x49/0x80 [1172609.968766] [] system_call+0x7e/0x83 [1172609.971279] [1172609.973759] screen S 0000000000000000 0 30991 30976 [1172609.976308] ffff8101a8329f68 0000000000000086 0000000000000000 00000000ffffffff [1172609.978892] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172609.981501] ffffffff80747dc0 ffff81021e322918 ffff8101a8329f34 ffff8101a8329f28 [1172609.981605] Call Trace: [1172609.986634] [] alarm_setitimer+0x35/0x70 [1172609.989220] [] sys_pause+0x19/0x30 [1172609.991766] [] system_call+0x7e/0x83 [1172609.994292] [1172609.996787] screen D ffff8100a18ff800 0 30992 30991 [1172609.999344] ffff8101a854dd28 0000000000000086 ffff81022854ddb7 ffff8101a854dcd8 [1172610.001953] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.004574] ffffffff80747dc0 ffff810170233028 ffffffff80656bcb ffffffff8021f8bc [1172610.004677] Call Trace: [1172610.009752] [] task_rq_lock+0x4c/0x90 [1172610.012366] [] try_to_wake_up+0x68/0x3b0 [1172610.014981] [] wait_for_completion+0x7d/0xc0 [1172610.017594] [] default_wake_function+0x0/0x10 [1172610.020208] [] flush_cpu_workqueue+0x6a/0x90 [1172610.022828] [] wq_barrier_func+0x0/0x10 [1172610.025447] [] flush_workqueue+0x33/0x50 [1172610.028076] [] release_dev+0x44f/0x750 [1172610.030710] [] mntput_no_expire+0x27/0xb0 [1172610.033339] [] tty_release+0x11/0x20 [1172610.035958] [] __fput+0xb1/0x1a0 [1172610.038547] [] filp_close+0x54/0x90 [1172610.041106] [] sys_close+0x96/0x100 [1172610.043652] [] system_call+0x7e/0x83 [1172610.046160] [1172610.048618] bash ? 0000000000000000 0 30993 30992 [1172610.051135] ffff8101aa2a3ee8 0000000000000046 ffff8101aa2a3eb0 0000000000000011 [1172610.053708] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.056312] ffffffff80747dc0 ffff810170233738 ffff8101aa2a3eb4 ffff8101aa2a3ea8 [1172610.056415] Call Trace: [1172610.061510] [] do_exit+0x5be/0x8a0 [1172610.064172] [] do_group_exit+0x2c/0x80 [1172610.066859] [] system_call+0x7e/0x83 [1172610.069537] [1172610.072190] sshd S 0000000000000000 0 7001 7582 [1172610.074908] ffff8100792b1bf8 0000000000000082 0000000000000000 ffff8101e9c51b80 [1172610.077679] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.080477] ffffffff80747dc0 ffff8102234ff738 ffff8100792b1bc4 ffff8100792b1bb8 [1172610.080580] Call Trace: [1172610.086042] [] schedule_timeout+0x95/0xd0 [1172610.088861] [] prepare_to_wait+0x23/0x80 [1172610.091673] [] unix_stream_recvmsg+0x386/0x550 [1172610.094492] [] autoremove_wake_function+0x0/0x30 [1172610.097318] [] link_path_walk+0x80/0xf0 [1172610.100148] [] sock_aio_read+0x11b/0x130 [1172610.102976] [] get_unused_fd_flags+0x79/0x120 [1172610.105822] [] do_sync_read+0xd9/0x120 [1172610.108651] [] autoremove_wake_function+0x0/0x30 [1172610.111495] [] __dentry_open+0x11f/0x1b0 [1172610.114319] [] do_filp_open+0x3a/0x50 [1172610.117118] [] vfs_read+0x157/0x160 [1172610.119902] [] sys_read+0x53/0x90 [1172610.122638] [] system_call+0x7e/0x83 [1172610.125360] [1172610.128056] sshd S 0000000000000000 0 7003 7001 [1172610.130818] ffff8100675a39e8 0000000000000082 ffff8100675a39b0 0000000000000002 [1172610.133623] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.136446] ffffffff80747dc0 ffff810225459028 ffff8100675a39b4 ffff8100675a39a8 [1172610.136549] Call Trace: [1172610.142064] [] schedule_timeout+0x5f/0xd0 [1172610.144899] [] process_timeout+0x0/0x10 [1172610.147716] [] do_select+0x468/0x560 [1172610.150495] [] __pollwait+0x0/0x130 [1172610.153260] [] default_wake_function+0x0/0x10 [1172610.156005] [] default_wake_function+0x0/0x10 [1172610.158707] [] default_wake_function+0x0/0x10 [1172610.161378] [] default_wake_function+0x0/0x10 [1172610.164026] [] skb_copy_datagram_iovec+0x1a1/0x260 [1172610.166675] [] _spin_lock_bh+0x9/0x20 [1172610.169315] [] release_sock+0x13/0xb0 [1172610.171917] [] tcp_recvmsg+0x370/0x940 [1172610.174494] [] sock_common_recvmsg+0x30/0x50 [1172610.177085] [] sock_aio_read+0x11b/0x130 [1172610.179638] [] core_sys_select+0x209/0x300 [1172610.182178] [] autoremove_wake_function+0x0/0x30 [1172610.184734] [] default_wake_function+0x0/0x10 [1172610.187290] [] current_fs_time+0x1e/0x30 [1172610.189837] [] tty_ldisc_deref+0x52/0x80 [1172610.192370] [] sys_select+0xd1/0x1c0 [1172610.194900] [] system_call+0x7e/0x83 [1172610.197426] [1172610.199919] bash S 000000000000000e 0 7004 7003 [1172610.202470] ffff8100cc263e88 0000000000000082 80000000804ca065 ffff81022367f530 [1172610.205071] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.207699] ffffffff80747dc0 ffff8102234fe918 ffff8100cc263e38 ffff810035a16348 [1172610.207802] Call Trace: [1172610.212949] [] do_page_fault+0x202/0x890 [1172610.215618] [] do_wait+0x599/0xc90 [1172610.218263] [] __wake_up+0x43/0x70 [1172610.220900] [] vfs_ioctl+0x220/0x2c0 [1172610.223509] [] default_wake_function+0x0/0x10 [1172610.226109] [] sys_ioctl+0x49/0x80 [1172610.228693] [] system_call+0x7e/0x83 [1172610.231240] [1172610.233746] aur S 0000000000000000 0 7014 7004 [1172610.236319] ffff810098071e88 0000000000000086 ffff810098071e50 ffffffff80232c93 [1172610.238941] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.241566] ffffffff80747dc0 ffff81022367f738 ffff810098071e54 ffff810098071e48 [1172610.241669] Call Trace: [1172610.246766] [] get_signal_to_deliver+0x73/0x470 [1172610.249380] [] do_wait+0x599/0xc90 [1172610.251983] [] default_wake_function+0x0/0x10 [1172610.254563] [] system_call+0x7e/0x83 [1172610.257122] [1172610.259648] aur S 0000000000000004 0 7066 7014 [1172610.262226] ffff810085231e88 0000000000000086 ffff8101ea314ce8 ffffffff80232c93 [1172610.264844] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.267471] ffffffff80747dc0 ffff8101ea314918 ffffffff802302ce ffffffff8020b3d6 [1172610.267574] Call Trace: [1172610.272674] [] get_signal_to_deliver+0x73/0x470 [1172610.275315] [] recalc_sigpending+0xe/0x30 [1172610.277948] [] do_notify_resume+0x536/0x7a0 [1172610.280577] [] do_wait+0x599/0xc90 [1172610.283199] [] default_wake_function+0x0/0x10 [1172610.285840] [] system_call+0x7e/0x83 [1172610.288491] [1172610.291116] unrar D ffff8100aa785c80 0 7135 7066 [1172610.293792] ffff8101ecf4ddb8 0000000000000086 ffff8101ecf4dd80 0000000000000000 [1172610.296525] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.299256] ffffffff80747dc0 ffff81021e53f028 ffff8101ecf4dd84 ffff8101ecf4dd78 [1172610.299359] Call Trace: [1172610.304629] [] vn_iowait+0x75/0xa0 [1172610.307301] [] autoremove_wake_function+0x0/0x30 [1172610.309979] [] xfs_trans_alloc+0x9c/0xb0 [1172610.312653] [] xfs_itruncate_start+0x35/0xe0 [1172610.315340] [] xfs_free_eofblocks+0x17a/0x280 [1172610.318032] [] xfs_release+0x134/0x1e0 [1172610.320711] [] xfs_file_release+0x1a/0x30 [1172610.323417] [] __fput+0xb1/0x1a0 [1172610.326144] [] filp_close+0x54/0x90 [1172610.328895] [] sys_close+0x96/0x100 [1172610.331631] [] system_call+0x7e/0x83 [1172610.334353] [1172610.337050] sshd D 0000000000000000 0 7187 7582 [1172610.339811] ffff81002b62fd28 0000000000000086 ffff81002b62fcf0 ffff81002b62fcd8 [1172610.342618] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.345448] ffffffff80747dc0 ffff8101ccda3028 ffff81002b62fcf4 ffff81002b62fce8 [1172610.345551] Call Trace: [1172610.351072] [] wait_for_completion+0x7d/0xc0 [1172610.353915] [] default_wake_function+0x0/0x10 [1172610.356765] [] flush_cpu_workqueue+0x6a/0x90 [1172610.359622] [] wq_barrier_func+0x0/0x10 [1172610.362477] [] flush_workqueue+0x33/0x50 [1172610.365337] [] release_dev+0x44f/0x750 [1172610.368184] [] sys_fchmodat+0x6a/0x120 [1172610.371026] [] tty_release+0x11/0x20 [1172610.373843] [] __fput+0xb1/0x1a0 [1172610.376628] [] filp_close+0x54/0x90 [1172610.379402] [] sys_close+0x96/0x100 [1172610.382135] [] system_call+0x7e/0x83 [1172610.384846] [1172610.387529] sshd ? 0000000000000000 0 7218 7187 [1172610.390280] ffff81013bd7bee8 0000000000000046 ffff81013bd7beb0 0000000000000011 [1172610.393084] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.395907] ffffffff80747dc0 ffff8101bf6ae918 ffff81013bd7beb4 ffff81013bd7bea8 [1172610.396010] Call Trace: [1172610.401528] [] __cond_resched+0x1c/0x50 [1172610.404362] [] do_exit+0x5be/0x8a0 [1172610.407192] [] do_group_exit+0x2c/0x80 [1172610.409993] [] system_call+0x7e/0x83 [1172610.412776] [1172610.415520] sshd S 0000000000000000 0 7236 7582 [1172610.418293] ffff8101e4a89bf8 0000000000000082 ffff8101e4a89bc0 ffff81013bf542c0 [1172610.421090] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.423898] ffffffff80747dc0 ffff810168684208 ffff8101e4a89bc4 ffff8101e4a89bb8 [1172610.424001] Call Trace: [1172610.429474] [] schedule_timeout+0x95/0xd0 [1172610.432284] [] prepare_to_wait+0x23/0x80 [1172610.435096] [] unix_stream_recvmsg+0x386/0x550 [1172610.437896] [] autoremove_wake_function+0x0/0x30 [1172610.440690] [] link_path_walk+0x80/0xf0 [1172610.443487] [] sock_aio_read+0x11b/0x130 [1172610.446249] [] get_unused_fd_flags+0x79/0x120 [1172610.448997] [] do_sync_read+0xd9/0x120 [1172610.451737] [] autoremove_wake_function+0x0/0x30 [1172610.454491] [] __dentry_open+0x11f/0x1b0 [1172610.457244] [] do_filp_open+0x3a/0x50 [1172610.459989] [] vfs_read+0x157/0x160 [1172610.462724] [] sys_read+0x53/0x90 [1172610.465430] [] system_call+0x7e/0x83 [1172610.468131] [1172610.470765] sshd S 0000000000000000 0 7238 7236 [1172610.473440] ffff810046e1f9e8 0000000000000082 ffff810046e1f9b0 0000000000000002 [1172610.476161] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.478873] ffffffff80747dc0 ffff810168685028 ffff810046e1f9b4 ffff810046e1f9a8 [1172610.478975] Call Trace: [1172610.484236] [] schedule_timeout+0x5f/0xd0 [1172610.486940] [] process_timeout+0x0/0x10 [1172610.489645] [] do_select+0x468/0x560 [1172610.492340] [] __pollwait+0x0/0x130 [1172610.495030] [] default_wake_function+0x0/0x10 [1172610.497738] [] default_wake_function+0x0/0x10 [1172610.500417] [] default_wake_function+0x0/0x10 [1172610.503076] [] default_wake_function+0x0/0x10 [1172610.505711] [] skb_copy_datagram_iovec+0x1a1/0x260 [1172610.508366] [] _spin_lock_bh+0x9/0x20 [1172610.511004] [] release_sock+0x13/0xb0 [1172610.513638] [] tcp_recvmsg+0x370/0x940 [1172610.516245] [] sock_common_recvmsg+0x30/0x50 [1172610.518841] [] sock_aio_read+0x11b/0x130 [1172610.521423] [] core_sys_select+0x209/0x300 [1172610.523974] [] autoremove_wake_function+0x0/0x30 [1172610.526518] [] default_wake_function+0x0/0x10 [1172610.529058] [] current_fs_time+0x1e/0x30 [1172610.531592] [] tty_ldisc_deref+0x52/0x80 [1172610.534118] [] sys_select+0xd1/0x1c0 [1172610.536645] [] system_call+0x7e/0x83 [1172610.539162] [1172610.541651] bash S 000000000000000e 0 7239 7238 [1172610.544203] ffff8100aae5fe88 0000000000000082 80000001bab2c065 ffff810145b6ae20 [1172610.546809] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.549446] ffffffff80747dc0 ffff810168685738 ffff8100aae5fe38 ffff810065785f18 [1172610.549550] Call Trace: [1172610.554709] [] do_page_fault+0x202/0x890 [1172610.557368] [] update_curr+0x109/0x120 [1172610.560022] [] do_wait+0x599/0xc90 [1172610.562647] [] __sched_text_start+0x166/0x23d [1172610.565267] [] __wake_up+0x43/0x70 [1172610.567871] [] vfs_ioctl+0x220/0x2c0 [1172610.570435] [] default_wake_function+0x0/0x10 [1172610.572998] [] sys_ioctl+0x49/0x80 [1172610.575555] [] system_call+0x7e/0x83 [1172610.578118] [1172610.580652] sshd S 0000000000000000 0 7248 7582 [1172610.583235] ffff8101120d5bf8 0000000000000082 ffff8101120d5bc0 ffff81001e998dc0 [1172610.585865] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.588489] ffffffff80747dc0 ffff810130906208 ffff8101120d5bc4 ffff8101120d5bb8 [1172610.588592] Call Trace: [1172610.593666] [] schedule_timeout+0x95/0xd0 [1172610.596253] [] prepare_to_wait+0x23/0x80 [1172610.598824] [] unix_stream_recvmsg+0x386/0x550 [1172610.601405] [] autoremove_wake_function+0x0/0x30 [1172610.603992] [] link_path_walk+0x80/0xf0 [1172610.606571] [] sock_aio_read+0x11b/0x130 [1172610.609138] [] get_unused_fd_flags+0x79/0x120 [1172610.611720] [] do_sync_read+0xd9/0x120 [1172610.614293] [] autoremove_wake_function+0x0/0x30 [1172610.616883] [] __dentry_open+0x11f/0x1b0 [1172610.619463] [] do_filp_open+0x3a/0x50 [1172610.622029] [] vfs_read+0x157/0x160 [1172610.624594] [] sys_read+0x53/0x90 [1172610.627144] [] system_call+0x7e/0x83 [1172610.629703] [1172610.632237] sshd S 0000000000000000 0 7250 7248 [1172610.634822] ffff810126f3d9e8 0000000000000086 ffff810126f3d9b0 0000000000000002 [1172610.637453] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.640086] ffffffff80747dc0 ffff810130907028 ffff810126f3d9b4 ffff810126f3d9a8 [1172610.640190] Call Trace: [1172610.645268] [] schedule_timeout+0x5f/0xd0 [1172610.647857] [] process_timeout+0x0/0x10 [1172610.650429] [] do_select+0x468/0x560 [1172610.652990] [] __pollwait+0x0/0x130 [1172610.655552] [] default_wake_function+0x0/0x10 [1172610.658131] [] default_wake_function+0x0/0x10 [1172610.660680] [] default_wake_function+0x0/0x10 [1172610.663230] [] default_wake_function+0x0/0x10 [1172610.665779] [] skb_copy_datagram_iovec+0x1a1/0x260 [1172610.668368] [] _spin_lock_bh+0x9/0x20 [1172610.670946] [] release_sock+0x13/0xb0 [1172610.673510] [] tcp_recvmsg+0x370/0x940 [1172610.676075] [] sock_common_recvmsg+0x30/0x50 [1172610.678653] [] sock_aio_read+0x11b/0x130 [1172610.681222] [] core_sys_select+0x209/0x300 [1172610.683798] [] autoremove_wake_function+0x0/0x30 [1172610.686386] [] default_wake_function+0x0/0x10 [1172610.688970] [] current_fs_time+0x1e/0x30 [1172610.691546] [] tty_ldisc_deref+0x52/0x80 [1172610.694114] [] sys_select+0xd1/0x1c0 [1172610.696683] [] system_call+0x7e/0x83 [1172610.699244] [1172610.701782] bash S 000000000000000e 0 7251 7250 [1172610.704370] ffff810121e8de88 0000000000000086 800000008e47c065 ffff8101afbec710 [1172610.707005] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.709641] ffffffff80747dc0 ffff810130907738 ffff810121e8de38 ffff8101e65ef9d8 [1172610.709744] Call Trace: [1172610.714827] [] do_page_fault+0x202/0x890 [1172610.717419] [] do_wait+0x599/0xc90 [1172610.719979] [] __wake_up+0x43/0x70 [1172610.722535] [] vfs_ioctl+0x220/0x2c0 [1172610.725088] [] default_wake_function+0x0/0x10 [1172610.727650] [] sys_ioctl+0x49/0x80 [1172610.730203] [] system_call+0x7e/0x83 [1172610.732759] [1172610.735278] su S 0000000000000000 0 7269 7251 [1172610.737850] ffff8101a5007e88 0000000000000086 ffff8101a5007e50 ffff8100219c0e20 [1172610.740475] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.743107] ffffffff80747dc0 ffff8101afbec918 ffff8101a5007e54 ffff8101a5007e48 [1172610.743210] Call Trace: [1172610.748316] [] do_wait+0x599/0xc90 [1172610.750913] [] default_wake_function+0x0/0x10 [1172610.753518] [] system_call+0x7e/0x83 [1172610.756084] [1172610.758600] bash S 0000000000000000 0 7270 7269 [1172610.761175] ffff81014bc9be88 0000000000000086 ffff81014bc9be50 ffff810139e7c000 [1172610.763792] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.766419] ffffffff80747dc0 ffff8100219c1028 ffff81014bc9be54 ffff81014bc9be48 [1172610.766521] Call Trace: [1172610.771636] [] do_wait+0x599/0xc90 [1172610.774264] [] __wake_up+0x43/0x70 [1172610.776885] [] vfs_ioctl+0x220/0x2c0 [1172610.779492] [] default_wake_function+0x0/0x10 [1172610.782107] [] sys_ioctl+0x49/0x80 [1172610.784719] [] system_call+0x7e/0x83 [1172610.787329] [1172610.789920] sshd S 0000000000000000 0 7278 7582 [1172610.792579] ffff810194cf5bf8 0000000000000086 ffff810194cf5bc0 ffff810010755600 [1172610.795276] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.797987] ffffffff80747dc0 ffff81002d667738 ffff810194cf5bc4 ffff810194cf5bb8 [1172610.798090] Call Trace: [1172610.803311] [] schedule_timeout+0x95/0xd0 [1172610.805992] [] prepare_to_wait+0x23/0x80 [1172610.808641] [] unix_stream_recvmsg+0x386/0x550 [1172610.811284] [] autoremove_wake_function+0x0/0x30 [1172610.813937] [] link_path_walk+0x80/0xf0 [1172610.816593] [] sock_aio_read+0x11b/0x130 [1172610.819250] [] get_unused_fd_flags+0x79/0x120 [1172610.821914] [] do_sync_read+0xd9/0x120 [1172610.824602] [] autoremove_wake_function+0x0/0x30 [1172610.827337] [] __dentry_open+0x11f/0x1b0 [1172610.830101] [] do_filp_open+0x3a/0x50 [1172610.832855] [] vfs_read+0x157/0x160 [1172610.835593] [] sys_read+0x53/0x90 [1172610.838321] [] system_call+0x7e/0x83 [1172610.841049] [1172610.843744] sshd S 0000000000000000 0 7280 7278 [1172610.846501] ffff81013acb39e8 0000000000000082 ffff81013acb39b0 0000000000000002 [1172610.849305] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.852125] ffffffff80747dc0 ffff81011060e918 ffff81013acb39b4 ffff81013acb39a8 [1172610.852228] Call Trace: [1172610.857719] [] schedule_timeout+0x5f/0xd0 [1172610.860553] [] process_timeout+0x0/0x10 [1172610.863393] [] do_select+0x468/0x560 [1172610.866228] [] __pollwait+0x0/0x130 [1172610.869046] [] default_wake_function+0x0/0x10 [1172610.871874] [] default_wake_function+0x0/0x10 [1172610.874652] [] default_wake_function+0x0/0x10 [1172610.877377] [] default_wake_function+0x0/0x10 [1172610.880068] [] add_partial+0x19/0x60 [1172610.882722] [] __slab_free+0x15d/0x310 [1172610.885354] [] _spin_lock_bh+0x9/0x20 [1172610.887984] [] release_sock+0x13/0xb0 [1172610.890616] [] tcp_recvmsg+0x370/0x940 [1172610.893251] [] sock_common_recvmsg+0x30/0x50 [1172610.895887] [] sock_aio_read+0x11b/0x130 [1172610.898512] [] core_sys_select+0x209/0x300 [1172610.901144] [] autoremove_wake_function+0x0/0x30 [1172610.903780] [] default_wake_function+0x0/0x10 [1172610.906421] [] current_fs_time+0x1e/0x30 [1172610.909040] [] tty_ldisc_deref+0x52/0x80 [1172610.911632] [] sys_select+0xd1/0x1c0 [1172610.914215] [] system_call+0x7e/0x83 [1172610.916754] [1172610.919253] bash S 000000000000000e 0 7281 7280 [1172610.921808] ffff8101919e3e88 0000000000000082 80000001542be065 ffff8100867c7530 [1172610.924409] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.927021] ffffffff80747dc0 ffff81011060f028 ffff8101919e3e38 ffff8101aae8a930 [1172610.927124] Call Trace: [1172610.932186] [] do_page_fault+0x202/0x890 [1172610.934771] [] do_wait+0x599/0xc90 [1172610.937337] [] __wake_up+0x43/0x70 [1172610.939863] [] default_wake_function+0x0/0x10 [1172610.942391] [] system_call+0x7e/0x83 [1172610.944923] [1172610.947429] su S 0000000000000000 0 7288 7281 [1172610.949987] ffff81004e873e88 0000000000000086 ffff81004e873e50 ffff81011060f530 [1172610.952588] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.955214] ffffffff80747dc0 ffff8100867c7738 ffff81004e873e54 ffff81004e873e48 [1172610.955317] Call Trace: [1172610.960412] [] do_wait+0x599/0xc90 [1172610.963007] [] default_wake_function+0x0/0x10 [1172610.965602] [] system_call+0x7e/0x83 [1172610.968186] [1172610.970703] bash S 0000000000000000 0 7289 7288 [1172610.973262] ffff810043dbfdb8 0000000000000082 ffff810043dbfd80 0000000000000fee [1172610.975867] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172610.978487] ffffffff80747dc0 ffff81011060f738 ffff810043dbfd84 ffff810043dbfd78 [1172610.978590] Call Trace: [1172610.983666] [] schedule_timeout+0x95/0xd0 [1172610.986293] [] add_wait_queue+0x1c/0x60 [1172610.988916] [] read_chan+0x228/0x6f0 [1172610.991531] [] default_wake_function+0x0/0x10 [1172610.994156] [] tty_read+0xb0/0x100 [1172610.996766] [] vfs_read+0xc5/0x160 [1172610.999359] [] sys_read+0x53/0x90 [1172611.001936] [] system_call+0x7e/0x83 [1172611.004527] [1172611.007092] strace S 0000000000000000 0 7319 7270 [1172611.009707] ffff8101534a9e88 0000000000000086 ffff8101534a9e50 0000000000000092 [1172611.012368] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.015032] ffffffff80747dc0 ffff810139e7c208 ffff8101534a9e54 ffff8101534a9e48 [1172611.015135] Call Trace: [1172611.020281] [] __group_send_sig_info+0x75/0xa0 [1172611.022914] [] do_wait+0x599/0xc90 [1172611.025526] [] kill_pid_info+0x51/0x90 [1172611.028133] [] default_wake_function+0x0/0x10 [1172611.030760] [] system_call+0x7e/0x83 [1172611.033389] [1172611.035983] rm D 0000000000000000 0 7463 7239 [1172611.038664] ffff8101254a3b08 0000000000000086 ffff8101254a3ad0 ffffffff80592c6c [1172611.041422] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.044223] ffffffff80747dc0 ffff810145b6b028 ffff8101254a3ad4 ffff8101254a3ac8 [1172611.044326] Call Trace: [1172611.049792] [] __down+0x10c/0x11f [1172611.052604] [] __down+0xa7/0x11f [1172611.055405] [] default_wake_function+0x0/0x10 [1172611.058227] [] __down_failed+0x35/0x3a [1172611.061044] [] xfs_buf_lock+0x3e/0x40 [1172611.063866] [] xfs_getsb+0x15/0x40 [1172611.066676] [] xfs_trans_getsb+0x5a/0xb0 [1172611.069478] [] xfs_trans_apply_sb_deltas+0xf/0x370 [1172611.072281] [] _xfs_trans_commit+0x9e/0x3c0 [1172611.075085] [] __up_read+0x21/0xb0 [1172611.077884] [] xfs_free_extent+0xe2/0x110 [1172611.080690] [] kmem_zone_alloc+0x5c/0xd0 [1172611.083499] [] kmem_zone_alloc+0x5c/0xd0 [1172611.086267] [] kmem_zone_zalloc+0x32/0x50 [1172611.089024] [] xfs_itruncate_finish+0xdb/0x320 [1172611.091768] [] xfs_inactive+0x3f1/0x520 [1172611.094486] [] xfs_fs_clear_inode+0xa9/0x100 [1172611.097203] [] clear_inode+0x58/0xf0 [1172611.099883] [] generic_delete_inode+0xe9/0xf0 [1172611.102557] [] do_unlinkat+0x14a/0x1c0 [1172611.105235] [] error_exit+0x0/0x84 [1172611.107916] [] system_call+0x7e/0x83 [1172611.110592] [1172611.113232] pickup S 0000000000000000 0 7573 30580 [1172611.115922] ffff81021d34be58 0000000000000086 ffff81021d34be20 0000000000000000 [1172611.118661] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.121398] ffffffff80747dc0 ffff8101964af028 ffff81021d34be24 ffff81021d34be18 [1172611.121501] Call Trace: [1172611.126801] [] schedule_timeout+0x5f/0xd0 [1172611.129501] [] process_timeout+0x0/0x10 [1172611.132189] [] sys_epoll_wait+0x1bd/0x4e0 [1172611.134877] [] default_wake_function+0x0/0x10 [1172611.137570] [] system_call+0x7e/0x83 [1172611.140247] [1172611.142890] bash D 0000000000000000 0 8896 1 [1172611.145570] ffff8101cdf07ac8 0000000000000046 ffff8101cdf07a90 ffff810226a79800 [1172611.148276] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.150995] ffffffff80747dc0 ffff810114417738 ffff8101cdf07a94 ffff8101cdf07a88 [1172611.151098] Call Trace: [1172611.156317] [] wait_for_completion+0x7d/0xc0 [1172611.159006] [] default_wake_function+0x0/0x10 [1172611.161706] [] flush_cpu_workqueue+0x6a/0x90 [1172611.164401] [] wq_barrier_func+0x0/0x10 [1172611.167086] [] flush_workqueue+0x33/0x50 [1172611.169785] [] release_dev+0x44f/0x750 [1172611.172499] [] __sched_text_start+0x166/0x23d [1172611.175232] [] tty_release+0x11/0x20 [1172611.177948] [] __fput+0xb1/0x1a0 [1172611.180652] [] filp_close+0x54/0x90 [1172611.183331] [] put_files_struct+0xb1/0xd0 [1172611.185991] [] do_exit+0x1a9/0x8a0 [1172611.188636] [] __dequeue_signal+0x165/0x1f0 [1172611.191258] [] do_group_exit+0x2c/0x80 [1172611.193857] [] get_signal_to_deliver+0x2c7/0x470 [1172611.196464] [] do_notify_resume+0xc5/0x7a0 [1172611.199077] [] send_signal+0x62/0x1f0 [1172611.201678] [] __group_send_sig_info+0x75/0xa0 [1172611.204289] [] group_send_sig_info+0x6e/0x90 [1172611.206890] [] sys_rt_sigreturn+0x324/0x3d0 [1172611.209498] [] sys_rt_sigaction+0x8e/0xc0 [1172611.212068] [] int_signal+0x12/0x17 [1172611.214618] [1172611.217129] su ? 0000000000000000 0 8903 8896 [1172611.219666] ffff8101e685dee8 0000000000000046 ffff8101e685deb0 0000000000000011 [1172611.222241] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.224859] ffffffff80747dc0 ffff8101158e9028 ffff8101e685deb4 ffff8101e685dea8 [1172611.224962] Call Trace: [1172611.230051] [] do_exit+0x5be/0x8a0 [1172611.232666] [] do_group_exit+0x2c/0x80 [1172611.235284] [] system_call+0x7e/0x83 [1172611.237904] [1172611.240493] bash D ffff8101bfb7e600 0 8977 1 [1172611.243132] ffff810106e37ac8 0000000000000046 ffff810106e37c08 ffff810226a79800 [1172611.245831] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.248548] ffffffff80747dc0 ffff81018e79d738 0000000000000000 0000000000000000 [1172611.248652] Call Trace: [1172611.253996] [] wait_for_completion+0x7d/0xc0 [1172611.256787] [] default_wake_function+0x0/0x10 [1172611.259581] [] flush_cpu_workqueue+0x6a/0x90 [1172611.262374] [] wq_barrier_func+0x0/0x10 [1172611.265167] [] flush_workqueue+0x33/0x50 [1172611.267952] [] release_dev+0x44f/0x750 [1172611.270733] [] tty_release+0x11/0x20 [1172611.273502] [] __fput+0xb1/0x1a0 [1172611.276259] [] filp_close+0x54/0x90 [1172611.279016] [] put_files_struct+0xb1/0xd0 [1172611.281764] [] do_exit+0x1a9/0x8a0 [1172611.284504] [] __dequeue_signal+0x165/0x1f0 [1172611.287232] [] do_group_exit+0x2c/0x80 [1172611.289939] [] get_signal_to_deliver+0x2c7/0x470 [1172611.292654] [] do_notify_resume+0xc5/0x7a0 [1172611.295338] [] send_signal+0x62/0x1f0 [1172611.298000] [] __group_send_sig_info+0x75/0xa0 [1172611.300678] [] group_send_sig_info+0x6e/0x90 [1172611.303357] [] sys_rt_sigreturn+0x324/0x3d0 [1172611.306036] [] sys_rt_sigaction+0x8e/0xc0 [1172611.308696] [] int_signal+0x12/0x17 [1172611.311338] [1172611.313951] su ? 0000000000000000 0 8984 8977 [1172611.316601] ffff810151203ee8 0000000000000046 ffff810151203eb0 0000000000000011 [1172611.319282] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.321981] ffffffff80747dc0 ffff81021a284918 ffff810151203eb4 ffff810151203ea8 [1172611.322083] Call Trace: [1172611.327263] [] do_exit+0x5be/0x8a0 [1172611.329910] [] do_group_exit+0x2c/0x80 [1172611.332547] [] system_call+0x7e/0x83 [1172611.335180] [1172611.337787] sshd S 0000000000000000 0 9072 7582 [1172611.340453] ffff81012ee91bf8 0000000000000082 ffff81012ee91bc0 ffff8101b0d95080 [1172611.343161] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.345879] ffffffff80747dc0 ffff8101cb862918 ffff81012ee91bc4 ffff81012ee91bb8 [1172611.345982] Call Trace: [1172611.351307] [] schedule_timeout+0x95/0xd0 [1172611.354060] [] prepare_to_wait+0x23/0x80 [1172611.356818] [] unix_stream_recvmsg+0x386/0x550 [1172611.359584] [] autoremove_wake_function+0x0/0x30 [1172611.362359] [] link_path_walk+0x80/0xf0 [1172611.365126] [] sock_aio_read+0x11b/0x130 [1172611.367882] [] get_unused_fd_flags+0x79/0x120 [1172611.370649] [] do_sync_read+0xd9/0x120 [1172611.373404] [] autoremove_wake_function+0x0/0x30 [1172611.376174] [] pick_next_task_fair+0x42/0x70 [1172611.378939] [] __sched_text_start+0x166/0x23d [1172611.381719] [] do_filp_open+0x3a/0x50 [1172611.384497] [] vfs_read+0x157/0x160 [1172611.387260] [] sys_read+0x53/0x90 [1172611.390008] [] system_call+0x7e/0x83 [1172611.392729] [1172611.395395] sshd S 0000000000000000 0 9074 9072 [1172611.398114] ffff8101677179e8 0000000000000086 ffff8101677179b0 0000000000000002 [1172611.400847] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.403591] ffffffff80747dc0 ffff8101cb863738 ffff8101677179b4 ffff8101677179a8 [1172611.403694] Call Trace: [1172611.409063] [] schedule_timeout+0x5f/0xd0 [1172611.411822] [] process_timeout+0x0/0x10 [1172611.414587] [] do_select+0x468/0x560 [1172611.417307] [] __pollwait+0x0/0x130 [1172611.420004] [] default_wake_function+0x0/0x10 [1172611.422704] [] default_wake_function+0x0/0x10 [1172611.425339] [] default_wake_function+0x0/0x10 [1172611.427923] [] default_wake_function+0x0/0x10 [1172611.430472] [] add_partial+0x19/0x60 [1172611.433013] [] __slab_free+0x15d/0x310 [1172611.435547] [] _spin_lock_bh+0x9/0x20 [1172611.438080] [] release_sock+0x13/0xb0 [1172611.440607] [] tcp_recvmsg+0x370/0x940 [1172611.443136] [] sock_common_recvmsg+0x30/0x50 [1172611.445679] [] sock_aio_read+0x11b/0x130 [1172611.448224] [] core_sys_select+0x209/0x300 [1172611.450769] [] autoremove_wake_function+0x0/0x30 [1172611.453330] [] default_wake_function+0x0/0x10 [1172611.455892] [] current_fs_time+0x1e/0x30 [1172611.458451] [] tty_ldisc_deref+0x52/0x80 [1172611.460996] [] sys_select+0xd1/0x1c0 [1172611.463530] [] system_call+0x7e/0x83 [1172611.466056] [1172611.468545] bash S 0000000000000000 0 9075 9074 [1172611.471088] ffff8101a8d01db8 0000000000000086 ffff8101a8d01d80 0000000000000ff5 [1172611.473676] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.476296] ffffffff80747dc0 ffff81010f26c208 ffff8101a8d01d84 ffff8101a8d01d78 [1172611.476398] Call Trace: [1172611.481491] [] schedule_timeout+0x95/0xd0 [1172611.484118] [] add_wait_queue+0x1c/0x60 [1172611.486724] [] read_chan+0x228/0x6f0 [1172611.489303] [] default_wake_function+0x0/0x10 [1172611.491890] [] tty_read+0xb0/0x100 [1172611.494437] [] vfs_read+0xc5/0x160 [1172611.496960] [] sys_read+0x53/0x90 [1172611.499471] [] system_call+0x7e/0x83 [1172611.501978] [1172611.504443] sshd S 0000000000000000 0 9477 7582 [1172611.506967] ffff810122bb5bf8 0000000000000082 ffff810122bb5bc0 ffff810102e23600 [1172611.509518] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.512071] ffffffff80747dc0 ffff81019b72e918 ffff810122bb5bc4 ffff810122bb5bb8 [1172611.512174] Call Trace: [1172611.517102] [] schedule_timeout+0x95/0xd0 [1172611.519615] [] prepare_to_wait+0x23/0x80 [1172611.522128] [] unix_stream_recvmsg+0x386/0x550 [1172611.524648] [] autoremove_wake_function+0x0/0x30 [1172611.527170] [] link_path_walk+0x80/0xf0 [1172611.529681] [] sock_aio_read+0x11b/0x130 [1172611.532193] [] get_unused_fd_flags+0x79/0x120 [1172611.534712] [] do_sync_read+0xd9/0x120 [1172611.537222] [] autoremove_wake_function+0x0/0x30 [1172611.539741] [] pick_next_task_fair+0x42/0x70 [1172611.542257] [] __sched_text_start+0x166/0x23d [1172611.544776] [] do_filp_open+0x3a/0x50 [1172611.547286] [] vfs_read+0x157/0x160 [1172611.549793] [] sys_read+0x53/0x90 [1172611.552298] [] system_call+0x7e/0x83 [1172611.554805] [1172611.557268] sshd S 0000000000000000 0 9479 9477 [1172611.559791] ffff8101d7f7b9e8 0000000000000082 ffff8101d7f7b9b0 0000000000000002 [1172611.562340] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.564892] ffffffff80747dc0 ffff81019b72f738 ffff8101d7f7b9b4 ffff8101d7f7b9a8 [1172611.564995] Call Trace: [1172611.569926] [] schedule_timeout+0x5f/0xd0 [1172611.572443] [] process_timeout+0x0/0x10 [1172611.574957] [] do_select+0x468/0x560 [1172611.577469] [] __pollwait+0x0/0x130 [1172611.579979] [] default_wake_function+0x0/0x10 [1172611.582500] [] default_wake_function+0x0/0x10 [1172611.585023] [] default_wake_function+0x0/0x10 [1172611.587546] [] default_wake_function+0x0/0x10 [1172611.590069] [] skb_copy_datagram_iovec+0x1a1/0x260 [1172611.592602] [] _spin_lock_bh+0x9/0x20 [1172611.595136] [] release_sock+0x13/0xb0 [1172611.597669] [] tcp_recvmsg+0x370/0x940 [1172611.600206] [] sock_common_recvmsg+0x30/0x50 [1172611.602755] [] sock_aio_read+0x11b/0x130 [1172611.605295] [] core_sys_select+0x209/0x300 [1172611.607838] [] autoremove_wake_function+0x0/0x30 [1172611.610396] [] default_wake_function+0x0/0x10 [1172611.612949] [] current_fs_time+0x1e/0x30 [1172611.615496] [] tty_ldisc_deref+0x52/0x80 [1172611.618033] [] sys_select+0xd1/0x1c0 [1172611.620569] [] system_call+0x7e/0x83 [1172611.623100] [1172611.625606] bash S 7fffffffffffffff 0 9480 9479 [1172611.628160] ffff8101d7ed1db8 0000000000000086 000000000000000b 0000000000000ff5 [1172611.630773] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.633395] ffffffff80747dc0 ffff8101cb7ee208 0000000000000000 ffff8101657c8018 [1172611.633497] Call Trace: [1172611.638557] [] schedule_timeout+0x95/0xd0 [1172611.641136] [] add_wait_queue+0x1c/0x60 [1172611.643699] [] read_chan+0x228/0x6f0 [1172611.646256] [] default_wake_function+0x0/0x10 [1172611.648829] [] tty_read+0xb0/0x100 [1172611.651389] [] vfs_read+0xc5/0x160 [1172611.653928] [] sys_read+0x53/0x90 [1172611.656463] [] system_call+0x7e/0x83 [1172611.659013] [1172611.661536] su S 0000000000000000 0 9613 1 [1172611.664103] ffff8101c3c57e88 0000000000000086 ffff8101c3c57e50 ffff810117ac0000 [1172611.666717] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.669327] ffffffff80747dc0 ffff810106581028 ffff8101c3c57e54 ffff8101c3c57e48 [1172611.669430] Call Trace: [1172611.674472] [] do_wait+0x599/0xc90 [1172611.677029] [] default_wake_function+0x0/0x10 [1172611.679584] [] system_call+0x7e/0x83 [1172611.682132] [1172611.684643] bash S 000000000000000e 0 9614 9613 [1172611.687205] ffff8101ebc27e88 0000000000000082 80000001df3f8065 ffff8101a86f6710 [1172611.689809] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.692439] ffffffff80747dc0 ffff810117ac0208 ffff8101ebc27e38 ffff8101786d7a80 [1172611.692542] Call Trace: [1172611.697656] [] do_page_fault+0x202/0x890 [1172611.700289] [] update_curr+0x109/0x120 [1172611.702917] [] do_wait+0x599/0xc90 [1172611.705533] [] __sched_text_start+0x166/0x23d [1172611.708163] [] __wake_up+0x43/0x70 [1172611.710787] [] vfs_ioctl+0x220/0x2c0 [1172611.713422] [] default_wake_function+0x0/0x10 [1172611.716078] [] sys_ioctl+0x49/0x80 [1172611.718716] [] system_call+0x7e/0x83 [1172611.721349] [1172611.723928] bash D ffff81017bb82900 0 9632 1 [1172611.726540] ffff8101514abac8 0000000000000046 ffff8101514abc08 ffff810226a79800 [1172611.729205] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.731857] ffffffff80747dc0 ffff810136f4c918 ffff810007139a50 ffff810004cd4a50 [1172611.731960] Call Trace: [1172611.737092] [] wait_for_completion+0x7d/0xc0 [1172611.739754] [] default_wake_function+0x0/0x10 [1172611.742431] [] flush_cpu_workqueue+0x6a/0x90 [1172611.745109] [] wq_barrier_func+0x0/0x10 [1172611.747812] [] flush_workqueue+0x33/0x50 [1172611.750540] [] release_dev+0x44f/0x750 [1172611.753296] [] __sched_text_start+0x166/0x23d [1172611.756058] [] tty_release+0x11/0x20 [1172611.758805] [] __fput+0xb1/0x1a0 [1172611.761549] [] filp_close+0x54/0x90 [1172611.764295] [] put_files_struct+0xb1/0xd0 [1172611.767039] [] do_exit+0x1a9/0x8a0 [1172611.769781] [] __dequeue_signal+0x165/0x1f0 [1172611.772533] [] do_group_exit+0x2c/0x80 [1172611.775279] [] get_signal_to_deliver+0x2c7/0x470 [1172611.778032] [] do_notify_resume+0xc5/0x7a0 [1172611.780784] [] send_signal+0x62/0x1f0 [1172611.783537] [] __group_send_sig_info+0x75/0xa0 [1172611.786308] [] group_send_sig_info+0x6e/0x90 [1172611.789086] [] sys_rt_sigreturn+0x324/0x3d0 [1172611.791858] [] sys_rt_sigaction+0x8e/0xc0 [1172611.794615] [] int_signal+0x12/0x17 [1172611.797334] [1172611.799993] su ? 0000000000000000 0 9639 9632 [1172611.802704] ffff8101b98afee8 0000000000000046 ffff8101b98afeb0 0000000000000011 [1172611.805431] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.808166] ffffffff80747dc0 ffff8101243a7028 ffff8101b98afeb4 ffff8101b98afea8 [1172611.808269] Call Trace: [1172611.813600] [] do_exit+0x5be/0x8a0 [1172611.816333] [] do_group_exit+0x2c/0x80 [1172611.819057] [] system_call+0x7e/0x83 [1172611.821794] [1172611.824519] mdadm D 0000000000000000 0 9783 9614 [1172611.827312] ffff8101aea09a18 0000000000000082 ffff8101aea099e0 ffff8101aea09998 [1172611.830142] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.832994] ffffffff80747dc0 ffff8101a86f6918 ffff8101aea099e4 ffff8101aea099d8 [1172611.833098] Call Trace: [1172611.838611] [] __wake_up+0x43/0x70 [1172611.841427] [] sync_page+0x0/0x50 [1172611.844200] [] io_schedule+0x28/0x40 [1172611.846961] [] sync_page+0x3b/0x50 [1172611.849716] [] __wait_on_bit_lock+0x4a/0x80 [1172611.852476] [] __lock_page+0x5f/0x70 [1172611.855220] [] wake_bit_function+0x0/0x30 [1172611.857968] [] pagevec_lookup_tag+0x1a/0x30 [1172611.860699] [] write_cache_pages+0x191/0x340 [1172611.863407] [] __writepage+0x0/0x30 [1172611.866104] [] do_writepages+0x20/0x40 [1172611.868763] [] __writeback_single_inode+0x2d9/0x400 [1172611.871430] [] __wake_up+0x43/0x70 [1172611.874087] [] sync_sb_inodes+0x21a/0x300 [1172611.876755] [] sync_inodes_sb+0xa1/0xc0 [1172611.879405] [] __fsync_super+0xb/0x70 [1172611.882049] [] fsync_super+0x9/0x20 [1172611.884692] [] fsync_bdev+0x26/0x60 [1172611.887318] [] blkdev_ioctl+0x1c7/0x7a0 [1172611.889939] [] handle_mm_fault+0x1a1/0x8a0 [1172611.892573] [] md_open+0x6a/0x90 [1172611.895186] [] blkdev_open+0x0/0x90 [1172611.897799] [] __up_read+0x21/0xb0 [1172611.900374] [] do_page_fault+0x202/0x890 [1172611.902936] [] blkdev_open+0x3c/0x90 [1172611.905489] [] block_ioctl+0x1b/0x30 [1172611.907994] [] do_ioctl+0x2f/0xa0 [1172611.910470] [] vfs_ioctl+0x220/0x2c0 [1172611.912938] [] sys_ioctl+0x49/0x80 [1172611.915381] [] error_exit+0x0/0x84 [1172611.917816] [] system_call+0x7e/0x83 [1172611.920256] [1172611.922661] sshd S 0000000000000000 0 9793 7582 [1172611.925122] ffff8101a7fabbf8 0000000000000086 ffff8101a7fabbc0 ffff8101cd1f1600 [1172611.927626] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.930154] ffffffff80747dc0 ffff8101536d0918 ffff8101a7fabbc4 ffff8101a7fabbb8 [1172611.930258] Call Trace: [1172611.935198] [] schedule_timeout+0x95/0xd0 [1172611.937753] [] prepare_to_wait+0x23/0x80 [1172611.940308] [] unix_stream_recvmsg+0x386/0x550 [1172611.942871] [] autoremove_wake_function+0x0/0x30 [1172611.945439] [] link_path_walk+0x80/0xf0 [1172611.947997] [] sock_aio_read+0x11b/0x130 [1172611.950553] [] get_unused_fd_flags+0x79/0x120 [1172611.953111] [] do_sync_read+0xd9/0x120 [1172611.955662] [] autoremove_wake_function+0x0/0x30 [1172611.958225] [] pick_next_task_fair+0x42/0x70 [1172611.960800] [] __sched_text_start+0x166/0x23d [1172611.963379] [] do_filp_open+0x3a/0x50 [1172611.965946] [] vfs_read+0x157/0x160 [1172611.968507] [] sys_read+0x53/0x90 [1172611.971029] [] system_call+0x7e/0x83 [1172611.973523] [1172611.975978] sshd S 0000000000000000 0 9795 9793 [1172611.978461] ffff81021f41d9e8 0000000000000082 ffff81021f41d9b0 0000000000000002 [1172611.980981] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172611.983533] ffffffff80747dc0 ffff8101536d1738 ffff81021f41d9b4 ffff81021f41d9a8 [1172611.983636] Call Trace: [1172611.988598] [] schedule_timeout+0x5f/0xd0 [1172611.991157] [] process_timeout+0x0/0x10 [1172611.993698] [] do_select+0x468/0x560 [1172611.996208] [] __pollwait+0x0/0x130 [1172611.998705] [] default_wake_function+0x0/0x10 [1172612.001178] [] default_wake_function+0x0/0x10 [1172612.003617] [] default_wake_function+0x0/0x10 [1172612.006034] [] default_wake_function+0x0/0x10 [1172612.008424] [] add_partial+0x19/0x60 [1172612.010813] [] __slab_free+0x15d/0x310 [1172612.013194] [] _spin_lock_bh+0x9/0x20 [1172612.015567] [] release_sock+0x13/0xb0 [1172612.017935] [] tcp_recvmsg+0x370/0x940 [1172612.020296] [] sock_common_recvmsg+0x30/0x50 [1172612.022667] [] sock_aio_read+0x11b/0x130 [1172612.025029] [] core_sys_select+0x209/0x300 [1172612.027401] [] autoremove_wake_function+0x0/0x30 [1172612.029774] [] default_wake_function+0x0/0x10 [1172612.032143] [] current_fs_time+0x1e/0x30 [1172612.034510] [] tty_ldisc_deref+0x52/0x80 [1172612.036880] [] sys_select+0xd1/0x1c0 [1172612.039245] [] system_call+0x7e/0x83 [1172612.041607] [1172612.043951] bash S 000000000000000e 0 9796 9795 [1172612.046358] ffff81013de09e88 0000000000000086 8000000104441065 ffff8101125a2710 [1172612.048809] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.051282] ffffffff80747dc0 ffff81014da80208 ffff81013de09e38 ffff8101eab7d5e8 [1172612.051385] Call Trace: [1172612.056215] [] do_page_fault+0x202/0x890 [1172612.058715] [] update_curr+0x109/0x120 [1172612.061212] [] do_wait+0x599/0xc90 [1172612.063714] [] __sched_text_start+0x166/0x23d [1172612.066238] [] __wake_up+0x43/0x70 [1172612.068760] [] vfs_ioctl+0x220/0x2c0 [1172612.071291] [] default_wake_function+0x0/0x10 [1172612.073847] [] sys_ioctl+0x49/0x80 [1172612.076399] [] system_call+0x7e/0x83 [1172612.078935] [1172612.081438] su S 0000000000000000 0 9804 9796 [1172612.083976] ffff810184fdbe88 0000000000000082 ffff810184fdbe50 ffff810120808000 [1172612.086547] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.089126] ffffffff80747dc0 ffff8101125a2918 ffff810184fdbe54 ffff810184fdbe48 [1172612.089229] Call Trace: [1172612.094170] [] do_wait+0x599/0xc90 [1172612.096703] [] default_wake_function+0x0/0x10 [1172612.099244] [] system_call+0x7e/0x83 [1172612.101772] [1172612.104264] bash S 0000000000000000 0 9805 9804 [1172612.106820] ffff8101e88f7db8 0000000000000082 ffff8101e88f7d80 0000000000000ff9 [1172612.109419] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.112036] ffffffff80747dc0 ffff810120808208 ffff8101e88f7d84 ffff8101e88f7d78 [1172612.112139] Call Trace: [1172612.117216] [] schedule_timeout+0x95/0xd0 [1172612.119833] [] add_wait_queue+0x1c/0x60 [1172612.122446] [] read_chan+0x228/0x6f0 [1172612.125058] [] default_wake_function+0x0/0x10 [1172612.127700] [] tty_read+0xb0/0x100 [1172612.130342] [] vfs_read+0xc5/0x160 [1172612.132958] [] sys_read+0x53/0x90 [1172612.135554] [] system_call+0x7e/0x83 [1172612.138121] [1172612.140634] smtpd S 0000000000000000 0 9847 30580 [1172612.143203] ffff8101a6e25e58 0000000000000086 ffff8101a6e25e20 ffff81022583d318 [1172612.145786] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.148371] ffffffff80747dc0 ffff8101d859a208 ffff8101a6e25e24 ffff8101a6e25e18 [1172612.148474] Call Trace: [1172612.153514] [] schedule_timeout+0x5f/0xd0 [1172612.156129] [] process_timeout+0x0/0x10 [1172612.158743] [] sys_epoll_wait+0x1bd/0x4e0 [1172612.161385] [] default_wake_function+0x0/0x10 [1172612.164066] [] system_call+0x7e/0x83 [1172612.166774] [1172612.169446] smtpd S ffff81022583d318 0 9963 30580 [1172612.172187] ffff8101c5f69eb8 0000000000000082 0000000000000000 ffffffff00000001 [1172612.174990] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.177819] ffffffff80747dc0 ffff810105b04918 0000000000000000 000000008bcec672 [1172612.177922] Call Trace: [1172612.183467] [] ns_to_timeval+0x9/0x40 [1172612.186321] [] flock_lock_file_wait+0x14d/0x300 [1172612.189190] [] autoremove_wake_function+0x0/0x30 [1172612.192057] [] sys_flock+0x16b/0x180 [1172612.194913] [] system_call+0x7e/0x83 [1172612.197759] [1172612.200578] cleanup S 0000000000000000 0 9966 30580 [1172612.203466] ffff8101b50b7e58 0000000000000082 ffff8101b50b7e20 ffff8101a496a828 [1172612.206409] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.209358] ffffffff80747dc0 ffff810132074208 ffff8101b50b7e24 ffff8101b50b7e18 [1172612.209460] Call Trace: [1172612.215203] [] schedule_timeout+0x5f/0xd0 [1172612.218127] [] process_timeout+0x0/0x10 [1172612.221046] [] sys_epoll_wait+0x1bd/0x4e0 [1172612.223934] [] default_wake_function+0x0/0x10 [1172612.226813] [] system_call+0x7e/0x83 [1172612.229693] [1172612.232543] local S 0000000000000000 0 9967 30580 [1172612.235450] ffff8101c7bf9e58 0000000000000086 ffff8101c7bf9e20 0000000000000000 [1172612.238401] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 [1172612.241380] ffffffff80747dc0 ffff8101b11bd738 ffff8101c7bf9e24 ffff8101c7bf9e18 [1172612.241483] Call Trace: [1172612.247296] [] schedule_timeout+0x5f/0xd0 [1172612.250292] [] process_timeout+0x0/0x10 [1172612.253278] [] sys_epoll_wait+0x1bd/0x4e0 [1172612.256265] [] default_wake_function+0x0/0x10 [1172612.259237] [] system_call+0x7e/0x83 [1172612.262188] From owner-xfs@oss.sgi.com Sun Nov 4 06:59:29 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 06:59:32 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33, SPF_HELO_PASS autolearn=no version=3.3.0-r574664 Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4ExSPN016066 for ; Sun, 4 Nov 2007 06:59:28 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id 2D7B51C00026A; Sun, 4 Nov 2007 09:59:32 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id 2822F4019B27; Sun, 4 Nov 2007 09:59:32 -0500 (EST) Date: Sun, 4 Nov 2007 09:59:32 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Michael Tokarev cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state In-Reply-To: <472DDD78.7040002@msgid.tls.msk.ru> Message-ID: References: <472DBF8C.2060508@msgid.tls.msk.ru> <472DDD78.7040002@msgid.tls.msk.ru> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.91.2/4672/Sun Nov 4 03:38:42 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13544 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs On Sun, 4 Nov 2007, Michael Tokarev wrote: > Justin Piszcz wrote: >> On Sun, 4 Nov 2007, Michael Tokarev wrote: > [] >>> The next time you come across something like that, do a SysRq-T dump and >>> post that. It shows a stack trace of all processes - and in particular, >>> where exactly each task is stuck. > >> Yes I got it before I rebooted, ran that and then dmesg > file. >> >> Here it is: >> >> [1172609.665902] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 >> [1172609.668768] ffffffff80747dc0 ffff81015c3aa918 ffff810091c899b4 ffff810091c899a8 > > That's only partial list. All the kernel threads - which are most important > in this context - aren't shown. You ran out of dmesg buffer, and the most > interesting entries was at the beginning. If your /var/log partition is > working, the stuff should be in /var/log/kern.log or equivalent. If it's > not working, there is a way to capture the info still, by stopping syslogd, > cat'ing /proc/kmsg to some tmpfs file and scp'ing it elsewhere. > > /mjt > Will do that the next time it happens, thanks. From owner-xfs@oss.sgi.com Sun Nov 4 07:21:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 07:21:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-6.5 required=5.0 tests=BAYES_00,J_CHICKENPOX_33, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.0-r574664 Received: from hobbit.corpit.ru (hobbit.corpit.ru [81.13.94.6]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4FL0Fc021016 for ; Sun, 4 Nov 2007 07:21:03 -0800 Received: from [192.168.1.200] (mjt.ppp.tls.msk.ru [192.168.1.200]) by hobbit.corpit.ru (Postfix) with ESMTP id 3593335610; Sun, 4 Nov 2007 17:55:53 +0300 (MSK) (envelope-from mjt@tls.msk.ru) Message-ID: <472DDD78.7040002@msgid.tls.msk.ru> Date: Sun, 04 Nov 2007 17:55:52 +0300 From: Michael Tokarev Organization: Telecom Service, JSC User-Agent: Icedove 1.5.0.12 (X11/20070607) MIME-Version: 1.0 To: Justin Piszcz CC: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state References: <472DBF8C.2060508@msgid.tls.msk.ru> In-Reply-To: X-Enigmail-Version: 0.94.2.0 OpenPGP: id=4F9CF57E Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4672/Sun Nov 4 03:38:42 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13545 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: mjt@tls.msk.ru Precedence: bulk X-list: xfs Justin Piszcz wrote: > On Sun, 4 Nov 2007, Michael Tokarev wrote: [] >> The next time you come across something like that, do a SysRq-T dump and >> post that. It shows a stack trace of all processes - and in particular, >> where exactly each task is stuck. > Yes I got it before I rebooted, ran that and then dmesg > file. > > Here it is: > > [1172609.665902] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 > [1172609.668768] ffffffff80747dc0 ffff81015c3aa918 ffff810091c899b4 ffff810091c899a8 That's only partial list. All the kernel threads - which are most important in this context - aren't shown. You ran out of dmesg buffer, and the most interesting entries was at the beginning. If your /var/log partition is working, the stuff should be in /var/log/kern.log or equivalent. If it's not working, there is a way to capture the info still, by stopping syslogd, cat'ing /proc/kmsg to some tmpfs file and scp'ing it elsewhere. /mjt From owner-xfs@oss.sgi.com Sun Nov 4 10:35:35 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 10:35:43 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.7 required=5.0 tests=BAYES_50,J_CHICKENPOX_33, J_CHICKENPOX_35,J_CHICKENPOX_36,J_CHICKENPOX_39,MIME_8BIT_HEADER autolearn=no version=3.3.0-r574664 Received: from rayleigh.systella.fr (rayleigh.systella.fr [213.41.184.253]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4IZT0g020746 for ; Sun, 4 Nov 2007 10:35:32 -0800 Received: from [192.168.0.83] (fermat.systella.fr [192.168.0.83]) (authenticated bits=0) by rayleigh.systella.fr (8.14.1/8.14.1/Debian-9) with ESMTP id lA4IHsMT029212 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 4 Nov 2007 19:18:00 +0100 Message-ID: <472E0CD2.7050702@systella.fr> Date: Sun, 04 Nov 2007 19:17:54 +0100 From: =?ISO-8859-1?Q?BERTRAND_Jo=EBl?= User-Agent: Mozilla/5.0 (X11; U; Linux sparc64; fr-FR; rv:1.8.1.6) Gecko/20070802 Iceape/1.1.4 (Debian-1.1.4-1) MIME-Version: 1.0 To: Michael Tokarev CC: Justin Piszcz , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state References: <472DBF8C.2060508@msgid.tls.msk.ru> <472DDD78.7040002@msgid.tls.msk.ru> In-Reply-To: <472DDD78.7040002@msgid.tls.msk.ru> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-3.1.8 (rayleigh.systella.fr [192.168.254.1]); Sun, 04 Nov 2007 19:18:02 +0100 (CET) X-Scanned-By: MIMEDefang on 192.168.254.1 X-Virus-Scanned: ClamAV 0.91.2/4672/Sun Nov 4 03:38:42 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13546 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: joel.bertrand@systella.fr Precedence: bulk X-list: xfs Michael Tokarev wrote: > Justin Piszcz wrote: >> On Sun, 4 Nov 2007, Michael Tokarev wrote: > [] >>> The next time you come across something like that, do a SysRq-T dump and >>> post that. It shows a stack trace of all processes - and in particular, >>> where exactly each task is stuck. > >> Yes I got it before I rebooted, ran that and then dmesg > file. >> >> Here it is: >> >> [1172609.665902] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 >> [1172609.668768] ffffffff80747dc0 ffff81015c3aa918 ffff810091c899b4 ffff810091c899a8 > > That's only partial list. All the kernel threads - which are most important > in this context - aren't shown. You ran out of dmesg buffer, and the most > interesting entries was at the beginning. If your /var/log partition is > working, the stuff should be in /var/log/kern.log or equivalent. If it's > not working, there is a way to capture the info still, by stopping syslogd, > cat'ing /proc/kmsg to some tmpfs file and scp'ing it elsewhere. I have reported some days ago the same bug . I can reproduced it without any trouble :-(. Configuration : 2.6.23 linux kernel with iscsi-target on sparc64/smp (sun4v). Following output was crated by echo t > /proc/sysrq-trigger and echo x > /proc/sysrq-trigger. I is a and paste from /var/log/syslog and I hope I haven't done any mistake... Nov 4 18:55:56 poulenc kernel: SysRq : Show State Nov 4 18:55:56 poulenc kernel: task PC stack pid father Nov 4 18:55:56 poulenc kernel: init S 00000000004c7d68 0 1 0 Nov 4 18:55:56 poulenc kernel: Call Trace: Nov 4 18:55:56 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:55:56 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:55:56 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:55:56 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:55:56 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:55:56 poulenc kernel: [00000000000150b8] 0x150c0 Nov 4 18:55:56 poulenc kernel: kthreadd S 00000000004273d0 0 2 0 Nov 4 18:55:56 poulenc kernel: Call Trace: Nov 4 18:55:56 poulenc kernel: [0000000000478fe8] kthreadd+0x1b0/0x1c0 Nov 4 18:55:56 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:56 poulenc kernel: [000000000067d404] rest_init+0x2c/0x60 Nov 4 18:55:56 poulenc kernel: migration/0 S 0000000000478ce0 0 3 2 Nov 4 18:55:56 poulenc kernel: Call Trace: Nov 4 18:55:56 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:56 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:56 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:56 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:56 poulenc kernel: ksoftirqd/0 S 0000000000478ce0 0 4 2 Nov 4 18:55:56 poulenc kernel: Call Trace: Nov 4 18:55:56 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:56 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:56 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: watchdog/0 S 0000000000478ce0 0 5 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: migration/1 S 0000000000478ce0 0 6 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: ksoftirqd/1 S 0000000000478ce0 0 7 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: watchdog/1 S 0000000000478ce0 0 8 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: migration/2 S 0000000000478ce0 0 9 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: ksoftirqd/2 S 0000000000478ce0 0 10 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: watchdog/2 S 0000000000478ce0 0 11 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:57 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:57 poulenc kernel: migration/3 R running task 0 12 2 Nov 4 18:55:57 poulenc kernel: ksoftirqd/3 S 0000000000478ce0 0 13 2 Nov 4 18:55:57 poulenc kernel: Call Trace: Nov 4 18:55:57 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:57 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:57 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: watchdog/3 S 0000000000478ce0 0 14 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: migration/4 S 0000000000478ce0 0 15 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: ksoftirqd/4 S 0000000000478ce0 0 16 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: watchdog/4 S 0000000000478ce0 0 17 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: migration/5 S 0000000000478ce0 0 18 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: ksoftirqd/5 S 0000000000478ce0 0 19 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: watchdog/5 S 0000000000478ce0 0 20 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:58 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:58 poulenc kernel: migration/6 S 0000000000478ce0 0 21 2 Nov 4 18:55:58 poulenc kernel: Call Trace: Nov 4 18:55:58 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:58 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:58 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: ksoftirqd/6 S 0000000000478ce0 0 22 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: watchdog/6 S 0000000000478ce0 0 23 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: migration/7 S 0000000000478ce0 0 24 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: ksoftirqd/7 S 0000000000478ce0 0 25 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: watchdog/7 S 0000000000478ce0 0 26 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: migration/8 S 0000000000478ce0 0 27 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: ksoftirqd/8 S 0000000000478ce0 0 28 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:55:59 poulenc kernel: watchdog/8 S 0000000000478ce0 0 29 2 Nov 4 18:55:59 poulenc kernel: Call Trace: Nov 4 18:55:59 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:55:59 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:55:59 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: migration/9 S 0000000000478ce0 0 30 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: ksoftirqd/9 S 0000000000478ce0 0 31 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: watchdog/9 S 0000000000478ce0 0 32 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: migration/10 S 0000000000478ce0 0 33 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: ksoftirqd/10 S 0000000000478ce0 0 34 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: watchdog/10 S 0000000000478ce0 0 35 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: migration/11 S 0000000000478ce0 0 36 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:00 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:00 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:00 poulenc kernel: ksoftirqd/11 S 0000000000478ce0 0 37 2 Nov 4 18:56:00 poulenc kernel: Call Trace: Nov 4 18:56:00 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:00 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: watchdog/11 S 0000000000478ce0 0 38 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: migration/12 S 0000000000478ce0 0 39 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: ksoftirqd/12 S 0000000000478ce0 0 40 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: watchdog/12 S 0000000000478ce0 0 41 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: migration/13 S 0000000000478ce0 0 42 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: ksoftirqd/13 S 0000000000478ce0 0 43 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:01 poulenc kernel: watchdog/13 S 0000000000478ce0 0 44 2 Nov 4 18:56:01 poulenc kernel: Call Trace: Nov 4 18:56:01 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:01 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:01 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: migration/14 S 0000000000478ce0 0 45 2 Nov 4 18:56:02 poulenc kernel: Call Trace: Nov 4 18:56:02 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:02 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:02 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: ksoftirqd/14 S 0000000000478ce0 0 46 2 Nov 4 18:56:02 poulenc kernel: Call Trace: Nov 4 18:56:02 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:02 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:02 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: watchdog/14 S 0000000000478ce0 0 47 2 Nov 4 18:56:02 poulenc kernel: Call Trace: Nov 4 18:56:02 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:02 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: migration/15 S 0000000000478ce0 0 48 2 Nov 4 18:56:02 poulenc kernel: Call Trace: Nov 4 18:56:02 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:02 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:02 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: ksoftirqd/15 S 0000000000478ce0 0 49 2 Nov 4 18:56:02 poulenc kernel: Call Trace: Nov 4 18:56:02 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:02 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:02 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: watchdog/15 S 0000000000478ce0 0 50 2 Nov 4 18:56:02 poulenc kernel: Call Trace: Nov 4 18:56:02 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:02 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:02 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:02 poulenc kernel: migration/16 S 0000000000478ce0 0 51 2 Nov 4 18:56:03 poulenc kernel: Call Trace: Nov 4 18:56:03 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:03 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:03 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:03 poulenc kernel: ksoftirqd/16 S 0000000000478ce0 0 52 2 Nov 4 18:56:03 poulenc kernel: Call Trace: Nov 4 18:56:03 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:03 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:03 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:03 poulenc kernel: watchdog/16 S 0000000000478ce0 0 53 2 Nov 4 18:56:03 poulenc kernel: Call Trace: Nov 4 18:56:03 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:03 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:03 poulenc kernel: migration/17 R running task 0 54 2 Nov 4 18:56:03 poulenc kernel: ksoftirqd/17 R running task 0 55 2 Nov 4 18:56:03 poulenc kernel: watchdog/17 S 0000000000478ce0 0 56 2 Nov 4 18:56:03 poulenc kernel: Call Trace: Nov 4 18:56:03 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:03 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:03 poulenc kernel: migration/18 S 0000000000478ce0 0 57 2 Nov 4 18:56:03 poulenc kernel: Call Trace: Nov 4 18:56:03 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:03 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:03 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:03 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:03 poulenc kernel: ksoftirqd/18 S 0000000000478ce0 0 58 2 Nov 4 18:56:03 poulenc kernel: Call Trace: Nov 4 18:56:03 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:03 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: watchdog/18 S 0000000000478ce0 0 59 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: migration/19 S 0000000000478ce0 0 60 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: ksoftirqd/19 S 0000000000478ce0 0 61 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: watchdog/19 S 0000000000478ce0 0 62 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: migration/20 S 0000000000478ce0 0 63 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: ksoftirqd/20 S 0000000000478ce0 0 64 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: watchdog/20 S 0000000000478ce0 0 65 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:04 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:04 poulenc kernel: migration/21 S 0000000000478ce0 0 66 2 Nov 4 18:56:04 poulenc kernel: Call Trace: Nov 4 18:56:04 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:04 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:04 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: ksoftirqd/21 S 0000000000478ce0 0 67 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: watchdog/21 S 0000000000478ce0 0 68 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: migration/22 S 0000000000478ce0 0 69 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: ksoftirqd/22 S 0000000000478ce0 0 70 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: watchdog/22 S 0000000000478ce0 0 71 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: migration/23 S 0000000000478ce0 0 72 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [000000000045e60c] migration_thread+0x174/0x360 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:05 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:05 poulenc kernel: ksoftirqd/23 S 0000000000478ce0 0 73 2 Nov 4 18:56:05 poulenc kernel: Call Trace: Nov 4 18:56:05 poulenc kernel: [00000000004683c0] ksoftirqd+0xa8/0xc0 Nov 4 18:56:05 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:05 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: watchdog/23 S 0000000000478ce0 0 74 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [000000000048f9e0] watchdog+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/0 S 0000000000478ce0 0 75 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/1 S 0000000000478ce0 0 76 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/2 S 0000000000478ce0 0 77 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/3 R running task 0 78 2 Nov 4 18:56:06 poulenc kernel: events/4 S 0000000000478ce0 0 79 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/5 S 0000000000478ce0 0 80 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/6 S 0000000000478ce0 0 81 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:06 poulenc kernel: events/7 S 0000000000478ce0 0 82 2 Nov 4 18:56:06 poulenc kernel: Call Trace: Nov 4 18:56:06 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:06 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:06 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:06 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/8 S 0000000000478ce0 0 83 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/9 S 0000000000478ce0 0 84 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/10 S 0000000000478ce0 0 85 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/11 S 0000000000478ce0 0 86 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/12 S 0000000000478ce0 0 87 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/13 S 0000000000478ce0 0 88 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/14 S 0000000000478ce0 0 89 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/15 S 0000000000478ce0 0 90 2 Nov 4 18:56:07 poulenc kernel: Call Trace: Nov 4 18:56:07 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:07 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:07 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:07 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:07 poulenc kernel: events/16 S 0000000000478ce0 0 91 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/17 S 0000000000478ce0 0 92 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/18 S 0000000000478ce0 0 93 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/19 S 0000000000478ce0 0 94 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/20 S 0000000000478ce0 0 95 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/21 S 0000000000478ce0 0 96 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/22 S 0000000000478ce0 0 97 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: events/23 S 0000000000478ce0 0 98 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:08 poulenc kernel: khelper S 0000000000478ce0 0 99 2 Nov 4 18:56:08 poulenc kernel: Call Trace: Nov 4 18:56:08 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:08 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:08 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:08 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:09 poulenc kernel: kblockd/0 S 0000000000478ce0 0 247 2 Nov 4 18:56:09 poulenc kernel: Call Trace: Nov 4 18:56:09 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:09 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:09 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:09 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:09 poulenc kernel: kblockd/1 S 0000000000478ce0 0 248 2 Nov 4 18:56:09 poulenc kernel: Call Trace: Nov 4 18:56:09 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:09 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:09 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:09 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:09 poulenc kernel: kblockd/2 S 0000000000478ce0 0 249 2 Nov 4 18:56:09 poulenc kernel: Call Trace: Nov 4 18:56:09 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:09 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:09 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:09 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:09 poulenc kernel: kblockd/3 R running task 0 250 2 Nov 4 18:56:09 poulenc kernel: kblockd/4 S 0000000000478ce0 0 251 2 Nov 4 18:56:09 poulenc kernel: Call Trace: Nov 4 18:56:09 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:09 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:09 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:09 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:09 poulenc kernel: kblockd/5 S 0000000000478ce0 0 252 2 Nov 4 18:56:09 poulenc kernel: Call Trace: Nov 4 18:56:09 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:09 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:09 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:09 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:09 poulenc kernel: kblockd/6 S 0000000000478ce0 0 253 2 Nov 4 18:56:09 poulenc kernel: Call Trace: Nov 4 18:56:09 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:09 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:09 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/7 S 0000000000478ce0 0 254 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/8 S 0000000000478ce0 0 255 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/9 S 0000000000478ce0 0 256 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/10 S 0000000000478ce0 0 257 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/11 S 0000000000478ce0 0 258 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/12 S 0000000000478ce0 0 259 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/13 S 0000000000478ce0 0 260 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:10 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:10 poulenc kernel: kblockd/14 S 0000000000478ce0 0 261 2 Nov 4 18:56:10 poulenc kernel: Call Trace: Nov 4 18:56:10 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:10 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:10 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/15 S 0000000000478ce0 0 262 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:11 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:11 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/16 S 0000000000478ce0 0 263 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:11 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:11 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/17 S 0000000000478ce0 0 264 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:11 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:11 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/18 S 0000000000478ce0 0 265 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:11 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:11 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/19 S 0000000000478ce0 0 266 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:11 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:11 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/20 S 0000000000478ce0 0 267 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:11 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:11 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:11 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:11 poulenc kernel: kblockd/21 S 0000000000478ce0 0 268 2 Nov 4 18:56:11 poulenc kernel: Call Trace: Nov 4 18:56:11 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: kblockd/22 S 0000000000478ce0 0 269 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: kblockd/23 S 0000000000478ce0 0 270 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: pdflush S 0000000000478ce0 0 294 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [000000000049a420] pdflush+0xc8/0x200 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: pdflush S 0000000000478ce0 0 295 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [000000000049a420] pdflush+0xc8/0x200 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: kswapd0 S 0000000000478ce0 0 296 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [000000000049e778] kswapd+0x540/0x560 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: aio/0 S 0000000000478ce0 0 297 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:12 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:12 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:12 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:12 poulenc kernel: aio/1 S 0000000000478ce0 0 298 2 Nov 4 18:56:12 poulenc kernel: Call Trace: Nov 4 18:56:12 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:13 poulenc kernel: aio/2 S 0000000000478ce0 0 299 2 Nov 4 18:56:13 poulenc kernel: Call Trace: Nov 4 18:56:13 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:13 poulenc kernel: aio/3 S 0000000000478ce0 0 300 2 Nov 4 18:56:13 poulenc kernel: Call Trace: Nov 4 18:56:13 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:13 poulenc kernel: aio/4 S 0000000000478ce0 0 301 2 Nov 4 18:56:13 poulenc kernel: Call Trace: Nov 4 18:56:13 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:13 poulenc kernel: aio/5 S 0000000000478ce0 0 302 2 Nov 4 18:56:13 poulenc kernel: Call Trace: Nov 4 18:56:13 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:13 poulenc kernel: aio/6 S 0000000000478ce0 0 303 2 Nov 4 18:56:13 poulenc kernel: Call Trace: Nov 4 18:56:13 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:13 poulenc kernel: aio/7 S 0000000000478ce0 0 304 2 Nov 4 18:56:13 poulenc kernel: Call Trace: Nov 4 18:56:13 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:13 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:13 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:13 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:14 poulenc kernel: aio/8 S 0000000000478ce0 0 305 2 Nov 4 18:56:14 poulenc kernel: Call Trace: Nov 4 18:56:14 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:14 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:14 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:14 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:14 poulenc kernel: aio/9 S 0000000000478ce0 0 306 2 Nov 4 18:56:14 poulenc kernel: Call Trace: Nov 4 18:56:14 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:14 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:14 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:14 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:14 poulenc kernel: aio/10 S 0000000000478ce0 0 307 2 Nov 4 18:56:14 poulenc kernel: Call Trace: Nov 4 18:56:14 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:14 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:14 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:14 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:14 poulenc kernel: aio/11 S 0000000000478ce0 0 308 2 Nov 4 18:56:14 poulenc kernel: Call Trace: Nov 4 18:56:14 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:14 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:14 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:14 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:14 poulenc kernel: aio/12 S 0000000000478ce0 0 309 2 Nov 4 18:56:14 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:15 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:15 poulenc kernel: aio/13 S 0000000000478ce0 0 310 2 Nov 4 18:56:15 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:15 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:15 poulenc kernel: aio/14 S 0000000000478ce0 0 311 2 Nov 4 18:56:15 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:15 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:15 poulenc kernel: aio/15 S 0000000000478ce0 0 312 2 Nov 4 18:56:15 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:15 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:15 poulenc kernel: aio/16 S 0000000000478ce0 0 313 2 Nov 4 18:56:15 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:15 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:15 poulenc kernel: aio/17 S 0000000000478ce0 0 314 2 Nov 4 18:56:15 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:15 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:15 poulenc kernel: aio/18 S 0000000000478ce0 0 315 2 Nov 4 18:56:15 poulenc kernel: Call Trace: Nov 4 18:56:15 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:15 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:15 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:16 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:16 poulenc kernel: aio/19 S 0000000000478ce0 0 316 2 Nov 4 18:56:16 poulenc kernel: Call Trace: Nov 4 18:56:16 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:16 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:16 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:16 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:16 poulenc kernel: aio/20 S 0000000000478ce0 0 317 2 Nov 4 18:56:16 poulenc kernel: Call Trace: Nov 4 18:56:16 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:16 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:16 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:16 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:16 poulenc kernel: aio/21 S 0000000000478ce0 0 318 2 Nov 4 18:56:16 poulenc kernel: Call Trace: Nov 4 18:56:16 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:16 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:16 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:16 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:16 poulenc kernel: aio/22 S 0000000000478ce0 0 319 2 Nov 4 18:56:16 poulenc kernel: Call Trace: Nov 4 18:56:16 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:16 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:16 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:16 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:16 poulenc kernel: aio/23 S 0000000000478ce0 0 320 2 Nov 4 18:56:16 poulenc kernel: Call Trace: Nov 4 18:56:16 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:16 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:16 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:16 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:16 poulenc kernel: scsi_tgtd/0 S 0000000000478ce0 0 911 2 Nov 4 18:56:16 poulenc kernel: Call Trace: Nov 4 18:56:16 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:16 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:16 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/1 S 0000000000478ce0 0 912 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/2 S 0000000000478ce0 0 913 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/3 S 0000000000478ce0 0 914 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/4 S 0000000000478ce0 0 915 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/5 S 0000000000478ce0 0 916 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/6 S 0000000000478ce0 0 917 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/7 S 0000000000478ce0 0 918 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:17 poulenc kernel: scsi_tgtd/8 S 0000000000478ce0 0 919 2 Nov 4 18:56:17 poulenc kernel: Call Trace: Nov 4 18:56:17 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:17 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:17 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:17 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/9 S 0000000000478ce0 0 920 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/10 S 0000000000478ce0 0 921 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/11 S 0000000000478ce0 0 922 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/12 S 0000000000478ce0 0 923 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/13 S 0000000000478ce0 0 924 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/14 S 0000000000478ce0 0 925 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/15 S 0000000000478ce0 0 926 2 Nov 4 18:56:18 poulenc kernel: Call Trace: Nov 4 18:56:18 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:18 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:18 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:18 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:18 poulenc kernel: scsi_tgtd/16 S 0000000000478ce0 0 927 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:19 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:19 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:19 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:19 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:19 poulenc kernel: scsi_tgtd/17 S 0000000000478ce0 0 928 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:19 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:19 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:19 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:19 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:19 poulenc kernel: scsi_tgtd/18 S 0000000000478ce0 0 929 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:19 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:19 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:19 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:19 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:19 poulenc kernel: scsi_tgtd/19 S 0000000000478ce0 0 930 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:19 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:19 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:19 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:19 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:19 poulenc kernel: scsi_tgtd/20 S 0000000000478ce0 0 931 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:19 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:19 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:19 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:19 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:19 poulenc kernel: scsi_tgtd/21 S 0000000000478ce0 0 932 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:19 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:19 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:19 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:19 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:19 poulenc kernel: scsi_tgtd/22 S 0000000000478ce0 0 933 2 Nov 4 18:56:19 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:20 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:20 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:20 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:20 poulenc kernel: scsi_tgtd/23 S 0000000000478ce0 0 934 2 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:20 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:20 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:20 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:20 poulenc kernel: scsi_eh_0 S 0000000000478ce0 0 947 2 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [00000000005ae0a0] scsi_error_handler+0x48/0x5a0 Nov 4 18:56:20 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:20 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:20 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:20 poulenc kernel: md0_raid1 S 00000000005f2ee8 0 991 2 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:20 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:20 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:20 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:20 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:20 poulenc kernel: kjournald S 0000000000478ce0 0 993 2 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [000000000052da18] kjournald+0x1c0/0x1e0 Nov 4 18:56:20 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:20 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:20 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:20 poulenc kernel: udevd S 00000000004c7d68 0 1091 1 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:20 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:20 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:20 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:20 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:20 poulenc kernel: [0000000000013590] 0x13598 Nov 4 18:56:20 poulenc kernel: scsi_eh_1 S 0000000000478ce0 0 1985 2 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:20 poulenc kernel: [00000000005ae0a0] scsi_error_handler+0x48/0x5a0 Nov 4 18:56:20 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:20 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:20 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:20 poulenc kernel: scsi_eh_2 S 0000000000478ce0 0 2093 2 Nov 4 18:56:20 poulenc kernel: Call Trace: Nov 4 18:56:21 poulenc kernel: [00000000005ae0a0] scsi_error_handler+0x48/0x5a0 Nov 4 18:56:21 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:21 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:21 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:21 poulenc kernel: ksnapd S 0000000000478ce0 0 2718 2 Nov 4 18:56:21 poulenc kernel: Call Trace: Nov 4 18:56:21 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:21 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:21 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:21 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:21 poulenc kernel: md6_raid1 S 00000000005f2ee8 0 2731 2 Nov 4 18:56:21 poulenc kernel: Call Trace: Nov 4 18:56:21 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:21 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:21 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:21 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:21 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:21 poulenc kernel: md1_raid1 S 00000000005f2ee8 0 2748 2 Nov 4 18:56:21 poulenc kernel: Call Trace: Nov 4 18:56:21 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:21 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:21 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:21 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:21 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:21 poulenc kernel: md2_raid1 S 00000000005f2ee8 0 2753 2 Nov 4 18:56:21 poulenc kernel: Call Trace: Nov 4 18:56:21 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:21 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:21 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:21 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:21 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:21 poulenc kernel: md3_raid1 S 00000000005f2ee8 0 2758 2 Nov 4 18:56:21 poulenc kernel: Call Trace: Nov 4 18:56:21 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:21 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:21 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:21 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:21 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:22 poulenc kernel: md4_raid1 S 00000000005f2ee8 0 2763 2 Nov 4 18:56:22 poulenc kernel: Call Trace: Nov 4 18:56:22 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:22 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:22 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:22 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:22 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:22 poulenc kernel: md5_raid1 S 00000000005f2ee8 0 2768 2 Nov 4 18:56:22 poulenc kernel: Call Trace: Nov 4 18:56:22 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:22 poulenc kernel: [00000000005f2ee8] md_thread+0xf0/0x140 Nov 4 18:56:22 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:22 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:22 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:22 poulenc kernel: kjournald S 0000000000478ce0 0 2857 2 Nov 4 18:56:22 poulenc kernel: Call Trace: Nov 4 18:56:22 poulenc kernel: [000000000052da18] kjournald+0x1c0/0x1e0 Nov 4 18:56:22 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:22 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:22 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:22 poulenc kernel: kjournald S 0000000000478ce0 0 2870 2 Nov 4 18:56:22 poulenc kernel: Call Trace: Nov 4 18:56:22 poulenc kernel: [000000000052da18] kjournald+0x1c0/0x1e0 Nov 4 18:56:22 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:22 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:22 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:22 poulenc kernel: kjournald S 0000000000478ce0 0 2871 2 Nov 4 18:56:22 poulenc kernel: Call Trace: Nov 4 18:56:22 poulenc kernel: [000000000052da18] kjournald+0x1c0/0x1e0 Nov 4 18:56:22 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:22 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:22 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:22 poulenc kernel: kjournald S 0000000000478ce0 0 2872 2 Nov 4 18:56:22 poulenc kernel: Call Trace: Nov 4 18:56:22 poulenc kernel: [000000000052da18] kjournald+0x1c0/0x1e0 Nov 4 18:56:22 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:22 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:22 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:23 poulenc kernel: kjournald S 0000000000478ce0 0 2873 2 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [000000000052da18] kjournald+0x1c0/0x1e0 Nov 4 18:56:23 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:23 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:23 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:23 poulenc kernel: portmap S 00000000004c776c 0 2994 1 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:23 poulenc kernel: [00000000004c776c] do_sys_poll+0x234/0x400 Nov 4 18:56:23 poulenc kernel: [00000000004c7960] sys_poll+0x28/0x60 Nov 4 18:56:23 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:23 poulenc kernel: [00000000700025b8] 0x700025c0 Nov 4 18:56:23 poulenc kernel: rpc.statd S 00000000004c7d68 0 3006 1 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:23 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:23 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:23 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:23 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:23 poulenc kernel: [00000000000143f4] 0x143fc Nov 4 18:56:23 poulenc kernel: rpciod/0 S 0000000000478ce0 0 3035 2 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:23 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:23 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:23 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:23 poulenc kernel: rpciod/1 S 0000000000478ce0 0 3036 2 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:23 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:23 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:23 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:23 poulenc kernel: rpciod/2 S 0000000000478ce0 0 3037 2 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:23 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:23 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:23 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:23 poulenc kernel: rpciod/3 S 0000000000478ce0 0 3038 2 Nov 4 18:56:23 poulenc kernel: Call Trace: Nov 4 18:56:23 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:23 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:23 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:23 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/4 S 0000000000478ce0 0 3039 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/5 S 0000000000478ce0 0 3040 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/6 S 0000000000478ce0 0 3041 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/7 S 0000000000478ce0 0 3042 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/8 S 0000000000478ce0 0 3043 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/9 S 0000000000478ce0 0 3044 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/10 S 0000000000478ce0 0 3045 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/11 S 0000000000478ce0 0 3046 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:24 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:24 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:24 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:24 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:24 poulenc kernel: rpciod/12 S 0000000000478ce0 0 3047 2 Nov 4 18:56:24 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/13 S 0000000000478ce0 0 3048 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/14 S 0000000000478ce0 0 3049 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/15 S 0000000000478ce0 0 3050 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/16 S 0000000000478ce0 0 3051 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/17 S 0000000000478ce0 0 3052 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/18 S 0000000000478ce0 0 3053 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:25 poulenc kernel: rpciod/19 S 0000000000478ce0 0 3054 2 Nov 4 18:56:25 poulenc kernel: Call Trace: Nov 4 18:56:25 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:25 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:25 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:25 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:26 poulenc kernel: rpciod/20 S 0000000000478ce0 0 3055 2 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:26 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:26 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:26 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:26 poulenc kernel: rpciod/21 S 0000000000478ce0 0 3056 2 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:26 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:26 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:26 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:26 poulenc kernel: rpciod/22 S 0000000000478ce0 0 3057 2 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:26 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:26 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:26 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:26 poulenc kernel: rpciod/23 S 0000000000478ce0 0 3058 2 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:26 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:26 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:26 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:26 poulenc kernel: rpc.idmapd S 00000000004ea8fc 0 3139 1 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:26 poulenc kernel: [00000000004ea8fc] sys_epoll_wait+0x144/0x480 Nov 4 18:56:26 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:26 poulenc kernel: [00000000f7f2515c] 0xf7f25164 Nov 4 18:56:26 poulenc kernel: syslogd S 00000000004c7d68 0 3244 1 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:26 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:26 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:26 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:26 poulenc kernel: [000000000002a32c] 0x2a334 Nov 4 18:56:26 poulenc kernel: [0000000000014910] 0x14918 Nov 4 18:56:26 poulenc kernel: klogd R running task 0 3254 1 Nov 4 18:56:26 poulenc kernel: named S 00000000004061d4 0 3270 1 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [000000000048b6d0] compat_sys_rt_sigsuspend+0x98/0xe0 Nov 4 18:56:26 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:26 poulenc kernel: [00000000f79bb0b0] 0xf79bb0b8 Nov 4 18:56:26 poulenc kernel: named S 00000000004833bc 0 3271 1 Nov 4 18:56:26 poulenc kernel: Call Trace: Nov 4 18:56:26 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:27 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:27 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: named S 00000000004833bc 0 3272 1 Nov 4 18:56:27 poulenc kernel: Call Trace: Nov 4 18:56:27 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:27 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:27 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: named S 00000000004833bc 0 3273 1 Nov 4 18:56:27 poulenc kernel: Call Trace: Nov 4 18:56:27 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:27 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:27 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: named S 00000000004833bc 0 3274 1 Nov 4 18:56:27 poulenc kernel: Call Trace: Nov 4 18:56:27 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:27 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:27 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: named S 00000000004833bc 0 3275 1 Nov 4 18:56:27 poulenc kernel: Call Trace: Nov 4 18:56:27 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:27 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:27 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: named S 00000000004833bc 0 3276 1 Nov 4 18:56:27 poulenc kernel: Call Trace: Nov 4 18:56:27 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:27 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:27 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:27 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:27 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:27 poulenc kernel: named S 00000000004833bc 0 3277 1 Nov 4 18:56:27 poulenc kernel: Call Trace: Nov 4 18:56:27 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:28 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:28 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:28 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:28 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:28 poulenc kernel: named S 00000000004833bc 0 3278 1 Nov 4 18:56:28 poulenc kernel: Call Trace: Nov 4 18:56:28 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:28 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:28 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:28 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:28 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:28 poulenc kernel: named S 00000000004833bc 0 3279 1 Nov 4 18:56:28 poulenc kernel: Call Trace: Nov 4 18:56:28 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:28 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:28 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:28 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:28 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:28 poulenc kernel: named S 00000000004833bc 0 3280 1 Nov 4 18:56:28 poulenc kernel: Call Trace: Nov 4 18:56:28 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:28 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:28 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:28 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:28 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:28 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:28 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:28 poulenc kernel: named S 00000000004833bc 0 3281 1 Nov 4 18:56:28 poulenc kernel: Call Trace: Nov 4 18:56:28 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:28 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:28 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:28 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:28 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:28 poulenc kernel: named S 00000000004833bc 0 3282 1 Nov 4 18:56:28 poulenc kernel: Call Trace: Nov 4 18:56:28 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:28 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:28 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: named S 00000000004833bc 0 3283 1 Nov 4 18:56:29 poulenc kernel: Call Trace: Nov 4 18:56:29 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:29 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:29 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: named S 00000000004833bc 0 3284 1 Nov 4 18:56:29 poulenc kernel: Call Trace: Nov 4 18:56:29 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:29 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:29 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: named S 00000000004833bc 0 3285 1 Nov 4 18:56:29 poulenc kernel: Call Trace: Nov 4 18:56:29 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:29 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:29 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: named S 00000000004833bc 0 3286 1 Nov 4 18:56:29 poulenc kernel: Call Trace: Nov 4 18:56:29 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:29 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:29 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: named S 00000000004833bc 0 3287 1 Nov 4 18:56:29 poulenc kernel: Call Trace: Nov 4 18:56:29 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:29 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:29 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:29 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:29 poulenc kernel: named S 00000000004833bc 0 3288 1 Nov 4 18:56:29 poulenc kernel: Call Trace: Nov 4 18:56:29 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:29 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:29 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:29 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: named S 00000000004833bc 0 3289 1 Nov 4 18:56:30 poulenc kernel: Call Trace: Nov 4 18:56:30 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:30 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:30 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:30 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: named S 00000000004833bc 0 3290 1 Nov 4 18:56:30 poulenc kernel: Call Trace: Nov 4 18:56:30 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:30 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:30 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:30 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: named S 00000000004833bc 0 3291 1 Nov 4 18:56:30 poulenc kernel: Call Trace: Nov 4 18:56:30 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:30 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:30 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:30 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: named S 00000000004833bc 0 3292 1 Nov 4 18:56:30 poulenc kernel: Call Trace: Nov 4 18:56:30 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:30 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:30 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:30 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: named S 00000000004833bc 0 3293 1 Nov 4 18:56:30 poulenc kernel: Call Trace: Nov 4 18:56:30 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:30 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:30 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:30 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:30 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:30 poulenc kernel: named S 00000000004833bc 0 3294 1 Nov 4 18:56:30 poulenc kernel: Call Trace: Nov 4 18:56:30 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:30 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:31 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:31 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:31 poulenc kernel: [00000000f7afb2e4] 0xf7afb2ec Nov 4 18:56:31 poulenc kernel: named S 00000000004833bc 0 3295 1 Nov 4 18:56:31 poulenc kernel: Call Trace: Nov 4 18:56:31 poulenc kernel: [0000000000482e1c] futex_wait+0x1c4/0x2c0 Nov 4 18:56:31 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:31 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:31 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:31 poulenc kernel: [00000000f7afb638] 0xf7afb640 Nov 4 18:56:31 poulenc kernel: named S 00000000004c7d68 0 3296 1 Nov 4 18:56:31 poulenc kernel: Call Trace: Nov 4 18:56:31 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:31 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:31 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:31 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:31 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:31 poulenc kernel: [00000000f7a60070] 0xf7a60078 Nov 4 18:56:31 poulenc kernel: rpc.bootparam S 00000000004c776c 0 3518 1 Nov 4 18:56:31 poulenc kernel: Call Trace: Nov 4 18:56:31 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:31 poulenc kernel: [00000000004c776c] do_sys_poll+0x234/0x400 Nov 4 18:56:31 poulenc kernel: [00000000004c7960] sys_poll+0x28/0x60 Nov 4 18:56:31 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:31 poulenc kernel: [0000000000011bbc] 0x11bc4 Nov 4 18:56:31 poulenc kernel: hddtemp S 00000000004c7d68 0 3576 1 Nov 4 18:56:31 poulenc kernel: Call Trace: Nov 4 18:56:31 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:31 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:31 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:31 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:31 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:31 poulenc kernel: [000000000001223c] 0x12244 Nov 4 18:56:31 poulenc kernel: lockd S 000000001008aff4 0 3681 2 Nov 4 18:56:31 poulenc kernel: Call Trace: Nov 4 18:56:31 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:31 poulenc kernel: [000000001008aff4] svc_recv+0x21c/0x4e0 [sunrpc] Nov 4 18:56:31 poulenc kernel: [00000000100baef8] lockd+0x120/0x300 [lockd] Nov 4 18:56:31 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:32 poulenc kernel: [0000000010087e10] __svc_create_thread+0x118/0x220 [sunrpc] Nov 4 18:56:32 poulenc kernel: nfsd4 S 0000000000478ce0 0 3682 2 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [0000000000474c5c] worker_thread+0xa4/0xe0 Nov 4 18:56:32 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:32 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:32 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:32 poulenc kernel: nfsd S 000000001008aff4 0 3683 2 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:32 poulenc kernel: [000000001008aff4] svc_recv+0x21c/0x4e0 [sunrpc] Nov 4 18:56:32 poulenc kernel: [0000000010150aac] nfsd+0xb4/0x300 [nfsd] Nov 4 18:56:32 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:32 poulenc kernel: [0000000010087e10] __svc_create_thread+0x118/0x220 [sunrpc] Nov 4 18:56:32 poulenc kernel: rpc.mountd S 00000000004c7d68 0 3694 1 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:32 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:32 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:32 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:32 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:32 poulenc kernel: [0000000000016708] 0x16710 Nov 4 18:56:32 poulenc kernel: iscsid S 000000000048c0d4 0 3785 1 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:32 poulenc kernel: [000000000048c0d4] compat_sys_nanosleep+0x7c/0xe0 Nov 4 18:56:32 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:32 poulenc kernel: [00000000f7ec168c] 0xf7ec1694 Nov 4 18:56:32 poulenc kernel: iscsid S 00000000004c776c 0 3786 1 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:32 poulenc kernel: [00000000004c776c] do_sys_poll+0x234/0x400 Nov 4 18:56:32 poulenc kernel: [00000000004c7960] sys_poll+0x28/0x60 Nov 4 18:56:32 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:32 poulenc kernel: [0000000000045bf0] 0x45bf8 Nov 4 18:56:32 poulenc kernel: inetd S 00000000004c7d68 0 3802 1 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:32 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:32 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:32 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:32 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:32 poulenc kernel: [00000000000155d8] 0x155e0 Nov 4 18:56:32 poulenc kernel: rarpd S 00000000004c776c 0 3809 1 Nov 4 18:56:32 poulenc kernel: Call Trace: Nov 4 18:56:32 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:33 poulenc kernel: [00000000004c776c] do_sys_poll+0x234/0x400 Nov 4 18:56:33 poulenc kernel: [00000000004c7960] sys_poll+0x28/0x60 Nov 4 18:56:33 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:33 poulenc kernel: [00000000f7da7234] 0xf7da723c Nov 4 18:56:33 poulenc kernel: smartd S 000000000048c0d4 0 3814 1 Nov 4 18:56:33 poulenc kernel: Call Trace: Nov 4 18:56:33 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:33 poulenc kernel: [000000000048c0d4] compat_sys_nanosleep+0x7c/0xe0 Nov 4 18:56:33 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:33 poulenc kernel: [00000000f7c5d68c] 0xf7c5d694 Nov 4 18:56:33 poulenc kernel: snmpd S 00000000004c7d68 0 3823 1 Nov 4 18:56:33 poulenc kernel: Call Trace: Nov 4 18:56:33 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:33 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:33 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:33 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:33 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:33 poulenc kernel: [0000000000012e64] 0x12e6c Nov 4 18:56:33 poulenc kernel: sendmail-mta S 00000000004c7d68 0 3880 1 Nov 4 18:56:33 poulenc kernel: Call Trace: Nov 4 18:56:33 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:33 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:33 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:33 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:33 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:33 poulenc kernel: [000000007003b1d8] 0x7003b1e0 Nov 4 18:56:33 poulenc kernel: ntpd S 00000000004c7d68 0 3909 1 Nov 4 18:56:33 poulenc kernel: Call Trace: Nov 4 18:56:33 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:33 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:33 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:33 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:33 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:33 poulenc kernel: [000000000001aea8] 0x1aeb0 Nov 4 18:56:33 poulenc kernel: mdadm S 00000000004c7d68 0 3922 1 Nov 4 18:56:33 poulenc kernel: Call Trace: Nov 4 18:56:33 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:33 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:33 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:33 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:33 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:33 poulenc kernel: [0000000000016cc0] 0x16cc8 Nov 4 18:56:33 poulenc kernel: rsync S 00000000004c7d68 0 3943 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:34 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:34 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:34 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [0000000000035418] 0x35420 Nov 4 18:56:34 poulenc kernel: atd S 000000000048c0d4 0 3959 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [000000000048c0d4] compat_sys_nanosleep+0x7c/0xe0 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [00000000f7e6d68c] 0xf7e6d694 Nov 4 18:56:34 poulenc kernel: cron S 000000000048c0d4 0 3966 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [000000000048c0d4] compat_sys_nanosleep+0x7c/0xe0 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [00000000f7df968c] 0xf7df9694 Nov 4 18:56:34 poulenc kernel: watchdog S 000000000048c0d4 0 3976 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [000000000048c0d4] compat_sys_nanosleep+0x7c/0xe0 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [00000000f7e4568c] 0xf7e45694 Nov 4 18:56:34 poulenc kernel: apache2 S 00000000004c7d68 0 3994 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:34 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:34 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [00000000f7be6c74] 0xf7be6c7c Nov 4 18:56:34 poulenc kernel: fail2ban-serv S 00000000004c7d68 0 4011 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:34 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:34 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [00000000f7dc0070] 0xf7dc0078 Nov 4 18:56:34 poulenc kernel: fail2ban-serv S 00000000004c776c 0 4012 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [00000000004c776c] do_sys_poll+0x234/0x400 Nov 4 18:56:34 poulenc kernel: [00000000004c7960] sys_poll+0x28/0x60 Nov 4 18:56:34 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:34 poulenc kernel: [00000000f7dbd0d4] 0xf7dbd0dc Nov 4 18:56:34 poulenc kernel: fail2ban-serv S 00000000004c7d68 0 4038 1 Nov 4 18:56:34 poulenc kernel: Call Trace: Nov 4 18:56:34 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:34 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:35 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:35 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7dc0070] 0xf7dc0078 Nov 4 18:56:35 poulenc kernel: fail2ban-serv S 00000000004c7d68 0 4039 1 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:35 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:35 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:35 poulenc kernel: [00000000004eef74] compat_sys_select+0xbc/0x1a0 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7dc0070] 0xf7dc0078 Nov 4 18:56:35 poulenc kernel: apache2 S 00000000004ea8fc 0 4071 3994 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:35 poulenc kernel: [00000000004ea8fc] sys_epoll_wait+0x144/0x480 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7be2710] 0xf7be2718 Nov 4 18:56:35 poulenc kernel: apache2 S 00000000004061d4 0 4072 3994 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [00000000005333bc] sys_semtimedop+0x564/0x660 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7bdaea8] 0xf7bdaeb0 Nov 4 18:56:35 poulenc kernel: apache2 S 00000000004061d4 0 4073 3994 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [00000000005333bc] sys_semtimedop+0x564/0x660 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7bdaea8] 0xf7bdaeb0 Nov 4 18:56:35 poulenc kernel: apache2 S 00000000004061d4 0 4074 3994 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [00000000005333bc] sys_semtimedop+0x564/0x660 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7bdaea8] 0xf7bdaeb0 Nov 4 18:56:35 poulenc kernel: apache2 S 00000000004061d4 0 4075 3994 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [00000000005333bc] sys_semtimedop+0x564/0x660 Nov 4 18:56:35 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:35 poulenc kernel: [00000000f7bdaea8] 0xf7bdaeb0 Nov 4 18:56:35 poulenc kernel: login S 000000000048bda8 0 4146 1 Nov 4 18:56:35 poulenc kernel: Call Trace: Nov 4 18:56:35 poulenc kernel: [00000000004656dc] do_wait+0x264/0xda0 Nov 4 18:56:36 poulenc kernel: [000000000048bda8] compat_sys_wait4+0xb0/0xc0 Nov 4 18:56:36 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:36 poulenc kernel: [00000000f7df3234] 0xf7df323c Nov 4 18:56:36 poulenc kernel: bash R running task 0 4173 4146 Nov 4 18:56:36 poulenc kernel: sshd S 00000000004c7d68 0 4330 1 Nov 4 18:56:36 poulenc kernel: Call Trace: Nov 4 18:56:36 poulenc kernel: [000000000067ec30] schedule_timeout+0x78/0xc0 Nov 4 18:56:36 poulenc kernel: [00000000004c7d68] do_select+0x3d0/0x420 Nov 4 18:56:36 poulenc kernel: [00000000004eccb8] compat_core_sys_select+0x160/0x200 Nov 4 18:56:36 poulenc kernel: [00000000004eeee4] compat_sys_select+0x2c/0x1a0 Nov 4 18:56:36 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:36 poulenc kernel: [0000000070017e1c] 0x70017e24 Nov 4 18:56:36 poulenc kernel: md_d0_raid5 R running task 0 4462 2 Nov 4 18:56:36 poulenc kernel: ietd S 000000000067dac4 0 8227 1 Nov 4 18:56:36 poulenc kernel: Call Trace: Nov 4 18:56:36 poulenc kernel: [000000000067d9dc] __down_interruptible+0xa4/0x1c0 Nov 4 18:56:36 poulenc kernel: [000000000067dac4] __down_interruptible+0x18c/0x1c0 Nov 4 18:56:36 poulenc kernel: [00000000102204d0] ioctl+0x58/0x5e0 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [00000000004f0484] compat_sys_ioctl+0x14c/0x460 Nov 4 18:56:36 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:36 poulenc kernel: [000000000001532c] 0x15334 Nov 4 18:56:36 poulenc kernel: istd1 R running task 0 8228 2 Nov 4 18:56:36 poulenc kernel: istiod1 D 00000000102262c8 0 8229 2 Nov 4 18:56:36 poulenc kernel: Call Trace: Nov 4 18:56:36 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:36 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:36 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:36 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:36 poulenc kernel: istiod1 D 00000000102262c8 0 8230 2 Nov 4 18:56:36 poulenc kernel: Call Trace: Nov 4 18:56:36 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:36 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:36 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:37 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:37 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:37 poulenc kernel: istiod1 D 00000000102262c8 0 8231 2 Nov 4 18:56:37 poulenc kernel: Call Trace: Nov 4 18:56:37 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:37 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:37 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:37 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:37 poulenc kernel: istiod1 D 00000000102262c8 0 8232 2 Nov 4 18:56:37 poulenc kernel: Call Trace: Nov 4 18:56:37 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:37 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:37 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:37 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:37 poulenc kernel: istiod1 D 00000000102262c8 0 8233 2 Nov 4 18:56:37 poulenc kernel: Call Trace: Nov 4 18:56:37 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:37 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:37 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:37 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:37 poulenc kernel: istiod1 D 00000000102262c8 0 8234 2 Nov 4 18:56:37 poulenc kernel: Call Trace: Nov 4 18:56:37 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:37 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:37 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:38 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:38 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:38 poulenc kernel: istiod1 D 00000000102262c8 0 8235 2 Nov 4 18:56:38 poulenc kernel: Call Trace: Nov 4 18:56:38 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:38 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:38 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:38 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:38 poulenc kernel: istiod1 D 00000000102262c8 0 8236 2 Nov 4 18:56:38 poulenc kernel: Call Trace: Nov 4 18:56:38 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:38 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:38 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:38 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:38 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:38 poulenc kernel: identd S 00000000004c776c 0 8394 3802 Nov 4 18:56:38 poulenc kernel: Call Trace: Nov 4 18:56:38 poulenc kernel: [000000000067ec0c] schedule_timeout+0x54/0xc0 Nov 4 18:56:38 poulenc kernel: [00000000004c776c] do_sys_poll+0x234/0x400 Nov 4 18:56:38 poulenc kernel: [00000000004c7960] sys_poll+0x28/0x60 Nov 4 18:56:38 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:38 poulenc kernel: [00000000f7ce50d4] 0xf7ce50dc Nov 4 18:56:38 poulenc kernel: identd S 00000000004833bc 0 8395 3802 Nov 4 18:56:38 poulenc kernel: Call Trace: Nov 4 18:56:38 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:38 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:38 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:38 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:38 poulenc kernel: [00000000f7ee72e4] 0xf7ee72ec Nov 4 18:56:38 poulenc kernel: identd S 00000000004833bc 0 8396 3802 Nov 4 18:56:38 poulenc kernel: Call Trace: Nov 4 18:56:38 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:38 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:38 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:38 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:38 poulenc kernel: [00000000f7ee72e4] 0xf7ee72ec Nov 4 18:56:38 poulenc kernel: identd S 00000000004833bc 0 8397 3802 Nov 4 18:56:38 poulenc kernel: Call Trace: Nov 4 18:56:39 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:39 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:39 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:39 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:39 poulenc kernel: [00000000f7ee72e4] 0xf7ee72ec Nov 4 18:56:39 poulenc kernel: identd S 00000000004833bc 0 8398 3802 Nov 4 18:56:39 poulenc kernel: Call Trace: Nov 4 18:56:39 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:39 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:39 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:39 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:39 poulenc kernel: [00000000f7ee72e4] 0xf7ee72ec Nov 4 18:56:39 poulenc kernel: identd S 00000000004833bc 0 8399 3802 Nov 4 18:56:39 poulenc kernel: Call Trace: Nov 4 18:56:39 poulenc kernel: [0000000000482ec0] futex_wait+0x268/0x2c0 Nov 4 18:56:39 poulenc kernel: [00000000004833bc] do_futex+0x64/0xbc0 Nov 4 18:56:39 poulenc kernel: [00000000004843fc] compat_sys_futex+0x64/0x120 Nov 4 18:56:39 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:39 poulenc kernel: [00000000f7ee72e4] 0xf7ee72ec Nov 4 18:56:39 poulenc kernel: watchdog ? 0000000000466d4c 0 8412 3976 Nov 4 18:56:39 poulenc kernel: Call Trace: Nov 4 18:56:39 poulenc kernel: [00000000004669f0] do_exit+0x6d8/0xa00 Nov 4 18:56:39 poulenc kernel: [0000000000466d4c] do_group_exit+0x34/0xa0 Nov 4 18:56:39 poulenc kernel: [00000000004061d4] linux_sparc_syscall32+0x3c/0x40 Nov 4 18:56:39 poulenc kernel: [000000000001b264] 0x1b26c Nov 4 18:56:46 poulenc kernel: SysRq : Show Blocked State Nov 4 18:56:46 poulenc kernel: task PC stack pid father Nov 4 18:56:46 poulenc kernel: istiod1 D 00000000102262c8 0 8229 2 Nov 4 18:56:46 poulenc kernel: Call Trace: Nov 4 18:56:46 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:46 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:46 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:46 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:47 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:47 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:47 poulenc kernel: istiod1 D 00000000102262c8 0 8230 2 Nov 4 18:56:47 poulenc kernel: Call Trace: Nov 4 18:56:47 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:47 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:47 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:47 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:47 poulenc kernel: istiod1 D 00000000102262c8 0 8231 2 Nov 4 18:56:47 poulenc kernel: Call Trace: Nov 4 18:56:47 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:47 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:47 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:47 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:47 poulenc kernel: istiod1 D 00000000102262c8 0 8232 2 Nov 4 18:56:47 poulenc kernel: Call Trace: Nov 4 18:56:47 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:47 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:47 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:48 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:48 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:48 poulenc kernel: istiod1 D 00000000102262c8 0 8233 2 Nov 4 18:56:48 poulenc kernel: Call Trace: Nov 4 18:56:48 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:48 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:48 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:48 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:48 poulenc kernel: istiod1 D 00000000102262c8 0 8234 2 Nov 4 18:56:48 poulenc kernel: Call Trace: Nov 4 18:56:48 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:48 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:48 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:48 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:48 poulenc kernel: istiod1 D 00000000102262c8 0 8235 2 Nov 4 18:56:48 poulenc kernel: Call Trace: Nov 4 18:56:48 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:48 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:48 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:48 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:48 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 Nov 4 18:56:49 poulenc kernel: istiod1 D 00000000102262c8 0 8236 2 Nov 4 18:56:49 poulenc kernel: Call Trace: Nov 4 18:56:49 poulenc kernel: [000000000067e4cc] wait_for_completion+0x74/0xe0 Nov 4 18:56:49 poulenc kernel: [00000000102262c8] blockio_make_request+0x1d0/0x24c [iscsi_trgt] Nov 4 18:56:49 poulenc kernel: [000000001021a140] tio_write+0x28/0x80 [iscsi_trgt] Nov 4 18:56:49 poulenc kernel: [0000000010223df8] build_write_response+0x40/0xe0 [iscsi_trgt] Nov 4 18:56:49 poulenc kernel: [000000001021e444] send_scsi_rsp+0xc/0x120 [iscsi_trgt] Nov 4 18:56:49 poulenc kernel: [0000000010223c30] disk_execute_cmnd+0x158/0x220 [iscsi_trgt] Nov 4 18:56:49 poulenc kernel: [0000000010220330] worker_thread+0x118/0x1a0 [iscsi_trgt] Nov 4 18:56:49 poulenc kernel: [0000000000478ce0] kthread+0x48/0x80 Nov 4 18:56:49 poulenc kernel: [00000000004273d0] kernel_thread+0x38/0x60 Nov 4 18:56:49 poulenc kernel: [0000000000478f80] kthreadd+0x148/0x1c0 From owner-xfs@oss.sgi.com Sun Nov 4 14:01:50 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 14:01:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_33 autolearn=no version=3.3.0-r574664 Received: from mail.ukfsn.org (s2.ukfsn.org [217.158.120.143]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA4M1nsb022151 for ; Sun, 4 Nov 2007 14:01:50 -0800 Received: from localhost (localhost [127.0.0.1]) by mail.ukfsn.org (Postfix) with ESMTP id 25A19DECC0; Sun, 4 Nov 2007 21:44:59 +0000 (GMT) Received: from mail.ukfsn.org ([127.0.0.1]) by localhost (smtp-filter.ukfsn.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PycALoFNZBT8; Sun, 4 Nov 2007 21:44:58 +0000 (GMT) Received: from elm.dgreaves.com (i-83-67-36-194.freedom2surf.net [83.67.36.194]) by mail.ukfsn.org (Postfix) with ESMTP id D4000DED3E; Sun, 4 Nov 2007 21:44:09 +0000 (GMT) Received: from ash.dgreaves.com ([10.0.0.90]) by elm.dgreaves.com with esmtp (Exim 4.62) (envelope-from ) id 1IonCp-0002DL-4r; Sun, 04 Nov 2007 21:40:31 +0000 Message-ID: <472E3C4B.5010904@dgreaves.com> Date: Sun, 04 Nov 2007 21:40:27 +0000 From: David Greaves User-Agent: Mozilla-Thunderbird 2.0.0.6 (X11/20071009) MIME-Version: 1.0 To: Michael Tokarev CC: Justin Piszcz , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state References: <472DBF8C.2060508@msgid.tls.msk.ru> <472DDD78.7040002@msgid.tls.msk.ru> In-Reply-To: <472DDD78.7040002@msgid.tls.msk.ru> X-Enigmail-Version: 0.95.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4672/Sun Nov 4 03:38:42 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13547 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: david@dgreaves.com Precedence: bulk X-list: xfs Michael Tokarev wrote: > Justin Piszcz wrote: >> On Sun, 4 Nov 2007, Michael Tokarev wrote: > [] >>> The next time you come across something like that, do a SysRq-T dump and >>> post that. It shows a stack trace of all processes - and in particular, >>> where exactly each task is stuck. > >> Yes I got it before I rebooted, ran that and then dmesg > file. >> >> Here it is: >> >> [1172609.665902] ffffffff80747dc0 ffffffff80747dc0 ffffffff80747dc0 ffffffff80744d80 >> [1172609.668768] ffffffff80747dc0 ffff81015c3aa918 ffff810091c899b4 ffff810091c899a8 > > That's only partial list. All the kernel threads - which are most important > in this context - aren't shown. You ran out of dmesg buffer, and the most > interesting entries was at the beginning. If your /var/log partition is > working, the stuff should be in /var/log/kern.log or equivalent. If it's > not working, there is a way to capture the info still, by stopping syslogd, > cat'ing /proc/kmsg to some tmpfs file and scp'ing it elsewhere. or netconsole is actually pretty easy and incredibly useful in this kind of situation even if there's no disk at all :) David From owner-xfs@oss.sgi.com Sun Nov 4 16:01:14 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 16:01:21 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=BAYES_50,DATE_IN_PAST_12_24, MIME_QP_LONG_LINE,RCVD_IN_DNSWL_MED,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from b.mx.filmlight.ltd.uk (bongo.filmlight.ltd.uk [217.40.27.26]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA501Cbf001302 for ; Sun, 4 Nov 2007 16:01:13 -0800 Received: (dqd 14871 invoked from network); 5 Nov 2007 00:01:16 -0000 Received: from unknown (HELO BODDINGTON) (roger@62.49.60.134) by b.mx.filmlight.ltd.uk with SMTP; 5 Nov 2007 00:01:16 -0000 Message-ID: <000001c81f3e$eff344b0$6501a8c0@BODDINGTON> From: "Roger Willcocks" To: "Timothy Shimmin" Cc: References: <47249E7A.7060709@filmlight.ltd.uk> <47252F62.6030503@sgi.com> <47262CD0.5010708@filmlight.ltd.uk> <4726ADAE.9070206@sgi.com> <472769A1.5090605@filmlight.ltd.uk> <472A7940.5070800@sgi.com> Subject: Re: bug: truncate to zero + setuid Date: Sun, 4 Nov 2007 11:59:51 -0000 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0645_01C81EDA.353308E0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.3138 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 X-Virus-Scanned: ClamAV 0.91.2/4673/Sun Nov 4 14:22:25 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13548 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: roger@filmlight.ltd.uk Precedence: bulk X-list: xfs This is a multi-part message in MIME format. ------=_NextPart_000_0645_01C81EDA.353308E0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 7bit Timothy Shimmin wrote: > Hi Roger, > ... > I don't like all these inconsistencies. Take a look at the attached patch relative to the current cvs (it's a bit big to put inline). The basic problem is it's currently unclear when to set the times from va_atime etc. and when to set them to the current time. So I've used the already defined XFS_AT_UPDxTIME flags to indicate that a time should be set to 'now' and XFS_AT_xTIME to mean set it using va_xtime. This seems to fit well with the current code and I wonder if that's how it was meant to work in the first place. I've also removed the now redundant ATTR_UTIME flag and pulled the null truncate to the top, which simplifies things. One query: in both xfs_iops.c/xfs_vn_setattr and xfs_dm.c/xfs_dm_set_fileattr the ATIME branch sets the inode's atime directly. This is probably something to do with the comment above xfs_iops.c/xfs_ichgtime ('to make sure the access time update will take') but it could probably be handled better. > BTW, your locking looks wrong - it appears you don't unlock when the > file is non-zero size. Oops... -- Roger ------=_NextPart_000_0645_01C81EDA.353308E0 Content-Type: application/octet-stream; name="xfs_setattr.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="xfs_setattr.patch" diff -ur xfs.orig/linux-2.6/xfs_iops.c xfs/linux-2.6/xfs_iops.c --- xfs.orig/linux-2.6/xfs_iops.c 2007-11-04 10:59:00.923480296 +0000 +++ xfs/linux-2.6/xfs_iops.c 2007-11-04 12:43:15.702609336 +0000 @@ -655,13 +655,21 @@ vattr.va_size =3D attr->ia_size; } if (ia_valid & ATTR_ATIME) { - vattr.va_mask |=3D XFS_AT_ATIME; - vattr.va_atime =3D attr->ia_atime; - inode->i_atime =3D attr->ia_atime; + if (ia_valid & ATTR_ATIME_SET) { + vattr.va_mask |=3D XFS_AT_ATIME; + vattr.va_atime =3D attr->ia_atime; + inode->i_atime =3D attr->ia_atime; + } else { + vattr.va_mask |=3D XFS_AT_UPDATIME; + } } if (ia_valid & ATTR_MTIME) { - vattr.va_mask |=3D XFS_AT_MTIME; - vattr.va_mtime =3D attr->ia_mtime; + if (ia_valid & ATTR_MTIME_SET) { + vattr.va_mask |=3D XFS_AT_MTIME; + vattr.va_mtime =3D attr->ia_mtime; + } else { + vattr.va_mask |=3D XFS_AT_UPDMTIME; + } } if (ia_valid & ATTR_CTIME) { vattr.va_mask |=3D XFS_AT_CTIME; @@ -674,8 +682,6 @@ inode->i_mode &=3D ~S_ISGID; } =20 - if (ia_valid & (ATTR_MTIME_SET | ATTR_ATIME_SET)) - flags |=3D ATTR_UTIME; #ifdef ATTR_NO_BLOCK if ((ia_valid & ATTR_NO_BLOCK)) flags |=3D ATTR_NONBLOCK; diff -ur xfs.orig/linux-2.6/xfs_vnode.h xfs/linux-2.6/xfs_vnode.h --- xfs.orig/linux-2.6/xfs_vnode.h 2007-11-04 10:59:00.923480296 +0000 +++ xfs/linux-2.6/xfs_vnode.h 2007-11-04 11:01:33.338309720 +0000 @@ -270,7 +270,6 @@ /* * Flags to vop_setattr/getattr. */ -#define ATTR_UTIME 0x01 /* non-default utime(2) request */ #define ATTR_DMI 0x08 /* invocation from a DMI function */ #define ATTR_LAZY 0x80 /* set/get attributes lazily */ #define ATTR_NONBLOCK 0x100 /* return EAGAIN if operation would block */ diff -ur xfs.orig/xfs_vnodeops.c xfs/xfs_vnodeops.c --- xfs.orig/xfs_vnodeops.c 2007-11-04 10:59:00.917481208 +0000 +++ xfs/xfs_vnodeops.c 2007-11-04 12:07:44.917537904 +0000 @@ -214,10 +214,10 @@ { bhv_vnode_t *vp =3D XFS_ITOV(ip); xfs_mount_t *mp =3D ip->i_mount; - xfs_trans_t *tp; + xfs_trans_t *tp =3D NULL; int mask; int code; - uint lock_flags; + uint lock_flags=3D0; uint commit_flags=3D0; uid_t uid=3D0, iuid=3D0; gid_t gid=3D0, igid=3D0; @@ -244,21 +244,51 @@ if (XFS_FORCED_SHUTDOWN(mp)) return XFS_ERROR(EIO); =20 + olddquot1 =3D olddquot2 =3D NULL; + udqp =3D gdqp =3D NULL; + + /* + * Truncate is special because it changes the file as well as + * the attributes. + */ + if (mask & XFS_AT_SIZE) { + /* Must have write permission and not be a directory. */ + if (VN_ISDIR(vp)) { + code =3D XFS_ERROR(EISDIR); + goto error_return; + } else if (!VN_ISREG(vp)) { + code =3D XFS_ERROR(EINVAL); + goto error_return; + } + /* + * Short circuit the truncate case for zero length files. + */ + if (vap->va_size =3D=3D 0) { + xfs_ilock(ip, XFS_ILOCK_EXCL); + if ((ip->i_size =3D=3D 0) && (ip->i_d.di_nextents =3D=3D 0)) { + mask |=3D XFS_AT_UPDCTIME|XFS_AT_UPDMTIME; + mask &=3D ~XFS_AT_SIZE; + } + xfs_iunlock(ip, XFS_ILOCK_EXCL); + } + } + /* * Timestamps do not need to be logged and hence do not * need to be done within a transaction. */ if (mask & XFS_AT_UPDTIMES) { - ASSERT((mask & ~XFS_AT_UPDTIMES) =3D=3D 0); timeflags =3D ((mask & XFS_AT_UPDATIME) ? XFS_ICHGTIME_ACC : 0) | ((mask & XFS_AT_UPDCTIME) ? XFS_ICHGTIME_CHG : 0) | ((mask & XFS_AT_UPDMTIME) ? XFS_ICHGTIME_MOD : 0); - xfs_ichgtime(ip, timeflags); - return 0; + mask &=3D ~XFS_AT_UPDTIMES; } =20 - olddquot1 =3D olddquot2 =3D NULL; - udqp =3D gdqp =3D NULL; + if (mask =3D=3D 0) { + if (timeflags && !(flags & ATTR_DMI)) + xfs_ichgtime(ip, timeflags); + return 0; + } =20 /* * If disk quotas is on, we make sure that the dquots do exist on disk, @@ -307,12 +337,11 @@ * For the other attributes, we acquire the inode lock and * first do an error checking pass. */ - tp =3D NULL; lock_flags =3D XFS_ILOCK_EXCL; if (flags & ATTR_NOLOCK) need_iolock =3D 0; if (!(mask & XFS_AT_SIZE)) { - if ((mask !=3D (XFS_AT_CTIME|XFS_AT_ATIME|XFS_AT_MTIME)) || + if ((mask & ~(XFS_AT_CTIME|XFS_AT_ATIME|XFS_AT_MTIME)) !=3D 0 || (mp->m_flags & XFS_MOUNT_WSYNC)) { tp =3D xfs_trans_alloc(mp, XFS_TRANS_SETATTR_NOT_SIZE); commit_flags =3D 0; @@ -451,24 +480,6 @@ * Truncate file. Must have write permission and not be a directory. */ if (mask & XFS_AT_SIZE) { - /* Short circuit the truncate case for zero length files */ - if ((vap->va_size =3D=3D 0) && - (ip->i_size =3D=3D 0) && (ip->i_d.di_nextents =3D=3D 0)) { - xfs_iunlock(ip, XFS_ILOCK_EXCL); - lock_flags &=3D ~XFS_ILOCK_EXCL; - if (mask & XFS_AT_CTIME) - xfs_ichgtime(ip, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); - code =3D 0; - goto error_return; - } - - if (VN_ISDIR(vp)) { - code =3D XFS_ERROR(EISDIR); - goto error_return; - } else if (!VN_ISREG(vp)) { - code =3D XFS_ERROR(EINVAL); - goto error_return; - } /* * Make sure that the dquots are attached to the inode. */ @@ -481,8 +492,7 @@ */ if (mask & (XFS_AT_ATIME|XFS_AT_MTIME)) { if (!file_owner) { - if ((flags & ATTR_UTIME) && - !capable(CAP_FOWNER)) { + if (!capable(CAP_FOWNER)) { code =3D XFS_ERROR(EPERM); goto error_return; } @@ -760,7 +770,7 @@ timeflags &=3D ~XFS_ICHGTIME_MOD; timeflags |=3D XFS_ICHGTIME_CHG; } - if (tp && (flags & ATTR_UTIME)) + if (tp) xfs_trans_log_inode (tp, ip, XFS_ILOG_CORE); } =20 ------=_NextPart_000_0645_01C81EDA.353308E0-- From owner-xfs@oss.sgi.com Sun Nov 4 17:45:44 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 17:45:49 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_63 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA51jb0s013123 for ; Sun, 4 Nov 2007 17:45:41 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA14744; Mon, 5 Nov 2007 12:45:24 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA51jKdD94077391; Mon, 5 Nov 2007 12:45:21 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA51jAW894285231; Mon, 5 Nov 2007 12:45:10 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 5 Nov 2007 12:45:10 +1100 From: David Chinner To: Torsten Kaiser Cc: David Chinner , Peter Zijlstra , Fengguang Wu , Maxim Levitsky , linux-kernel@vger.kernel.org, Andrew Morton , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: writeout stalls in current -git Message-ID: <20071105014510.GU66820511@sgi.com> References: <393060478.03650@ustc.edu.cn> <64bb37e0710310822r5ca6b793p8fd97db2f72a8655@mail.gmail.com> <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="7ZAtKRhVyVSsbBD2" Content-Disposition: inline In-Reply-To: <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4673/Sun Nov 4 14:22:25 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13549 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs --7ZAtKRhVyVSsbBD2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sun, Nov 04, 2007 at 12:19:19PM +0100, Torsten Kaiser wrote: > On 11/2/07, David Chinner wrote: > > That's stalled waiting on the inode cluster buffer lock. That implies > > that the inode lcuser is already being written out and the inode has > > been redirtied during writeout. > > > > Does the kernel you are testing have the "flush inodes in ascending > > inode number order" patches applied? If so, can you remove that > > patch and see if the problem goes away? > > I can now confirm, that I see this also with the current mainline-git-version > I used 2.6.24-rc1-git-b4f555081fdd27d13e6ff39d455d5aefae9d2c0c > plus the fix for the sg changes in ieee1394. Ok, so it's probably a side effect of the writeback changes. Attached are two patches (two because one was in a separate patchset as a standalone change) that should prevent async writeback from blocking on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first. Can you see if this fixes the problem? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --7ZAtKRhVyVSsbBD2 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=xfs-factor-inotobp --- fs/xfs/xfs_inode.c | 283 ++++++++++++++++++++++++----------------------------- 1 file changed, 129 insertions(+), 154 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2007-09-12 15:41:22.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2007-09-13 08:57:06.395641940 +1000 @@ -124,6 +124,126 @@ xfs_inobp_check( #endif /* + * Simple wrapper for calling xfs_imap() that includes error + * and bounds checking + */ +STATIC int +xfs_ino_to_imap( + xfs_mount_t *mp, + xfs_trans_t *tp, + xfs_ino_t ino, + xfs_imap_t *imap, + uint imap_flags) +{ + int error; + + error = xfs_imap(mp, tp, ino, imap, imap_flags); + if (error) { + cmn_err(CE_WARN, "xfs_ino_to_imap: xfs_imap() returned an " + "error %d on %s. Returning error.", + error, mp->m_fsname); + return error; + } + + /* + * If the inode number maps to a block outside the bounds + * of the file system then return NULL rather than calling + * read_buf and panicing when we get an error from the + * driver. + */ + if ((imap->im_blkno + imap->im_len) > + XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks)) { + xfs_fs_cmn_err(CE_ALERT, mp, "xfs_ino_to_imap: " + "(imap->im_blkno (0x%llx) + imap->im_len (0x%llx)) > " + " XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks) (0x%llx)", + (unsigned long long) imap->im_blkno, + (unsigned long long) imap->im_len, + XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks)); + return XFS_ERROR(EINVAL); + } + return 0; +} + +/* + * Find the buffer associated with the given inode map + * We do basic validation checks on the buffer once it has been + * retrieved from disk. + */ +STATIC int +xfs_imap_to_bp( + xfs_mount_t *mp, + xfs_trans_t *tp, + xfs_imap_t *imap, + xfs_buf_t **bpp, + uint buf_flags, + uint imap_flags) +{ + int error; + int i; + int ni; + xfs_buf_t *bp; + + error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, imap->im_blkno, + (int)imap->im_len, XFS_BUF_LOCK, &bp); + if (error) { + cmn_err(CE_WARN, "xfs_imap_to_bp: xfs_trans_read_buf()returned " + "an error %d on %s. Returning error.", + error, mp->m_fsname); + return error; + } + + /* + * Validate the magic number and version of every inode in the buffer + * (if DEBUG kernel) or the first inode in the buffer, otherwise. + */ +#ifdef DEBUG + ni = BBTOB(imap->im_len) >> mp->m_sb.sb_inodelog; +#else /* usual case */ + ni = 1; +#endif + + for (i = 0; i < ni; i++) { + int di_ok; + xfs_dinode_t *dip; + + dip = (xfs_dinode_t *)xfs_buf_offset(bp, + (i << mp->m_sb.sb_inodelog)); + di_ok = be16_to_cpu(dip->di_core.di_magic) == XFS_DINODE_MAGIC && + XFS_DINODE_GOOD_VERSION(dip->di_core.di_version); + if (unlikely(XFS_TEST_ERROR(!di_ok, mp, + XFS_ERRTAG_ITOBP_INOTOBP, + XFS_RANDOM_ITOBP_INOTOBP))) { + if (imap_flags & XFS_IMAP_BULKSTAT) { + xfs_trans_brelse(tp, bp); + return XFS_ERROR(EINVAL); + } + XFS_CORRUPTION_ERROR("xfs_imap_to_bp", + XFS_ERRLEVEL_HIGH, mp, dip); +#ifdef DEBUG + cmn_err(CE_PANIC, + "Device %s - bad inode magic/vsn " + "daddr %lld #%d (magic=%x)", + XFS_BUFTARG_NAME(mp->m_ddev_targp), + (unsigned long long)imap->im_blkno, i, + be16_to_cpu(dip->di_core.di_magic)); +#endif + xfs_trans_brelse(tp, bp); + return XFS_ERROR(EFSCORRUPTED); + } + } + + xfs_inobp_check(mp, bp); + + /* + * Mark the buffer as an inode buffer now that it looks good + */ + XFS_BUF_SET_VTYPE(bp, B_FS_INO); + + *bpp = bp; + return 0; +} + +/* * This routine is called to map an inode number within a file * system to the buffer containing the on-disk version of the * inode. It returns a pointer to the buffer containing the @@ -145,72 +265,19 @@ xfs_inotobp( xfs_buf_t **bpp, int *offset) { - int di_ok; xfs_imap_t imap; xfs_buf_t *bp; int error; - xfs_dinode_t *dip; - /* - * Call the space management code to find the location of the - * inode on disk. - */ imap.im_blkno = 0; - error = xfs_imap(mp, tp, ino, &imap, XFS_IMAP_LOOKUP); - if (error != 0) { - cmn_err(CE_WARN, - "xfs_inotobp: xfs_imap() returned an " - "error %d on %s. Returning error.", error, mp->m_fsname); + error = xfs_ino_to_imap(mp, tp, ino, &imap, XFS_IMAP_LOOKUP); + if (error) return error; - } - - /* - * If the inode number maps to a block outside the bounds of the - * file system then return NULL rather than calling read_buf - * and panicing when we get an error from the driver. - */ - if ((imap.im_blkno + imap.im_len) > - XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks)) { - cmn_err(CE_WARN, - "xfs_inotobp: inode number (%llu + %d) maps to a block outside the bounds " - "of the file system %s. Returning EINVAL.", - (unsigned long long)imap.im_blkno, - imap.im_len, mp->m_fsname); - return XFS_ERROR(EINVAL); - } - - /* - * Read in the buffer. If tp is NULL, xfs_trans_read_buf() will - * default to just a read_buf() call. - */ - error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, imap.im_blkno, - (int)imap.im_len, XFS_BUF_LOCK, &bp); - if (error) { - cmn_err(CE_WARN, - "xfs_inotobp: xfs_trans_read_buf() returned an " - "error %d on %s. Returning error.", error, mp->m_fsname); + error = xfs_imap_to_bp(mp, tp, &imap, &bp, XFS_BUF_LOCK, 0); + if (error) return error; - } - dip = (xfs_dinode_t *)xfs_buf_offset(bp, 0); - di_ok = - be16_to_cpu(dip->di_core.di_magic) == XFS_DINODE_MAGIC && - XFS_DINODE_GOOD_VERSION(dip->di_core.di_version); - if (unlikely(XFS_TEST_ERROR(!di_ok, mp, XFS_ERRTAG_ITOBP_INOTOBP, - XFS_RANDOM_ITOBP_INOTOBP))) { - XFS_CORRUPTION_ERROR("xfs_inotobp", XFS_ERRLEVEL_LOW, mp, dip); - xfs_trans_brelse(tp, bp); - cmn_err(CE_WARN, - "xfs_inotobp: XFS_TEST_ERROR() returned an " - "error on %s. Returning EFSCORRUPTED.", mp->m_fsname); - return XFS_ERROR(EFSCORRUPTED); - } - - xfs_inobp_check(mp, bp); - /* - * Set *dipp to point to the on-disk inode in the buffer. - */ *dipp = (xfs_dinode_t *)xfs_buf_offset(bp, imap.im_boffset); *bpp = bp; *offset = imap.im_boffset; @@ -251,41 +318,15 @@ xfs_itobp( xfs_imap_t imap; xfs_buf_t *bp; int error; - int i; - int ni; if (ip->i_blkno == (xfs_daddr_t)0) { - /* - * Call the space management code to find the location of the - * inode on disk. - */ imap.im_blkno = bno; - if ((error = xfs_imap(mp, tp, ip->i_ino, &imap, - XFS_IMAP_LOOKUP | imap_flags))) + error = xfs_ino_to_imap(mp, tp, ip->i_ino, &imap, + XFS_IMAP_LOOKUP | imap_flags); + if (error) return error; /* - * If the inode number maps to a block outside the bounds - * of the file system then return NULL rather than calling - * read_buf and panicing when we get an error from the - * driver. - */ - if ((imap.im_blkno + imap.im_len) > - XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks)) { -#ifdef DEBUG - xfs_fs_cmn_err(CE_ALERT, mp, "xfs_itobp: " - "(imap.im_blkno (0x%llx) " - "+ imap.im_len (0x%llx)) > " - " XFS_FSB_TO_BB(mp, " - "mp->m_sb.sb_dblocks) (0x%llx)", - (unsigned long long) imap.im_blkno, - (unsigned long long) imap.im_len, - XFS_FSB_TO_BB(mp, mp->m_sb.sb_dblocks)); -#endif /* DEBUG */ - return XFS_ERROR(EINVAL); - } - - /* * Fill in the fields in the inode that will be used to * map the inode to its buffer from now on. */ @@ -303,76 +344,10 @@ xfs_itobp( } ASSERT(bno == 0 || bno == imap.im_blkno); - /* - * Read in the buffer. If tp is NULL, xfs_trans_read_buf() will - * default to just a read_buf() call. - */ - error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, imap.im_blkno, - (int)imap.im_len, XFS_BUF_LOCK, &bp); - if (error) { -#ifdef DEBUG - xfs_fs_cmn_err(CE_ALERT, mp, "xfs_itobp: " - "xfs_trans_read_buf() returned error %d, " - "imap.im_blkno 0x%llx, imap.im_len 0x%llx", - error, (unsigned long long) imap.im_blkno, - (unsigned long long) imap.im_len); -#endif /* DEBUG */ + error = xfs_imap_to_bp(mp, tp, &imap, &bp, XFS_BUF_LOCK, imap_flags); + if (error) return error; - } - - /* - * Validate the magic number and version of every inode in the buffer - * (if DEBUG kernel) or the first inode in the buffer, otherwise. - * No validation is done here in userspace (xfs_repair). - */ -#if !defined(__KERNEL__) - ni = 0; -#elif defined(DEBUG) - ni = BBTOB(imap.im_len) >> mp->m_sb.sb_inodelog; -#else /* usual case */ - ni = 1; -#endif - - for (i = 0; i < ni; i++) { - int di_ok; - xfs_dinode_t *dip; - - dip = (xfs_dinode_t *)xfs_buf_offset(bp, - (i << mp->m_sb.sb_inodelog)); - di_ok = be16_to_cpu(dip->di_core.di_magic) == XFS_DINODE_MAGIC && - XFS_DINODE_GOOD_VERSION(dip->di_core.di_version); - if (unlikely(XFS_TEST_ERROR(!di_ok, mp, - XFS_ERRTAG_ITOBP_INOTOBP, - XFS_RANDOM_ITOBP_INOTOBP))) { - if (imap_flags & XFS_IMAP_BULKSTAT) { - xfs_trans_brelse(tp, bp); - return XFS_ERROR(EINVAL); - } -#ifdef DEBUG - cmn_err(CE_ALERT, - "Device %s - bad inode magic/vsn " - "daddr %lld #%d (magic=%x)", - XFS_BUFTARG_NAME(mp->m_ddev_targp), - (unsigned long long)imap.im_blkno, i, - be16_to_cpu(dip->di_core.di_magic)); -#endif - XFS_CORRUPTION_ERROR("xfs_itobp", XFS_ERRLEVEL_HIGH, - mp, dip); - xfs_trans_brelse(tp, bp); - return XFS_ERROR(EFSCORRUPTED); - } - } - - xfs_inobp_check(mp, bp); - /* - * Mark the buffer as an inode buffer now that it looks good - */ - XFS_BUF_SET_VTYPE(bp, B_FS_INO); - - /* - * Set *dipp to point to the on-disk inode in the buffer. - */ *dipp = (xfs_dinode_t *)xfs_buf_offset(bp, imap.im_boffset); *bpp = bp; return 0; --7ZAtKRhVyVSsbBD2 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=xfs-iflush-blocking-fix --- fs/xfs/linux-2.6/xfs_super.c | 3 +- fs/xfs/linux-2.6/xfs_vnode.h | 5 --- fs/xfs/xfs_inode.c | 33 ++++++++++++++++--------- fs/xfs/xfs_inode.h | 7 +++-- fs/xfs/xfs_vnodeops.c | 55 +++++++++---------------------------------- 5 files changed, 41 insertions(+), 62 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.c 2007-11-05 10:17:36.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.c 2007-11-05 10:33:49.590268027 +1100 @@ -306,14 +306,15 @@ xfs_inotobp( * 0 for the disk block address. */ int -xfs_itobp( +xfs_itobp_flags( xfs_mount_t *mp, xfs_trans_t *tp, xfs_inode_t *ip, xfs_dinode_t **dipp, xfs_buf_t **bpp, xfs_daddr_t bno, - uint imap_flags) + uint imap_flags, + uint buf_flags) { xfs_imap_t imap; xfs_buf_t *bp; @@ -344,10 +345,17 @@ xfs_itobp( } ASSERT(bno == 0 || bno == imap.im_blkno); - error = xfs_imap_to_bp(mp, tp, &imap, &bp, XFS_BUF_LOCK, imap_flags); + error = xfs_imap_to_bp(mp, tp, &imap, &bp, buf_flags, imap_flags); if (error) return error; + if (!bp) { + ASSERT(buf_flags & XFS_BUF_TRYLOCK); + ASSERT(tp == NULL); + *bpp = NULL; + return EAGAIN; + } + *dipp = (xfs_dinode_t *)xfs_buf_offset(bp, imap.im_boffset); *bpp = bp; return 0; @@ -3068,15 +3076,6 @@ xfs_iflush( } /* - * Get the buffer containing the on-disk inode. - */ - error = xfs_itobp(mp, NULL, ip, &dip, &bp, 0, 0); - if (error) { - xfs_ifunlock(ip); - return error; - } - - /* * Decide how buffer will be flushed out. This is done before * the call to xfs_iflush_int because this field is zeroed by it. */ @@ -3125,6 +3124,16 @@ xfs_iflush( } /* + * Get the buffer containing the on-disk inode. + */ + error = xfs_itobp_flags(mp, NULL, ip, &dip, &bp, 0, 0, + (flags == INT_ASYNC) ? XFS_BUF_TRYLOCK : XFS_BUF_LOCK); + if (error ||!bp) { + xfs_ifunlock(ip); + return error; + } + + /* * First flush out the inode that xfs_iflush was called with. */ error = xfs_iflush_int(ip, bp); Index: 2.6.x-xfs-new/fs/xfs/xfs_inode.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_inode.h 2007-11-02 13:44:46.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_inode.h 2007-11-05 10:25:44.885153248 +1100 @@ -488,9 +488,12 @@ int xfs_finish_reclaim_all(struct xfs_m /* * xfs_inode.c prototypes. */ -int xfs_itobp(struct xfs_mount *, struct xfs_trans *, +int xfs_itobp_flags(struct xfs_mount *, struct xfs_trans *, xfs_inode_t *, struct xfs_dinode **, struct xfs_buf **, - xfs_daddr_t, uint); + xfs_daddr_t, uint, uint); +#define xfs_itobp(mp, tp, ip, dipp, bpp, bno, iflags) \ + xfs_itobp_flags(mp, tp, ip, dipp, bpp, bno, iflags, XFS_BUF_LOCK) + int xfs_iread(struct xfs_mount *, struct xfs_trans *, xfs_ino_t, xfs_inode_t **, xfs_daddr_t, uint); int xfs_iread_extents(struct xfs_trans *, xfs_inode_t *, int); Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c 2007-11-02 13:44:50.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c 2007-11-05 10:39:05.969204451 +1100 @@ -840,7 +840,8 @@ xfs_fs_write_inode( struct inode *inode, int sync) { - int error = 0, flags = FLUSH_INODE; + int error = 0; + int flags = 0; xfs_itrace_entry(XFS_I(inode)); if (sync) { Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vnode.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_vnode.h 2007-10-02 16:01:47.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_vnode.h 2007-11-05 10:40:49.103817818 +1100 @@ -73,12 +73,9 @@ typedef enum bhv_vrwlock { #define IO_INVIS 0x00020 /* don't update inode timestamps */ /* - * Flags for vop_iflush call + * Flags for xfs_inode_flush */ #define FLUSH_SYNC 1 /* wait for flush to complete */ -#define FLUSH_INODE 2 /* flush the inode itself */ -#define FLUSH_LOG 4 /* force the last log entry for - * this inode out to disk */ /* * Flush/Invalidate options for vop_toss/flush/flushinval_pages. Index: 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_vnodeops.c 2007-11-05 10:02:05.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_vnodeops.c 2007-11-05 10:37:53.398623943 +1100 @@ -3556,29 +3556,6 @@ xfs_inode_flush( ((iip == NULL) || !(iip->ili_format.ilf_fields & XFS_ILOG_ALL))) return 0; - if (flags & FLUSH_LOG) { - if (iip && iip->ili_last_lsn) { - xlog_t *log = mp->m_log; - xfs_lsn_t sync_lsn; - int s, log_flags = XFS_LOG_FORCE; - - s = GRANT_LOCK(log); - sync_lsn = log->l_last_sync_lsn; - GRANT_UNLOCK(log, s); - - if ((XFS_LSN_CMP(iip->ili_last_lsn, sync_lsn) > 0)) { - if (flags & FLUSH_SYNC) - log_flags |= XFS_LOG_SYNC; - error = xfs_log_force(mp, iip->ili_last_lsn, log_flags); - if (error) - return error; - } - - if (ip->i_update_core == 0) - return 0; - } - } - /* * We make this non-blocking if the inode is contended, * return EAGAIN to indicate to the caller that they @@ -3586,30 +3563,22 @@ xfs_inode_flush( * blocking on inodes inside another operation right * now, they get caught later by xfs_sync. */ - if (flags & FLUSH_INODE) { - int flush_flags; - - if (flags & FLUSH_SYNC) { - xfs_ilock(ip, XFS_ILOCK_SHARED); - xfs_iflock(ip); - } else if (xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) { - if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip)) { - xfs_iunlock(ip, XFS_ILOCK_SHARED); - return EAGAIN; - } - } else { + if (flags & FLUSH_SYNC) { + xfs_ilock(ip, XFS_ILOCK_SHARED); + xfs_iflock(ip); + } else if (xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) { + if (xfs_ipincount(ip) || !xfs_iflock_nowait(ip)) { + xfs_iunlock(ip, XFS_ILOCK_SHARED); return EAGAIN; } - - if (flags & FLUSH_SYNC) - flush_flags = XFS_IFLUSH_SYNC; - else - flush_flags = XFS_IFLUSH_ASYNC; - - error = xfs_iflush(ip, flush_flags); - xfs_iunlock(ip, XFS_ILOCK_SHARED); + } else { + return EAGAIN; } + error = xfs_iflush(ip, (flags & FLUSH_SYNC) ? XFS_IFLUSH_SYNC + : XFS_IFLUSH_ASYNC); + xfs_iunlock(ip, XFS_ILOCK_SHARED); + return error; } --7ZAtKRhVyVSsbBD2-- From owner-xfs@oss.sgi.com Sun Nov 4 20:15:49 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 20:16:11 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA54FjhE026281 for ; Sun, 4 Nov 2007 20:15:48 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA16991; Mon, 5 Nov 2007 15:15:44 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id C86DB58C38F7; Mon, 5 Nov 2007 15:15:44 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTIAL TAKE 971186 - Move platform specific mount option parsing out of core XFS code Message-Id: <20071105041544.C86DB58C38F7@chook.melbourne.sgi.com> Date: Mon, 5 Nov 2007 15:15:44 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4673/Sun Nov 4 14:22:25 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13550 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Move platform specific mount option parsing out of core XFS code Mount option parsing is platform specific. Move it out of core code into the platform specific superblock operation file. Date: Mon Nov 5 15:14:58 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: hch@infradead.org The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30012a fs/xfs/xfs_vfsops.c - 1.548 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.c.diff?r1=text&tr1=1.548&r2=text&tr2=1.547&f=h - move linux specific mount option parsing into linux specific code. fs/xfs/linux-2.6/xfs_super.c - 1.403 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_super.c.diff?r1=text&tr1=1.403&r2=text&tr2=1.402&f=h - move linux specific mount option parsing into linux specific code. fs/xfs/xfs_vfsops.h - 1.6 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vfsops.h.diff?r1=text&tr1=1.6&r2=text&tr2=1.5&f=h - move linux specific mount option parsing into linux specific code. From owner-xfs@oss.sgi.com Sun Nov 4 21:07:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 21:07:20 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_62, J_CHICKENPOX_64 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA5574Qj006251 for ; Sun, 4 Nov 2007 21:07:09 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA18323; Mon, 5 Nov 2007 16:07:08 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA5577dD94343975; Mon, 5 Nov 2007 16:07:08 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA5576JN94246852; Mon, 5 Nov 2007 16:07:06 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Mon, 5 Nov 2007 16:07:06 +1100 From: David Chinner To: xfs-oss Cc: xfs-dev Subject: [PATCH, RFC] Move AIL pushing into a separate thread Message-ID: <20071105050706.GW66820511@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4673/Sun Nov 4 14:22:25 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13551 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs When many hundreds to thousands of threads all try to do simultaneous transactions and the log is in a tail-pushing situation (i.e. full), we can get multiple threads walking the AIL list and contending on the AIL lock. Recently wevve had two cases of machines basically locking up because most of the CPUs in the system are trying to obtain the AIL lock. The first was an 8p machine with ~2,500 kernel threads trying to do transactions, and the latest is a 2048p altix closing a file per MPI rank in a synchronised fashion resulting in > 400 processes all trying to walk and push the AIL at the same time. The AIL push is, in effect, a simple I/O dispatch algorithm complicated by the ordering constraints placed on it by the transaction subsystem. It really does not need multiple threads to push on it - even when only a single CPU is pushing the AIL, it can push the I/O out far faster that pretty much any disk subsystem can handle. So, to avoid contention problems stemming from multiple list walkers, move the list walk off into another thread and simply provide a "target" to push to. When a thread requires a push, it sets the target and wakes the push thread, then goes to sleep waiting for the required amount of space to become available in the log. This mechanism should also be a lot fairer under heavy load as the waiters will queue in arrival order, rather than queuing in "who completed a push first" order. Also, by moving the pushing to a separate thread we can do more effectively overload detection and prevention as we can keep context from loop iteration to loop iteration. That is, we can push only part of the list each loop and not have to loop back to the start of the list every time we run. This should also help by reducing the number of items we try to lock and/or push items that we cannot move. Note that this patch is not intended to solve the inefficiencies in the AIL structure and the associated issues with extremely large list contents. That needs to be addresses separately; parallel access would cause problems to any new structure as well, so I'm only aiming to isolate the structure from unbounded parallelism here. Signed-Off-By: Dave Chinner --- fs/xfs/linux-2.6/xfs_super.c | 60 +++++++++++ fs/xfs/xfs_log.c | 12 ++ fs/xfs/xfs_mount.c | 6 - fs/xfs/xfs_mount.h | 10 + fs/xfs/xfs_trans.h | 1 fs/xfs/xfs_trans_ail.c | 231 ++++++++++++++++++++++++++++--------------- fs/xfs/xfs_trans_priv.h | 8 + fs/xfs/xfsidbg.c | 12 +- 8 files changed, 247 insertions(+), 93 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c 2007-11-05 10:39:05.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c 2007-11-05 14:48:39.871177707 +1100 @@ -51,6 +51,7 @@ #include "xfs_vfsops.h" #include "xfs_version.h" #include "xfs_log_priv.h" +#include "xfs_trans_priv.h" #include #include @@ -765,6 +766,65 @@ xfs_blkdev_issue_flush( blkdev_issue_flush(buftarg->bt_bdev, NULL); } +/* + * XFS AIL push thread support + */ +void +xfsaild_wakeup( + xfs_mount_t *mp, + xfs_lsn_t threshold_lsn) +{ + + if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0) { + mp->m_ail.xa_target = threshold_lsn; + wake_up_process(mp->m_ail.xa_task); + } +} + +int +xfsaild( + void *data) +{ + xfs_mount_t *mp = (xfs_mount_t *)data; + xfs_lsn_t last_pushed_lsn = 0; + long tout = 0; + + while (!kthread_should_stop()) { + if (tout) + schedule_timeout_interruptible(msecs_to_jiffies(tout)); + + /* swsusp */ + try_to_freeze(); + + /* we're either starting or stopping if there is no log */ + if (!mp->m_log) + continue; + + tout = xfsaild_push(mp, &last_pushed_lsn); + } + + return 0; +} /* xfsaild */ + +void +xfsaild_start( + xfs_mount_t *mp) +{ + mp->m_ail.xa_target = 0; + mp->m_ail.xa_task = kthread_run(xfsaild, mp, "xfsaild"); + ASSERT(!IS_ERR(mp->m_ail.xa_task)); + /* XXX: should return error but nowhere to do it */ +} + +void +xfsaild_stop( + xfs_mount_t *mp) +{ + kthread_stop(mp->m_ail.xa_task); +} + + + STATIC struct inode * xfs_fs_alloc_inode( struct super_block *sb) Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2007-11-02 18:00:19.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2007-11-05 14:07:16.850189316 +1100 @@ -515,6 +515,12 @@ xfs_log_mount(xfs_mount_t *mp, mp->m_log = xlog_alloc_log(mp, log_target, blk_offset, num_bblks); /* + * Initialize the AIL now we have a log. + */ + spin_lock_init(&mp->m_ail_lock); + xfs_trans_ail_init(mp); + + /* * skip log recovery on a norecovery mount. pretend it all * just worked. */ @@ -530,7 +536,7 @@ xfs_log_mount(xfs_mount_t *mp, mp->m_flags |= XFS_MOUNT_RDONLY; if (error) { cmn_err(CE_WARN, "XFS: log mount/recovery failed: error %d", error); - xlog_dealloc_log(mp->m_log); + xfs_log_unmount_dealloc(mp); return error; } } @@ -722,10 +728,14 @@ xfs_log_unmount_write(xfs_mount_t *mp) /* * Deallocate log structures for unmount/relocation. + * + * We need to stop the aild from running before we destroy + * and deallocate the log as the aild references the log. */ void xfs_log_unmount_dealloc(xfs_mount_t *mp) { + xfs_trans_ail_destroy(mp); xlog_dealloc_log(mp->m_log); } Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.c 2007-11-02 13:44:50.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.c 2007-11-05 14:12:22.554601173 +1100 @@ -137,15 +137,9 @@ xfs_mount_init(void) mp->m_flags |= XFS_MOUNT_NO_PERCPU_SB; } - spin_lock_init(&mp->m_ail_lock); spin_lock_init(&mp->m_sb_lock); mutex_init(&mp->m_ilock); mutex_init(&mp->m_growlock); - /* - * Initialize the AIL. - */ - xfs_trans_ail_init(mp); - atomic_set(&mp->m_active_trans, 0); return mp; Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.h 2007-10-16 08:52:58.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.h 2007-11-05 14:14:42.652456849 +1100 @@ -219,12 +219,18 @@ extern void xfs_icsb_sync_counters_flags #define xfs_icsb_sync_counters_flags(mp, flags) do { } while (0) #endif +typedef struct xfs_ail { + xfs_ail_entry_t xa_ail; + uint xa_gen; + struct task_struct *xa_task; + xfs_lsn_t xa_target; +} xfs_ail_t; + typedef struct xfs_mount { struct super_block *m_super; xfs_tid_t m_tid; /* next unused tid for fs */ spinlock_t m_ail_lock; /* fs AIL mutex */ - xfs_ail_entry_t m_ail; /* fs active log item list */ - uint m_ail_gen; /* fs AIL generation count */ + xfs_ail_t m_ail; /* fs active log item list */ xfs_sb_t m_sb; /* copy of fs superblock */ spinlock_t m_sb_lock; /* sb counter lock */ struct xfs_buf *m_sb_bp; /* buffer for superblock */ Index: 2.6.x-xfs-new/fs/xfs/xfs_trans.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans.h 2007-11-02 13:44:46.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_trans.h 2007-11-05 14:01:13.205272667 +1100 @@ -993,6 +993,7 @@ int _xfs_trans_commit(xfs_trans_t *, #define xfs_trans_commit(tp, flags) _xfs_trans_commit(tp, flags, NULL) void xfs_trans_cancel(xfs_trans_t *, int); void xfs_trans_ail_init(struct xfs_mount *); +void xfs_trans_ail_destroy(struct xfs_mount *); xfs_lsn_t xfs_trans_push_ail(struct xfs_mount *, xfs_lsn_t); xfs_lsn_t xfs_trans_tail_ail(struct xfs_mount *); void xfs_trans_unlocked_item(struct xfs_mount *, Index: 2.6.x-xfs-new/fs/xfs/xfs_trans_ail.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans_ail.c 2007-10-02 16:01:48.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_trans_ail.c 2007-11-05 14:46:44.206327966 +1100 @@ -57,7 +57,7 @@ xfs_trans_tail_ail( xfs_log_item_t *lip; spin_lock(&mp->m_ail_lock); - lip = xfs_ail_min(&(mp->m_ail)); + lip = xfs_ail_min(&(mp->m_ail.xa_ail)); if (lip == NULL) { lsn = (xfs_lsn_t)0; } else { @@ -71,25 +71,22 @@ xfs_trans_tail_ail( /* * xfs_trans_push_ail * - * This routine is called to move the tail of the AIL - * forward. It does this by trying to flush items in the AIL - * whose lsns are below the given threshold_lsn. + * This routine is called to move the tail of the AIL forward. It does this by + * trying to flush items in the AIL whose lsns are below the given + * threshold_lsn. * - * The routine returns the lsn of the tail of the log. + * the push is run asynchronously in a separate thread, so we return the tail + * of the log right now instead of the tail after the push. This means we will + * either continue right away, or we will sleep waiting on the async thread to + * do it's work. */ xfs_lsn_t xfs_trans_push_ail( xfs_mount_t *mp, xfs_lsn_t threshold_lsn) { - xfs_lsn_t lsn; xfs_log_item_t *lip; int gen; - int restarts; - int lock_result; - int flush_log; - -#define XFS_TRANS_PUSH_AIL_RESTARTS 1000 spin_lock(&mp->m_ail_lock); lip = xfs_trans_first_ail(mp, &gen); @@ -100,57 +97,105 @@ xfs_trans_push_ail( spin_unlock(&mp->m_ail_lock); return (xfs_lsn_t)0; } + if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0) + xfsaild_wakeup(mp, threshold_lsn); + spin_unlock(&mp->m_ail_lock); + return (xfs_lsn_t)lip->li_lsn; +} + +/* + * Return the item in the AIL with the current lsn. + * Return the current tree generation number for use + * in calls to xfs_trans_next_ail(). + */ +STATIC xfs_log_item_t * +xfs_trans_first_push_ail( + xfs_mount_t *mp, + int *gen, + xfs_lsn_t lsn) +{ + xfs_log_item_t *lip; + + lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + *gen = (int)mp->m_ail.xa_gen; + while (lip && (XFS_LSN_CMP(lip->li_lsn, lsn) < 0)) + lip = lip->li_ail.ail_forw; + + return (lip); +} + +/* + * Function that does the work of pushing on the AIL + */ +long +xfsaild_push( + xfs_mount_t *mp, + xfs_lsn_t *last_lsn) +{ + long tout = 100; /* milliseconds */ + xfs_lsn_t last_pushed_lsn = *last_lsn; + xfs_lsn_t target = mp->m_ail.xa_target; + xfs_lsn_t lsn; + xfs_log_item_t *lip; + int lock_result; + int gen; + int restarts; + int flush_log, count, stuck; + +#define XFS_TRANS_PUSH_AIL_RESTARTS 10 + + spin_lock(&mp->m_ail_lock); + lip = xfs_trans_first_push_ail(mp, &gen, *last_lsn); + if (lip == NULL || XFS_FORCED_SHUTDOWN(mp)) { + /* + * AIL is empty or our push has reached the end. + */ + spin_unlock(&mp->m_ail_lock); + last_pushed_lsn = 0; + goto out; + } XFS_STATS_INC(xs_push_ail); /* * While the item we are looking at is below the given threshold - * try to flush it out. Make sure to limit the number of times - * we allow xfs_trans_next_ail() to restart scanning from the - * beginning of the list. We'd like not to stop until we've at least + * try to flush it out. We'd like not to stop until we've at least * tried to push on everything in the AIL with an LSN less than - * the given threshold. However, we may give up before that if - * we realize that we've been holding the AIL lock for 'too long', - * blocking interrupts. Currently, too long is < 500us roughly. + * the given threshold. + * + * However, we will stop after a certain number of pushes and wait + * for a reduced timeout to fire before pushing further. This + * prevents use from spinning when we can't do anything or there is + * lots of contention on the AIL lists. */ - flush_log = 0; - restarts = 0; - while (((restarts < XFS_TRANS_PUSH_AIL_RESTARTS) && - (XFS_LSN_CMP(lip->li_lsn, threshold_lsn) < 0))) { + tout = 10; + lsn = lip->li_lsn; + flush_log = stuck = count = 0; + while ((XFS_LSN_CMP(lip->li_lsn, target) < 0)) { /* - * If we can lock the item without sleeping, unlock - * the AIL lock and flush the item. Then re-grab the - * AIL lock so we can look for the next item on the - * AIL. Since we unlock the AIL while we flush the - * item, the next routine may start over again at the - * the beginning of the list if anything has changed. - * That is what the generation count is for. + * If we can lock the item without sleeping, unlock the AIL + * lock and flush the item. Then re-grab the AIL lock so we + * can look for the next item on the AIL. List changes are + * handled by the AIL lookup functions internally * - * If we can't lock the item, either its holder will flush - * it or it is already being flushed or it is being relogged. - * In any of these case it is being taken care of and we - * can just skip to the next item in the list. + * If we can't lock the item, either its holder will flush it + * or it is already being flushed or it is being relogged. In + * any of these case it is being taken care of and we can just + * skip to the next item in the list. */ lock_result = IOP_TRYLOCK(lip); + spin_unlock(&mp->m_ail_lock); switch (lock_result) { case XFS_ITEM_SUCCESS: - spin_unlock(&mp->m_ail_lock); XFS_STATS_INC(xs_push_ail_success); IOP_PUSH(lip); - spin_lock(&mp->m_ail_lock); + last_pushed_lsn = lsn; break; case XFS_ITEM_PUSHBUF: - spin_unlock(&mp->m_ail_lock); XFS_STATS_INC(xs_push_ail_pushbuf); -#ifdef XFSRACEDEBUG - delay_for_intr(); - delay(300); -#endif - ASSERT(lip->li_ops->iop_pushbuf); - ASSERT(lip); IOP_PUSHBUF(lip); - spin_lock(&mp->m_ail_lock); + last_pushed_lsn = lsn; break; case XFS_ITEM_PINNED: @@ -160,10 +205,14 @@ xfs_trans_push_ail( case XFS_ITEM_LOCKED: XFS_STATS_INC(xs_push_ail_locked); + last_pushed_lsn = lsn; + stuck++; break; case XFS_ITEM_FLUSHING: XFS_STATS_INC(xs_push_ail_flushing); + last_pushed_lsn = lsn; + stuck++; break; default: @@ -171,19 +220,26 @@ xfs_trans_push_ail( break; } + spin_lock(&mp->m_ail_lock); + count++; + /* Too many items we can't do anything with? */ + if (stuck > 100) + break; + /* we're either starting or stopping if there is no log */ + if (!mp->m_log) + break; + /* should we bother continuing? */ + if (XFS_FORCED_SHUTDOWN(mp)) + break; + /* get the next item */ lip = xfs_trans_next_ail(mp, lip, &gen, &restarts); - if (lip == NULL) { + if (lip == NULL) break; - } - if (XFS_FORCED_SHUTDOWN(mp)) { - /* - * Just return if we shut down during the last try. - */ - spin_unlock(&mp->m_ail_lock); - return (xfs_lsn_t)0; - } - + if (restarts > XFS_TRANS_PUSH_AIL_RESTARTS) + break; + lsn = lip->li_lsn; } + spin_unlock(&mp->m_ail_lock); if (flush_log) { /* @@ -191,22 +247,33 @@ xfs_trans_push_ail( * push out the log so it will become unpinned and * move forward in the AIL. */ - spin_unlock(&mp->m_ail_lock); XFS_STATS_INC(xs_push_ail_flush); xfs_log_force(mp, (xfs_lsn_t)0, XFS_LOG_FORCE); - spin_lock(&mp->m_ail_lock); } - lip = xfs_ail_min(&(mp->m_ail)); - if (lip == NULL) { - lsn = (xfs_lsn_t)0; - } else { - lsn = lip->li_lsn; + /* + * We reached the target so wait a bit longer for I/O to complete and + * remove pushed items from the AIL before we start the next scan from + * the start of the AIL. + */ + if ((XFS_LSN_CMP(lsn, target) >= 0)) { + tout += 20; + last_pushed_lsn = 0; + } else if ((restarts > XFS_TRANS_PUSH_AIL_RESTARTS) || + (count && (count < (stuck + 10)))) { + /* + * Either there is a lot of contention on the AIL or we + * found a lot of items we couldn't do anything with. + * Backoff a bit more to allow some I/O to complete before + * continuing from where we were. + */ + tout += 10; } - spin_unlock(&mp->m_ail_lock); - return lsn; -} /* xfs_trans_push_ail */ +out: + *last_lsn = last_pushed_lsn; + return tout; +} /* xfsaild_push */ /* @@ -247,7 +314,7 @@ xfs_trans_unlocked_item( * the call to xfs_log_move_tail() doesn't do anything if there's * not enough free space to wake people up so we're safe calling it. */ - min_lip = xfs_ail_min(&mp->m_ail); + min_lip = xfs_ail_min(&mp->m_ail.xa_ail); if (min_lip == lip) xfs_log_move_tail(mp, 1); @@ -279,7 +346,7 @@ xfs_trans_update_ail( xfs_log_item_t *dlip=NULL; xfs_log_item_t *mlip; /* ptr to minimum lip */ - ailp = &(mp->m_ail); + ailp = &(mp->m_ail.xa_ail); mlip = xfs_ail_min(ailp); if (lip->li_flags & XFS_LI_IN_AIL) { @@ -292,10 +359,10 @@ xfs_trans_update_ail( lip->li_lsn = lsn; xfs_ail_insert(ailp, lip); - mp->m_ail_gen++; + mp->m_ail.xa_gen++; if (mlip == dlip) { - mlip = xfs_ail_min(&(mp->m_ail)); + mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); spin_unlock(&mp->m_ail_lock); xfs_log_move_tail(mp, mlip->li_lsn); } else { @@ -330,7 +397,7 @@ xfs_trans_delete_ail( xfs_log_item_t *mlip; if (lip->li_flags & XFS_LI_IN_AIL) { - ailp = &(mp->m_ail); + ailp = &(mp->m_ail.xa_ail); mlip = xfs_ail_min(ailp); dlip = xfs_ail_delete(ailp, lip); ASSERT(dlip == lip); @@ -338,10 +405,10 @@ xfs_trans_delete_ail( lip->li_flags &= ~XFS_LI_IN_AIL; lip->li_lsn = 0; - mp->m_ail_gen++; + mp->m_ail.xa_gen++; if (mlip == dlip) { - mlip = xfs_ail_min(&(mp->m_ail)); + mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); spin_unlock(&mp->m_ail_lock); xfs_log_move_tail(mp, (mlip ? mlip->li_lsn : 0)); } else { @@ -379,10 +446,10 @@ xfs_trans_first_ail( { xfs_log_item_t *lip; - lip = xfs_ail_min(&(mp->m_ail)); - *gen = (int)mp->m_ail_gen; + lip = xfs_ail_min(&(mp->m_ail.xa_ail)); + *gen = (int)mp->m_ail.xa_gen; - return (lip); + return lip; } /* @@ -402,11 +469,11 @@ xfs_trans_next_ail( xfs_log_item_t *nlip; ASSERT(mp && lip && gen); - if (mp->m_ail_gen == *gen) { - nlip = xfs_ail_next(&(mp->m_ail), lip); + if (mp->m_ail.xa_gen == *gen) { + nlip = xfs_ail_next(&(mp->m_ail.xa_ail), lip); } else { - nlip = xfs_ail_min(&(mp->m_ail)); - *gen = (int)mp->m_ail_gen; + nlip = xfs_ail_min(&(mp->m_ail).xa_ail); + *gen = (int)mp->m_ail.xa_gen; if (restarts != NULL) { XFS_STATS_INC(xs_push_ail_restarts); (*restarts)++; @@ -435,8 +502,16 @@ void xfs_trans_ail_init( xfs_mount_t *mp) { - mp->m_ail.ail_forw = (xfs_log_item_t*)&(mp->m_ail); - mp->m_ail.ail_back = (xfs_log_item_t*)&(mp->m_ail); + mp->m_ail.xa_ail.ail_forw = (xfs_log_item_t*)&mp->m_ail.xa_ail; + mp->m_ail.xa_ail.ail_back = (xfs_log_item_t*)&mp->m_ail.xa_ail; + xfsaild_start(mp); +} + +void +xfs_trans_ail_destroy( + xfs_mount_t *mp) +{ + xfsaild_stop(mp); } /* Index: 2.6.x-xfs-new/fs/xfs/xfs_trans_priv.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans_priv.h 2007-10-02 16:01:48.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_trans_priv.h 2007-11-05 14:02:18.784782356 +1100 @@ -57,4 +57,12 @@ struct xfs_log_item *xfs_trans_next_ail( struct xfs_log_item *, int *, int *); +/* + * AIL push thread support + */ +long xfsaild_push(struct xfs_mount *, xfs_lsn_t *); +void xfsaild_wakeup(struct xfs_mount *, xfs_lsn_t); +void xfsaild_start(struct xfs_mount *); +void xfsaild_stop(struct xfs_mount *); + #endif /* __XFS_TRANS_PRIV_H__ */ Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2007-11-02 13:44:50.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2007-11-05 14:50:43.099049624 +1100 @@ -6220,13 +6220,13 @@ xfsidbg_xaildump(xfs_mount_t *mp) }; int count; - if ((mp->m_ail.ail_forw == NULL) || - (mp->m_ail.ail_forw == (xfs_log_item_t *)&mp->m_ail)) { + if ((mp->m_ail.xa_ail.ail_forw == NULL) || + (mp->m_ail.xa_ail.ail_forw == (xfs_log_item_t *)&mp->m_ail.xa_ail)) { kdb_printf("AIL is empty\n"); return; } kdb_printf("AIL for mp 0x%p, oldest first\n", mp); - lip = (xfs_log_item_t*)mp->m_ail.ail_forw; + lip = (xfs_log_item_t*)mp->m_ail.xa_ail.ail_forw; for (count = 0; lip; count++) { kdb_printf("[%d] type %s ", count, xfsidbg_item_type_str(lip)); printflags((uint)(lip->li_flags), li_flags, "flags:"); @@ -6255,7 +6255,7 @@ xfsidbg_xaildump(xfs_mount_t *mp) break; } - if (lip->li_ail.ail_forw == (xfs_log_item_t*)&mp->m_ail) { + if (lip->li_ail.ail_forw == (xfs_log_item_t*)&mp->m_ail.xa_ail) { lip = NULL; } else { lip = lip->li_ail.ail_forw; @@ -6312,9 +6312,9 @@ xfsidbg_xmount(xfs_mount_t *mp) kdb_printf("xfs_mount at 0x%p\n", mp); kdb_printf("tid 0x%x ail_lock 0x%p &ail 0x%p\n", - mp->m_tid, &mp->m_ail_lock, &mp->m_ail); + mp->m_tid, &mp->m_ail_lock, &mp->m_ail.xa_ail); kdb_printf("ail_gen 0x%x &sb 0x%p\n", - mp->m_ail_gen, &mp->m_sb); + mp->m_ail.xa_gen, &mp->m_sb); kdb_printf("sb_lock 0x%p sb_bp 0x%p dev 0x%x logdev 0x%x rtdev 0x%x\n", &mp->m_sb_lock, mp->m_sb_bp, mp->m_ddev_targp ? mp->m_ddev_targp->bt_dev : 0, From owner-xfs@oss.sgi.com Sun Nov 4 21:32:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 21:32:09 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA55Vwg1009794 for ; Sun, 4 Nov 2007 21:32:02 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA18760; Mon, 5 Nov 2007 16:31:57 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 9887158C38F7; Mon, 5 Nov 2007 16:31:57 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: PARTITAL TAKE 971186 - optimize XFS_IS_REALTIME_INODE w/o realtime config Message-Id: <20071105053157.9887158C38F7@chook.melbourne.sgi.com> Date: Mon, 5 Nov 2007 16:31:57 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4673/Sun Nov 4 14:22:25 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13552 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs optimize XFS_IS_REALTIME_INODE w/o realtime config Use XFS_IS_REALTIME_INODE in more places, and #define it to 0 if CONFIG_XFS_RT is off. This should be safe because mount checks in xfs_rtmount_init: # define xfs_rtmount_init(m) (((mp)->m_sb.sb_rblocks == 0)? 0 : (ENOSYS)) so if we get mounted w/o CONFIG_XFS_RT, no realtime inodes should be encountered after that. Defining XFS_IS_REALTIME_INODE to 0 saves a bit of stack space, presumeably gcc can optimize around the various "if (0)" type checks: xfs_alloc_file_space -8 xfs_bmap_adjacent -16 xfs_bmapi -8 xfs_bmap_rtalloc -16 xfs_bunmapi -28 xfs_free_file_space -64 xfs_imap +8 <-- ? hmm. xfs_iomap_write_direct -12 xfs_qm_dqusage_adjust -4 xfs_qm_vop_chown_reserve -4 Signed-off-by: Eric Sandeen Date: Mon Nov 5 16:31:21 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: sandeen@sandeen.net The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30014a fs/xfs/xfs_rw.h - 1.86 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rw.h.diff?r1=text&tr1=1.86&r2=text&tr2=1.85&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_vnodeops.c - 1.725 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_vnodeops.c.diff?r1=text&tr1=1.725&r2=text&tr2=1.724&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_rtalloc.h - 1.30 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_rtalloc.h.diff?r1=text&tr1=1.30&r2=text&tr2=1.29&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_dfrag.c - 1.63 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dfrag.c.diff?r1=text&tr1=1.63&r2=text&tr2=1.62&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_bmap_btree.c - 1.167 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap_btree.c.diff?r1=text&tr1=1.167&r2=text&tr2=1.166&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_inode.c - 1.486 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_inode.c.diff?r1=text&tr1=1.486&r2=text&tr2=1.485&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_bmap.c - 1.381 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_bmap.c.diff?r1=text&tr1=1.381&r2=text&tr2=1.380&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_dinode.h - 1.83 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_dinode.h.diff?r1=text&tr1=1.83&r2=text&tr2=1.82&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/xfs_iomap.c - 1.61 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iomap.c.diff?r1=text&tr1=1.61&r2=text&tr2=1.60&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/linux-2.6/xfs_lrw.c - 1.271 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_lrw.c.diff?r1=text&tr1=1.271&r2=text&tr2=1.270&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/linux-2.6/xfs_ioctl.c - 1.158 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_ioctl.c.diff?r1=text&tr1=1.158&r2=text&tr2=1.157&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/linux-2.6/xfs_iops.c - 1.269 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.269&r2=text&tr2=1.268&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/linux-2.6/xfs_aops.c - 1.158 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_aops.c.diff?r1=text&tr1=1.158&r2=text&tr2=1.157&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. fs/xfs/dmapi/xfs_dm.c - 1.59 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/dmapi/xfs_dm.c.diff?r1=text&tr1=1.59&r2=text&tr2=1.58&f=h - Use XFS_IS_REALTIME_INODE() rather than open coding the check. From owner-xfs@oss.sgi.com Sun Nov 4 23:01:41 2007 Received: with ECARTIS (v1.0.0; list xfs); Sun, 04 Nov 2007 23:01:51 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.183]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA571cYT019277 for ; Sun, 4 Nov 2007 23:01:41 -0800 Received: by py-out-1112.google.com with SMTP id u77so3004464pyb for ; Sun, 04 Nov 2007 23:01:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=tph1GPI9/q/B4mKi93htGjseE4k+yn0OOX1KqcZ5qRk=; b=poX06C9iUGzSK3q9s+cmKjJrs5shzGvdxX5PahVPVnyqnUex2SMqQ8VTiwCH6I45UvGiMToac4QF9bLNfVUQ9diL9XNDnS6z+xQfCcGmUzTKZP/xBOJkxz4WSLASIscVNzLFnFuZQxmzPV2HOs8+RJ0Typaf/FQmsRnBOnP3xk4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=TyUw72JHP3WNrEjNA1Shgwnf2+/8KJB3ZoncyKvWMobvWLDuJD30/DFkysdbb215Vu+JZ59AkjI/7BaDSajaBw5IjEl4IaBQuKN6CIPfqc7LGyFRNr51hmAe1GgWnWaOqcK+hsesWfNXr52Oa01oXo6Fx9iZs4nh3b3f/F4TafA= Received: by 10.64.27.13 with SMTP id a13mr12458999qba.1194246102047; Sun, 04 Nov 2007 23:01:42 -0800 (PST) Received: by 10.65.112.13 with HTTP; Sun, 4 Nov 2007 23:01:41 -0800 (PST) Message-ID: <64bb37e0711042301l54f1aca4qc36b184be5caa12b@mail.gmail.com> Date: Mon, 5 Nov 2007 08:01:41 +0100 From: "Torsten Kaiser" To: "David Chinner" Subject: Re: writeout stalls in current -git Cc: "Peter Zijlstra" , "Fengguang Wu" , "Maxim Levitsky" , linux-kernel@vger.kernel.org, "Andrew Morton" , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <20071105014510.GU66820511@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <393060478.03650@ustc.edu.cn> <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> <20071105014510.GU66820511@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4673/Sun Nov 4 14:22:25 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13553 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: just.for.lkml@googlemail.com Precedence: bulk X-list: xfs On 11/5/07, David Chinner wrote: > On Sun, Nov 04, 2007 at 12:19:19PM +0100, Torsten Kaiser wrote: > > I can now confirm, that I see this also with the current mainline-git-version > > I used 2.6.24-rc1-git-b4f555081fdd27d13e6ff39d455d5aefae9d2c0c > > plus the fix for the sg changes in ieee1394. > > Ok, so it's probably a side effect of the writeback changes. > > Attached are two patches (two because one was in a separate patchset as > a standalone change) that should prevent async writeback from blocking > on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first. > Can you see if this fixes the problem? Applied both patches against the kernel mentioned above. This blows up at boot: [ 80.807589] Filesystem "dm-0": Disabling barriers, not supported by the underlying device [ 80.820241] XFS mounting filesystem dm-0 [ 80.913144] ------------[ cut here ]------------ [ 80.914932] kernel BUG at drivers/md/raid5.c:143! [ 80.916751] invalid opcode: 0000 [1] SMP [ 80.918338] CPU 3 [ 80.919142] Modules linked in: [ 80.920345] Pid: 974, comm: md1_raid5 Not tainted 2.6.24-rc1 #3 [ 80.922628] RIP: 0010:[] [] __release_stripe+0x164/0x170 [ 80.925935] RSP: 0018:ffff8100060e7dd0 EFLAGS: 00010002 [ 80.927987] RAX: 0000000000000000 RBX: ffff81010141c288 RCX: 0000000000000000 [ 80.930738] RDX: 0000000000000000 RSI: ffff81010141c288 RDI: ffff810004fb3200 [ 80.933488] RBP: ffff810004fb3200 R08: 0000000000000000 R09: 0000000000000005 [ 80.936240] R10: 0000000000000e00 R11: ffffe200038465e8 R12: ffff81010141c298 [ 80.938990] R13: 0000000000000286 R14: ffff810004fb3330 R15: 0000000000000000 [ 80.941741] FS: 000000000060c870(0000) GS:ffff810100313700(0000) knlGS:0000000000000000 [ 80.944861] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 80.947080] CR2: 00007fff7b295000 CR3: 0000000101842000 CR4: 00000000000006e0 [ 80.949830] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 80.952580] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 80.955332] Process md1_raid5 (pid: 974, threadinfo ffff8100060e6000, task ffff81000645c730) [ 80.958584] Stack: ffff81010141c288 00000000000001f4 ffff810004fb3200 ffffffff804b6f2d [ 80.961761] 00000000000001f4 ffff81010141c288 ffffffff804c8bd0 0000000000000000 [ 80.964681] ffff8100060e7ee8 ffffffff804bd094 ffff81000645c730 ffff8100060e7e70 [ 80.967518] Call Trace: [ 80.968558] [] release_stripe+0x3d/0x60 [ 80.970677] [] md_thread+0x0/0x100 [ 80.972629] [] raid5d+0x344/0x450 [ 80.974549] [] process_timeout+0x0/0x10 [ 80.976668] [] schedule_timeout+0x5a/0xd0 [ 80.978855] [] md_thread+0x0/0x100 [ 80.980807] [] md_thread+0x30/0x100 [ 80.982794] [] autoremove_wake_function+0x0/0x30 [ 80.985214] [] md_thread+0x0/0x100 [ 80.987167] [] kthread+0x4b/0x80 [ 80.989054] [] child_rip+0xa/0x12 [ 80.990972] [] kthread+0x0/0x80 [ 80.992824] [] child_rip+0x0/0x12 [ 80.994743] [ 80.995588] [ 80.995588] Code: 0f 0b eb fe 0f 1f 84 00 00 00 00 00 48 83 ec 28 48 89 5c 24 [ 80.999307] RIP [] __release_stripe+0x164/0x170 [ 81.001711] RSP Switching back to unpatched 2.6.23-mm1 boots sucessfull... Torsten From owner-xfs@oss.sgi.com Mon Nov 5 10:56:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 10:56:17 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from el-out-1112.google.com (el-out-1112.google.com [209.85.162.178]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA5Iu8EY013384 for ; Mon, 5 Nov 2007 10:56:10 -0800 Received: by el-out-1112.google.com with SMTP id v27so338910ele for ; Mon, 05 Nov 2007 10:56:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=+8MSj2380Q4bne6A0aYLYo6tXpUEmdg3tUONwknfKXE=; b=l4kdhEpjrLp+yVSA0V70Kvdi85cG45TgsFKhUyVwpFYgqz7yaJ6rg89fFmQ4dSmCreLDolmBd8BK/wBIjVMe9ZiltrhkSfGNYrRqhVKU+7begyyoP32vnSCkfjiVNMmyfLFWE7eCL+wSL0uE7p9Qj/wPpDanmoXXtPIGJfVkTts= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=YRi4YBQeIbdpFNafcL72iTJxvnB6Cskq1NRmOBYQQSHVqyGLlAXZ02vzLu5O4LY2X0TWpUkjwgFtmgGIa2aBV+N3eLP1i4f0RmZ8n8bGjFdtpYeW/lupdiADHPl3JKzWn1xcHz0OP+LmqVeloErmZNL6d6PcYIq5kwpyQn+zhd0= Received: by 10.142.201.3 with SMTP id y3mr831842wff.1194287236809; Mon, 05 Nov 2007 10:27:16 -0800 (PST) Received: by 10.65.112.13 with HTTP; Mon, 5 Nov 2007 10:27:16 -0800 (PST) Message-ID: <64bb37e0711051027v49869699s9593ea54713b15ff@mail.gmail.com> Date: Mon, 5 Nov 2007 19:27:16 +0100 From: "Torsten Kaiser" To: "David Chinner" Subject: Re: writeout stalls in current -git Cc: "Peter Zijlstra" , "Fengguang Wu" , "Maxim Levitsky" , linux-kernel@vger.kernel.org, "Andrew Morton" , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <20071105014510.GU66820511@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <393060478.03650@ustc.edu.cn> <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> <20071105014510.GU66820511@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4675/Mon Nov 5 08:20:43 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13554 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: just.for.lkml@googlemail.com Precedence: bulk X-list: xfs On 11/5/07, David Chinner wrote: > Ok, so it's probably a side effect of the writeback changes. > > Attached are two patches (two because one was in a separate patchset as > a standalone change) that should prevent async writeback from blocking > on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first. > Can you see if this fixes the problem? Now testing v2.6.24-rc1-650-gb55d1b1+ the fix for the missapplied raid5-patch Applying your two patches ontop of that does not fix the stalls. vmstat 10 output from unmerging (uninstalling) a kernel: 1 0 0 3512188 332 192644 0 0 185 12 368 735 10 3 85 1 -> emerge starts to remove the kernel source files 3 0 0 3506624 332 192836 0 0 15 9825 2458 8307 7 12 81 0 0 0 0 3507212 332 192836 0 0 0 554 630 1233 0 1 99 0 0 0 0 3507292 332 192836 0 0 0 537 580 1328 0 1 99 0 0 0 0 3507168 332 192836 0 0 0 633 626 1380 0 1 99 0 0 0 0 3507116 332 192836 0 0 0 1510 768 2030 1 2 97 0 0 0 0 3507596 332 192836 0 0 0 524 540 1544 0 0 99 0 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 3507540 332 192836 0 0 0 489 551 1293 0 0 99 0 0 0 0 3507528 332 192836 0 0 0 527 510 1432 1 1 99 0 0 0 0 3508052 332 192840 0 0 0 2088 910 2964 2 3 95 0 0 0 0 3507888 332 192840 0 0 0 442 565 1383 1 1 99 0 0 0 0 3508704 332 192840 0 0 0 497 529 1479 0 0 99 0 0 0 0 3508704 332 192840 0 0 0 594 595 1458 0 0 99 0 0 0 0 3511492 332 192840 0 0 0 2381 1028 2941 2 3 95 0 0 0 0 3510684 332 192840 0 0 0 699 600 1390 0 0 99 0 0 0 0 3511636 332 192840 0 0 0 741 661 1641 0 0 100 0 0 0 0 3524020 332 192840 0 0 0 2452 1080 3910 2 3 95 0 0 0 0 3524040 332 192844 0 0 0 530 617 1297 0 0 99 0 0 0 0 3524128 332 192844 0 0 0 812 674 1667 0 1 99 0 0 0 0 3527000 332 193672 0 0 339 721 754 1681 3 2 93 1 -> emerge is finished, no dirty or writeback data in /proc/meminfo 0 0 0 3571056 332 194768 0 0 111 639 632 1344 0 1 99 0 0 0 0 3571260 332 194768 0 0 0 757 688 1405 1 0 99 0 0 0 0 3571156 332 194768 0 0 0 753 641 1361 0 0 99 0 0 0 0 3571404 332 194768 0 0 0 766 653 1389 0 0 99 0 1 0 0 3571136 332 194768 0 0 6 764 669 1488 0 0 99 0 0 0 0 3571668 332 194824 0 0 0 764 657 1482 0 0 99 0 0 0 0 3571848 332 194824 0 0 0 673 659 1406 0 0 99 0 0 0 0 3571908 332 195052 0 0 22 753 638 1500 0 1 99 0 0 0 0 3573052 332 195052 0 0 0 765 631 1482 0 1 99 0 0 0 0 3574144 332 195052 0 0 0 771 640 1497 0 0 99 0 0 0 0 3573468 332 195052 0 0 0 458 485 1251 0 0 99 0 0 0 0 3574184 332 195052 0 0 0 427 474 1192 0 0 100 0 0 0 0 3575092 332 195052 0 0 0 461 482 1235 0 0 99 0 0 0 0 3576368 332 195056 0 0 0 582 556 1310 0 0 99 0 0 0 0 3579300 332 195056 0 0 0 695 571 1402 0 0 99 0 0 0 0 3580376 332 195056 0 0 0 417 568 906 0 0 99 0 0 0 0 3581212 332 195056 0 0 0 421 559 977 0 1 99 0 0 0 0 3583780 332 195060 0 0 0 494 555 1080 0 1 99 0 0 0 0 3584352 332 195060 0 0 0 99 347 559 0 0 99 0 0 0 0 3585232 332 195060 0 0 0 11 301 621 0 0 99 0 -> disks go idle. So these patches do not seem to be the source of these excessive disk writes... Torsten From owner-xfs@oss.sgi.com Mon Nov 5 11:41:17 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 11:41:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE, J_CHICKENPOX_42 autolearn=no version=3.3.0-r574664 Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.232]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA5JfEdg019444 for ; Mon, 5 Nov 2007 11:41:16 -0800 Received: by nz-out-0506.google.com with SMTP id x3so1002294nzd for ; Mon, 05 Nov 2007 11:41:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; bh=ullirRnf7fsRwZBlH1YzvyFmG4YErAL/OUOqVi6uLS8=; b=IvQ36/C4C6tAVQ2rKuKSRGhgY93wMsz+WeKAR5lk4vtZc683+c/CVz4PTyYNHGteVzJTEMDSFmDELi+NSDmbliSJZUbgBriag9f40qUUB46i7EWhgLPtmlftSv+e2k+nWnUtfiL0iFWCmk0QCPZOcr36EfX7yHYT2BO1rqUw7ug= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=qogWKHDaP0qAOLEfPvv/Rb+hXm/gur0LikbahJvvscOwTK+pJQiKjuJpikQ9d+oLGMiBaY7wncddsBV3OOhmSVvNN3qpenqZ7Tk+k40+cv0TjIpZtfSx5RLJ6iVsOztlivtyH9VFpclbxqAq5pNjA6JUckc/1owbi+Q7eNEhZ7U= Received: by 10.142.229.4 with SMTP id b4mr1177219wfh.1194288172123; Mon, 05 Nov 2007 10:42:52 -0800 (PST) Received: by 10.142.162.19 with HTTP; Mon, 5 Nov 2007 10:42:52 -0800 (PST) Message-ID: Date: Tue, 6 Nov 2007 00:12:52 +0530 From: "Bhagi rathi" To: "David Chinner" Subject: Re: TAKE 972756 - Implement fallocate. Cc: sgi.bugs.xfs@engr.sgi.com, xfs@oss.sgi.com In-Reply-To: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> MIME-Version: 1.0 References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4675/Mon Nov 5 08:20:43 2007 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1134 X-archive-position: 13555 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs David, What happens if offset is not aligned to 4k? Let's say we have a file whose size is not aligned to 4k. It could have blocks beyond the eof which haven't been zero'ed out. fallocate may increase the size and we can read garbage from disk-block if it hasn't been zero'ed out. -Thanks, Bhagi. On 11/2/07, David Chinner wrote: > > Implement fallocate. > > Implement the new generic callout for file preallocation. > Atomically change the file size if requested. > > > Date: Fri Nov 2 13:42:52 AEDT 2007 > Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs > Inspected by: hch@infradead.org > > The following file(s) were checked into: > longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb > > > Modid: xfs-linux-melb:xfs-kern:30009a > fs/xfs/linux-2.6/xfs_iops.c - 1.268 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/> linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.268&r2=text&tr2=1.267&f=h > > http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/linux-2.6/xfs_iops.c.diff?r1=text&tr1=1.268&r2=text&tr2=1.267&f=h > - implement ->fallocate() > > > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Mon Nov 5 14:20:27 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 14:20:32 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from spike.grumly.eu.org (spike.grumly.eu.org [195.5.253.226]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA5MKOe3026420 for ; Mon, 5 Nov 2007 14:20:27 -0800 Received: by spike.grumly.eu.org (Postfix, from userid 1001) id E1FD6118F0; Mon, 5 Nov 2007 22:51:35 +0100 (CET) Date: Mon, 5 Nov 2007 22:51:35 +0100 From: Cedric - Equinoxe Media To: xfs@oss.sgi.com Subject: xfs crash Message-ID: <20071105215135.GA12238@e-m.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.91.2/4675/Mon Nov 5 08:20:43 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13556 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cedric@e-m.fr Precedence: bulk X-list: xfs Hi, I got a crash with xfs serving nfs : Hardware is Dell Poweredge 2950 with RAID5 SAS Linux fng2 2.6.22-3-amd64 #1 SMP Wed Oct 31 13:43:07 UTC 2007 x86_64 GNU/Linux Here is the dmesg : NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory NFSD: starting 90-second grace period XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1563 of file fs/xfs/xfs_alloc.c. Caller 0xffffffff88113b35 Call Trace: [] :xfs:xfs_free_ag_extent+0x1a6/0x6b5 [] :xfs:xfs_free_extent+0xa9/0xc9 [] :xfs:xfs_bmap_finish+0xee/0x167 [] :xfs:xfs_itruncate_finish+0x19b/0x2e0 [] :xfs:xfs_setattr+0x841/0xe57 [] __mod_timer+0xc3/0xd3 [] task_rq_lock+0x3d/0x6f [] __activate_task+0x26/0x38 [] :xfs:xfs_vn_setattr+0x121/0x144 [] notify_change+0x156/0x2f1 [] :nfsd:nfsd_setattr+0x334/0x4b1 [] :nfsd:nfsd3_proc_setattr+0xa2/0xae [] :nfsd:nfsd_dispatch+0xdd/0x19e [] :sunrpc:svc_process+0x3df/0x6ef [] __down_read+0x12/0x9a [] :nfsd:nfsd+0x191/0x2ac [] child_rip+0xa/0x12 [] :nfsd:nfsd+0x0/0x2ac [] child_rip+0x0/0x12 xfs_force_shutdown(sda4,0x8) called from line 4258 of file fs/xfs/xfs_bmap.c. Return address = 0xffffffff8811cfb4 Filesystem "sda4": Corruption of in-memory data detected. Shutting down filesystem: sda4 Please umount the filesystem, and rectify the problem(s) nfsd: non-standard errno: -117 ---------------- Here I stopped nfs, umount -f /dev/sda4, mount /dev/sda4 then start nfs again. ---------------- nfsd: last server has exited nfsd: unexporting all filesystems xfs_force_shutdown(sda4,0x1) called from line 423 of file fs/xfs/xfs_rw.c. Return address = 0xffffffff88158289 xfs_force_shutdown(sda4,0x1) called from line 423 of file fs/xfs/xfs_rw.c. Return address = 0xffffffff88158289 XFS mounting filesystem sda4 Starting XFS recovery on filesystem: sda4 (logdev: internal) XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1563 of file fs/xfs/xfs_alloc.c. Caller 0xffffffff88113b35 Call Trace: [] :xfs:xfs_free_ag_extent+0x1a6/0x6b5 [] :xfs:xfs_free_extent+0xa9/0xc9 [] :xfs:xlog_recover_process_efi+0xf7/0x12a [] :xfs:xlog_recover_process_efis+0x4f/0x81 [] :xfs:xlog_recover_finish+0x19/0x9a [] :xfs:xfs_mountfs+0x83d/0x91b [] _atomic_dec_and_lock+0x39/0x58 [] :xfs:xfs_mount+0x317/0x39d [] :xfs:xfs_fs_fill_super+0x0/0x1a7 [] :xfs:xfs_fs_fill_super+0x7e/0x1a7 [] __down_write_nested+0x12/0x9a [] get_filesystem+0x12/0x35 [] sget+0x39d/0x3af [] set_bdev_super+0x0/0xf [] test_bdev_super+0x0/0xd [] get_sb_bdev+0x105/0x152 [] vfs_kern_mount+0x93/0x11a [] do_kern_mount+0x43/0xdd [] do_mount+0x691/0x708 [] mntput_no_expire+0x1c/0x94 [] link_path_walk+0xce/0xe0 [] activate_page+0xad/0xd4 [] find_get_page+0x21/0x50 [] filemap_nopage+0x180/0x2ab [] __handle_mm_fault+0x3e6/0x9d9 [] zone_statistics+0x3f/0x60 [] __up_read+0x13/0x8a [] __alloc_pages+0x5a/0x2bc [] sys_mount+0x8a/0xd7 [] system_call+0x7e/0x83 Ending XFS recovery on filesystem: sda4 (logdev: internal) NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory NFSD: starting 90-second grace period I have no other message in the dmesg, the server has latest RAID firmware from dell. Cédric. From owner-xfs@oss.sgi.com Mon Nov 5 16:12:34 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 16:12:36 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA60CTqv007559 for ; Mon, 5 Nov 2007 16:12:33 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA12377; Tue, 6 Nov 2007 11:12:26 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA60COdD94499571; Tue, 6 Nov 2007 11:12:25 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA60CNUj95451472; Tue, 6 Nov 2007 11:12:23 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 6 Nov 2007 11:12:23 +1100 From: David Chinner To: Bhagi rathi Cc: David Chinner , xfs@oss.sgi.com Subject: Re: TAKE 972756 - Implement fallocate. Message-ID: <20071106001223.GY66820511@sgi.com> References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4676/Mon Nov 5 13:20:22 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13557 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Nov 06, 2007 at 12:12:52AM +0530, Bhagi rathi wrote: > David, What happens if offset is not aligned to 4k? Let's say we have a file > whose size is > not aligned to 4k. It could have blocks beyond the eof which haven't been > zero'ed out. No it won't. They are *preallocated* blocks, which by definition are zero-filled. Preallocated blocks are marked as unwritten on disk, so it is known that they contain zeros, even if they lie beyond EOF. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Nov 5 20:25:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 20:25:55 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA64PmG0016534 for ; Mon, 5 Nov 2007 20:25:50 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id PAA16934; Tue, 6 Nov 2007 15:25:39 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA64PZdD95507918; Tue, 6 Nov 2007 15:25:36 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA64PRU691691920; Tue, 6 Nov 2007 15:25:27 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 6 Nov 2007 15:25:27 +1100 From: David Chinner To: Torsten Kaiser Cc: David Chinner , Peter Zijlstra , Fengguang Wu , Maxim Levitsky , linux-kernel@vger.kernel.org, Andrew Morton , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com Subject: Re: writeout stalls in current -git Message-ID: <20071106042527.GT995458@sgi.com> References: <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> <20071105014510.GU66820511@sgi.com> <64bb37e0711051027v49869699s9593ea54713b15ff@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <64bb37e0711051027v49869699s9593ea54713b15ff@mail.gmail.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4678/Mon Nov 5 17:20:26 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13558 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Mon, Nov 05, 2007 at 07:27:16PM +0100, Torsten Kaiser wrote: > On 11/5/07, David Chinner wrote: > > Ok, so it's probably a side effect of the writeback changes. > > > > Attached are two patches (two because one was in a separate patchset as > > a standalone change) that should prevent async writeback from blocking > > on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first. > > Can you see if this fixes the problem? > > Now testing v2.6.24-rc1-650-gb55d1b1+ the fix for the missapplied raid5-patch > Applying your two patches ontop of that does not fix the stalls. So you are having RAID5 problems as well? I'm struggling to understand what possible changed in XFS or writeback that would lead to stalls like this, esp. as you appear to be removing files when the stalls occur. Rather than vmstat, can you use something like iostat to show how busy your disks are? i.e. are we seeing RMW cycles in the raid5 or some such issue. OOC, what is the 'xfs_info ' output for your filesystem? > vmstat 10 output from unmerging (uninstalling) a kernel: > 1 0 0 3512188 332 192644 0 0 185 12 368 735 10 3 85 1 > -> emerge starts to remove the kernel source files > 3 0 0 3506624 332 192836 0 0 15 9825 2458 8307 7 12 81 0 > 0 0 0 3507212 332 192836 0 0 0 554 630 1233 0 1 99 0 > 0 0 0 3507292 332 192836 0 0 0 537 580 1328 0 1 99 0 > 0 0 0 3507168 332 192836 0 0 0 633 626 1380 0 1 99 0 > 0 0 0 3507116 332 192836 0 0 0 1510 768 2030 1 2 97 0 > 0 0 0 3507596 332 192836 0 0 0 524 540 1544 0 0 99 0 > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 0 0 0 3507540 332 192836 0 0 0 489 551 1293 0 0 99 0 > 0 0 0 3507528 332 192836 0 0 0 527 510 1432 1 1 99 0 > 0 0 0 3508052 332 192840 0 0 0 2088 910 2964 2 3 95 0 > 0 0 0 3507888 332 192840 0 0 0 442 565 1383 1 1 99 0 > 0 0 0 3508704 332 192840 0 0 0 497 529 1479 0 0 99 0 > 0 0 0 3508704 332 192840 0 0 0 594 595 1458 0 0 99 0 > 0 0 0 3511492 332 192840 0 0 0 2381 1028 2941 2 3 95 0 > 0 0 0 3510684 332 192840 0 0 0 699 600 1390 0 0 99 0 > 0 0 0 3511636 332 192840 0 0 0 741 661 1641 0 0 100 0 > 0 0 0 3524020 332 192840 0 0 0 2452 1080 3910 2 3 95 0 > 0 0 0 3524040 332 192844 0 0 0 530 617 1297 0 0 99 0 > 0 0 0 3524128 332 192844 0 0 0 812 674 1667 0 1 99 0 > 0 0 0 3527000 332 193672 0 0 339 721 754 1681 3 2 93 1 > -> emerge is finished, no dirty or writeback data in /proc/meminfo At this point, can you run a "sync" and see how long that takes to complete? The only thing I can think that woul dbe written out after this point is inodes, but even then it seems to go on for a long, long time and it really doesn't seem like XFS is holding up the inode writes. Another option is to use blktrace/blkparse to determine which process is issuing this I/O. > 0 0 0 3583780 332 195060 0 0 0 494 555 1080 0 1 99 0 > 0 0 0 3584352 332 195060 0 0 0 99 347 559 0 0 99 0 > 0 0 0 3585232 332 195060 0 0 0 11 301 621 0 0 99 0 > -> disks go idle. > > So these patches do not seem to be the source of these excessive disk writes... Well, the patches I posted should prevent blocking in the places that it was seen, so if that does not stop the slowdowns then either the writeback code is not feeding us inodes fast enough or the block device below is having some kind of problem.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Mon Nov 5 22:48:02 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 22:48:08 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=ANY_BOUNCE_MESSAGE,AWL, BAYES_50,VBOUNCE_MESSAGE autolearn=no version=3.3.0-r574664 Received: from omr-m23.mx.aol.com (omr-m23.mx.aol.com [64.12.136.131]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA66luQv000338 for ; Mon, 5 Nov 2007 22:48:01 -0800 Received: from rly-dd09.mx.aol.com (rly-dd09.mx.aol.com [205.188.156.246]) by omr-m23.mx.aol.com (v117.7) with ESMTP id MAILOMRM233-7dfe47300e1e30; Tue, 06 Nov 2007 01:47:58 -0400 Received: from localhost (localhost) by rly-dd09.mx.aol.com (8.14.1/8.14.1) id lA66lnYC024722; Tue, 6 Nov 2007 01:47:58 -0500 Date: Tue, 6 Nov 2007 01:47:58 -0500 From: Mail Delivery Subsystem Message-Id: <200711060647.lA66lnYC024722@rly-dd09.mx.aol.com> To: MIME-Version: 1.0 Content-Type: multipart/report; report-type=delivery-status; boundary="lA66lnYC024722.1194331678/rly-dd09.mx.aol.com" Subject: Returned mail: see transcript for details Auto-Submitted: auto-generated (failure) X-AOL-INRLY: smtp1.wanadoo.jo [193.252.22.182] rly-dd09 X-AOL-IP: 205.188.156.246 X-Virus-Scanned: ClamAV 0.91.2/4680/Mon Nov 5 20:49:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13559 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: MAILER-DAEMON@aol.com Precedence: bulk X-list: xfs This is a MIME-encapsulated message --lA66lnYC024722.1194331678/rly-dd09.mx.aol.com The original message was received at Tue, 6 Nov 2007 01:47:44 -0500 from smtp1.wanadoo.jo [193.252.22.182] *** ATTENTION *** Your e-mail is being returned to you because there was a problem with its delivery. The address which was undeliverable is listed in the section labeled: "----- The following addresses had permanent fatal errors -----". The reason your mail is being returned to you is listed in the section labeled: "----- Transcript of Session Follows -----". The line beginning with "<<<" describes the specific reason your e-mail could not be delivered. The next line contains a second error message which is a general translation for other e-mail servers. Please direct further questions regarding this message to your e-mail administrator. --AOL Postmaster ----- The following addresses had permanent fatal errors ----- (reason: 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent.) ----- Transcript of session follows ----- ... while talking to air-dd02.mail.aol.com.: >>> DATA <<< 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent. 554 5.0.0 Service unavailable --lA66lnYC024722.1194331678/rly-dd09.mx.aol.com Content-Type: message/delivery-status Reporting-MTA: dns; rly-dd09.mx.aol.com Arrival-Date: Tue, 6 Nov 2007 01:47:44 -0500 Final-Recipient: RFC822; docsbnb@aol.com Action: failed Status: 5.0.0 Remote-MTA: DNS; air-dd02.mail.aol.com Diagnostic-Code: SMTP; 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent. Last-Attempt-Date: Tue, 6 Nov 2007 01:47:58 -0500 --lA66lnYC024722.1194331678/rly-dd09.mx.aol.com Content-Type: text/rfc822-headers Received: from mwinf4011.affiliated.me-wanadoo.net (smtp1.wanadoo.jo [193.252.22.182]) by rly-dd09.mx.aol.com (v120.9) with ESMTP id MAILRELAYINDD095-b9b47300e0e11c; Tue, 06 Nov 2007 01:47:43 -0400 Received: from oss.sgi.com (unknown [86.108.50.217]) by mwinf4011.affiliated.me-wanadoo.net (SMTP Server) with ESMTP id B063A1C0027E for ; Tue, 6 Nov 2007 07:47:38 +0100 (CET) X-ME-UUID: 20071106064738722.B063A1C0027E@mwinf4011.affiliated.me-wanadoo.net X-ME-bounce-domain: orange.jo From: linux-xfs@oss.sgi.com To: docsbnb@aol.com Subject: Docsbnb@aol.com Date: Mon, 5 Nov 2007 22:47:19 -0800 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0007_8AA0D621.F6E57592" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 Message-Id: <20071106064738.B063A1C0027E@mwinf4011.affiliated.me-wanadoo.net> X-AOL-IP: 193.252.22.182 X-AOL-SCOLL-SCORE:0:2:400697280:9395240 X-AOL-SCOLL-URL_COUNT: X-AOL-SCOLL-AUTHENTICATION: listenair ; SPF_helo : X-AOL-SCOLL-AUTHENTICATION: listenair ; SPF_822_from : --lA66lnYC024722.1194331678/rly-dd09.mx.aol.com-- From owner-xfs@oss.sgi.com Mon Nov 5 23:10:20 2007 Received: with ECARTIS (v1.0.0; list xfs); Mon, 05 Nov 2007 23:10:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_51 autolearn=no version=3.3.0-r574664 Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.176]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA67AIrt002874 for ; Mon, 5 Nov 2007 23:10:20 -0800 Received: by py-out-1112.google.com with SMTP id u77so3727623pyb for ; Mon, 05 Nov 2007 23:10:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=EW1c170Cm1YvLTbkhmS2rWCfzevYC/msaicm3fzL84g=; b=WfqbkELWbx8eW7EW37gMiSpe89OHtBWvLgpdhuZNQM3VB1UCWI/2oIwv+AuPCeWoMun6jVxj1T3N1EumlFMF/T5pQivS1BMGr02htqeu9eLJw0qoWkN50hvncQXQffdRGa5YufUTgF/ZViVGh1/D2HpF1fZP2jTr3WwL7yzRYLQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=GuKXd6gsRcXZUYdU3CtPQCkSJj73Jcl51yEV8Y4SGrwjFmZp2f64OKNvd6ra+B+xXqRNR165tNLLVCk59SxmKI0G3t/gw5Cqzansdlrw1+AKUTOOk2Czetu+omd3xzBBguIKriWUNOiUo3cDv/yeXRDeiAPJlf3Qnj57xVVKkK4= Received: by 10.65.153.10 with SMTP id f10mr9442879qbo.1194333021770; Mon, 05 Nov 2007 23:10:21 -0800 (PST) Received: by 10.65.112.13 with HTTP; Mon, 5 Nov 2007 23:10:21 -0800 (PST) Message-ID: <64bb37e0711052310r5214cf50nd148d989524490ea@mail.gmail.com> Date: Tue, 6 Nov 2007 08:10:21 +0100 From: "Torsten Kaiser" To: "David Chinner" Subject: Re: writeout stalls in current -git Cc: "Peter Zijlstra" , "Fengguang Wu" , "Maxim Levitsky" , linux-kernel@vger.kernel.org, "Andrew Morton" , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <20071106042527.GT995458@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <393903856.06449@ustc.edu.cn> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> <20071105014510.GU66820511@sgi.com> <64bb37e0711051027v49869699s9593ea54713b15ff@mail.gmail.com> <20071106042527.GT995458@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4680/Mon Nov 5 20:49:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13560 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: just.for.lkml@googlemail.com Precedence: bulk X-list: xfs On 11/6/07, David Chinner wrote: > On Mon, Nov 05, 2007 at 07:27:16PM +0100, Torsten Kaiser wrote: > > On 11/5/07, David Chinner wrote: > > > Ok, so it's probably a side effect of the writeback changes. > > > > > > Attached are two patches (two because one was in a separate patchset as > > > a standalone change) that should prevent async writeback from blocking > > > on locked inode cluster buffers. Apply the xfs-factor-inotobp patch first. > > > Can you see if this fixes the problem? > > > > Now testing v2.6.24-rc1-650-gb55d1b1+ the fix for the missapplied raid5-patch > > Applying your two patches ontop of that does not fix the stalls. > > So you are having RAID5 problems as well? The first 2.6.24-rc1-git-kernel that I patched with your patches did not boot for me. (Oops send in one of my previous mails) But given that the stacktrace was not xfs related and I had seen this patch on the lkml, I tried to fix this Oops this way. I did not have troubles with the RAID5 otherwise. > I'm struggling to understand what possible changed in XFS or writeback that > would lead to stalls like this, esp. as you appear to be removing files when > the stalls occur. Rather than vmstat, can you use something like iostat to > show how busy your disks are? i.e. are we seeing RMW cycles in the raid5 or > some such issue. Will do this this evening. > OOC, what is the 'xfs_info ' output for your filesystem? meta-data=/dev/mapper/root isize=256 agcount=32, agsize=4731132 blks = sectsz=512 attr=1 data = bsize=4096 blocks=151396224, imaxpct=25 = sunit=0 swidth=0 blks, unwritten=1 naming =version 2 bsize=4096 log =internal bsize=4096 blocks=32768, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 > > vmstat 10 output from unmerging (uninstalling) a kernel: > > 1 0 0 3512188 332 192644 0 0 185 12 368 735 10 3 85 1 > > -> emerge starts to remove the kernel source files > > 3 0 0 3506624 332 192836 0 0 15 9825 2458 8307 7 12 81 0 > > 0 0 0 3507212 332 192836 0 0 0 554 630 1233 0 1 99 0 > > 0 0 0 3507292 332 192836 0 0 0 537 580 1328 0 1 99 0 > > 0 0 0 3507168 332 192836 0 0 0 633 626 1380 0 1 99 0 > > 0 0 0 3507116 332 192836 0 0 0 1510 768 2030 1 2 97 0 > > 0 0 0 3507596 332 192836 0 0 0 524 540 1544 0 0 99 0 > > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- > > r b swpd free buff cache si so bi bo in cs us sy id wa > > 0 0 0 3507540 332 192836 0 0 0 489 551 1293 0 0 99 0 > > 0 0 0 3507528 332 192836 0 0 0 527 510 1432 1 1 99 0 > > 0 0 0 3508052 332 192840 0 0 0 2088 910 2964 2 3 95 0 > > 0 0 0 3507888 332 192840 0 0 0 442 565 1383 1 1 99 0 > > 0 0 0 3508704 332 192840 0 0 0 497 529 1479 0 0 99 0 > > 0 0 0 3508704 332 192840 0 0 0 594 595 1458 0 0 99 0 > > 0 0 0 3511492 332 192840 0 0 0 2381 1028 2941 2 3 95 0 > > 0 0 0 3510684 332 192840 0 0 0 699 600 1390 0 0 99 0 > > 0 0 0 3511636 332 192840 0 0 0 741 661 1641 0 0 100 0 > > 0 0 0 3524020 332 192840 0 0 0 2452 1080 3910 2 3 95 0 > > 0 0 0 3524040 332 192844 0 0 0 530 617 1297 0 0 99 0 > > 0 0 0 3524128 332 192844 0 0 0 812 674 1667 0 1 99 0 > > 0 0 0 3527000 332 193672 0 0 339 721 754 1681 3 2 93 1 > > -> emerge is finished, no dirty or writeback data in /proc/meminfo > > At this point, can you run a "sync" and see how long that takes to > complete? Already tried that: http://lkml.org/lkml/2007/11/2/178 See the logs from the second unmerge in the second half of the mail. The sync did not stop this writeout, but returned immediately. > The only thing I can think that woul dbe written out after > this point is inodes, but even then it seems to go on for a long, > long time and it really doesn't seem like XFS is holding up the > inode writes. Yes, I completly agree that this is much to long. Thats why I included the after-emerge-finished parts of the logs. But I still partly suspect xfs, because the xfssyncd shows up when I hip SysRq+W. > Another option is to use blktrace/blkparse to determine which process is > issuing this I/O. > > > 0 0 0 3583780 332 195060 0 0 0 494 555 1080 0 1 99 0 > > 0 0 0 3584352 332 195060 0 0 0 99 347 559 0 0 99 0 > > 0 0 0 3585232 332 195060 0 0 0 11 301 621 0 0 99 0 > > -> disks go idle. > > > > So these patches do not seem to be the source of these excessive disk writes... > > Well, the patches I posted should prevent blocking in the places that it > was seen, so if that does not stop the slowdowns then either the writeback > code is not feeding us inodes fast enough or the block device below is > having some kind of problem.... I don't think its the block device, because reading/writing larger files do not seem to be troubled. It looks much more like an inode problem. For example both installing and uninstalling kernel source trees show these stalls, but during uninstalling this is much more noticeable. But I agree that this might not be xfs specific, as this showed up at the same time as other people started reporting about the 100% iowait bug. Could be that this is the same bug and the differences between reiserfs and xfs might explain the iowait vs. idle. Or that I don't see the 100% iowait is something else on my system... Torsten From owner-xfs@oss.sgi.com Tue Nov 6 01:18:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 01:19:17 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=AWL,BAYES_50,J_CHICKENPOX_21, J_CHICKENPOX_23,J_CHICKENPOX_31,J_CHICKENPOX_42,J_CHICKENPOX_43, J_CHICKENPOX_44,J_CHICKENPOX_45,J_CHICKENPOX_46,J_CHICKENPOX_47, J_CHICKENPOX_48,J_CHICKENPOX_61,J_CHICKENPOX_62,J_CHICKENPOX_63, J_CHICKENPOX_64,J_CHICKENPOX_65,J_CHICKENPOX_73 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA69Ihsn002714 for ; Tue, 6 Nov 2007 01:18:45 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id UAA21882; Tue, 6 Nov 2007 20:18:38 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA69IbdD95551378; Tue, 6 Nov 2007 20:18:38 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA69Ia1s95104290; Tue, 6 Nov 2007 20:18:36 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Tue, 6 Nov 2007 20:18:36 +1100 From: David Chinner To: xfs-dev Cc: xfs-oss Subject: [PATCH,RFC] Factor some btree code.... Message-ID: <20071106091836.GV995458@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4680/Mon Nov 5 20:49:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13562 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Only a small patch. ;) basically, I need to introduce new formats to some of the btree block fields (crc, uuids) for resilience and recovery purposes. Rather than have to copy large chunks of three separate btree implementations, I decided that I'd factor them into one implementation first. The approach I took was to build a bunch of ops structures taht each different btree structure could implement. basically, all the btrees do the same fundamental operations, so it shoul dbe easy to do. Right? I've formed a "btree core" set of functions that operation on: xfs_btree_block_t - a generic btree block xfs_btree_key_t - a union of the different key types xfs_btree_ptr_t - a union of the different pointer types xfs_btree_rec_t - a union of the different record types These are passed around the core btree code in disk endian format and the callouts convert to/from disk endian format as needed. there are operations for intialising keys, ptrs and records from either the cursor or other keys or records. There are operations for moving them, getting the address within the block of a given index within a block, logging the changes made etc. There's various block operations e.g. allocating and freeing blocks, logging block headers, etc in a separate ops structure. Some of the remaining operations are lumped into a "cursor ops" structure - I think I'll probably fold them back into the block ops structure, or even just make it one large ops structure for everything - there's really no need for multiple ops structures, except for.... ... the btree tracing code. I haven't completed that yet, but the btree core inherits the tracing code from the bmap btree code, so we'll have fined grained tracing on all btree operations once this is complete. The core btree code also got factored and commenting was improved; the result is that the code is now readable and understandable, which it certainly wasn't before I began this. A further feature is that the core btree code now supports the btree root being placed in an inode. I still need to move the extent format code into the core as well as some of the root manipulation code, but in future the only difference between a pointer rooted btree (eg freespace trees) and an inode rooted btree (inode extent btree) will be a single flag being set in during the btree cursor initialisation. The result of all this is a massive patch that cleans up a lot of stuff, introduces new functionality into the btree code and reduces each btree implementation down to a relatively simple set of operations to write. The freespace btrees (xfs_alloc_btree.c) have gone from 2200 lines to 900, the inode btree (xfs_ialloc_btree.c) has gone from 2000 lines to less than 800, and the bmap btree has gone from ~2600 lines to ~1400. There's probably more this can be reduced as well. On top of this, modifying the btree structures will now involve writing only a handful of new functions to be written instead of duplicating most of those three files mentioned above. The next question - does it work? Well, apart from test 042 (massively fragmented file and freespace btree) and occasional 013 (fstress) and 083 (fstress @ ENOSPC) corruptions, it runs fine. Indeed, I just did an apt-get update that replaced about 500MB of the binaries on the root drive of my test box, updated a git tree and rebuilt a kernel and the filesystem survived that just fine. So, while I would not recommend it for production yet, it's definitely usable. The probelms remaining stem from level 3 btrees and larger, and I need the btree tracing code working to trace those problems (it doesn't work yet). There's plenty still to clean up in the patch, but I thought that pushing it out early for comment would be better than leaving it until I had everything working. Thoughts, comments, flames? (Eric, I'm looking at you and your 3-way diffstats ;) Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group --- fs/xfs/xfs.h | 2 fs/xfs/xfs_alloc.c | 48 fs/xfs/xfs_alloc_btree.c | 2611 +++++++++--------------------------- fs/xfs/xfs_alloc_btree.h | 2 fs/xfs/xfs_bmap.c | 58 fs/xfs/xfs_bmap_btree.c | 3307 ++++++++++++++-------------------------------- fs/xfs/xfs_bmap_btree.h | 8 fs/xfs/xfs_btree.c | 351 +++- fs/xfs/xfs_btree.h | 419 +++++ fs/xfs/xfs_btree_core.c | 2299 +++++++++++++++++++++++++++++++ fs/xfs/xfs_btree_trace.c | 202 ++ fs/xfs/xfs_ialloc.c | 24 fs/xfs/xfs_ialloc_btree.c | 2399 +++++++-------------------------- fs/xfs/xfs_ialloc_btree.h | 2 fs/xfs/xfs_itable.c | 6 15 files changed, 5497 insertions(+), 6241 deletions(-) Index: 2.6.x-xfs-new/fs/xfs/xfs.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs.h 2007-09-12 15:41:22.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs.h 2007-11-06 19:40:29.694676106 +1100 @@ -30,7 +30,7 @@ #define XFS_ATTR_TRACE 1 #define XFS_BLI_TRACE 1 #define XFS_BMAP_TRACE 1 -#define XFS_BMBT_TRACE 1 +#define XFS_BTREE_TRACE 1 #define XFS_DIR2_TRACE 1 #define XFS_DQUOT_TRACE 1 #define XFS_ILOCK_TRACE 1 Index: 2.6.x-xfs-new/fs/xfs/xfs_alloc.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_alloc.c 2007-10-16 08:52:58.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_alloc.c 2007-11-06 19:40:29.694676106 +1100 @@ -334,7 +334,7 @@ xfs_alloc_fixup_trees( /* * Delete the entry from the by-size btree. */ - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); /* @@ -344,7 +344,7 @@ xfs_alloc_fixup_trees( if ((error = xfs_alloc_lookup_eq(cnt_cur, nfbno1, nflen1, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 0); - if ((error = xfs_alloc_insert(cnt_cur, &i))) + if ((error = xfs_btree_insert(cnt_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); } @@ -352,7 +352,7 @@ xfs_alloc_fixup_trees( if ((error = xfs_alloc_lookup_eq(cnt_cur, nfbno2, nflen2, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 0); - if ((error = xfs_alloc_insert(cnt_cur, &i))) + if ((error = xfs_btree_insert(cnt_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); } @@ -363,7 +363,7 @@ xfs_alloc_fixup_trees( /* * No remaining freespace, just delete the by-block tree entry. */ - if ((error = xfs_alloc_delete(bno_cur, &i))) + if ((error = xfs_btree_delete(bno_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); } else { @@ -380,7 +380,7 @@ xfs_alloc_fixup_trees( if ((error = xfs_alloc_lookup_eq(bno_cur, nfbno2, nflen2, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 0); - if ((error = xfs_alloc_insert(bno_cur, &i))) + if ((error = xfs_btree_insert(bno_cur, &i))) return error; XFS_WANT_CORRUPTED_RETURN(i == 1); } @@ -819,7 +819,7 @@ xfs_alloc_ag_vextent_near( XFS_WANT_CORRUPTED_GOTO(i == 1, error0); if (ltlen >= args->minlen) break; - if ((error = xfs_alloc_increment(cnt_cur, 0, &i))) + if ((error = xfs_btree_increment(cnt_cur, 0, &i))) goto error0; } while (i); ASSERT(ltlen >= args->minlen); @@ -829,7 +829,7 @@ xfs_alloc_ag_vextent_near( i = cnt_cur->bc_ptrs[0]; for (j = 1, blen = 0, bdiff = 0; !error && j && (blen < args->maxlen || bdiff > 0); - error = xfs_alloc_increment(cnt_cur, 0, &j)) { + error = xfs_btree_increment(cnt_cur, 0, &j)) { /* * For each entry, decide if it's better than * the previous best entry. @@ -939,7 +939,7 @@ xfs_alloc_ag_vextent_near( * Increment the cursor, so we will point at the entry just right * of the leftward entry if any, or to the leftmost entry. */ - if ((error = xfs_alloc_increment(bno_cur_gt, 0, &i))) + if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) goto error0; if (!i) { /* @@ -962,7 +962,7 @@ xfs_alloc_ag_vextent_near( args->alignment, args->minlen, <bnoa, <lena)) break; - if ((error = xfs_alloc_decrement(bno_cur_lt, 0, &i))) + if ((error = xfs_btree_decrement(bno_cur_lt, 0, &i))) goto error0; if (!i) { xfs_btree_del_cursor(bno_cur_lt, @@ -978,7 +978,7 @@ xfs_alloc_ag_vextent_near( args->alignment, args->minlen, >bnoa, >lena)) break; - if ((error = xfs_alloc_increment(bno_cur_gt, 0, &i))) + if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) goto error0; if (!i) { xfs_btree_del_cursor(bno_cur_gt, @@ -1067,7 +1067,7 @@ xfs_alloc_ag_vextent_near( /* * Fell off the right end. */ - if ((error = xfs_alloc_increment( + if ((error = xfs_btree_increment( bno_cur_gt, 0, &i))) goto error0; if (!i) { @@ -1163,7 +1163,7 @@ xfs_alloc_ag_vextent_near( /* * Fell off the left end. */ - if ((error = xfs_alloc_decrement( + if ((error = xfs_btree_decrement( bno_cur_lt, 0, &i))) goto error0; if (!i) { @@ -1322,7 +1322,7 @@ xfs_alloc_ag_vextent_size( bestflen = flen; bestfbno = fbno; for (;;) { - if ((error = xfs_alloc_decrement(cnt_cur, 0, &i))) + if ((error = xfs_btree_decrement(cnt_cur, 0, &i))) goto error0; if (i == 0) break; @@ -1417,7 +1417,7 @@ xfs_alloc_ag_vextent_small( xfs_extlen_t flen; int i; - if ((error = xfs_alloc_decrement(ccur, 0, &i))) + if ((error = xfs_btree_decrement(ccur, 0, &i))) goto error0; if (i) { if ((error = xfs_alloc_get_rec(ccur, &fbno, &flen, &i))) @@ -1550,7 +1550,7 @@ xfs_free_ag_extent( * Look for a neighboring block on the right (higher block numbers) * that is contiguous with this space. */ - if ((error = xfs_alloc_increment(bno_cur, 0, &haveright))) + if ((error = xfs_btree_increment(bno_cur, 0, &haveright))) goto error0; if (haveright) { /* @@ -1589,7 +1589,7 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, ltbno, ltlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* @@ -1598,19 +1598,19 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, gtbno, gtlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* * Delete the old by-block entry for the right block. */ - if ((error = xfs_alloc_delete(bno_cur, &i))) + if ((error = xfs_btree_delete(bno_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* * Move the by-block cursor back to the left neighbor. */ - if ((error = xfs_alloc_decrement(bno_cur, 0, &i))) + if ((error = xfs_btree_decrement(bno_cur, 0, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); #ifdef DEBUG @@ -1649,14 +1649,14 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, ltbno, ltlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* * Back up the by-block cursor to the left neighbor, and * update its length. */ - if ((error = xfs_alloc_decrement(bno_cur, 0, &i))) + if ((error = xfs_btree_decrement(bno_cur, 0, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); nbno = ltbno; @@ -1675,7 +1675,7 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, gtbno, gtlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_delete(cnt_cur, &i))) + if ((error = xfs_btree_delete(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); /* @@ -1694,7 +1694,7 @@ xfs_free_ag_extent( else { nbno = bno; nlen = len; - if ((error = xfs_alloc_insert(bno_cur, &i))) + if ((error = xfs_btree_insert(bno_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); } @@ -1706,7 +1706,7 @@ xfs_free_ag_extent( if ((error = xfs_alloc_lookup_eq(cnt_cur, nbno, nlen, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 0, error0); - if ((error = xfs_alloc_insert(cnt_cur, &i))) + if ((error = xfs_btree_insert(cnt_cur, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); Index: 2.6.x-xfs-new/fs/xfs/xfs_alloc_btree.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_alloc_btree.c 2007-05-22 19:04:51.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_alloc_btree.c 2007-11-06 19:40:29.702675076 +1100 @@ -39,519 +39,119 @@ #include "xfs_alloc.h" #include "xfs_error.h" + /* - * Prototypes for internal functions. + * Get the block pointer for the given level of the cursor. + * Fill in the buffer pointer, if applicable. */ +STATIC xfs_btree_block_t * +xfs_alloc_get_block( + xfs_btree_cur_t *cur, + int level, + xfs_buf_t **bpp) +{ + ASSERT(level < cur->bc_nlevels); + *bpp = cur->bc_bufs[level]; + return (xfs_btree_block_t *)XFS_BUF_TO_ALLOC_BLOCK(*bpp); +} -STATIC void xfs_alloc_log_block(xfs_trans_t *, xfs_buf_t *, int); -STATIC void xfs_alloc_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_alloc_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_alloc_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC int xfs_alloc_lshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_alloc_newroot(xfs_btree_cur_t *, int *); -STATIC int xfs_alloc_rshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_alloc_split(xfs_btree_cur_t *, int, xfs_agblock_t *, - xfs_alloc_key_t *, xfs_btree_cur_t **, int *); -STATIC int xfs_alloc_updkey(xfs_btree_cur_t *, xfs_alloc_key_t *, int); -/* - * Internal functions. - */ +STATIC int +xfs_alloc_get_buf( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int flags, + xfs_buf_t **bpp) +{ + xfs_buf_t *bp; -/* - * Single level of the xfs_alloc_delete record deletion routine. - * Delete record pointed to by cur/level. - * Remove the record from its block then rebalance the tree. - * Return 0 for error, 1 for done, 2 to go on to the next level. - */ -STATIC int /* error */ -xfs_alloc_delrec( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level removing record from */ - int *stat) /* fail/done/go-on */ + bp = xfs_btree_get_bufs(cur->bc_mp, cur->bc_tp, cur->bc_private.a.agno, + be32_to_cpu(ptr->u.alloc), flags); + *bpp = bp; + return 0; + +} + +STATIC int +xfs_alloc_read_buf( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int flags, + xfs_buf_t **bpp) { - xfs_agf_t *agf; /* allocation group freelist header */ - xfs_alloc_block_t *block; /* btree block record/key lives in */ - xfs_agblock_t bno; /* btree block number */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_alloc_key_t key; /* kp points here if block is level 0 */ - xfs_agblock_t lbno; /* left block's block number */ - xfs_buf_t *lbp; /* left block's buffer pointer */ - xfs_alloc_block_t *left; /* left btree block */ - xfs_alloc_key_t *lkp=NULL; /* left block key pointer */ - xfs_alloc_ptr_t *lpp=NULL; /* left block address pointer */ - int lrecs=0; /* number of records in left block */ - xfs_alloc_rec_t *lrp; /* left block record pointer */ - xfs_mount_t *mp; /* mount structure */ - int ptr; /* index in btree block for this rec */ - xfs_agblock_t rbno; /* right block's block number */ - xfs_buf_t *rbp; /* right block's buffer pointer */ - xfs_alloc_block_t *right; /* right btree block */ - xfs_alloc_key_t *rkp; /* right block key pointer */ - xfs_alloc_ptr_t *rpp; /* right block address pointer */ - int rrecs=0; /* number of records in right block */ - int numrecs; - xfs_alloc_rec_t *rrp; /* right block record pointer */ - xfs_btree_cur_t *tcur; /* temporary btree cursor */ + return xfs_btree_read_bufs(cur->bc_mp, + cur->bc_tp, cur->bc_private.a.agno, + be32_to_cpu(ptr->u.alloc), flags, + bpp, XFS_ALLOC_BTREE_REF); +} +STATIC xfs_btree_block_t * +xfs_alloc_buf_to_block( + xfs_btree_cur_t *cur, + xfs_buf_t *bp) +{ + /* XFS_BUF_TO_ALLOC_BLOCK(rbp); */ + return XFS_BUF_TO_BLOCK(bp); +} + +STATIC void +xfs_alloc_buf_to_ptr( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + xfs_btree_ptr_t *ptr) +{ + ptr->u.alloc = cpu_to_be32(XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp))); +} + +STATIC int +xfs_alloc_alloc_block( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *start, + xfs_btree_ptr_t *new, + int length, + int *stat) +{ + int error; + xfs_agblock_t bno; + + XFS_BTREE_TRACE_CURSOR(cur, ENTER); /* - * Get the index of the entry being deleted, check for nothing there. - */ - ptr = cur->bc_ptrs[level]; - if (ptr == 0) { - *stat = 0; - return 0; - } - /* - * Get the buffer & block containing the record or key/ptr. + * Allocate the new block from the freelist. + * If we can't do it, we're toast. Give up. */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) + error = xfs_alloc_get_freelist(cur->bc_tp, + cur->bc_private.a.agbp, &bno, 1); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, ERROR); return error; -#endif - /* - * Fail if we're off the end of the block. - */ - numrecs = be16_to_cpu(block->bb_numrecs); - if (ptr > numrecs) { + } + if (bno == NULLAGBLOCK) { + XFS_BTREE_TRACE_CURSOR(cur, EXIT); *stat = 0; return 0; } - XFS_STATS_INC(xs_abt_delrec); - /* - * It's a nonleaf. Excise the key and ptr being deleted, by - * sliding the entries past them down one. - * Log the changed areas of the block. - */ - if (level > 0) { - lkp = XFS_ALLOC_KEY_ADDR(block, 1, cur); - lpp = XFS_ALLOC_PTR_ADDR(block, 1, cur); -#ifdef DEBUG - for (i = ptr; i < numrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(lpp[i]), level))) - return error; - } -#endif - if (ptr < numrecs) { - memmove(&lkp[ptr - 1], &lkp[ptr], - (numrecs - ptr) * sizeof(*lkp)); - memmove(&lpp[ptr - 1], &lpp[ptr], - (numrecs - ptr) * sizeof(*lpp)); - xfs_alloc_log_ptrs(cur, bp, ptr, numrecs - 1); - xfs_alloc_log_keys(cur, bp, ptr, numrecs - 1); - } - } - /* - * It's a leaf. Excise the record being deleted, by sliding the - * entries past it down one. Log the changed areas of the block. - */ - else { - lrp = XFS_ALLOC_REC_ADDR(block, 1, cur); - if (ptr < numrecs) { - memmove(&lrp[ptr - 1], &lrp[ptr], - (numrecs - ptr) * sizeof(*lrp)); - xfs_alloc_log_recs(cur, bp, ptr, numrecs - 1); - } - /* - * If it's the first record in the block, we'll need a key - * structure to pass up to the next level (updkey). - */ - if (ptr == 1) { - key.ar_startblock = lrp->ar_startblock; - key.ar_blockcount = lrp->ar_blockcount; - lkp = &key; - } - } - /* - * Decrement and log the number of entries in the block. - */ - numrecs--; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_alloc_log_block(cur->bc_tp, bp, XFS_BB_NUMRECS); - /* - * See if the longest free extent in the allocation group was - * changed by this operation. True if it's the by-size btree, and - * this is the leaf level, and there is no right sibling block, - * and this was the last record. - */ + xfs_trans_agbtree_delta(cur->bc_tp, 1); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); + new->u.alloc = cpu_to_be32(bno); + *stat = 1; + return 0; +} + +STATIC int +xfs_alloc_free_block( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + int size) +{ + xfs_agf_t *agf; /* allocation group freelist header */ + int error; + xfs_agblock_t bno; + + bno = XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp)); agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - mp = cur->bc_mp; - if (level == 0 && - cur->bc_btnum == XFS_BTNUM_CNT && - be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK && - ptr > numrecs) { - ASSERT(ptr == numrecs + 1); - /* - * There are still records in the block. Grab the size - * from the last one. - */ - if (numrecs) { - rrp = XFS_ALLOC_REC_ADDR(block, numrecs, cur); - agf->agf_longest = rrp->ar_blockcount; - } - /* - * No free extents left. - */ - else - agf->agf_longest = 0; - mp->m_perag[be32_to_cpu(agf->agf_seqno)].pagf_longest = - be32_to_cpu(agf->agf_longest); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_LONGEST); - } - /* - * Is this the root level? If so, we're almost done. - */ - if (level == cur->bc_nlevels - 1) { - /* - * If this is the root level, - * and there's only one entry left, - * and it's NOT the leaf level, - * then we can get rid of this level. - */ - if (numrecs == 1 && level > 0) { - /* - * lpp is still set to the first pointer in the block. - * Make it the new root of the btree. - */ - bno = be32_to_cpu(agf->agf_roots[cur->bc_btnum]); - agf->agf_roots[cur->bc_btnum] = *lpp; - be32_add(&agf->agf_levels[cur->bc_btnum], -1); - mp->m_perag[be32_to_cpu(agf->agf_seqno)].pagf_levels[cur->bc_btnum]--; - /* - * Put this buffer/block on the ag's freelist. - */ - error = xfs_alloc_put_freelist(cur->bc_tp, - cur->bc_private.a.agbp, NULL, bno, 1); - if (error) - return error; - /* - * Since blocks move to the free list without the - * coordination used in xfs_bmap_finish, we can't allow - * block to be available for reallocation and - * non-transaction writing (user data) until we know - * that the transaction that moved it to the free list - * is permanently on disk. We track the blocks by - * declaring these blocks as "busy"; the busy list is - * maintained on a per-ag basis and each transaction - * records which entries should be removed when the - * iclog commits to disk. If a busy block is - * allocated, the iclog is pushed up to the LSN - * that freed the block. - */ - xfs_alloc_mark_busy(cur->bc_tp, - be32_to_cpu(agf->agf_seqno), bno, 1); - - xfs_trans_agbtree_delta(cur->bc_tp, -1); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_ROOTS | XFS_AGF_LEVELS); - /* - * Update the cursor so there's one fewer level. - */ - xfs_btree_setbuf(cur, level, NULL); - cur->bc_nlevels--; - } else if (level > 0 && - (error = xfs_alloc_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * If we deleted the leftmost entry in the block, update the - * key values above us in the tree. - */ - if (ptr == 1 && (error = xfs_alloc_updkey(cur, lkp, level + 1))) - return error; - /* - * If the number of records remaining in the block is at least - * the minimum, we're done. - */ - if (numrecs >= XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if (level > 0 && (error = xfs_alloc_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * Otherwise, we have to move some records around to keep the - * tree balanced. Look at the left and right sibling blocks to - * see if we can re-balance by moving only one record. - */ - rbno = be32_to_cpu(block->bb_rightsib); - lbno = be32_to_cpu(block->bb_leftsib); - bno = NULLAGBLOCK; - ASSERT(rbno != NULLAGBLOCK || lbno != NULLAGBLOCK); - /* - * Duplicate the cursor so our btree manipulations here won't - * disrupt the next level up. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - /* - * If there's a right sibling, see if it's ok to shift an entry - * out of it. - */ - if (rbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the last entry in the next block. - * Actually any entry but the first would suffice. - */ - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_increment(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - /* - * Grab a pointer to the block. - */ - rbp = tcur->bc_bufs[level]; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(right->bb_leftsib); - /* - * If right block is full enough so that removing one entry - * won't make it too empty, and left-shifting an entry out - * of right to us works, we're done. - */ - if (be16_to_cpu(right->bb_numrecs) - 1 >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_alloc_lshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level > 0 && - (error = xfs_alloc_decrement(cur, level, - &i))) - return error; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference, and fix up the temp cursor to point - * to our block again (last record). - */ - rrecs = be16_to_cpu(right->bb_numrecs); - if (lbno != NULLAGBLOCK) { - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_decrement(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - } - } - /* - * If there's a left sibling, see if it's ok to shift an entry - * out of it. - */ - if (lbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the first entry in the - * previous block. - */ - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_decrement(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - xfs_btree_firstrec(tcur, level); - /* - * Grab a pointer to the block. - */ - lbp = tcur->bc_bufs[level]; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(left->bb_rightsib); - /* - * If left block is full enough so that removing one entry - * won't make it too empty, and right-shifting an entry out - * of left to us works, we're done. - */ - if (be16_to_cpu(left->bb_numrecs) - 1 >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_alloc_rshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_ALLOC_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level == 0) - cur->bc_ptrs[0]++; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference. - */ - lrecs = be16_to_cpu(left->bb_numrecs); - } - /* - * Delete the temp cursor, we're done with it. - */ - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - /* - * If here, we need to do a join to keep the tree balanced. - */ - ASSERT(bno != NULLAGBLOCK); - /* - * See if we can join with the left neighbor block. - */ - if (lbno != NULLAGBLOCK && - lrecs + numrecs <= XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - /* - * Set "right" to be the starting block, - * "left" to be the left neighbor. - */ - rbno = bno; - right = block; - rrecs = be16_to_cpu(right->bb_numrecs); - rbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, lbno, 0, &lbp, - XFS_ALLOC_BTREE_REF))) - return error; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); - lrecs = be16_to_cpu(left->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - } - /* - * If that won't work, see if we can join with the right neighbor block. - */ - else if (rbno != NULLAGBLOCK && - rrecs + numrecs <= XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - /* - * Set "left" to be the starting block, - * "right" to be the right neighbor. - */ - lbno = bno; - left = block; - lrecs = be16_to_cpu(left->bb_numrecs); - lbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, rbno, 0, &rbp, - XFS_ALLOC_BTREE_REF))) - return error; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); - rrecs = be16_to_cpu(right->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - } - /* - * Otherwise, we can't fix the imbalance. - * Just return. This is probably a logic error, but it's not fatal. - */ - else { - if (level > 0 && (error = xfs_alloc_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * We're now going to join "left" and "right" by moving all the stuff - * in "right" to "left" and deleting "right". - */ - if (level > 0) { - /* - * It's a non-leaf. Move keys and pointers. - */ - lkp = XFS_ALLOC_KEY_ADDR(left, lrecs + 1, cur); - lpp = XFS_ALLOC_PTR_ADDR(left, lrecs + 1, cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < rrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memcpy(lkp, rkp, rrecs * sizeof(*lkp)); - memcpy(lpp, rpp, rrecs * sizeof(*lpp)); - xfs_alloc_log_keys(cur, lbp, lrecs + 1, lrecs + rrecs); - xfs_alloc_log_ptrs(cur, lbp, lrecs + 1, lrecs + rrecs); - } else { - /* - * It's a leaf. Move records. - */ - lrp = XFS_ALLOC_REC_ADDR(left, lrecs + 1, cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - memcpy(lrp, rrp, rrecs * sizeof(*lrp)); - xfs_alloc_log_recs(cur, lbp, lrecs + 1, lrecs + rrecs); - } - /* - * If we joined with the left neighbor, set the buffer in the - * cursor to the left block, and fix up the index. - */ - if (bp != lbp) { - xfs_btree_setbuf(cur, level, lbp); - cur->bc_ptrs[level] += lrecs; - } - /* - * If we joined with the right neighbor and there's a level above - * us, increment the cursor at that level. - */ - else if (level + 1 < cur->bc_nlevels && - (error = xfs_alloc_increment(cur, level + 1, &i))) - return error; - /* - * Fix up the number of records in the surviving block. - */ - lrecs += rrecs; - left->bb_numrecs = cpu_to_be16(lrecs); - /* - * Fix up the right block pointer in the surviving block, and log it. - */ - left->bb_rightsib = right->bb_rightsib; - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - /* - * If there is a right sibling now, make it point to the - * remaining block. - */ - if (be32_to_cpu(left->bb_rightsib) != NULLAGBLOCK) { - xfs_alloc_block_t *rrblock; - xfs_buf_t *rrbp; - - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(left->bb_rightsib), 0, - &rrbp, XFS_ALLOC_BTREE_REF))) - return error; - rrblock = XFS_BUF_TO_ALLOC_BLOCK(rrbp); - if ((error = xfs_btree_check_sblock(cur, rrblock, level, rrbp))) - return error; - rrblock->bb_leftsib = cpu_to_be32(lbno); - xfs_alloc_log_block(cur->bc_tp, rrbp, XFS_BB_LEFTSIB); - } - /* - * Free the deleting block by putting it on the freelist. - */ - error = xfs_alloc_put_freelist(cur->bc_tp, - cur->bc_private.a.agbp, NULL, rbno, 1); + error = xfs_alloc_put_freelist(cur->bc_tp, cur->bc_private.a.agbp, + NULL, bno, size); if (error) return error; /* @@ -568,278 +168,15 @@ xfs_alloc_delrec( */ xfs_alloc_mark_busy(cur->bc_tp, be32_to_cpu(agf->agf_seqno), bno, 1); xfs_trans_agbtree_delta(cur->bc_tp, -1); - - /* - * Adjust the current level's cursor so that we're left referring - * to the right node, after we're done. - * If this leaves the ptr value 0 our caller will fix it up. - */ - if (level > 0) - cur->bc_ptrs[level]--; - /* - * Return value means the next level up has something to do. - */ - *stat = 2; return 0; - -error0: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; } /* - * Insert one record/level. Return information to the caller - * allowing the next level up to proceed if necessary. - */ -STATIC int /* error */ -xfs_alloc_insrec( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to insert record at */ - xfs_agblock_t *bnop, /* i/o: block number inserted */ - xfs_alloc_rec_t *recp, /* i/o: record data inserted */ - xfs_btree_cur_t **curp, /* output: new cursor replacing cur */ - int *stat) /* output: success/failure */ -{ - xfs_agf_t *agf; /* allocation group freelist header */ - xfs_alloc_block_t *block; /* btree block record/key lives in */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_alloc_key_t key; /* key value being inserted */ - xfs_alloc_key_t *kp; /* pointer to btree keys */ - xfs_agblock_t nbno; /* block number of allocated block */ - xfs_btree_cur_t *ncur; /* new cursor to be used at next lvl */ - xfs_alloc_key_t nkey; /* new key value, from split */ - xfs_alloc_rec_t nrec; /* new record value, for caller */ - int numrecs; - int optr; /* old ptr value */ - xfs_alloc_ptr_t *pp; /* pointer to btree addresses */ - int ptr; /* index in btree block for this rec */ - xfs_alloc_rec_t *rp; /* pointer to btree records */ - - ASSERT(be32_to_cpu(recp->ar_blockcount) > 0); - - /* - * GCC doesn't understand the (arguably complex) control flow in - * this function and complains about uninitialized structure fields - * without this. - */ - memset(&nrec, 0, sizeof(nrec)); - - /* - * If we made it to the root level, allocate a new root block - * and we're done. - */ - if (level >= cur->bc_nlevels) { - XFS_STATS_INC(xs_abt_insrec); - if ((error = xfs_alloc_newroot(cur, &i))) - return error; - *bnop = NULLAGBLOCK; - *stat = i; - return 0; - } - /* - * Make a key out of the record data to be inserted, and save it. - */ - key.ar_startblock = recp->ar_startblock; - key.ar_blockcount = recp->ar_blockcount; - optr = ptr = cur->bc_ptrs[level]; - /* - * If we're off the left edge, return failure. - */ - if (ptr == 0) { - *stat = 0; - return 0; - } - XFS_STATS_INC(xs_abt_insrec); - /* - * Get pointers to the btree buffer and block. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - numrecs = be16_to_cpu(block->bb_numrecs); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; - /* - * Check that the new entry is being inserted in the right place. - */ - if (ptr <= numrecs) { - if (level == 0) { - rp = XFS_ALLOC_REC_ADDR(block, ptr, cur); - xfs_btree_check_rec(cur->bc_btnum, recp, rp); - } else { - kp = XFS_ALLOC_KEY_ADDR(block, ptr, cur); - xfs_btree_check_key(cur->bc_btnum, &key, kp); - } - } -#endif - nbno = NULLAGBLOCK; - ncur = NULL; - /* - * If the block is full, we can't insert the new entry until we - * make the block un-full. - */ - if (numrecs == XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - /* - * First, try shifting an entry to the right neighbor. - */ - if ((error = xfs_alloc_rshift(cur, level, &i))) - return error; - if (i) { - /* nothing */ - } - /* - * Next, try shifting an entry to the left neighbor. - */ - else { - if ((error = xfs_alloc_lshift(cur, level, &i))) - return error; - if (i) - optr = ptr = cur->bc_ptrs[level]; - else { - /* - * Next, try splitting the current block in - * half. If this works we have to re-set our - * variables because we could be in a - * different block now. - */ - if ((error = xfs_alloc_split(cur, level, &nbno, - &nkey, &ncur, &i))) - return error; - if (i) { - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = - xfs_btree_check_sblock(cur, - block, level, bp))) - return error; -#endif - ptr = cur->bc_ptrs[level]; - nrec.ar_startblock = nkey.ar_startblock; - nrec.ar_blockcount = nkey.ar_blockcount; - } - /* - * Otherwise the insert fails. - */ - else { - *stat = 0; - return 0; - } - } - } - } - /* - * At this point we know there's room for our new entry in the block - * we're pointing at. - */ - numrecs = be16_to_cpu(block->bb_numrecs); - if (level > 0) { - /* - * It's a non-leaf entry. Make a hole for the new data - * in the key and ptr regions of the block. - */ - kp = XFS_ALLOC_KEY_ADDR(block, 1, cur); - pp = XFS_ALLOC_PTR_ADDR(block, 1, cur); -#ifdef DEBUG - for (i = numrecs; i >= ptr; i--) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(pp[i - 1]), level))) - return error; - } -#endif - memmove(&kp[ptr], &kp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*kp)); - memmove(&pp[ptr], &pp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*pp)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, *bnop, level))) - return error; -#endif - /* - * Now stuff the new data in, bump numrecs and log the new data. - */ - kp[ptr - 1] = key; - pp[ptr - 1] = cpu_to_be32(*bnop); - numrecs++; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_alloc_log_keys(cur, bp, ptr, numrecs); - xfs_alloc_log_ptrs(cur, bp, ptr, numrecs); -#ifdef DEBUG - if (ptr < numrecs) - xfs_btree_check_key(cur->bc_btnum, kp + ptr - 1, - kp + ptr); -#endif - } else { - /* - * It's a leaf entry. Make a hole for the new record. - */ - rp = XFS_ALLOC_REC_ADDR(block, 1, cur); - memmove(&rp[ptr], &rp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*rp)); - /* - * Now stuff the new record in, bump numrecs - * and log the new data. - */ - rp[ptr - 1] = *recp; - numrecs++; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_alloc_log_recs(cur, bp, ptr, numrecs); -#ifdef DEBUG - if (ptr < numrecs) - xfs_btree_check_rec(cur->bc_btnum, rp + ptr - 1, - rp + ptr); -#endif - } - /* - * Log the new number of records in the btree header. - */ - xfs_alloc_log_block(cur->bc_tp, bp, XFS_BB_NUMRECS); - /* - * If we inserted at the start of a block, update the parents' keys. - */ - if (optr == 1 && (error = xfs_alloc_updkey(cur, &key, level + 1))) - return error; - /* - * Look to see if the longest extent in the allocation group - * needs to be updated. - */ - - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - if (level == 0 && - cur->bc_btnum == XFS_BTNUM_CNT && - be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK && - be32_to_cpu(recp->ar_blockcount) > be32_to_cpu(agf->agf_longest)) { - /* - * If this is a leaf in the by-size btree and there - * is no right sibling block and this block is bigger - * than the previous longest block, update it. - */ - agf->agf_longest = recp->ar_blockcount; - cur->bc_mp->m_perag[be32_to_cpu(agf->agf_seqno)].pagf_longest - = be32_to_cpu(recp->ar_blockcount); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_LONGEST); - } - /* - * Return the new block number, if any. - * If there is one, give back a record value and a cursor too. - */ - *bnop = nbno; - if (nbno != NULLAGBLOCK) { - *recp = nrec; - *curp = ncur; - } - *stat = 1; - return 0; -} - -/* - * Log header fields from a btree block. + * Log fields from the btree block header. */ STATIC void xfs_alloc_log_block( - xfs_trans_t *tp, /* transaction pointer */ + xfs_btree_cur_t *cur, /* btree cursor */ xfs_buf_t *bp, /* buffer containing btree block */ int fields) /* mask of fields: XFS_BB_... */ { @@ -854,1243 +191,629 @@ xfs_alloc_log_block( sizeof(xfs_alloc_block_t) }; + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBI(cur, bp, fields); xfs_btree_offsets(fields, offsets, XFS_BB_NUM_BITS, &first, &last); - xfs_trans_log_buf(tp, bp, first, last); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); } -/* - * Log keys from a btree block (nonleaf). - */ -STATIC void -xfs_alloc_log_keys( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int kfirst, /* index of first key to log */ - int klast) /* index of last key to log */ +static const struct xfs_btree_block_ops xfs_alloc_blkops = { + .get_buf = xfs_alloc_get_buf, + .read_buf = xfs_alloc_read_buf, + .get_block = xfs_alloc_get_block, + .buf_to_block = xfs_alloc_buf_to_block, + .buf_to_ptr = xfs_alloc_buf_to_ptr, + .log_block = xfs_alloc_log_block, + .check_block = xfs_btree_check_sblock, + + .alloc_block = xfs_alloc_alloc_block, + .free_block = xfs_alloc_free_block, + + .get_sibling = xfs_btree_get_ssibling, + .set_sibling = xfs_btree_set_ssibling, + .init_sibling = xfs_btree_init_sibling, +}; + +STATIC int +xfs_alloc_get_minrecs( + xfs_btree_cur_t *cur, + int lev) { - xfs_alloc_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - xfs_alloc_key_t *kp; /* key pointer in btree block */ - int last; /* last byte offset logged */ + return cur->bc_mp->m_alloc_mnr[lev != 0]; +} - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - kp = XFS_ALLOC_KEY_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); +STATIC int +xfs_alloc_get_maxrecs( + xfs_btree_cur_t *cur, + int lev) +{ + return cur->bc_mp->m_alloc_mxr[lev != 0]; } -/* - * Log block pointer fields from a btree block (nonleaf). - */ -STATIC void -xfs_alloc_log_ptrs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int pfirst, /* index of first pointer to log */ - int plast) /* index of last pointer to log */ +STATIC int +xfs_btree_get_numrecs( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block) { - xfs_alloc_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_alloc_ptr_t *pp; /* block-pointer pointer in btree blk */ + BUG_ON(be16_to_cpu(block->bb_h.bb_numrecs) < 0); + BUG_ON(be16_to_cpu(block->bb_h.bb_numrecs) > 1000); + return be16_to_cpu(block->bb_h.bb_numrecs); +} - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - pp = XFS_ALLOC_PTR_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); +STATIC void +xfs_btree_set_numrecs( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + int numrecs) +{ + BUG_ON(numrecs < 0); + BUG_ON(numrecs > 1000); + block->bb_h.bb_numrecs = cpu_to_be16(numrecs); } -/* - * Log records from a btree block (leaf). - */ STATIC void -xfs_alloc_log_recs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int rfirst, /* index of first record to log */ - int rlast) /* index of last record to log */ +xfs_alloc_init_key_from_rec( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key, + xfs_btree_rec_t *rec) { - xfs_alloc_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_alloc_rec_t *rp; /* record pointer for btree block */ - - - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - rp = XFS_ALLOC_REC_ADDR(block, 1, cur); -#ifdef DEBUG - { - xfs_agf_t *agf; - xfs_alloc_rec_t *p; - - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - for (p = &rp[rfirst - 1]; p <= &rp[rlast - 1]; p++) - ASSERT(be32_to_cpu(p->ar_startblock) + - be32_to_cpu(p->ar_blockcount) <= - be32_to_cpu(agf->agf_length)); - } -#endif - first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); + key->u.alloc.ar_startblock = rec->u.alloc.ar_startblock; + key->u.alloc.ar_blockcount = rec->u.alloc.ar_blockcount; + BUG_ON(key->u.alloc.ar_startblock == 0); } /* - * Lookup the record. The cursor is made to point to it, based on dir. - * Return 0 if can't find any such record, 1 for success. + * intial value of ptr for lookup */ -STATIC int /* error */ -xfs_alloc_lookup( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_lookup_t dir, /* <=, ==, or >= */ - int *stat) /* success/failure */ +STATIC void +xfs_alloc_init_ptr_from_cur( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) { - xfs_agblock_t agbno; /* a.g. relative btree block number */ - xfs_agnumber_t agno; /* allocation group number */ - xfs_alloc_block_t *block=NULL; /* current btree block */ - int diff; /* difference for the current key */ - int error; /* error return value */ - int keyno=0; /* current key number */ - int level; /* level in the btree */ - xfs_mount_t *mp; /* file system mount point */ + xfs_agf_t *agf; /* a.g. freespace header */ - XFS_STATS_INC(xs_abt_lookup); - /* - * Get the allocation group header, and the root block number. - */ - mp = cur->bc_mp; + agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + ASSERT(cur->bc_private.a.agno == be32_to_cpu(agf->agf_seqno)); + ptr->u.alloc = agf->agf_roots[cur->bc_btnum]; + BUG_ON(ptr->u.alloc == 0); +} - { - xfs_agf_t *agf; /* a.g. freespace header */ +STATIC void +xfs_alloc_init_rec_from_key( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key, + xfs_btree_rec_t *rec) +{ + BUG_ON(key->u.alloc.ar_startblock == 0); + rec->u.alloc.ar_startblock = key->u.alloc.ar_startblock; + rec->u.alloc.ar_blockcount = key->u.alloc.ar_blockcount; +} - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - agno = be32_to_cpu(agf->agf_seqno); - agbno = be32_to_cpu(agf->agf_roots[cur->bc_btnum]); - } - /* - * Iterate over each level in the btree, starting at the root. - * For each level above the leaves, find the key we need, based - * on the lookup record, then follow the corresponding block - * pointer down to the next level. - */ - for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { - xfs_buf_t *bp; /* buffer pointer for btree block */ - xfs_daddr_t d; /* disk address of btree block */ - - /* - * Get the disk address we're looking for. - */ - d = XFS_AGB_TO_DADDR(mp, agno, agbno); - /* - * If the old buffer at this level is for a different block, - * throw it away, otherwise just use it. - */ - bp = cur->bc_bufs[level]; - if (bp && XFS_BUF_ADDR(bp) != d) - bp = NULL; - if (!bp) { - /* - * Need to get a new buffer. Read it, then - * set it in the cursor, releasing the old one. - */ - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, agno, - agbno, 0, &bp, XFS_ALLOC_BTREE_REF))) - return error; - xfs_btree_setbuf(cur, level, bp); - /* - * Point to the btree block, now that we have the buffer - */ - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, level, - bp))) - return error; - } else - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - /* - * If we already had a key match at a higher level, we know - * we need to use the first entry in this block. - */ - if (diff == 0) - keyno = 1; - /* - * Otherwise we need to search this block. Do a binary search. - */ - else { - int high; /* high entry number */ - xfs_alloc_key_t *kkbase=NULL;/* base of keys in block */ - xfs_alloc_rec_t *krbase=NULL;/* base of records in block */ - int low; /* low entry number */ - - /* - * Get a pointer to keys or records. - */ - if (level > 0) - kkbase = XFS_ALLOC_KEY_ADDR(block, 1, cur); - else - krbase = XFS_ALLOC_REC_ADDR(block, 1, cur); - /* - * Set low and high entry numbers, 1-based. - */ - low = 1; - if (!(high = be16_to_cpu(block->bb_numrecs))) { - /* - * If the block is empty, the tree must - * be an empty leaf. - */ - ASSERT(level == 0 && cur->bc_nlevels == 1); - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; - *stat = 0; - return 0; - } - /* - * Binary search the block. - */ - while (low <= high) { - xfs_extlen_t blockcount; /* key value */ - xfs_agblock_t startblock; /* key value */ - - XFS_STATS_INC(xs_abt_compare); - /* - * keyno is average of low and high. - */ - keyno = (low + high) >> 1; - /* - * Get startblock & blockcount. - */ - if (level > 0) { - xfs_alloc_key_t *kkp; - - kkp = kkbase + keyno - 1; - startblock = be32_to_cpu(kkp->ar_startblock); - blockcount = be32_to_cpu(kkp->ar_blockcount); - } else { - xfs_alloc_rec_t *krp; - - krp = krbase + keyno - 1; - startblock = be32_to_cpu(krp->ar_startblock); - blockcount = be32_to_cpu(krp->ar_blockcount); - } - /* - * Compute difference to get next direction. - */ - if (cur->bc_btnum == XFS_BTNUM_BNO) - diff = (int)startblock - - (int)cur->bc_rec.a.ar_startblock; - else if (!(diff = (int)blockcount - - (int)cur->bc_rec.a.ar_blockcount)) - diff = (int)startblock - - (int)cur->bc_rec.a.ar_startblock; - /* - * Less than, move right. - */ - if (diff < 0) - low = keyno + 1; - /* - * Greater than, move left. - */ - else if (diff > 0) - high = keyno - 1; - /* - * Equal, we're done. - */ - else - break; - } - } - /* - * If there are more levels, set up for the next level - * by getting the block number and filling in the cursor. - */ - if (level > 0) { - /* - * If we moved left, need the previous key number, - * unless there isn't one. - */ - if (diff > 0 && --keyno < 1) - keyno = 1; - agbno = be32_to_cpu(*XFS_ALLOC_PTR_ADDR(block, keyno, cur)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, agbno, level))) - return error; -#endif - cur->bc_ptrs[level] = keyno; - } - } - /* - * Done with the search. - * See if we need to adjust the results. - */ - if (dir != XFS_LOOKUP_LE && diff < 0) { - keyno++; - /* - * If ge search and we went off the end of the block, but it's - * not the last block, we're in the wrong block. - */ - if (dir == XFS_LOOKUP_GE && - keyno > be16_to_cpu(block->bb_numrecs) && - be32_to_cpu(block->bb_rightsib) != NULLAGBLOCK) { - int i; - - cur->bc_ptrs[0] = keyno; - if ((error = xfs_alloc_increment(cur, 0, &i))) - return error; - XFS_WANT_CORRUPTED_RETURN(i == 1); - *stat = 1; - return 0; - } - } - else if (dir == XFS_LOOKUP_LE && diff > 0) - keyno--; - cur->bc_ptrs[0] = keyno; - /* - * Return if we succeeded or not. - */ - if (keyno == 0 || keyno > be16_to_cpu(block->bb_numrecs)) - *stat = 0; - else - *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); - return 0; +STATIC void +xfs_alloc_init_rec_from_cur( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec) +{ + BUG_ON(cur->bc_rec.a.ar_startblock == 0); + rec->u.alloc.ar_startblock = cpu_to_be32(cur->bc_rec.a.ar_startblock); + rec->u.alloc.ar_blockcount = cpu_to_be32(cur->bc_rec.a.ar_blockcount); } -/* - * Move 1 record left from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_alloc_lshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ +STATIC xfs_btree_key_t * +xfs_alloc_key_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) { - int error; /* error return value */ -#ifdef DEBUG - int i; /* loop index */ -#endif - xfs_alloc_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left neighbor block */ - xfs_alloc_block_t *left; /* left neighbor btree block */ - int nrec; /* new number of left block entries */ - xfs_buf_t *rbp; /* buffer for right (current) block */ - xfs_alloc_block_t *right; /* right (current) btree block */ - xfs_alloc_key_t *rkp=NULL; /* key pointer for right block */ - xfs_alloc_ptr_t *rpp=NULL; /* address pointer for right block */ - xfs_alloc_rec_t *rrp=NULL; /* record pointer for right block */ + return (xfs_btree_key_t *)XFS_ALLOC_KEY_ADDR(&block->bb_h, index, cur); +} - /* - * Set up variables for this block as "right". - */ - rbp = cur->bc_bufs[level]; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; -#endif - /* - * If we've got no left sibling then we can't shift an entry left. - */ - if (be32_to_cpu(right->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] <= 1) { - *stat = 0; - return 0; - } - /* - * Set up the left neighbor as "left". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(right->bb_leftsib), - 0, &lbp, XFS_ALLOC_BTREE_REF))) - return error; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(left->bb_numrecs) == XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - nrec = be16_to_cpu(left->bb_numrecs) + 1; - /* - * If non-leaf, copy a key and a ptr to the left block. - */ - if (level > 0) { - xfs_alloc_key_t *lkp; /* key pointer for left block */ - xfs_alloc_ptr_t *lpp; /* address pointer for left block */ - - lkp = XFS_ALLOC_KEY_ADDR(left, nrec, cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - *lkp = *rkp; - xfs_alloc_log_keys(cur, lbp, nrec, nrec); - lpp = XFS_ALLOC_PTR_ADDR(left, nrec, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*rpp), level))) - return error; -#endif - *lpp = *rpp; - xfs_alloc_log_ptrs(cur, lbp, nrec, nrec); - xfs_btree_check_key(cur->bc_btnum, lkp - 1, lkp); - } - /* - * If leaf, copy a record to the left block. - */ - else { - xfs_alloc_rec_t *lrp; /* record pointer for left block */ +STATIC xfs_btree_ptr_t * +xfs_alloc_ptr_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) +{ + return (xfs_btree_ptr_t *)XFS_ALLOC_PTR_ADDR(&block->bb_h, index, cur); +} - lrp = XFS_ALLOC_REC_ADDR(left, nrec, cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - *lrp = *rrp; - xfs_alloc_log_recs(cur, lbp, nrec, nrec); - xfs_btree_check_rec(cur->bc_btnum, lrp - 1, lrp); - } - /* - * Bump and log left's numrecs, decrement and log right's numrecs. - */ - be16_add(&left->bb_numrecs, 1); - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, -1); - xfs_alloc_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Slide the contents of right down one entry. - */ - if (level > 0) { -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i + 1]), - level))) - return error; - } -#endif - memmove(rkp, rkp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp, rpp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_alloc_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_alloc_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - } else { - memmove(rrp, rrp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_alloc_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - key.ar_startblock = rrp->ar_startblock; - key.ar_blockcount = rrp->ar_blockcount; - rkp = &key; - } - /* - * Update the parent key values of right. - */ - if ((error = xfs_alloc_updkey(cur, rkp, level + 1))) - return error; - /* - * Slide the cursor value left one. - */ - cur->bc_ptrs[level]--; - *stat = 1; - return 0; +STATIC xfs_btree_rec_t * +xfs_alloc_rec_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) +{ + return (xfs_btree_rec_t *)XFS_ALLOC_REC_ADDR(&block->bb_h, index, cur); } -/* - * Allocate a new root block, fill it in. - */ -STATIC int /* error */ -xfs_alloc_newroot( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ +STATIC int64_t +xfs_alloc_key_diff( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key) { - int error; /* error return value */ - xfs_agblock_t lbno; /* left block number */ - xfs_buf_t *lbp; /* left btree buffer */ - xfs_alloc_block_t *left; /* left btree block */ - xfs_mount_t *mp; /* mount structure */ - xfs_agblock_t nbno; /* new block number */ - xfs_buf_t *nbp; /* new (root) buffer */ - xfs_alloc_block_t *new; /* new (root) btree block */ - int nptr; /* new value for key index, 1 or 2 */ - xfs_agblock_t rbno; /* right block number */ - xfs_buf_t *rbp; /* right btree buffer */ - xfs_alloc_block_t *right; /* right btree block */ + xfs_alloc_rec_incore_t *rec = &cur->bc_rec.a; + xfs_alloc_key_t *kp = &key->u.alloc; + int64_t diff; - mp = cur->bc_mp; + if (cur->bc_btnum == XFS_BTNUM_BNO) + return (int64_t)(be32_to_cpu(kp->ar_startblock)) - + rec->ar_startblock; - ASSERT(cur->bc_nlevels < XFS_AG_MAXLEVELS(mp)); - /* - * Get a buffer from the freelist blocks, for the new root. - */ - error = xfs_alloc_get_freelist(cur->bc_tp, - cur->bc_private.a.agbp, &nbno, 1); - if (error) - return error; - /* - * None available, we fail. - */ - if (nbno == NULLAGBLOCK) { - *stat = 0; - return 0; - } - xfs_trans_agbtree_delta(cur->bc_tp, 1); - nbp = xfs_btree_get_bufs(mp, cur->bc_tp, cur->bc_private.a.agno, nbno, - 0); - new = XFS_BUF_TO_ALLOC_BLOCK(nbp); - /* - * Set the root data in the a.g. freespace structure. - */ - { - xfs_agf_t *agf; /* a.g. freespace header */ - xfs_agnumber_t seqno; + diff = (int64_t)(be32_to_cpu(kp->ar_blockcount)) - rec->ar_blockcount; + if (!diff) + diff = (int64_t)(be32_to_cpu(kp->ar_startblock)) - + rec->ar_startblock; + return diff; +} - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - agf->agf_roots[cur->bc_btnum] = cpu_to_be32(nbno); - be32_add(&agf->agf_levels[cur->bc_btnum], 1); - seqno = be32_to_cpu(agf->agf_seqno); - mp->m_perag[seqno].pagf_levels[cur->bc_btnum]++; - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_ROOTS | XFS_AGF_LEVELS); - } - /* - * At the previous root level there are now two blocks: the old - * root, and the new block generated when it was split. - * We don't know which one the cursor is pointing at, so we - * set up variables "left" and "right" for each case. - */ - lbp = cur->bc_bufs[cur->bc_nlevels - 1]; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, cur->bc_nlevels - 1, lbp))) - return error; -#endif - if (be32_to_cpu(left->bb_rightsib) != NULLAGBLOCK) { - /* - * Our block is left, pick up the right block. - */ - lbno = XFS_DADDR_TO_AGBNO(mp, XFS_BUF_ADDR(lbp)); - rbno = be32_to_cpu(left->bb_rightsib); - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, rbno, 0, &rbp, - XFS_ALLOC_BTREE_REF))) - return error; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); - if ((error = xfs_btree_check_sblock(cur, right, - cur->bc_nlevels - 1, rbp))) - return error; - nptr = 1; - } else { - /* - * Our block is right, pick up the left block. - */ - rbp = lbp; - right = left; - rbno = XFS_DADDR_TO_AGBNO(mp, XFS_BUF_ADDR(rbp)); - lbno = be32_to_cpu(right->bb_leftsib); - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.a.agno, lbno, 0, &lbp, - XFS_ALLOC_BTREE_REF))) - return error; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); - if ((error = xfs_btree_check_sblock(cur, left, - cur->bc_nlevels - 1, lbp))) - return error; - nptr = 2; - } - /* - * Fill in the new block's btree header and log it. - */ - new->bb_magic = cpu_to_be32(xfs_magics[cur->bc_btnum]); - new->bb_level = cpu_to_be16(cur->bc_nlevels); - new->bb_numrecs = cpu_to_be16(2); - new->bb_leftsib = cpu_to_be32(NULLAGBLOCK); - new->bb_rightsib = cpu_to_be32(NULLAGBLOCK); - xfs_alloc_log_block(cur->bc_tp, nbp, XFS_BB_ALL_BITS); - ASSERT(lbno != NULLAGBLOCK && rbno != NULLAGBLOCK); - /* - * Fill in the key data in the new root. - */ - { - xfs_alloc_key_t *kp; /* btree key pointer */ +STATIC xfs_daddr_t +xfs_alloc_ptr_to_daddr( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) +{ + return XFS_AGB_TO_DADDR(cur->bc_mp, cur->bc_private.a.agno, + be32_to_cpu(ptr->u.alloc)); +} - kp = XFS_ALLOC_KEY_ADDR(new, 1, cur); - if (be16_to_cpu(left->bb_level) > 0) { - kp[0] = *XFS_ALLOC_KEY_ADDR(left, 1, cur); - kp[1] = *XFS_ALLOC_KEY_ADDR(right, 1, cur); - } else { - xfs_alloc_rec_t *rp; /* btree record pointer */ - - rp = XFS_ALLOC_REC_ADDR(left, 1, cur); - kp[0].ar_startblock = rp->ar_startblock; - kp[0].ar_blockcount = rp->ar_blockcount; - rp = XFS_ALLOC_REC_ADDR(right, 1, cur); - kp[1].ar_startblock = rp->ar_startblock; - kp[1].ar_blockcount = rp->ar_blockcount; - } +STATIC void +xfs_alloc_move_keys( + xfs_btree_cur_t *cur, + xfs_btree_key_t *src_key, + xfs_btree_key_t *dst_key, + int from, + int to, + int numkeys) +{ + BUG_ON(from < 0 || to < 0); + BUG_ON(from > 1000 || to > 1000); + BUG_ON(numkeys < 0); + + /* + * we can get a request to move zero records if the + * block is already empty. e.g. xfs_alloc_fix_freelist + * will delete the current entry and then reinsert a + * modified entry. If there is only a single entry in + * the block, the will result in an empty block. + */ + if (numkeys == 0) + return; + if (dst_key == NULL) { + /* moving within a block */ + xfs_alloc_key_t *kp = &src_key->u.alloc; + memmove(&kp[to], &kp[from], numkeys * sizeof(*kp)); + } else { + /* moving between blocks */ + memcpy(dst_key, src_key, numkeys * sizeof(xfs_alloc_key_t)); } - xfs_alloc_log_keys(cur, nbp, 1, 2); - /* - * Fill in the pointer data in the new root. - */ - { - xfs_alloc_ptr_t *pp; /* btree address pointer */ +} - pp = XFS_ALLOC_PTR_ADDR(new, 1, cur); - pp[0] = cpu_to_be32(lbno); - pp[1] = cpu_to_be32(rbno); +STATIC void +xfs_alloc_move_ptrs( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *src_ptr, + xfs_btree_ptr_t *dst_ptr, + int from, + int to, + int numptrs) +{ + BUG_ON(from < 0 || to < 0); + BUG_ON(from > 1000 || to > 1000); + BUG_ON(numptrs < 0); + if (numptrs == 0) + return; + if (dst_ptr == NULL) { + xfs_alloc_ptr_t *pp = &src_ptr->u.alloc; + memmove(&pp[to], &pp[from], numptrs * sizeof(*pp)); + } else { + memcpy(dst_ptr, src_ptr, numptrs * sizeof(xfs_alloc_ptr_t)); } - xfs_alloc_log_ptrs(cur, nbp, 1, 2); - /* - * Fix up the cursor. - */ - xfs_btree_setbuf(cur, cur->bc_nlevels, nbp); - cur->bc_ptrs[cur->bc_nlevels] = nptr; - cur->bc_nlevels++; - *stat = 1; - return 0; } -/* - * Move 1 record right from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_alloc_rshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; /* loop index */ - xfs_alloc_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left (current) block */ - xfs_alloc_block_t *left; /* left (current) btree block */ - xfs_buf_t *rbp; /* buffer for right neighbor block */ - xfs_alloc_block_t *right; /* right neighbor btree block */ - xfs_alloc_key_t *rkp; /* key pointer for right block */ - xfs_btree_cur_t *tcur; /* temporary cursor */ - - /* - * Set up variables for this block as "left". - */ - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; -#endif - /* - * If we've got no right sibling then we can't shift an entry right. - */ - if (be32_to_cpu(left->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] >= be16_to_cpu(left->bb_numrecs)) { - *stat = 0; - return 0; - } - /* - * Set up the right neighbor as "right". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(left->bb_rightsib), - 0, &rbp, XFS_ALLOC_BTREE_REF))) - return error; - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(right->bb_numrecs) == XFS_ALLOC_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - /* - * Make a hole at the start of the right neighbor block, then - * copy the last left block entry to the hole. - */ - if (level > 0) { - xfs_alloc_key_t *lkp; /* key pointer for left block */ - xfs_alloc_ptr_t *lpp; /* address pointer for left block */ - xfs_alloc_ptr_t *rpp; /* address pointer for right block */ - - lkp = XFS_ALLOC_KEY_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - lpp = XFS_ALLOC_PTR_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = be16_to_cpu(right->bb_numrecs) - 1; i >= 0; i--) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memmove(rkp + 1, rkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp + 1, rpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*lpp), level))) - return error; -#endif - *rkp = *lkp; - *rpp = *lpp; - xfs_alloc_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_alloc_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_btree_check_key(cur->bc_btnum, rkp, rkp + 1); +STATIC void +xfs_alloc_move_recs( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *src_rec, + xfs_btree_rec_t *dst_rec, + int from, + int to, + int numrecs) +{ + BUG_ON(from < 0 || to < 0); + BUG_ON(from > 1000 || to > 1000); + BUG_ON(numrecs < 0); + if (numrecs == 0) + return; + if (dst_rec == NULL) { + xfs_alloc_rec_t *rp = &src_rec->u.alloc; + memmove(&rp[to], &rp[from], numrecs * sizeof(*rp)); } else { - xfs_alloc_rec_t *lrp; /* record pointer for left block */ - xfs_alloc_rec_t *rrp; /* record pointer for right block */ - - lrp = XFS_ALLOC_REC_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - memmove(rrp + 1, rrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - *rrp = *lrp; - xfs_alloc_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - key.ar_startblock = rrp->ar_startblock; - key.ar_blockcount = rrp->ar_blockcount; - rkp = &key; - xfs_btree_check_rec(cur->bc_btnum, rrp, rrp + 1); + memcpy(dst_rec, src_rec, numrecs * sizeof(xfs_alloc_rec_t)); } - /* - * Decrement and log left's numrecs, bump and log right's numrecs. - */ - be16_add(&left->bb_numrecs, -1); - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, 1); - xfs_alloc_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Using a temporary cursor, update the parent key values of the - * block on the right. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_alloc_increment(tcur, level, &i)) || - (error = xfs_alloc_updkey(tcur, rkp, level + 1))) - goto error0; - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - *stat = 1; - return 0; -error0: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; } -/* - * Split cur/level block in half. - * Return new block number and its first record (to be inserted into parent). - */ -STATIC int /* error */ -xfs_alloc_split( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to split */ - xfs_agblock_t *bnop, /* output: block number allocated */ - xfs_alloc_key_t *keyp, /* output: first key of new block */ - xfs_btree_cur_t **curp, /* output: new cursor */ - int *stat) /* success/failure */ + +STATIC void +xfs_alloc_set_key( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key_addr, + int index, + xfs_btree_key_t *newkey) { - int error; /* error return value */ - int i; /* loop index/record number */ - xfs_agblock_t lbno; /* left (current) block number */ - xfs_buf_t *lbp; /* buffer for left block */ - xfs_alloc_block_t *left; /* left (current) btree block */ - xfs_agblock_t rbno; /* right (new) block number */ - xfs_buf_t *rbp; /* buffer for right block */ - xfs_alloc_block_t *right; /* right (new) btree block */ + xfs_alloc_key_t *kp = &key_addr->u.alloc; - /* - * Allocate the new block from the freelist. - * If we can't do it, we're toast. Give up. - */ - error = xfs_alloc_get_freelist(cur->bc_tp, - cur->bc_private.a.agbp, &rbno, 1); - if (error) - return error; - if (rbno == NULLAGBLOCK) { - *stat = 0; - return 0; - } - xfs_trans_agbtree_delta(cur->bc_tp, 1); - rbp = xfs_btree_get_bufs(cur->bc_mp, cur->bc_tp, cur->bc_private.a.agno, - rbno, 0); - /* - * Set up the new block as "right". - */ - right = XFS_BUF_TO_ALLOC_BLOCK(rbp); - /* - * "Left" is the current (according to the cursor) block. - */ - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_ALLOC_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; -#endif - /* - * Fill in the btree header for the new block. - */ - right->bb_magic = cpu_to_be32(xfs_magics[cur->bc_btnum]); - right->bb_level = left->bb_level; - right->bb_numrecs = cpu_to_be16(be16_to_cpu(left->bb_numrecs) / 2); - /* - * Make sure that if there's an odd number of entries now, that - * each new block will have the same number of entries. - */ - if ((be16_to_cpu(left->bb_numrecs) & 1) && - cur->bc_ptrs[level] <= be16_to_cpu(right->bb_numrecs) + 1) - be16_add(&right->bb_numrecs, 1); - i = be16_to_cpu(left->bb_numrecs) - be16_to_cpu(right->bb_numrecs) + 1; - /* - * For non-leaf blocks, copy keys and addresses over to the new block. - */ - if (level > 0) { - xfs_alloc_key_t *lkp; /* left btree key pointer */ - xfs_alloc_ptr_t *lpp; /* left btree address pointer */ - xfs_alloc_key_t *rkp; /* right btree key pointer */ - xfs_alloc_ptr_t *rpp; /* right btree address pointer */ - - lkp = XFS_ALLOC_KEY_ADDR(left, i, cur); - lpp = XFS_ALLOC_PTR_ADDR(left, i, cur); - rkp = XFS_ALLOC_KEY_ADDR(right, 1, cur); - rpp = XFS_ALLOC_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(lpp[i]), level))) - return error; - } -#endif - memcpy(rkp, lkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memcpy(rpp, lpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_alloc_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_alloc_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - *keyp = *rkp; - } - /* - * For leaf blocks, copy records over to the new block. - */ - else { - xfs_alloc_rec_t *lrp; /* left btree record pointer */ - xfs_alloc_rec_t *rrp; /* right btree record pointer */ - - lrp = XFS_ALLOC_REC_ADDR(left, i, cur); - rrp = XFS_ALLOC_REC_ADDR(right, 1, cur); - memcpy(rrp, lrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_alloc_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - keyp->ar_startblock = rrp->ar_startblock; - keyp->ar_blockcount = rrp->ar_blockcount; - } - /* - * Find the left block number by looking in the buffer. - * Adjust numrecs, sibling pointers. - */ - lbno = XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(lbp)); - be16_add(&left->bb_numrecs, -(be16_to_cpu(right->bb_numrecs))); - right->bb_rightsib = left->bb_rightsib; - left->bb_rightsib = cpu_to_be32(rbno); - right->bb_leftsib = cpu_to_be32(lbno); - xfs_alloc_log_block(cur->bc_tp, rbp, XFS_BB_ALL_BITS); - xfs_alloc_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - /* - * If there's a block to the new block's right, make that block - * point back to right instead of to left. - */ - if (be32_to_cpu(right->bb_rightsib) != NULLAGBLOCK) { - xfs_alloc_block_t *rrblock; /* rr btree block */ - xfs_buf_t *rrbp; /* buffer for rrblock */ - - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, be32_to_cpu(right->bb_rightsib), 0, - &rrbp, XFS_ALLOC_BTREE_REF))) - return error; - rrblock = XFS_BUF_TO_ALLOC_BLOCK(rrbp); - if ((error = xfs_btree_check_sblock(cur, rrblock, level, rrbp))) - return error; - rrblock->bb_leftsib = cpu_to_be32(rbno); - xfs_alloc_log_block(cur->bc_tp, rrbp, XFS_BB_LEFTSIB); - } - /* - * If the cursor is really in the right block, move it there. - * If it's just pointing past the last entry in left, then we'll - * insert there, so don't change anything in that case. - */ - if (cur->bc_ptrs[level] > be16_to_cpu(left->bb_numrecs) + 1) { - xfs_btree_setbuf(cur, level, rbp); - cur->bc_ptrs[level] -= be16_to_cpu(left->bb_numrecs); - } - /* - * If there are more levels, we'll need another cursor which refers to - * the right block, no matter where this cursor was. - */ - if (level + 1 < cur->bc_nlevels) { - if ((error = xfs_btree_dup_cursor(cur, curp))) - return error; - (*curp)->bc_ptrs[level + 1]++; - } - *bnop = rbno; - *stat = 1; - return 0; + kp[index] = newkey->u.alloc; +} + +STATIC void +xfs_alloc_set_ptr( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr_addr, + int index, + xfs_btree_ptr_t *newptr) +{ + xfs_alloc_ptr_t *pp = &ptr_addr->u.alloc; + + pp[index] = newptr->u.alloc; +} + +STATIC void +xfs_alloc_set_rec( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec_addr, + int index, + xfs_btree_rec_t *newrec) +{ + xfs_alloc_rec_t *rp = &rec_addr->u.alloc; + + rp[index] = newrec->u.alloc; } /* - * Update keys at all levels from here to the root along the cursor's path. + * Log keys from a btree block (nonleaf). */ -STATIC int /* error */ -xfs_alloc_updkey( +STATIC void +xfs_alloc_log_keys( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_alloc_key_t *keyp, /* new key value to update to */ - int level) /* starting level for update */ + xfs_buf_t *bp, /* buffer containing btree block */ + int kfirst, /* index of first key to log */ + int klast) /* index of last key to log */ { - int ptr; /* index of key in block */ - - /* - * Go up the tree from this level toward the root. - * At each level, update the key value to the value input. - * Stop when we reach a level where the cursor isn't pointing - * at the first entry in the block. - */ - for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { - xfs_alloc_block_t *block; /* btree block */ - xfs_buf_t *bp; /* buffer for block */ -#ifdef DEBUG - int error; /* error return value */ -#endif - xfs_alloc_key_t *kp; /* ptr to btree block keys */ + xfs_alloc_block_t *block; /* btree block to log from */ + int first; /* first byte offset logged */ + xfs_alloc_key_t *kp; /* key pointer in btree block */ + int last; /* last byte offset logged */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - ptr = cur->bc_ptrs[level]; - kp = XFS_ALLOC_KEY_ADDR(block, ptr, cur); - *kp = *keyp; - xfs_alloc_log_keys(cur, bp, ptr, ptr); - } - return 0; + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, kfirst, klast); + block = XFS_BUF_TO_ALLOC_BLOCK(bp); + kp = XFS_ALLOC_KEY_ADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); } /* - * Externally visible routines. + * Log block pointer fields from a btree block (nonleaf). */ +STATIC void +xfs_alloc_log_ptrs( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_buf_t *bp, /* buffer containing btree block */ + int pfirst, /* index of first pointer to log */ + int plast) /* index of last pointer to log */ +{ + xfs_alloc_block_t *block; /* btree block to log from */ + int first; /* first byte offset logged */ + int last; /* last byte offset logged */ + xfs_alloc_ptr_t *pp; /* block-pointer pointer in btree blk */ + + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, pfirst, plast); + block = XFS_BUF_TO_ALLOC_BLOCK(bp); + pp = XFS_ALLOC_PTR_ADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); +} /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. + * Log records from a btree block (leaf). */ -int /* error */ -xfs_alloc_decrement( +STATIC void +xfs_alloc_log_recs( xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ + xfs_buf_t *bp, /* buffer containing btree block */ + int rfirst, /* index of first record to log */ + int rlast) /* index of last record to log */ { - xfs_alloc_block_t *block; /* btree block */ - int error; /* error return value */ - int lev; /* btree level */ + xfs_alloc_block_t *block; /* btree block to log from */ + int first; /* first byte offset logged */ + int last; /* last byte offset logged */ + xfs_alloc_rec_t *rp; /* record pointer for btree block */ - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the left at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); - /* - * Decrement the ptr at this level. If we're still in the block - * then we're done. - */ - if (--cur->bc_ptrs[level] > 0) { - *stat = 1; - return 0; - } - /* - * Get a pointer to the btree block. - */ - block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[level]); + + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, rfirst, rlast); + block = XFS_BUF_TO_ALLOC_BLOCK(bp); + rp = XFS_ALLOC_REC_ADDR(block, 1, cur); #ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, - cur->bc_bufs[level]))) - return error; -#endif - /* - * If we just went off the left edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; + { + xfs_agf_t *agf; + xfs_alloc_rec_t *p; + + agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + for (p = &rp[rfirst - 1]; p <= &rp[rlast - 1]; p++) + ASSERT(be32_to_cpu(p->ar_startblock) + + be32_to_cpu(p->ar_blockcount) <= + be32_to_cpu(agf->agf_length)); } +#endif + first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); +} + +static const struct xfs_btree_record_ops xfs_alloc_recops = { + .get_minrecs = xfs_alloc_get_minrecs, + .get_maxrecs = xfs_alloc_get_maxrecs, + .get_numrecs = xfs_btree_get_numrecs, + .set_numrecs = xfs_btree_set_numrecs, + + .init_key_from_rec = xfs_alloc_init_key_from_rec, + .init_ptr_from_cur = xfs_alloc_init_ptr_from_cur, + .init_rec_from_key = xfs_alloc_init_rec_from_key, + .init_rec_from_cur = xfs_alloc_init_rec_from_cur, + + .key_addr = xfs_alloc_key_addr, + .ptr_addr = xfs_alloc_ptr_addr, + .rec_addr = xfs_alloc_rec_addr, + + .key_diff = xfs_alloc_key_diff, + .ptr_to_daddr = xfs_alloc_ptr_to_daddr, + + .move_keys = xfs_alloc_move_keys, + .move_ptrs = xfs_alloc_move_ptrs, + .move_recs = xfs_alloc_move_recs, + + .set_key = xfs_alloc_set_key, + .set_ptr = xfs_alloc_set_ptr, + .set_rec = xfs_alloc_set_rec, + + .log_keys = xfs_alloc_log_keys, + .log_ptrs = xfs_alloc_log_ptrs, + .log_recs = xfs_alloc_log_recs, + + .check_ptrs = xfs_btree_check_sptr, +}; + +STATIC void +xfs_alloc_setroot( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int inc) +{ + xfs_agf_t *agf; /* a.g. freespace header */ + xfs_agnumber_t seqno; + + agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + + BUG_ON(ptr->u.alloc == 0); + agf->agf_roots[cur->bc_btnum] = ptr->u.alloc; + be32_add(&agf->agf_levels[cur->bc_btnum], inc); + + seqno = be32_to_cpu(agf->agf_seqno); + cur->bc_mp->m_perag[seqno].pagf_levels[cur->bc_btnum] += inc; + + xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, + XFS_AGF_ROOTS | XFS_AGF_LEVELS); +} + +STATIC int +xfs_alloc_killroot( + xfs_btree_cur_t *cur, + int level, + xfs_btree_ptr_t *newroot) +{ + xfs_agf_t *agf; /* allocation group freelist header */ + xfs_agblock_t bno; /* old root block number */ + int error; + + agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); + /* - * March up the tree decrementing pointers. - * Stop when we don't go off the left edge of a block. + * Set the root entry in the agf structure, + * decreasing the level by 1. */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) - break; - /* - * Read-ahead the left block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); - } + bno = be32_to_cpu(agf->agf_roots[cur->bc_btnum]); + xfs_alloc_setroot(cur, newroot, -1); /* - * If we went off the root then we are seriously confused. + * Put this buffer/block on the ag's freelist. */ - ASSERT(lev < cur->bc_nlevels); + BUG_ON(bno == 0); + error = xfs_alloc_put_freelist(cur->bc_tp, + cur->bc_private.a.agbp, NULL, bno, 1); + if (error) + return error; /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. + * Since blocks move to the free list without the + * coordination used in xfs_bmap_finish, we can't allow + * block to be available for reallocation and + * non-transaction writing (user data) until we know + * that the transaction that moved it to the free list + * is permanently on disk. We track the blocks by + * declaring these blocks as "busy"; the busy list is + * maintained on a per-ag basis and each transaction + * records which entries should be removed when the + * iclog commits to disk. If a busy block is + * allocated, the iclog is pushed up to the LSN + * that freed the block. */ - for (block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[lev]); lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - xfs_buf_t *bp; /* buffer pointer for block */ - - agbno = be32_to_cpu(*XFS_ALLOC_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, agbno, 0, &bp, - XFS_ALLOC_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = be16_to_cpu(block->bb_numrecs); - } - *stat = 1; - return 0; -} - -/* - * Delete the record pointed to by cur. - * The cursor refers to the place where the record was (could be inserted) - * when the operation returns. - */ -int /* error */ -xfs_alloc_delete( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ -{ - int error; /* error return value */ - int i; /* result code */ - int level; /* btree level */ + xfs_alloc_mark_busy(cur->bc_tp, + be32_to_cpu(agf->agf_seqno), bno, 1); + xfs_trans_agbtree_delta(cur->bc_tp, -1); /* - * Go up the tree, starting at leaf level. - * If 2 is returned then a join was done; go to the next level. - * Otherwise we are done. + * Update the cursor so there's one fewer level. */ - for (level = 0, i = 2; i == 2; level++) { - if ((error = xfs_alloc_delrec(cur, level, &i))) - return error; - } - if (i == 0) { - for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_alloc_decrement(cur, level, &i))) - return error; - break; - } - } - } - *stat = i; + xfs_btree_setbuf(cur, level, NULL); + cur->bc_nlevels--; return 0; } /* - * Get the data from the pointed-to record. + * update the longest extent in the AGF */ -int /* error */ -xfs_alloc_get_rec( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t *bno, /* output: starting block of extent */ - xfs_extlen_t *len, /* output: length of extent */ - int *stat) /* output: success/failure */ +STATIC int +xfs_alloc_update_lastrec( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block) { - xfs_alloc_block_t *block; /* btree block */ -#ifdef DEBUG - int error; /* error return value */ -#endif - int ptr; /* record number */ + xfs_agf_t *agf; /* allocation group freelist header */ + xfs_alloc_rec_t *rrp; /* right block record pointer */ + int numrecs; - ptr = cur->bc_ptrs[0]; - block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[0]); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, 0, cur->bc_bufs[0]))) - return error; -#endif + if (cur->bc_btnum != XFS_BTNUM_CNT) + return 0; + + agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); /* - * Off the right end or left end, return failure. + * There are still records in the block. Grab the size + * from the last one. */ - if (ptr > be16_to_cpu(block->bb_numrecs) || ptr <= 0) { - *stat = 0; - return 0; + numrecs = xfs_btree_get_numrecs(cur, block); + if (numrecs) { + rrp = XFS_ALLOC_REC_ADDR(block, numrecs, cur); + ASSERT(be32_to_cpu(rrp->ar_blockcount) >= + be32_to_cpu(agf->agf_longest)); + agf->agf_longest = rrp->ar_blockcount; } /* - * Point to the record and extract its data. + * No free extents left. */ - { - xfs_alloc_rec_t *rec; /* record data */ + else + agf->agf_longest = 0; - rec = XFS_ALLOC_REC_ADDR(block, ptr, cur); - *bno = be32_to_cpu(rec->ar_startblock); - *len = be32_to_cpu(rec->ar_blockcount); - } - *stat = 1; + cur->bc_mp->m_perag[be32_to_cpu(agf->agf_seqno)].pagf_longest = + be32_to_cpu(agf->agf_longest); + xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, XFS_AGF_LONGEST); return 0; } +static const struct xfs_btree_cur_ops xfs_alloc_curops = { + .update_lastrec = xfs_alloc_update_lastrec, + .set_root = xfs_alloc_setroot, + .new_root = xfs_btree_newroot, + .kill_root = xfs_alloc_killroot, +}; + +#if defined(XFS_BTREE_TRACE) + /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. + * Global alloc btree trace buffer */ -int /* error */ -xfs_alloc_increment( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ +ktrace_t *xfs_allocbt_trace_buf; +/* + * Add a trace buffer entry for the arguments given to the routine, + * generic form. + */ +STATIC void +xfs_alloc_trace_enter( + const char *func, + xfs_btree_cur_t *cur, + char *s, + int type, + int line, + __psunsigned_t a0, + __psunsigned_t a1, + __psunsigned_t a2, + __psunsigned_t a3, + __psunsigned_t a4, + __psunsigned_t a5, + __psunsigned_t a6, + __psunsigned_t a7, + __psunsigned_t a8, + __psunsigned_t a9, + __psunsigned_t a10) +{ + ktrace_enter(xfs_allocbt_trace_buf, + (void *)(__psint_t)type, + (void *)func, (void *)s, (void *)ip, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); +} + +STATIC void +xfs_alloc_trace_cursor( + xfs_btree_cur_t *cur, + __uint32_t *s0, + __uint64_t *l0, + __uint64_t *l1) +{ + *s0 = cur->bc_private.a.agno; + *l0 = cur->bc_rec.a.ar_startblock; + *l1 = cur->bc_rec.a.ar_blockcount; +} + +STATIC void +xfs_alloc_trace_record( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec, + __uint64_t *l0, + __uint64_t *l1, + __uint64_t *l2) { - xfs_alloc_block_t *block; /* btree block */ - xfs_buf_t *bp; /* tree block buffer */ - int error; /* error return value */ - int lev; /* btree level */ + *l0 = be32_to_cpu(&rec->u.alloc.ar_startblock); + *l1 = be32_to_cpu(&rec->u.alloc.ar_blockcount); + *l2 = 0; +} - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the right at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); - /* - * Get a pointer to the btree block. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; +static const struct xfs_btree_trc_ops xfs_alloc_trcops = { + .enter = xfs_alloc_trace_enter, + .cursor = xfs_alloc_trace_cursor, + .record = xfs_alloc_trace_record, +}; #endif - /* - * Increment the ptr at this level. If we're still in the block - * then we're done. - */ - if (++cur->bc_ptrs[level] <= be16_to_cpu(block->bb_numrecs)) { - *stat = 1; - return 0; - } - /* - * If we just went off the right edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree incrementing pointers. - * Stop when we don't go off the right edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - bp = cur->bc_bufs[lev]; - block = XFS_BUF_TO_ALLOC_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; + +void +xfs_alloc_init_cursor( + xfs_btree_cur_t *cur) +{ + cur->bc_flags = 0; + if (cur->bc_btnum == XFS_BTNUM_CNT) + cur->bc_flags |= XFS_BTREE_LASTREC_UPDATE; + cur->bc_curops = &xfs_alloc_curops; + cur->bc_blkops = &xfs_alloc_blkops; + cur->bc_recops = &xfs_alloc_recops; +#if defined(XFS_BTREE_TRACE) + cur->bc_trcops = &xfs_alloc_trcops; #endif - if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) - break; - /* - * Read-ahead the right block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (bp = cur->bc_bufs[lev], block = XFS_BUF_TO_ALLOC_BLOCK(bp); - lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - - agbno = be32_to_cpu(*XFS_ALLOC_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.a.agno, agbno, 0, &bp, - XFS_ALLOC_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_ALLOC_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = 1; - } - *stat = 1; - return 0; } /* - * Insert the current record at the point referenced by cur. - * The cursor may be inconsistent on return if splits have been done. + * ALLOC functions that are not covered by core btree code. + * Externally visible routines. + */ + +/* + * Update the record referred to by cur, to the value given by [bno, len]. + * This either works (return 0) or gets an EFSCORRUPTED error. */ int /* error */ -xfs_alloc_insert( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ +xfs_alloc_update( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_agblock_t bno, /* starting block of extent */ + xfs_extlen_t len) /* length of extent */ { - int error; /* error return value */ - int i; /* result value, 0 for failure */ - int level; /* current level number in btree */ - xfs_agblock_t nbno; /* new block number (split result) */ - xfs_btree_cur_t *ncur; /* new cursor (split result) */ - xfs_alloc_rec_t nrec; /* record being inserted this level */ - xfs_btree_cur_t *pcur; /* previous level's cursor */ - - level = 0; - nbno = NULLAGBLOCK; - nrec.ar_startblock = cpu_to_be32(cur->bc_rec.a.ar_startblock); - nrec.ar_blockcount = cpu_to_be32(cur->bc_rec.a.ar_blockcount); - ncur = NULL; - pcur = cur; - /* - * Loop going up the tree, starting at the leaf level. - * Stop when we don't get a split block, that must mean that - * the insert is finished with this level. - */ - do { - /* - * Insert nrec/nbno into this level of the tree. - * Note if we fail, nbno will be null. - */ - if ((error = xfs_alloc_insrec(pcur, level++, &nbno, &nrec, &ncur, - &i))) { - if (pcur != cur) - xfs_btree_del_cursor(pcur, XFS_BTREE_ERROR); - return error; - } - /* - * See if the cursor we just used is trash. - * Can't trash the caller's cursor, but otherwise we should - * if ncur is a new cursor or we're about to be done. - */ - if (pcur != cur && (ncur || nbno == NULLAGBLOCK)) { - cur->bc_nlevels = pcur->bc_nlevels; - xfs_btree_del_cursor(pcur, XFS_BTREE_NOERROR); - } - /* - * If we got a new cursor, switch to it. - */ - if (ncur) { - pcur = ncur; - ncur = NULL; - } - } while (nbno != NULLAGBLOCK); - *stat = i; - return 0; + xfs_btree_rec_t rec; + + rec.u.alloc.ar_startblock = cpu_to_be32(bno); + rec.u.alloc.ar_blockcount = cpu_to_be32(len); + return xfs_btree_update(cur, &rec); } /* @@ -2105,7 +828,7 @@ xfs_alloc_lookup_eq( { cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_alloc_lookup(cur, XFS_LOOKUP_EQ, stat); + return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); } /* @@ -2121,7 +844,7 @@ xfs_alloc_lookup_ge( { cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_alloc_lookup(cur, XFS_LOOKUP_GE, stat); + return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); } /* @@ -2137,75 +860,53 @@ xfs_alloc_lookup_le( { cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_alloc_lookup(cur, XFS_LOOKUP_LE, stat); + return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); } /* - * Update the record referred to by cur, to the value given by [bno, len]. - * This either works (return 0) or gets an EFSCORRUPTED error. + * Get the data from the pointed-to record. */ int /* error */ -xfs_alloc_update( +xfs_alloc_get_rec( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t bno, /* starting block of extent */ - xfs_extlen_t len) /* length of extent */ + xfs_agblock_t *bno, /* output: starting block of extent */ + xfs_extlen_t *len, /* output: length of extent */ + int *stat) /* output: success/failure */ { - xfs_alloc_block_t *block; /* btree block to update */ + xfs_btree_block_t *block; /* btree block */ + xfs_btree_rec_t *rec; /* record data */ + xfs_buf_t *bp; /* buffer containing btree block */ +#ifdef DEBUG int error; /* error return value */ - int ptr; /* current record number (updating) */ +#endif + int ptr; /* record number */ - ASSERT(len > 0); - /* - * Pick up the a.g. freelist struct and the current block. - */ - block = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[0]); + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGFFF(cur, *ino, *fcnt, *free); + + ptr = cur->bc_ptrs[0]; + block = xfs_alloc_get_block(cur, 0, &bp); #ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, 0, cur->bc_bufs[0]))) + error = xfs_btree_check_sblock(cur, block, 0, bp); + if (error) return error; #endif /* - * Get the address of the rec to be updated. - */ - ptr = cur->bc_ptrs[0]; - { - xfs_alloc_rec_t *rp; /* pointer to updated record */ - - rp = XFS_ALLOC_REC_ADDR(block, ptr, cur); - /* - * Fill in the new contents and log them. - */ - rp->ar_startblock = cpu_to_be32(bno); - rp->ar_blockcount = cpu_to_be32(len); - xfs_alloc_log_recs(cur, cur->bc_bufs[0], ptr, ptr); - } - /* - * If it's the by-size btree and it's the last leaf block and - * it's the last record... then update the size of the longest - * extent in the a.g., which we cache in the a.g. freelist header. + * Off the right end or left end, return failure. */ - if (cur->bc_btnum == XFS_BTNUM_CNT && - be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK && - ptr == be16_to_cpu(block->bb_numrecs)) { - xfs_agf_t *agf; /* a.g. freespace header */ - xfs_agnumber_t seqno; - - agf = XFS_BUF_TO_AGF(cur->bc_private.a.agbp); - seqno = be32_to_cpu(agf->agf_seqno); - cur->bc_mp->m_perag[seqno].pagf_longest = len; - agf->agf_longest = cpu_to_be32(len); - xfs_alloc_log_agf(cur->bc_tp, cur->bc_private.a.agbp, - XFS_AGF_LONGEST); + if (ptr > be16_to_cpu(block->bb_h.bb_numrecs) || ptr <= 0) { + XFS_BTREE_TRACE_CURSOR(cur, EXIT); + *stat = 0; + return 0; } /* - * Updating first record in leaf. Pass new key value up to our parent. + * Point to the record and extract its data. */ - if (ptr == 1) { - xfs_alloc_key_t key; /* key containing [bno, len] */ - - key.ar_startblock = cpu_to_be32(bno); - key.ar_blockcount = cpu_to_be32(len); - if ((error = xfs_alloc_updkey(cur, &key, 1))) - return error; - } + rec = xfs_alloc_rec_addr(cur, ptr, block); + *bno = be32_to_cpu(rec->u.alloc.ar_startblock); + *len = be32_to_cpu(rec->u.alloc.ar_blockcount); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); + *stat = 1; return 0; } + Index: 2.6.x-xfs-new/fs/xfs/xfs_alloc_btree.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_alloc_btree.h 2007-02-07 13:24:32.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_alloc_btree.h 2007-11-06 19:40:29.702675076 +1100 @@ -94,6 +94,8 @@ typedef struct xfs_btree_sblock xfs_allo #define XFS_ALLOC_PTR_ADDR(bb,i,cur) \ XFS_BTREE_PTR_ADDR(xfs_alloc, bb, i, XFS_ALLOC_BLOCK_MAXRECS(1, cur)) +extern void xfs_alloc_init_cursor(struct xfs_btree_cur *cur); + /* * Decrement cursor by one record at the level. * For nonzero levels the leaf-ward information is untouched. Index: 2.6.x-xfs-new/fs/xfs/xfs_bmap.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_bmap.c 2007-11-05 10:08:51.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_bmap.c 2007-11-06 19:40:29.710674046 +1100 @@ -817,10 +817,10 @@ xfs_bmap_add_extent_delay_real( RIGHT.br_blockcount, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; ASSERT(i == 1); if ((error = xfs_bmbt_update(cur, LEFT.br_startoff, @@ -930,7 +930,7 @@ xfs_bmap_add_extent_delay_real( goto done; ASSERT(i == 0); cur->bc_rec.b.br_state = XFS_EXT_NORM; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -1006,7 +1006,7 @@ xfs_bmap_add_extent_delay_real( goto done; ASSERT(i == 0); cur->bc_rec.b.br_state = XFS_EXT_NORM; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -1096,7 +1096,7 @@ xfs_bmap_add_extent_delay_real( goto done; ASSERT(i == 0); cur->bc_rec.b.br_state = XFS_EXT_NORM; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -1151,7 +1151,7 @@ xfs_bmap_add_extent_delay_real( goto done; ASSERT(i == 0); cur->bc_rec.b.br_state = XFS_EXT_NORM; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -1378,16 +1378,16 @@ xfs_bmap_add_extent_unwritten_real( RIGHT.br_blockcount, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; ASSERT(i == 1); if ((error = xfs_bmbt_update(cur, LEFT.br_startoff, @@ -1427,10 +1427,10 @@ xfs_bmap_add_extent_unwritten_real( &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; ASSERT(i == 1); if ((error = xfs_bmbt_update(cur, LEFT.br_startoff, @@ -1470,10 +1470,10 @@ xfs_bmap_add_extent_unwritten_real( RIGHT.br_blockcount, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; ASSERT(i == 1); if ((error = xfs_bmbt_update(cur, new->br_startoff, @@ -1556,7 +1556,7 @@ xfs_bmap_add_extent_unwritten_real( PREV.br_blockcount - new->br_blockcount, oldext))) goto done; - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; if (xfs_bmbt_update(cur, LEFT.br_startoff, LEFT.br_startblock, @@ -1604,7 +1604,7 @@ xfs_bmap_add_extent_unwritten_real( oldext))) goto done; cur->bc_rec.b = *new; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -1646,7 +1646,7 @@ xfs_bmap_add_extent_unwritten_real( PREV.br_blockcount - new->br_blockcount, oldext))) goto done; - if ((error = xfs_bmbt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto done; if ((error = xfs_bmbt_update(cur, new->br_startoff, new->br_startblock, @@ -1694,7 +1694,7 @@ xfs_bmap_add_extent_unwritten_real( goto done; ASSERT(i == 0); cur->bc_rec.b.br_state = XFS_EXT_NORM; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -1742,15 +1742,15 @@ xfs_bmap_add_extent_unwritten_real( PREV.br_blockcount = new->br_startoff - PREV.br_startoff; cur->bc_rec.b = PREV; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto done; ASSERT(i == 1); /* new middle extent - newext */ cur->bc_rec.b = *new; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -2098,10 +2098,10 @@ xfs_bmap_add_extent_hole_real( right.br_blockcount, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); - if ((error = xfs_bmbt_decrement(cur, 0, &i))) + if ((error = xfs_btree_decrement(cur, 0, &i))) goto done; ASSERT(i == 1); if ((error = xfs_bmbt_update(cur, left.br_startoff, @@ -2210,7 +2210,7 @@ xfs_bmap_add_extent_hole_real( goto done; ASSERT(i == 0); cur->bc_rec.b.br_state = new->br_state; - if ((error = xfs_bmbt_insert(cur, &i))) + if ((error = xfs_btree_insert(cur, &i))) goto done; ASSERT(i == 1); } @@ -2989,7 +2989,7 @@ xfs_bmap_btree_to_extents( int whichfork) /* data or attr fork */ { /* REFERENCED */ - xfs_bmbt_block_t *cblock;/* child btree block */ + xfs_btree_block_t *cblock;/* child btree block */ xfs_fsblock_t cbno; /* child block number */ xfs_buf_t *cbp; /* child block's buffer */ int error; /* error return value */ @@ -3016,7 +3016,7 @@ xfs_bmap_btree_to_extents( if ((error = xfs_btree_read_bufl(mp, tp, cbno, 0, &cbp, XFS_BMAP_BTREE_REF))) return error; - cblock = XFS_BUF_TO_BMBT_BLOCK(cbp); + cblock = XFS_BUF_TO_BLOCK(cbp); if ((error = xfs_btree_check_lblock(cur, cblock, 0, cbp))) return error; xfs_bmap_add_free(cbno, 1, cur->bc_private.b.flist, mp); @@ -3163,7 +3163,7 @@ xfs_bmap_del_extent( flags |= XFS_ILOG_FEXT(whichfork); break; } - if ((error = xfs_bmbt_delete(cur, &i))) + if ((error = xfs_btree_delete(cur, &i))) goto done; ASSERT(i == 1); break; @@ -3247,10 +3247,10 @@ xfs_bmap_del_extent( got.br_startblock, temp, got.br_state))) goto done; - if ((error = xfs_bmbt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto done; cur->bc_rec.b = new; - error = xfs_bmbt_insert(cur, &i); + error = xfs_btree_insert(cur, &i); if (error && error != ENOSPC) goto done; /* Index: 2.6.x-xfs-new/fs/xfs/xfs_bmap_btree.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_bmap_btree.c 2007-11-05 10:09:31.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_bmap_btree.c 2007-11-06 19:41:45.344933663 +1100 @@ -35,1466 +35,544 @@ #include "xfs_dinode.h" #include "xfs_inode.h" #include "xfs_inode_item.h" -#include "xfs_alloc.h" #include "xfs_btree.h" #include "xfs_ialloc.h" +#include "xfs_alloc.h" #include "xfs_itable.h" #include "xfs_bmap.h" #include "xfs_error.h" #include "xfs_quota.h" -#if defined(XFS_BMBT_TRACE) -ktrace_t *xfs_bmbt_trace_buf; -#endif - /* - * Prototypes for internal btree functions. + * Determine the extent state. */ - - -STATIC int xfs_bmbt_killroot(xfs_btree_cur_t *); -STATIC void xfs_bmbt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_bmbt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC int xfs_bmbt_lshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_bmbt_rshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_bmbt_split(xfs_btree_cur_t *, int, xfs_fsblock_t *, - __uint64_t *, xfs_btree_cur_t **, int *); -STATIC int xfs_bmbt_updkey(xfs_btree_cur_t *, xfs_bmbt_key_t *, int); - - -#if defined(XFS_BMBT_TRACE) - -static char ARGS[] = "args"; -static char ENTRY[] = "entry"; -static char ERROR[] = "error"; -#undef EXIT -static char EXIT[] = "exit"; +/* ARGSUSED */ +STATIC xfs_exntst_t +xfs_extent_state( + xfs_filblks_t blks, + int extent_flag) +{ + if (extent_flag) { + ASSERT(blks != 0); /* saved for DMIG */ + return XFS_EXT_UNWRITTEN; + } + return XFS_EXT_NORM; +} /* - * Add a trace buffer entry for the arguments given to the routine, - * generic form. + * Convert on-disk form of btree root to in-memory form. */ -STATIC void -xfs_bmbt_trace_enter( - const char *func, - xfs_btree_cur_t *cur, - char *s, - int type, - int line, - __psunsigned_t a0, - __psunsigned_t a1, - __psunsigned_t a2, - __psunsigned_t a3, - __psunsigned_t a4, - __psunsigned_t a5, - __psunsigned_t a6, - __psunsigned_t a7, - __psunsigned_t a8, - __psunsigned_t a9, - __psunsigned_t a10) +void +xfs_bmdr_to_bmbt( + xfs_bmdr_block_t *dblock, + int dblocklen, + xfs_bmbt_block_t *rblock, + int rblocklen) { - xfs_inode_t *ip; - int whichfork; + int dmxr; + xfs_bmbt_key_t *fkp; + __be64 *fpp; + xfs_bmbt_key_t *tkp; + __be64 *tpp; - ip = cur->bc_private.b.ip; - whichfork = cur->bc_private.b.whichfork; - ktrace_enter(xfs_bmbt_trace_buf, - (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), - (void *)func, (void *)s, (void *)ip, (void *)cur, - (void *)a0, (void *)a1, (void *)a2, (void *)a3, - (void *)a4, (void *)a5, (void *)a6, (void *)a7, - (void *)a8, (void *)a9, (void *)a10); - ASSERT(ip->i_btrace); - ktrace_enter(ip->i_btrace, - (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), - (void *)func, (void *)s, (void *)ip, (void *)cur, - (void *)a0, (void *)a1, (void *)a2, (void *)a3, - (void *)a4, (void *)a5, (void *)a6, (void *)a7, - (void *)a8, (void *)a9, (void *)a10); + rblock->bb_magic = cpu_to_be32(XFS_BMAP_MAGIC); + rblock->bb_level = dblock->bb_level; + ASSERT(be16_to_cpu(rblock->bb_level) > 0); + rblock->bb_numrecs = dblock->bb_numrecs; + rblock->bb_leftsib = cpu_to_be64(NULLDFSBNO); + rblock->bb_rightsib = cpu_to_be64(NULLDFSBNO); + dmxr = (int)XFS_BTREE_BLOCK_MAXRECS(dblocklen, xfs_bmdr, 0); + fkp = XFS_BTREE_KEY_ADDR(xfs_bmdr, dblock, 1); + tkp = XFS_BMAP_BROOT_KEY_ADDR(rblock, 1, rblocklen); + fpp = XFS_BTREE_PTR_ADDR(xfs_bmdr, dblock, 1, dmxr); + tpp = XFS_BMAP_BROOT_PTR_ADDR(rblock, 1, rblocklen); + dmxr = be16_to_cpu(dblock->bb_numrecs); + memcpy(tkp, fkp, sizeof(*fkp) * dmxr); + memcpy(tpp, fpp, sizeof(*fpp) * dmxr); } + /* - * Add a trace buffer entry for arguments, for a buffer & 1 integer arg. + * Convert a compressed bmap extent record to an uncompressed form. + * This code must be in sync with the routines xfs_bmbt_get_startoff, + * xfs_bmbt_get_startblock, xfs_bmbt_get_blockcount and xfs_bmbt_get_state. */ -STATIC void -xfs_bmbt_trace_argbi( - const char *func, - xfs_btree_cur_t *cur, - xfs_buf_t *b, - int i, - int line) +STATIC_INLINE void +__xfs_bmbt_get_all( + __uint64_t l0, + __uint64_t l1, + xfs_bmbt_irec_t *s) { - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGBI, line, - (__psunsigned_t)b, i, 0, 0, - 0, 0, 0, 0, - 0, 0, 0); + int ext_flag; + xfs_exntst_t st; + + ext_flag = (int)(l0 >> (64 - BMBT_EXNTFLAG_BITLEN)); + s->br_startoff = ((xfs_fileoff_t)l0 & + XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN)) >> 9; +#if XFS_BIG_BLKNOS + s->br_startblock = (((xfs_fsblock_t)l0 & XFS_MASK64LO(9)) << 43) | + (((xfs_fsblock_t)l1) >> 21); +#else +#ifdef DEBUG + { + xfs_dfsbno_t b; + + b = (((xfs_dfsbno_t)l0 & XFS_MASK64LO(9)) << 43) | + (((xfs_dfsbno_t)l1) >> 21); + ASSERT((b >> 32) == 0 || ISNULLDSTARTBLOCK(b)); + s->br_startblock = (xfs_fsblock_t)b; + } +#else /* !DEBUG */ + s->br_startblock = (xfs_fsblock_t)(((xfs_dfsbno_t)l1) >> 21); +#endif /* DEBUG */ +#endif /* XFS_BIG_BLKNOS */ + s->br_blockcount = (xfs_filblks_t)(l1 & XFS_MASK64LO(21)); + /* This is xfs_extent_state() in-line */ + if (ext_flag) { + ASSERT(s->br_blockcount != 0); /* saved for DMIG */ + st = XFS_EXT_UNWRITTEN; + } else + st = XFS_EXT_NORM; + s->br_state = st; } -/* - * Add a trace buffer entry for arguments, for a buffer & 2 integer args. - */ -STATIC void -xfs_bmbt_trace_argbii( - const char *func, - xfs_btree_cur_t *cur, - xfs_buf_t *b, - int i0, - int i1, - int line) +void +xfs_bmbt_get_all( + xfs_bmbt_rec_host_t *r, + xfs_bmbt_irec_t *s) { - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGBII, line, - (__psunsigned_t)b, i0, i1, 0, - 0, 0, 0, 0, - 0, 0, 0); + __xfs_bmbt_get_all(r->l0, r->l1, s); } /* - * Add a trace buffer entry for arguments, for 3 block-length args - * and an integer arg. + * Extract the blockcount field from an in memory bmap extent record. */ -STATIC void -xfs_bmbt_trace_argfffi( - const char *func, - xfs_btree_cur_t *cur, - xfs_dfiloff_t o, - xfs_dfsbno_t b, - xfs_dfilblks_t i, - int j, - int line) +xfs_filblks_t +xfs_bmbt_get_blockcount( + xfs_bmbt_rec_host_t *r) { - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGFFFI, line, - o >> 32, (int)o, b >> 32, (int)b, - i >> 32, (int)i, (int)j, 0, - 0, 0, 0); + return (xfs_filblks_t)(r->l1 & XFS_MASK64LO(21)); } /* - * Add a trace buffer entry for arguments, for one integer arg. + * Extract the startblock field from an in memory bmap extent record. */ -STATIC void -xfs_bmbt_trace_argi( - const char *func, - xfs_btree_cur_t *cur, - int i, - int line) +xfs_fsblock_t +xfs_bmbt_get_startblock( + xfs_bmbt_rec_host_t *r) { - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGI, line, - i, 0, 0, 0, - 0, 0, 0, 0, - 0, 0, 0); +#if XFS_BIG_BLKNOS + return (((xfs_fsblock_t)r->l0 & XFS_MASK64LO(9)) << 43) | + (((xfs_fsblock_t)r->l1) >> 21); +#else +#ifdef DEBUG + xfs_dfsbno_t b; + + b = (((xfs_dfsbno_t)r->l0 & XFS_MASK64LO(9)) << 43) | + (((xfs_dfsbno_t)r->l1) >> 21); + ASSERT((b >> 32) == 0 || ISNULLDSTARTBLOCK(b)); + return (xfs_fsblock_t)b; +#else /* !DEBUG */ + return (xfs_fsblock_t)(((xfs_dfsbno_t)r->l1) >> 21); +#endif /* DEBUG */ +#endif /* XFS_BIG_BLKNOS */ } /* - * Add a trace buffer entry for arguments, for int, fsblock, key. + * Extract the startoff field from an in memory bmap extent record. */ -STATIC void -xfs_bmbt_trace_argifk( - const char *func, - xfs_btree_cur_t *cur, - int i, - xfs_fsblock_t f, - xfs_dfiloff_t o, - int line) +xfs_fileoff_t +xfs_bmbt_get_startoff( + xfs_bmbt_rec_host_t *r) { - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGIFK, line, - i, (xfs_dfsbno_t)f >> 32, (int)f, o >> 32, - (int)o, 0, 0, 0, - 0, 0, 0); + return ((xfs_fileoff_t)r->l0 & + XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN)) >> 9; } -/* - * Add a trace buffer entry for arguments, for int, fsblock, rec. - */ -STATIC void -xfs_bmbt_trace_argifr( - const char *func, - xfs_btree_cur_t *cur, - int i, - xfs_fsblock_t f, - xfs_bmbt_rec_t *r, - int line) +xfs_exntst_t +xfs_bmbt_get_state( + xfs_bmbt_rec_host_t *r) { - xfs_dfsbno_t b; - xfs_dfilblks_t c; - xfs_dfsbno_t d; - xfs_dfiloff_t o; - xfs_bmbt_irec_t s; - - d = (xfs_dfsbno_t)f; - xfs_bmbt_disk_get_all(r, &s); - o = (xfs_dfiloff_t)s.br_startoff; - b = (xfs_dfsbno_t)s.br_startblock; - c = s.br_blockcount; - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGIFR, line, - i, d >> 32, (int)d, o >> 32, - (int)o, b >> 32, (int)b, c >> 32, - (int)c, 0, 0); + int ext_flag; + + ext_flag = (int)((r->l0) >> (64 - BMBT_EXNTFLAG_BITLEN)); + return xfs_extent_state(xfs_bmbt_get_blockcount(r), + ext_flag); } -/* - * Add a trace buffer entry for arguments, for int, key. - */ -STATIC void -xfs_bmbt_trace_argik( - const char *func, - xfs_btree_cur_t *cur, - int i, - xfs_bmbt_key_t *k, - int line) +/* Endian flipping versions of the bmbt extraction functions */ +void +xfs_bmbt_disk_get_all( + xfs_bmbt_rec_t *r, + xfs_bmbt_irec_t *s) { - xfs_dfiloff_t o; - - o = be64_to_cpu(k->br_startoff); - xfs_bmbt_trace_enter(func, cur, ARGS, XFS_BMBT_KTRACE_ARGIFK, line, - i, o >> 32, (int)o, 0, - 0, 0, 0, 0, - 0, 0, 0); + __xfs_bmbt_get_all(be64_to_cpu(r->l0), be64_to_cpu(r->l1), s); } /* - * Add a trace buffer entry for the cursor/operation. + * Extract the blockcount field from an on disk bmap extent record. */ -STATIC void -xfs_bmbt_trace_cursor( - const char *func, - xfs_btree_cur_t *cur, - char *s, - int line) +xfs_filblks_t +xfs_bmbt_disk_get_blockcount( + xfs_bmbt_rec_t *r) { - xfs_bmbt_rec_host_t r; - - xfs_bmbt_set_all(&r, &cur->bc_rec.b); - xfs_bmbt_trace_enter(func, cur, s, XFS_BMBT_KTRACE_CUR, line, - (cur->bc_nlevels << 24) | (cur->bc_private.b.flags << 16) | - cur->bc_private.b.allocated, - r.l0 >> 32, (int)r.l0, - r.l1 >> 32, (int)r.l1, - (unsigned long)cur->bc_bufs[0], (unsigned long)cur->bc_bufs[1], - (unsigned long)cur->bc_bufs[2], (unsigned long)cur->bc_bufs[3], - (cur->bc_ptrs[0] << 16) | cur->bc_ptrs[1], - (cur->bc_ptrs[2] << 16) | cur->bc_ptrs[3]); -} - -#define XFS_BMBT_TRACE_ARGBI(c,b,i) \ - xfs_bmbt_trace_argbi(__FUNCTION__, c, b, i, __LINE__) -#define XFS_BMBT_TRACE_ARGBII(c,b,i,j) \ - xfs_bmbt_trace_argbii(__FUNCTION__, c, b, i, j, __LINE__) -#define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) \ - xfs_bmbt_trace_argfffi(__FUNCTION__, c, o, b, i, j, __LINE__) -#define XFS_BMBT_TRACE_ARGI(c,i) \ - xfs_bmbt_trace_argi(__FUNCTION__, c, i, __LINE__) -#define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) \ - xfs_bmbt_trace_argifk(__FUNCTION__, c, i, f, s, __LINE__) -#define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) \ - xfs_bmbt_trace_argifr(__FUNCTION__, c, i, f, r, __LINE__) -#define XFS_BMBT_TRACE_ARGIK(c,i,k) \ - xfs_bmbt_trace_argik(__FUNCTION__, c, i, k, __LINE__) -#define XFS_BMBT_TRACE_CURSOR(c,s) \ - xfs_bmbt_trace_cursor(__FUNCTION__, c, s, __LINE__) -#else -#define XFS_BMBT_TRACE_ARGBI(c,b,i) -#define XFS_BMBT_TRACE_ARGBII(c,b,i,j) -#define XFS_BMBT_TRACE_ARGFFFI(c,o,b,i,j) -#define XFS_BMBT_TRACE_ARGI(c,i) -#define XFS_BMBT_TRACE_ARGIFK(c,i,f,s) -#define XFS_BMBT_TRACE_ARGIFR(c,i,f,r) -#define XFS_BMBT_TRACE_ARGIK(c,i,k) -#define XFS_BMBT_TRACE_CURSOR(c,s) -#endif /* XFS_BMBT_TRACE */ - + return (xfs_filblks_t)(be64_to_cpu(r->l1) & XFS_MASK64LO(21)); +} /* - * Internal functions. + * Extract the startoff field from a disk format bmap extent record. */ +xfs_fileoff_t +xfs_bmbt_disk_get_startoff( + xfs_bmbt_rec_t *r) +{ + return ((xfs_fileoff_t)be64_to_cpu(r->l0) & + XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN)) >> 9; +} /* - * Delete record pointed to by cur/level. + * Set all the fields in a bmap extent record from the arguments. */ -STATIC int /* error */ -xfs_bmbt_delrec( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ +void +xfs_bmbt_set_allf( + xfs_bmbt_rec_host_t *r, + xfs_fileoff_t startoff, + xfs_fsblock_t startblock, + xfs_filblks_t blockcount, + xfs_exntst_t state) { - xfs_bmbt_block_t *block; /* bmap btree block */ - xfs_fsblock_t bno; /* fs-relative block number */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop counter */ - int j; /* temp state */ - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_bmbt_key_t *kp=NULL; /* pointer to bmap btree key */ - xfs_fsblock_t lbno; /* left sibling block number */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - int lrecs=0; /* left record count */ - xfs_bmbt_rec_t *lrp; /* left record pointer */ - xfs_mount_t *mp; /* file system mount point */ - xfs_bmbt_ptr_t *pp; /* pointer to bmap block addr */ - int ptr; /* key/record index */ - xfs_fsblock_t rbno; /* right sibling block number */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp; /* right btree key */ - xfs_bmbt_rec_t *rp; /* pointer to bmap btree rec */ - xfs_bmbt_ptr_t *rpp; /* right address pointer */ - xfs_bmbt_block_t *rrblock; /* right-right btree block */ - xfs_buf_t *rrbp; /* right-right buffer pointer */ - int rrecs=0; /* right record count */ - xfs_bmbt_rec_t *rrp; /* right record pointer */ - xfs_btree_cur_t *tcur; /* temporary btree cursor */ - int numrecs; /* temporary numrec count */ - int numlrecs, numrrecs; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - ptr = cur->bc_ptrs[level]; - tcur = NULL; - if (ptr == 0) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - block = xfs_bmbt_get_block(cur, level, &bp); - numrecs = be16_to_cpu(block->bb_numrecs); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } -#endif - if (ptr > numrecs) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - XFS_STATS_INC(xs_bmbt_delrec); - if (level > 0) { - kp = XFS_BMAP_KEY_IADDR(block, 1, cur); - pp = XFS_BMAP_PTR_IADDR(block, 1, cur); -#ifdef DEBUG - for (i = ptr; i < numrecs; i++) { - if ((error = xfs_btree_check_lptr_disk(cur, pp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - } -#endif - if (ptr < numrecs) { - memmove(&kp[ptr - 1], &kp[ptr], - (numrecs - ptr) * sizeof(*kp)); - memmove(&pp[ptr - 1], &pp[ptr], - (numrecs - ptr) * sizeof(*pp)); - xfs_bmbt_log_ptrs(cur, bp, ptr, numrecs - 1); - xfs_bmbt_log_keys(cur, bp, ptr, numrecs - 1); - } - } else { - rp = XFS_BMAP_REC_IADDR(block, 1, cur); - if (ptr < numrecs) { - memmove(&rp[ptr - 1], &rp[ptr], - (numrecs - ptr) * sizeof(*rp)); - xfs_bmbt_log_recs(cur, bp, ptr, numrecs - 1); - } - if (ptr == 1) { - key.br_startoff = - cpu_to_be64(xfs_bmbt_disk_get_startoff(rp)); - kp = &key; - } - } - numrecs--; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_bmbt_log_block(cur, bp, XFS_BB_NUMRECS); - /* - * We're at the root level. - * First, shrink the root block in-memory. - * Try to get rid of the next level down. - * If we can't then there's nothing left to do. - */ - if (level == cur->bc_nlevels - 1) { - xfs_iroot_realloc(cur->bc_private.b.ip, -1, - cur->bc_private.b.whichfork); - if ((error = xfs_bmbt_killroot(cur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &j))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - if (ptr == 1 && (error = xfs_bmbt_updkey(cur, kp, level + 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (numrecs >= XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &j))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - rbno = be64_to_cpu(block->bb_rightsib); - lbno = be64_to_cpu(block->bb_leftsib); - /* - * One child of root, need to get a chance to copy its contents - * into the root and delete it. Can't go up to next level, - * there's nothing to delete there. - */ - if (lbno == NULLFSBLOCK && rbno == NULLFSBLOCK && - level == cur->bc_nlevels - 2) { - if ((error = xfs_bmbt_killroot(cur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - ASSERT(rbno != NULLFSBLOCK || lbno != NULLFSBLOCK); - if ((error = xfs_btree_dup_cursor(cur, &tcur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - bno = NULLFSBLOCK; - if (rbno != NULLFSBLOCK) { - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_increment(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - rbp = tcur->bc_bufs[level]; - right = XFS_BUF_TO_BMBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } -#endif - bno = be64_to_cpu(right->bb_leftsib); - if (be16_to_cpu(right->bb_numrecs) - 1 >= - XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if ((error = xfs_bmbt_lshift(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_BMAP_BLOCK_IMINRECS(level, tcur)); - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - tcur = NULL; - if (level > 0) { - if ((error = xfs_bmbt_decrement(cur, - level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, - ERROR); - goto error0; - } - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - } - rrecs = be16_to_cpu(right->bb_numrecs); - if (lbno != NULLFSBLOCK) { - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_decrement(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - } - } - if (lbno != NULLFSBLOCK) { - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - /* - * decrement to last in block - */ - if ((error = xfs_bmbt_decrement(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - i = xfs_btree_firstrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - lbp = tcur->bc_bufs[level]; - left = XFS_BUF_TO_BMBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } -#endif - bno = be64_to_cpu(left->bb_rightsib); - if (be16_to_cpu(left->bb_numrecs) - 1 >= - XFS_BMAP_BLOCK_IMINRECS(level, cur)) { - if ((error = xfs_bmbt_rshift(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_BMAP_BLOCK_IMINRECS(level, tcur)); - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - tcur = NULL; - if (level == 0) - cur->bc_ptrs[0]++; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - } - lrecs = be16_to_cpu(left->bb_numrecs); - } - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - tcur = NULL; - mp = cur->bc_mp; - ASSERT(bno != NULLFSBLOCK); - if (lbno != NULLFSBLOCK && - lrecs + be16_to_cpu(block->bb_numrecs) <= XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - rbno = bno; - right = block; - rbp = bp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, lbno, 0, &lbp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - left = XFS_BUF_TO_BMBT_BLOCK(lbp); - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - } else if (rbno != NULLFSBLOCK && - rrecs + be16_to_cpu(block->bb_numrecs) <= - XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - lbno = bno; - left = block; - lbp = bp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, rbno, 0, &rbp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - right = XFS_BUF_TO_BMBT_BLOCK(rbp); - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - lrecs = be16_to_cpu(left->bb_numrecs); - } else { - if (level > 0 && (error = xfs_bmbt_decrement(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - numlrecs = be16_to_cpu(left->bb_numrecs); - numrrecs = be16_to_cpu(right->bb_numrecs); - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, numlrecs + 1, cur); - lpp = XFS_BMAP_PTR_IADDR(left, numlrecs + 1, cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < numrrecs; i++) { - if ((error = xfs_btree_check_lptr_disk(cur, rpp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - } -#endif - memcpy(lkp, rkp, numrrecs * sizeof(*lkp)); - memcpy(lpp, rpp, numrrecs * sizeof(*lpp)); - xfs_bmbt_log_keys(cur, lbp, numlrecs + 1, numlrecs + numrrecs); - xfs_bmbt_log_ptrs(cur, lbp, numlrecs + 1, numlrecs + numrrecs); + int extent_flag = (state == XFS_EXT_NORM) ? 0 : 1; + + ASSERT(state == XFS_EXT_NORM || state == XFS_EXT_UNWRITTEN); + ASSERT((startoff & XFS_MASK64HI(64-BMBT_STARTOFF_BITLEN)) == 0); + ASSERT((blockcount & XFS_MASK64HI(64-BMBT_BLOCKCOUNT_BITLEN)) == 0); + +#if XFS_BIG_BLKNOS + ASSERT((startblock & XFS_MASK64HI(64-BMBT_STARTBLOCK_BITLEN)) == 0); + + r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | + ((xfs_bmbt_rec_base_t)startoff << 9) | + ((xfs_bmbt_rec_base_t)startblock >> 43); + r->l1 = ((xfs_bmbt_rec_base_t)startblock << 21) | + ((xfs_bmbt_rec_base_t)blockcount & + (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); +#else /* !XFS_BIG_BLKNOS */ + if (ISNULLSTARTBLOCK(startblock)) { + r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | + ((xfs_bmbt_rec_base_t)startoff << 9) | + (xfs_bmbt_rec_base_t)XFS_MASK64LO(9); + r->l1 = XFS_MASK64HI(11) | + ((xfs_bmbt_rec_base_t)startblock << 21) | + ((xfs_bmbt_rec_base_t)blockcount & + (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); } else { - lrp = XFS_BMAP_REC_IADDR(left, numlrecs + 1, cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - memcpy(lrp, rrp, numrrecs * sizeof(*lrp)); - xfs_bmbt_log_recs(cur, lbp, numlrecs + 1, numlrecs + numrrecs); - } - be16_add(&left->bb_numrecs, numrrecs); - left->bb_rightsib = right->bb_rightsib; - xfs_bmbt_log_block(cur, lbp, XFS_BB_RIGHTSIB | XFS_BB_NUMRECS); - if (be64_to_cpu(left->bb_rightsib) != NULLDFSBNO) { - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, - be64_to_cpu(left->bb_rightsib), - 0, &rrbp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - rrblock = XFS_BUF_TO_BMBT_BLOCK(rrbp); - if ((error = xfs_btree_check_lblock(cur, rrblock, level, rrbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; - } - rrblock->bb_leftsib = cpu_to_be64(lbno); - xfs_bmbt_log_block(cur, rrbp, XFS_BB_LEFTSIB); - } - xfs_bmap_add_free(XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(rbp)), 1, - cur->bc_private.b.flist, mp); - cur->bc_private.b.ip->i_d.di_nblocks--; - xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, XFS_ILOG_CORE); - XFS_TRANS_MOD_DQUOT_BYINO(mp, cur->bc_tp, cur->bc_private.b.ip, - XFS_TRANS_DQ_BCOUNT, -1L); - xfs_trans_binval(cur->bc_tp, rbp); - if (bp != lbp) { - cur->bc_bufs[level] = lbp; - cur->bc_ptrs[level] += lrecs; - cur->bc_ra[level] = 0; - } else if ((error = xfs_bmbt_increment(cur, level + 1, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - goto error0; + r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | + ((xfs_bmbt_rec_base_t)startoff << 9); + r->l1 = ((xfs_bmbt_rec_base_t)startblock << 21) | + ((xfs_bmbt_rec_base_t)blockcount & + (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); } - if (level > 0) - cur->bc_ptrs[level]--; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 2; - return 0; - -error0: - if (tcur) - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; +#endif /* XFS_BIG_BLKNOS */ } /* - * Insert one record/level. Return information to the caller - * allowing the next level up to proceed if necessary. + * Set all the fields in a bmap extent record from the uncompressed form. */ -STATIC int /* error */ -xfs_bmbt_insrec( - xfs_btree_cur_t *cur, - int level, - xfs_fsblock_t *bnop, - xfs_bmbt_rec_t *recp, - xfs_btree_cur_t **curp, - int *stat) /* no-go/done/continue */ +void +xfs_bmbt_set_all( + xfs_bmbt_rec_host_t *r, + xfs_bmbt_irec_t *s) { - xfs_bmbt_block_t *block; /* bmap btree block */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_bmbt_key_t *kp=NULL; /* pointer to bmap btree key */ - int logflags; /* inode logging flags */ - xfs_fsblock_t nbno; /* new block number */ - struct xfs_btree_cur *ncur; /* new btree cursor */ - __uint64_t startoff; /* new btree key value */ - xfs_bmbt_rec_t nrec; /* new record count */ - int optr; /* old key/record index */ - xfs_bmbt_ptr_t *pp; /* pointer to bmap block addr */ - int ptr; /* key/record index */ - xfs_bmbt_rec_t *rp=NULL; /* pointer to bmap btree rec */ - int numrecs; - - ASSERT(level < cur->bc_nlevels); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGIFR(cur, level, *bnop, recp); - ncur = NULL; - key.br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(recp)); - optr = ptr = cur->bc_ptrs[level]; - if (ptr == 0) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - XFS_STATS_INC(xs_bmbt_insrec); - block = xfs_bmbt_get_block(cur, level, &bp); - numrecs = be16_to_cpu(block->bb_numrecs); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (ptr <= numrecs) { - if (level == 0) { - rp = XFS_BMAP_REC_IADDR(block, ptr, cur); - xfs_btree_check_rec(XFS_BTNUM_BMAP, recp, rp); - } else { - kp = XFS_BMAP_KEY_IADDR(block, ptr, cur); - xfs_btree_check_key(XFS_BTNUM_BMAP, &key, kp); - } - } -#endif - nbno = NULLFSBLOCK; - if (numrecs == XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - if (numrecs < XFS_BMAP_BLOCK_DMAXRECS(level, cur)) { - /* - * A root block, that can be made bigger. - */ - xfs_iroot_realloc(cur->bc_private.b.ip, 1, - cur->bc_private.b.whichfork); - block = xfs_bmbt_get_block(cur, level, &bp); - } else if (level == cur->bc_nlevels - 1) { - if ((error = xfs_bmbt_newroot(cur, &logflags, stat)) || - *stat == 0) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, - logflags); - block = xfs_bmbt_get_block(cur, level, &bp); - } else { - if ((error = xfs_bmbt_rshift(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (i) { - /* nothing */ - } else { - if ((error = xfs_bmbt_lshift(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (i) { - optr = ptr = cur->bc_ptrs[level]; - } else { - if ((error = xfs_bmbt_split(cur, level, - &nbno, &startoff, &ncur, - &i))) { - XFS_BMBT_TRACE_CURSOR(cur, - ERROR); - return error; - } - if (i) { - block = xfs_bmbt_get_block( - cur, level, &bp); -#ifdef DEBUG - if ((error = - xfs_btree_check_lblock(cur, - block, level, bp))) { - XFS_BMBT_TRACE_CURSOR( - cur, ERROR); - return error; - } -#endif - ptr = cur->bc_ptrs[level]; - xfs_bmbt_disk_set_allf(&nrec, - startoff, 0, 0, - XFS_EXT_NORM); - } else { - XFS_BMBT_TRACE_CURSOR(cur, - EXIT); - *stat = 0; - return 0; - } - } - } - } - } - numrecs = be16_to_cpu(block->bb_numrecs); - if (level > 0) { - kp = XFS_BMAP_KEY_IADDR(block, 1, cur); - pp = XFS_BMAP_PTR_IADDR(block, 1, cur); -#ifdef DEBUG - for (i = numrecs; i >= ptr; i--) { - if ((error = xfs_btree_check_lptr_disk(cur, pp[i - 1], - level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memmove(&kp[ptr], &kp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*kp)); - memmove(&pp[ptr], &pp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*pp)); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr(cur, *bnop, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - kp[ptr - 1] = key; - pp[ptr - 1] = cpu_to_be64(*bnop); - numrecs++; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_bmbt_log_keys(cur, bp, ptr, numrecs); - xfs_bmbt_log_ptrs(cur, bp, ptr, numrecs); - } else { - rp = XFS_BMAP_REC_IADDR(block, 1, cur); - memmove(&rp[ptr], &rp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*rp)); - rp[ptr - 1] = *recp; - numrecs++; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_bmbt_log_recs(cur, bp, ptr, numrecs); - } - xfs_bmbt_log_block(cur, bp, XFS_BB_NUMRECS); -#ifdef DEBUG - if (ptr < numrecs) { - if (level == 0) - xfs_btree_check_rec(XFS_BTNUM_BMAP, rp + ptr - 1, - rp + ptr); - else - xfs_btree_check_key(XFS_BTNUM_BMAP, kp + ptr - 1, - kp + ptr); - } -#endif - if (optr == 1 && (error = xfs_bmbt_updkey(cur, &key, level + 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - *bnop = nbno; - if (nbno != NULLFSBLOCK) { - *recp = nrec; - *curp = ncur; - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; + xfs_bmbt_set_allf(r, s->br_startoff, s->br_startblock, + s->br_blockcount, s->br_state); } -STATIC int -xfs_bmbt_killroot( - xfs_btree_cur_t *cur) + +/* + * Set all the fields in a disk format bmap extent record from the arguments. + */ +void +xfs_bmbt_disk_set_allf( + xfs_bmbt_rec_t *r, + xfs_fileoff_t startoff, + xfs_fsblock_t startblock, + xfs_filblks_t blockcount, + xfs_exntst_t state) { - xfs_bmbt_block_t *block; - xfs_bmbt_block_t *cblock; - xfs_buf_t *cbp; - xfs_bmbt_key_t *ckp; - xfs_bmbt_ptr_t *cpp; -#ifdef DEBUG - int error; -#endif - int i; - xfs_bmbt_key_t *kp; - xfs_inode_t *ip; - xfs_ifork_t *ifp; - int level; - xfs_bmbt_ptr_t *pp; + int extent_flag = (state == XFS_EXT_NORM) ? 0 : 1; - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - level = cur->bc_nlevels - 1; - ASSERT(level >= 1); - /* - * Don't deal with the root block needs to be a leaf case. - * We're just going to turn the thing back into extents anyway. - */ - if (level == 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; - } - block = xfs_bmbt_get_block(cur, level, &cbp); - /* - * Give up if the root has multiple children. - */ - if (be16_to_cpu(block->bb_numrecs) != 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; - } - /* - * Only do this if the next level will fit. - * Then the data must be copied up to the inode, - * instead of freeing the root you free the next level. - */ - cbp = cur->bc_bufs[level - 1]; - cblock = XFS_BUF_TO_BMBT_BLOCK(cbp); - if (be16_to_cpu(cblock->bb_numrecs) > XFS_BMAP_BLOCK_DMAXRECS(level, cur)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; - } - ASSERT(be64_to_cpu(cblock->bb_leftsib) == NULLDFSBNO); - ASSERT(be64_to_cpu(cblock->bb_rightsib) == NULLDFSBNO); - ip = cur->bc_private.b.ip; - ifp = XFS_IFORK_PTR(ip, cur->bc_private.b.whichfork); - ASSERT(XFS_BMAP_BLOCK_IMAXRECS(level, cur) == - XFS_BMAP_BROOT_MAXRECS(ifp->if_broot_bytes)); - i = (int)(be16_to_cpu(cblock->bb_numrecs) - XFS_BMAP_BLOCK_IMAXRECS(level, cur)); - if (i) { - xfs_iroot_realloc(ip, i, cur->bc_private.b.whichfork); - block = ifp->if_broot; - } - be16_add(&block->bb_numrecs, i); - ASSERT(block->bb_numrecs == cblock->bb_numrecs); - kp = XFS_BMAP_KEY_IADDR(block, 1, cur); - ckp = XFS_BMAP_KEY_IADDR(cblock, 1, cur); - memcpy(kp, ckp, be16_to_cpu(block->bb_numrecs) * sizeof(*kp)); - pp = XFS_BMAP_PTR_IADDR(block, 1, cur); - cpp = XFS_BMAP_PTR_IADDR(cblock, 1, cur); -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(cblock->bb_numrecs); i++) { - if ((error = xfs_btree_check_lptr_disk(cur, cpp[i], level - 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } + ASSERT(state == XFS_EXT_NORM || state == XFS_EXT_UNWRITTEN); + ASSERT((startoff & XFS_MASK64HI(64-BMBT_STARTOFF_BITLEN)) == 0); + ASSERT((blockcount & XFS_MASK64HI(64-BMBT_BLOCKCOUNT_BITLEN)) == 0); + +#if XFS_BIG_BLKNOS + ASSERT((startblock & XFS_MASK64HI(64-BMBT_STARTBLOCK_BITLEN)) == 0); + + r->l0 = cpu_to_be64( + ((xfs_bmbt_rec_base_t)extent_flag << 63) | + ((xfs_bmbt_rec_base_t)startoff << 9) | + ((xfs_bmbt_rec_base_t)startblock >> 43)); + r->l1 = cpu_to_be64( + ((xfs_bmbt_rec_base_t)startblock << 21) | + ((xfs_bmbt_rec_base_t)blockcount & + (xfs_bmbt_rec_base_t)XFS_MASK64LO(21))); +#else /* !XFS_BIG_BLKNOS */ + if (ISNULLSTARTBLOCK(startblock)) { + r->l0 = cpu_to_be64( + ((xfs_bmbt_rec_base_t)extent_flag << 63) | + ((xfs_bmbt_rec_base_t)startoff << 9) | + (xfs_bmbt_rec_base_t)XFS_MASK64LO(9)); + r->l1 = cpu_to_be64(XFS_MASK64HI(11) | + ((xfs_bmbt_rec_base_t)startblock << 21) | + ((xfs_bmbt_rec_base_t)blockcount & + (xfs_bmbt_rec_base_t)XFS_MASK64LO(21))); + } else { + r->l0 = cpu_to_be64( + ((xfs_bmbt_rec_base_t)extent_flag << 63) | + ((xfs_bmbt_rec_base_t)startoff << 9)); + r->l1 = cpu_to_be64( + ((xfs_bmbt_rec_base_t)startblock << 21) | + ((xfs_bmbt_rec_base_t)blockcount & + (xfs_bmbt_rec_base_t)XFS_MASK64LO(21))); } -#endif - memcpy(pp, cpp, be16_to_cpu(block->bb_numrecs) * sizeof(*pp)); - xfs_bmap_add_free(XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(cbp)), 1, - cur->bc_private.b.flist, cur->bc_mp); - ip->i_d.di_nblocks--; - XFS_TRANS_MOD_DQUOT_BYINO(cur->bc_mp, cur->bc_tp, ip, - XFS_TRANS_DQ_BCOUNT, -1L); - xfs_trans_binval(cur->bc_tp, cbp); - cur->bc_bufs[level - 1] = NULL; - be16_add(&block->bb_level, -1); - xfs_trans_log_inode(cur->bc_tp, ip, - XFS_ILOG_CORE | XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); - cur->bc_nlevels--; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - return 0; +#endif /* XFS_BIG_BLKNOS */ } /* - * Log key values from the btree block. + * Set all the fields in a bmap extent record from the uncompressed form. */ -STATIC void -xfs_bmbt_log_keys( - xfs_btree_cur_t *cur, - xfs_buf_t *bp, - int kfirst, - int klast) +void +xfs_bmbt_disk_set_all( + xfs_bmbt_rec_t *r, + xfs_bmbt_irec_t *s) { - xfs_trans_t *tp; + xfs_bmbt_disk_set_allf(r, s->br_startoff, s->br_startblock, + s->br_blockcount, s->br_state); +} - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBII(cur, bp, kfirst, klast); - tp = cur->bc_tp; - if (bp) { - xfs_bmbt_block_t *block; - int first; - xfs_bmbt_key_t *kp; - int last; +/* + * Set the blockcount field in a bmap extent record. + */ +void +xfs_bmbt_set_blockcount( + xfs_bmbt_rec_host_t *r, + xfs_filblks_t v) +{ + ASSERT((v & XFS_MASK64HI(43)) == 0); + r->l1 = (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(43)) | + (xfs_bmbt_rec_base_t)(v & XFS_MASK64LO(21)); +} - block = XFS_BUF_TO_BMBT_BLOCK(bp); - kp = XFS_BMAP_KEY_DADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(tp, bp, first, last); +/* + * Set the startblock field in a bmap extent record. + */ +void +xfs_bmbt_set_startblock( + xfs_bmbt_rec_host_t *r, + xfs_fsblock_t v) +{ +#if XFS_BIG_BLKNOS + ASSERT((v & XFS_MASK64HI(12)) == 0); + r->l0 = (r->l0 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(55)) | + (xfs_bmbt_rec_base_t)(v >> 43); + r->l1 = (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)) | + (xfs_bmbt_rec_base_t)(v << 21); +#else /* !XFS_BIG_BLKNOS */ + if (ISNULLSTARTBLOCK(v)) { + r->l0 |= (xfs_bmbt_rec_base_t)XFS_MASK64LO(9); + r->l1 = (xfs_bmbt_rec_base_t)XFS_MASK64HI(11) | + ((xfs_bmbt_rec_base_t)v << 21) | + (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); } else { - xfs_inode_t *ip; - - ip = cur->bc_private.b.ip; - xfs_trans_log_inode(tp, ip, - XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + r->l0 &= ~(xfs_bmbt_rec_base_t)XFS_MASK64LO(9); + r->l1 = ((xfs_bmbt_rec_base_t)v << 21) | + (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); +#endif /* XFS_BIG_BLKNOS */ } /* - * Log pointer values from the btree block. + * Set the startoff field in a bmap extent record. */ -STATIC void -xfs_bmbt_log_ptrs( - xfs_btree_cur_t *cur, - xfs_buf_t *bp, - int pfirst, - int plast) +void +xfs_bmbt_set_startoff( + xfs_bmbt_rec_host_t *r, + xfs_fileoff_t v) { - xfs_trans_t *tp; + ASSERT((v & XFS_MASK64HI(9)) == 0); + r->l0 = (r->l0 & (xfs_bmbt_rec_base_t) XFS_MASK64HI(1)) | + ((xfs_bmbt_rec_base_t)v << 9) | + (r->l0 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(9)); +} - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBII(cur, bp, pfirst, plast); - tp = cur->bc_tp; - if (bp) { - xfs_bmbt_block_t *block; - int first; - int last; - xfs_bmbt_ptr_t *pp; +/* + * Set the extent state field in a bmap extent record. + */ +void +xfs_bmbt_set_state( + xfs_bmbt_rec_host_t *r, + xfs_exntst_t v) +{ + ASSERT(v == XFS_EXT_NORM || v == XFS_EXT_UNWRITTEN); + if (v == XFS_EXT_NORM) + r->l0 &= XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN); + else + r->l0 |= XFS_MASK64HI(BMBT_EXNTFLAG_BITLEN); +} - block = XFS_BUF_TO_BMBT_BLOCK(bp); - pp = XFS_BMAP_PTR_DADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(tp, bp, first, last); - } else { - xfs_inode_t *ip; +/* + * Convert in-memory form of btree root to on-disk form. + */ +void +xfs_bmbt_to_bmdr( + xfs_bmbt_block_t *rblock, + int rblocklen, + xfs_bmdr_block_t *dblock, + int dblocklen) +{ + int dmxr; + xfs_bmbt_key_t *fkp; + __be64 *fpp; + xfs_bmbt_key_t *tkp; + __be64 *tpp; - ip = cur->bc_private.b.ip; - xfs_trans_log_inode(tp, ip, - XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + ASSERT(be32_to_cpu(rblock->bb_magic) == XFS_BMAP_MAGIC); + ASSERT(be64_to_cpu(rblock->bb_leftsib) == NULLDFSBNO); + ASSERT(be64_to_cpu(rblock->bb_rightsib) == NULLDFSBNO); + ASSERT(be16_to_cpu(rblock->bb_level) > 0); + dblock->bb_level = rblock->bb_level; + dblock->bb_numrecs = rblock->bb_numrecs; + dmxr = (int)XFS_BTREE_BLOCK_MAXRECS(dblocklen, xfs_bmdr, 0); + fkp = XFS_BMAP_BROOT_KEY_ADDR(rblock, 1, rblocklen); + tkp = XFS_BTREE_KEY_ADDR(xfs_bmdr, dblock, 1); + fpp = XFS_BMAP_BROOT_PTR_ADDR(rblock, 1, rblocklen); + tpp = XFS_BTREE_PTR_ADDR(xfs_bmdr, dblock, 1, dmxr); + dmxr = be16_to_cpu(dblock->bb_numrecs); + memcpy(tkp, fkp, sizeof(*fkp) * dmxr); + memcpy(tpp, fpp, sizeof(*fpp) * dmxr); } /* - * Lookup the record. The cursor is made to point to it, based on dir. + * Check extent records, which have just been read, for + * any bit in the extent flag field. ASSERT on debug + * kernels, as this condition should not occur. + * Return an error condition (1) if any flags found, + * otherwise return 0. */ -STATIC int /* error */ -xfs_bmbt_lookup( - xfs_btree_cur_t *cur, - xfs_lookup_t dir, - int *stat) /* success/failure */ -{ - xfs_bmbt_block_t *block=NULL; - xfs_buf_t *bp; - xfs_daddr_t d; - xfs_sfiloff_t diff; - int error; /* error return value */ - xfs_fsblock_t fsbno=0; - int high; - int i; - int keyno=0; - xfs_bmbt_key_t *kkbase=NULL; - xfs_bmbt_key_t *kkp; - xfs_bmbt_rec_t *krbase=NULL; - xfs_bmbt_rec_t *krp; - int level; - int low; - xfs_mount_t *mp; - xfs_bmbt_ptr_t *pp; - xfs_bmbt_irec_t *rp; - xfs_fileoff_t startoff; - xfs_trans_t *tp; - - XFS_STATS_INC(xs_bmbt_lookup); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, (int)dir); - tp = cur->bc_tp; - mp = cur->bc_mp; - rp = &cur->bc_rec.b; - for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { - if (level < cur->bc_nlevels - 1) { - d = XFS_FSB_TO_DADDR(mp, fsbno); - bp = cur->bc_bufs[level]; - if (bp && XFS_BUF_ADDR(bp) != d) - bp = NULL; - if (!bp) { - if ((error = xfs_btree_read_bufl(mp, tp, fsbno, - 0, &bp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - xfs_btree_setbuf(cur, level, bp); - block = XFS_BUF_TO_BMBT_BLOCK(bp); - if ((error = xfs_btree_check_lblock(cur, block, - level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } else - block = XFS_BUF_TO_BMBT_BLOCK(bp); - } else - block = xfs_bmbt_get_block(cur, level, &bp); - if (diff == 0) - keyno = 1; - else { - if (level > 0) - kkbase = XFS_BMAP_KEY_IADDR(block, 1, cur); - else - krbase = XFS_BMAP_REC_IADDR(block, 1, cur); - low = 1; - if (!(high = be16_to_cpu(block->bb_numrecs))) { - ASSERT(level == 0); - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - while (low <= high) { - XFS_STATS_INC(xs_bmbt_compare); - keyno = (low + high) >> 1; - if (level > 0) { - kkp = kkbase + keyno - 1; - startoff = be64_to_cpu(kkp->br_startoff); - } else { - krp = krbase + keyno - 1; - startoff = xfs_bmbt_disk_get_startoff(krp); - } - diff = (xfs_sfiloff_t) - (startoff - rp->br_startoff); - if (diff < 0) - low = keyno + 1; - else if (diff > 0) - high = keyno - 1; - else - break; - } - } - if (level > 0) { - if (diff > 0 && --keyno < 1) - keyno = 1; - pp = XFS_BMAP_PTR_IADDR(block, keyno, cur); - fsbno = be64_to_cpu(*pp); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr(cur, fsbno, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - cur->bc_ptrs[level] = keyno; - } - } - if (dir != XFS_LOOKUP_LE && diff < 0) { - keyno++; - /* - * If ge search and we went off the end of the block, but it's - * not the last block, we're in the wrong block. - */ - if (dir == XFS_LOOKUP_GE && keyno > be16_to_cpu(block->bb_numrecs) && - be64_to_cpu(block->bb_rightsib) != NULLDFSBNO) { - cur->bc_ptrs[0] = keyno; - if ((error = xfs_bmbt_increment(cur, 0, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - XFS_WANT_CORRUPTED_RETURN(i == 1); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; + +int +xfs_check_nostate_extents( + xfs_ifork_t *ifp, + xfs_extnum_t idx, + xfs_extnum_t num) +{ + for (; num > 0; num--, idx++) { + xfs_bmbt_rec_host_t *ep = xfs_iext_get_ext(ifp, idx); + if ((ep->l0 >> + (64 - BMBT_EXNTFLAG_BITLEN)) != 0) { + ASSERT(0); + return 1; } } - else if (dir == XFS_LOOKUP_LE && diff > 0) - keyno--; - cur->bc_ptrs[0] = keyno; - if (keyno == 0 || keyno > be16_to_cpu(block->bb_numrecs)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - } else { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); - } return 0; } /* - * Move 1 record left from cur/level if possible. - * Update cur to reflect the new path. + * BMBT function vectors for core btree operations */ -STATIC int /* error */ -xfs_bmbt_lshift( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - int error; /* error return value */ -#ifdef DEBUG - int i; /* loop counter */ -#endif - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp=NULL; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - int lrecs; /* left record count */ - xfs_bmbt_rec_t *lrp=NULL; /* left record pointer */ - xfs_mount_t *mp; /* file system mount point */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp=NULL; /* right btree key */ - xfs_bmbt_ptr_t *rpp=NULL; /* right address pointer */ - xfs_bmbt_rec_t *rrp=NULL; /* right record pointer */ - int rrecs; /* right record count */ - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - if (level == cur->bc_nlevels - 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - rbp = cur->bc_bufs[level]; - right = XFS_BUF_TO_BMBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (be64_to_cpu(right->bb_leftsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - if (cur->bc_ptrs[level] <= 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - mp = cur->bc_mp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, be64_to_cpu(right->bb_leftsib), 0, - &lbp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - left = XFS_BUF_TO_BMBT_BLOCK(lbp); - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (be16_to_cpu(left->bb_numrecs) == XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - lrecs = be16_to_cpu(left->bb_numrecs) + 1; - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, lrecs, cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - *lkp = *rkp; - xfs_bmbt_log_keys(cur, lbp, lrecs, lrecs); - lpp = XFS_BMAP_PTR_IADDR(left, lrecs, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr_disk(cur, *rpp, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - *lpp = *rpp; - xfs_bmbt_log_ptrs(cur, lbp, lrecs, lrecs); - } else { - lrp = XFS_BMAP_REC_IADDR(left, lrecs, cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - *lrp = *rrp; - xfs_bmbt_log_recs(cur, lbp, lrecs, lrecs); - } - left->bb_numrecs = cpu_to_be16(lrecs); - xfs_bmbt_log_block(cur, lbp, XFS_BB_NUMRECS); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(XFS_BTNUM_BMAP, lkp - 1, lkp); - else - xfs_btree_check_rec(XFS_BTNUM_BMAP, lrp - 1, lrp); -#endif - rrecs = be16_to_cpu(right->bb_numrecs) - 1; - right->bb_numrecs = cpu_to_be16(rrecs); - xfs_bmbt_log_block(cur, rbp, XFS_BB_NUMRECS); - if (level > 0) { -#ifdef DEBUG - for (i = 0; i < rrecs; i++) { - if ((error = xfs_btree_check_lptr_disk(cur, rpp[i + 1], - level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memmove(rkp, rkp + 1, rrecs * sizeof(*rkp)); - memmove(rpp, rpp + 1, rrecs * sizeof(*rpp)); - xfs_bmbt_log_keys(cur, rbp, 1, rrecs); - xfs_bmbt_log_ptrs(cur, rbp, 1, rrecs); - } else { - memmove(rrp, rrp + 1, rrecs * sizeof(*rrp)); - xfs_bmbt_log_recs(cur, rbp, 1, rrecs); - key.br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(rrp)); - rkp = &key; - } - if ((error = xfs_bmbt_updkey(cur, rkp, level + 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - cur->bc_ptrs[level]--; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; -} /* - * Move 1 record right from cur/level if possible. - * Update cur to reflect the new path. + * Get the block pointer for the given level of the cursor. + * Fill in the buffer pointer, if applicable. */ -STATIC int /* error */ -xfs_bmbt_rshift( +STATIC xfs_btree_block_t * +xfs_bmbt_get_block( xfs_btree_cur_t *cur, int level, - int *stat) /* success/failure */ + xfs_buf_t **bpp) { - int error; /* error return value */ - int i; /* loop counter */ - xfs_bmbt_key_t key; /* bmap btree key */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - xfs_bmbt_rec_t *lrp; /* left record pointer */ - xfs_mount_t *mp; /* file system mount point */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp; /* right btree key */ - xfs_bmbt_ptr_t *rpp; /* right address pointer */ - xfs_bmbt_rec_t *rrp=NULL; /* right record pointer */ - struct xfs_btree_cur *tcur; /* temporary btree cursor */ - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - if (level == cur->bc_nlevels - 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_BMBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, left, level, lbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (be64_to_cpu(left->bb_rightsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - if (cur->bc_ptrs[level] >= be16_to_cpu(left->bb_numrecs)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - mp = cur->bc_mp; - if ((error = xfs_btree_read_bufl(mp, cur->bc_tp, be64_to_cpu(left->bb_rightsib), 0, - &rbp, XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - right = XFS_BUF_TO_BMBT_BLOCK(rbp); - if ((error = xfs_btree_check_lblock(cur, right, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (be16_to_cpu(right->bb_numrecs) == XFS_BMAP_BLOCK_IMAXRECS(level, cur)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, be16_to_cpu(left->bb_numrecs), cur); - lpp = XFS_BMAP_PTR_IADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - for (i = be16_to_cpu(right->bb_numrecs) - 1; i >= 0; i--) { - if ((error = xfs_btree_check_lptr_disk(cur, rpp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memmove(rkp + 1, rkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp + 1, rpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr_disk(cur, *lpp, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - *rkp = *lkp; - *rpp = *lpp; - xfs_bmbt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_bmbt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); + xfs_ifork_t *ifp; + xfs_bmbt_block_t *rval; + + if (level < cur->bc_nlevels - 1) { + *bpp = cur->bc_bufs[level]; + rval = XFS_BUF_TO_BMBT_BLOCK(*bpp); } else { - lrp = XFS_BMAP_REC_IADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - memmove(rrp + 1, rrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - *rrp = *lrp; - xfs_bmbt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - key.br_startoff = cpu_to_be64(xfs_bmbt_disk_get_startoff(rrp)); - rkp = &key; - } - be16_add(&left->bb_numrecs, -1); - xfs_bmbt_log_block(cur, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, 1); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(XFS_BTNUM_BMAP, rkp, rkp + 1); - else - xfs_btree_check_rec(XFS_BTNUM_BMAP, rrp, rrp + 1); -#endif - xfs_bmbt_log_block(cur, rbp, XFS_BB_NUMRECS); - if ((error = xfs_btree_dup_cursor(cur, &tcur))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_increment(tcur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(tcur, ERROR); - goto error1; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_bmbt_updkey(tcur, rkp, level + 1))) { - XFS_BMBT_TRACE_CURSOR(tcur, ERROR); - goto error1; + *bpp = NULL; + ifp = XFS_IFORK_PTR(cur->bc_private.b.ip, + cur->bc_private.b.whichfork); + rval = ifp->if_broot; } - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; + return (xfs_btree_block_t *)rval; +} + + +STATIC int +xfs_bmbt_get_buf( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int flags, + xfs_buf_t **bpp) +{ + xfs_buf_t *bp; + + BUG_ON(be64_to_cpu(ptr->u.bmbt) == 0); + bp = xfs_btree_get_bufl(cur->bc_mp, cur->bc_tp, + be64_to_cpu(ptr->u.bmbt), flags); + *bpp = bp; return 0; -error0: - XFS_BMBT_TRACE_CURSOR(cur, ERROR); -error1: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; + } -/* - * Determine the extent state. - */ -/* ARGSUSED */ -STATIC xfs_exntst_t -xfs_extent_state( - xfs_filblks_t blks, - int extent_flag) +STATIC int +xfs_bmbt_read_buf( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int flags, + xfs_buf_t **bpp) +{ + BUG_ON(be64_to_cpu(ptr->u.bmbt) == 0); + return xfs_btree_read_bufl(cur->bc_mp, cur->bc_tp, + be64_to_cpu(ptr->u.bmbt), flags, + bpp, XFS_BMAP_BTREE_REF); +} + +STATIC xfs_btree_block_t * +xfs_bmbt_buf_to_block( + xfs_btree_cur_t *cur, + xfs_buf_t *bp) { - if (extent_flag) { - ASSERT(blks != 0); /* saved for DMIG */ - return XFS_EXT_UNWRITTEN; - } - return XFS_EXT_NORM; + /* XFS_BUF_TO_BMBT_BLOCK(rbp); */ + return XFS_BUF_TO_BLOCK(bp); } +STATIC void +xfs_bmbt_buf_to_ptr( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + xfs_btree_ptr_t *ptr) +{ + ptr->u.bmbt = cpu_to_be64(XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp))); +} -/* - * Split cur/level block in half. - * Return new block number and its first record (to be inserted into parent). - */ -STATIC int /* error */ -xfs_bmbt_split( - xfs_btree_cur_t *cur, - int level, - xfs_fsblock_t *bnop, - __uint64_t *startoff, - xfs_btree_cur_t **curp, - int *stat) /* success/failure */ +STATIC int +xfs_bmbt_alloc_block( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *start, + xfs_btree_ptr_t *new, + int length, + int *stat) { xfs_alloc_arg_t args; /* block allocation args */ int error; /* error return value */ - int i; /* loop counter */ - xfs_fsblock_t lbno; /* left sibling block number */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_bmbt_block_t *left; /* left btree block */ - xfs_bmbt_key_t *lkp; /* left btree key */ - xfs_bmbt_ptr_t *lpp; /* left address pointer */ - xfs_bmbt_rec_t *lrp; /* left record pointer */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_bmbt_block_t *right; /* right btree block */ - xfs_bmbt_key_t *rkp; /* right btree key */ - xfs_bmbt_ptr_t *rpp; /* right address pointer */ - xfs_bmbt_block_t *rrblock; /* right-right btree block */ - xfs_buf_t *rrbp; /* right-right buffer pointer */ - xfs_bmbt_rec_t *rrp; /* right record pointer */ + xfs_fsblock_t sbno = be64_to_cpu(start->u.bmbt); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGIFK(cur, level, *bnop, *startoff); + memset(&args, 0, sizeof(args)); args.tp = cur->bc_tp; args.mp = cur->bc_mp; - lbp = cur->bc_bufs[level]; - lbno = XFS_DADDR_TO_FSB(args.mp, XFS_BUF_ADDR(lbp)); - left = XFS_BUF_TO_BMBT_BLOCK(lbp); args.fsbno = cur->bc_private.b.firstblock; args.firstblock = args.fsbno; if (args.fsbno == NULLFSBLOCK) { - args.fsbno = lbno; + args.fsbno = sbno; args.type = XFS_ALLOCTYPE_START_BNO; } else args.type = XFS_ALLOCTYPE_NEAR_BNO; @@ -1503,15 +581,16 @@ xfs_bmbt_split( args.minlen = args.maxlen = args.prod = 1; args.wasdel = cur->bc_private.b.flags & XFS_BTCUR_BPRV_WASDEL; if (!args.wasdel && xfs_trans_get_block_res(args.tp) == 0) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); return XFS_ERROR(ENOSPC); } - if ((error = xfs_alloc_vextent(&args))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); + error = xfs_alloc_vextent(&args); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); return error; } if (args.fsbno == NULLFSBLOCK) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); *stat = 0; return 0; } @@ -1522,602 +601,383 @@ xfs_bmbt_split( xfs_trans_log_inode(args.tp, cur->bc_private.b.ip, XFS_ILOG_CORE); XFS_TRANS_MOD_DQUOT_BYINO(args.mp, args.tp, cur->bc_private.b.ip, XFS_TRANS_DQ_BCOUNT, 1L); - rbp = xfs_btree_get_bufl(args.mp, args.tp, args.fsbno, 0); - right = XFS_BUF_TO_BMBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, left, level, rbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - right->bb_magic = cpu_to_be32(XFS_BMAP_MAGIC); - right->bb_level = left->bb_level; - right->bb_numrecs = cpu_to_be16(be16_to_cpu(left->bb_numrecs) / 2); - if ((be16_to_cpu(left->bb_numrecs) & 1) && - cur->bc_ptrs[level] <= be16_to_cpu(right->bb_numrecs) + 1) - be16_add(&right->bb_numrecs, 1); - i = be16_to_cpu(left->bb_numrecs) - be16_to_cpu(right->bb_numrecs) + 1; - if (level > 0) { - lkp = XFS_BMAP_KEY_IADDR(left, i, cur); - lpp = XFS_BMAP_PTR_IADDR(left, i, cur); - rkp = XFS_BMAP_KEY_IADDR(right, 1, cur); - rpp = XFS_BMAP_PTR_IADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_lptr_disk(cur, lpp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } -#endif - memcpy(rkp, lkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memcpy(rpp, lpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_bmbt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_bmbt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - *startoff = be64_to_cpu(rkp->br_startoff); - } else { - lrp = XFS_BMAP_REC_IADDR(left, i, cur); - rrp = XFS_BMAP_REC_IADDR(right, 1, cur); - memcpy(rrp, lrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_bmbt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - *startoff = xfs_bmbt_disk_get_startoff(rrp); - } - be16_add(&left->bb_numrecs, -(be16_to_cpu(right->bb_numrecs))); - right->bb_rightsib = left->bb_rightsib; - left->bb_rightsib = cpu_to_be64(args.fsbno); - right->bb_leftsib = cpu_to_be64(lbno); - xfs_bmbt_log_block(cur, rbp, XFS_BB_ALL_BITS); - xfs_bmbt_log_block(cur, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - if (be64_to_cpu(right->bb_rightsib) != NULLDFSBNO) { - if ((error = xfs_btree_read_bufl(args.mp, args.tp, - be64_to_cpu(right->bb_rightsib), 0, &rrbp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - rrblock = XFS_BUF_TO_BMBT_BLOCK(rrbp); - if ((error = xfs_btree_check_lblock(cur, rrblock, level, rrbp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - rrblock->bb_leftsib = cpu_to_be64(args.fsbno); - xfs_bmbt_log_block(cur, rrbp, XFS_BB_LEFTSIB); - } - if (cur->bc_ptrs[level] > be16_to_cpu(left->bb_numrecs) + 1) { - xfs_btree_setbuf(cur, level, rbp); - cur->bc_ptrs[level] -= be16_to_cpu(left->bb_numrecs); - } - if (level + 1 < cur->bc_nlevels) { - if ((error = xfs_btree_dup_cursor(cur, curp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - (*curp)->bc_ptrs[level + 1]++; - } - *bnop = args.fsbno; - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + + new->u.bmbt = cpu_to_be64(args.fsbno); *stat = 1; return 0; } - -/* - * Update keys for the record. - */ STATIC int -xfs_bmbt_updkey( - xfs_btree_cur_t *cur, - xfs_bmbt_key_t *keyp, /* on-disk format */ - int level) +xfs_bmbt_free_block( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + int size) { - xfs_bmbt_block_t *block; - xfs_buf_t *bp; -#ifdef DEBUG - int error; -#endif - xfs_bmbt_key_t *kp; - int ptr; + xfs_mount_t *mp = cur->bc_mp; + xfs_inode_t *ip = cur->bc_private.b.ip; - ASSERT(level >= 1); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGIK(cur, level, keyp); - for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { - block = xfs_bmbt_get_block(cur, level, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - ptr = cur->bc_ptrs[level]; - kp = XFS_BMAP_KEY_IADDR(block, ptr, cur); - *kp = *keyp; - xfs_bmbt_log_keys(cur, bp, ptr, ptr); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + xfs_bmap_add_free(XFS_DADDR_TO_FSB(mp, XFS_BUF_ADDR(bp)), 1, + cur->bc_private.b.flist, mp); + ip->i_d.di_nblocks--; + xfs_trans_log_inode(cur->bc_tp, ip, XFS_ILOG_CORE); + XFS_TRANS_MOD_DQUOT_BYINO(mp, cur->bc_tp, ip, XFS_TRANS_DQ_BCOUNT, -1L); + xfs_trans_binval(cur->bc_tp, bp); return 0; } + /* - * Convert on-disk form of btree root to in-memory form. + * Log fields from the btree block header. */ void -xfs_bmdr_to_bmbt( - xfs_bmdr_block_t *dblock, - int dblocklen, - xfs_bmbt_block_t *rblock, - int rblocklen) -{ - int dmxr; - xfs_bmbt_key_t *fkp; - __be64 *fpp; - xfs_bmbt_key_t *tkp; - __be64 *tpp; - - rblock->bb_magic = cpu_to_be32(XFS_BMAP_MAGIC); - rblock->bb_level = dblock->bb_level; - ASSERT(be16_to_cpu(rblock->bb_level) > 0); - rblock->bb_numrecs = dblock->bb_numrecs; - rblock->bb_leftsib = cpu_to_be64(NULLDFSBNO); - rblock->bb_rightsib = cpu_to_be64(NULLDFSBNO); - dmxr = (int)XFS_BTREE_BLOCK_MAXRECS(dblocklen, xfs_bmdr, 0); - fkp = XFS_BTREE_KEY_ADDR(xfs_bmdr, dblock, 1); - tkp = XFS_BMAP_BROOT_KEY_ADDR(rblock, 1, rblocklen); - fpp = XFS_BTREE_PTR_ADDR(xfs_bmdr, dblock, 1, dmxr); - tpp = XFS_BMAP_BROOT_PTR_ADDR(rblock, 1, rblocklen); - dmxr = be16_to_cpu(dblock->bb_numrecs); - memcpy(tkp, fkp, sizeof(*fkp) * dmxr); - memcpy(tpp, fpp, sizeof(*fpp) * dmxr); -} +xfs_bmbt_log_block( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_buf_t *bp, /* buffer containing btree block */ + int fields) /* mask of fields: XFS_BB_... */ +{ + int first; /* first byte offset logged */ + int last; /* last byte offset logged */ + static const short offsets[] = { /* table of offsets */ + offsetof(xfs_bmbt_block_t, bb_magic), + offsetof(xfs_bmbt_block_t, bb_level), + offsetof(xfs_bmbt_block_t, bb_numrecs), + offsetof(xfs_bmbt_block_t, bb_leftsib), + offsetof(xfs_bmbt_block_t, bb_rightsib), + sizeof(xfs_bmbt_block_t) + }; -/* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_bmbt_decrement( - xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ -{ - xfs_bmbt_block_t *block; - xfs_buf_t *bp; - int error; /* error return value */ - xfs_fsblock_t fsbno; - int lev; - xfs_mount_t *mp; - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - ASSERT(level < cur->bc_nlevels); - if (level < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); - if (--cur->bc_ptrs[level] > 0) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - block = xfs_bmbt_get_block(cur, level, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (be64_to_cpu(block->bb_leftsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) - break; - if (lev < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); - } - if (lev == cur->bc_nlevels) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - tp = cur->bc_tp; - mp = cur->bc_mp; - for (block = xfs_bmbt_get_block(cur, lev, &bp); lev > level; ) { - fsbno = be64_to_cpu(*XFS_BMAP_PTR_IADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufl(mp, tp, fsbno, 0, &bp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_BMBT_BLOCK(bp); - if ((error = xfs_btree_check_lblock(cur, block, lev, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - cur->bc_ptrs[lev] = be16_to_cpu(block->bb_numrecs); - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBI(cur, bp, fields); + if (bp) { + xfs_btree_offsets(fields, offsets, XFS_BB_NUM_BITS, &first, + &last); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + } else + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } -/* - * Delete the record pointed to by cur. - */ -int /* error */ -xfs_bmbt_delete( +static const struct xfs_btree_block_ops xfs_bmbt_blkops = { + .get_buf = xfs_bmbt_get_buf, + .read_buf = xfs_bmbt_read_buf, + .get_block = xfs_bmbt_get_block, + .buf_to_block = xfs_bmbt_buf_to_block, + .buf_to_ptr = xfs_bmbt_buf_to_ptr, + .log_block = xfs_bmbt_log_block, + .check_block = xfs_btree_check_lblock, + + .alloc_block = xfs_bmbt_alloc_block, + .free_block = xfs_bmbt_free_block, + + .get_sibling = xfs_btree_get_lsibling, + .set_sibling = xfs_btree_set_lsibling, + .init_sibling = xfs_btree_init_sibling, +}; + +STATIC int +xfs_bmbt_get_iminrecs( xfs_btree_cur_t *cur, - int *stat) /* success/failure */ + int lev) { - int error; /* error return value */ - int i; - int level; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - for (level = 0, i = 2; i == 2; level++) { - if ((error = xfs_bmbt_delrec(cur, level, &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - } - if (i == 0) { - for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_bmbt_decrement(cur, level, - &i))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - break; - } - } - } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = i; - return 0; + return XFS_BMAP_BLOCK_IMINRECS(lev, cur); } -/* - * Convert a compressed bmap extent record to an uncompressed form. - * This code must be in sync with the routines xfs_bmbt_get_startoff, - * xfs_bmbt_get_startblock, xfs_bmbt_get_blockcount and xfs_bmbt_get_state. - */ - -STATIC_INLINE void -__xfs_bmbt_get_all( - __uint64_t l0, - __uint64_t l1, - xfs_bmbt_irec_t *s) +STATIC int +xfs_bmbt_get_imaxrecs( + xfs_btree_cur_t *cur, + int lev) { - int ext_flag; - xfs_exntst_t st; - - ext_flag = (int)(l0 >> (64 - BMBT_EXNTFLAG_BITLEN)); - s->br_startoff = ((xfs_fileoff_t)l0 & - XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN)) >> 9; -#if XFS_BIG_BLKNOS - s->br_startblock = (((xfs_fsblock_t)l0 & XFS_MASK64LO(9)) << 43) | - (((xfs_fsblock_t)l1) >> 21); -#else -#ifdef DEBUG - { - xfs_dfsbno_t b; + return XFS_BMAP_BLOCK_IMAXRECS(lev, cur); +} - b = (((xfs_dfsbno_t)l0 & XFS_MASK64LO(9)) << 43) | - (((xfs_dfsbno_t)l1) >> 21); - ASSERT((b >> 32) == 0 || ISNULLDSTARTBLOCK(b)); - s->br_startblock = (xfs_fsblock_t)b; - } -#else /* !DEBUG */ - s->br_startblock = (xfs_fsblock_t)(((xfs_dfsbno_t)l1) >> 21); -#endif /* DEBUG */ -#endif /* XFS_BIG_BLKNOS */ - s->br_blockcount = (xfs_filblks_t)(l1 & XFS_MASK64LO(21)); - /* This is xfs_extent_state() in-line */ - if (ext_flag) { - ASSERT(s->br_blockcount != 0); /* saved for DMIG */ - st = XFS_EXT_UNWRITTEN; - } else - st = XFS_EXT_NORM; - s->br_state = st; +STATIC int +xfs_bmbt_get_dminrecs( + xfs_btree_cur_t *cur, + int lev) +{ + return XFS_BMAP_BLOCK_DMINRECS(lev, cur); } -void -xfs_bmbt_get_all( - xfs_bmbt_rec_host_t *r, - xfs_bmbt_irec_t *s) +STATIC int +xfs_bmbt_get_dmaxrecs( + xfs_btree_cur_t *cur, + int lev) { - __xfs_bmbt_get_all(r->l0, r->l1, s); + return XFS_BMAP_BLOCK_DMAXRECS(lev, cur); } -/* - * Get the block pointer for the given level of the cursor. - * Fill in the buffer pointer, if applicable. - */ -xfs_bmbt_block_t * -xfs_bmbt_get_block( +STATIC int +xfs_btree_get_numrecs( xfs_btree_cur_t *cur, - int level, - xfs_buf_t **bpp) + xfs_btree_block_t *block) { - xfs_ifork_t *ifp; - xfs_bmbt_block_t *rval; + return be16_to_cpu(block->bb_h.bb_numrecs); +} - if (level < cur->bc_nlevels - 1) { - *bpp = cur->bc_bufs[level]; - rval = XFS_BUF_TO_BMBT_BLOCK(*bpp); - } else { - *bpp = NULL; - ifp = XFS_IFORK_PTR(cur->bc_private.b.ip, - cur->bc_private.b.whichfork); - rval = ifp->if_broot; - } - return rval; +STATIC void +xfs_btree_set_numrecs( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + int numrecs) +{ + block->bb_h.bb_numrecs = cpu_to_be16(numrecs); } -/* - * Extract the blockcount field from an in memory bmap extent record. - */ -xfs_filblks_t -xfs_bmbt_get_blockcount( - xfs_bmbt_rec_host_t *r) +STATIC void +xfs_bmbt_init_key_from_rec( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key, + xfs_btree_rec_t *rec) { - return (xfs_filblks_t)(r->l1 & XFS_MASK64LO(21)); + key->u.bmbt.br_startoff = cpu_to_be64( + xfs_bmbt_disk_get_startoff(&rec->u.bmbt)); } /* - * Extract the startblock field from an in memory bmap extent record. + * intial value of ptr for lookup */ -xfs_fsblock_t -xfs_bmbt_get_startblock( - xfs_bmbt_rec_host_t *r) +STATIC void +xfs_bmbt_init_ptr_from_cur( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) { -#if XFS_BIG_BLKNOS - return (((xfs_fsblock_t)r->l0 & XFS_MASK64LO(9)) << 43) | - (((xfs_fsblock_t)r->l1) >> 21); -#else -#ifdef DEBUG - xfs_dfsbno_t b; - - b = (((xfs_dfsbno_t)r->l0 & XFS_MASK64LO(9)) << 43) | - (((xfs_dfsbno_t)r->l1) >> 21); - ASSERT((b >> 32) == 0 || ISNULLDSTARTBLOCK(b)); - return (xfs_fsblock_t)b; -#else /* !DEBUG */ - return (xfs_fsblock_t)(((xfs_dfsbno_t)r->l1) >> 21); -#endif /* DEBUG */ -#endif /* XFS_BIG_BLKNOS */ + ptr->u.bmbt = 0; } -/* - * Extract the startoff field from an in memory bmap extent record. - */ -xfs_fileoff_t -xfs_bmbt_get_startoff( - xfs_bmbt_rec_host_t *r) +STATIC void +xfs_bmbt_init_rec_from_key( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key, + xfs_btree_rec_t *rec) { - return ((xfs_fileoff_t)r->l0 & - XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN)) >> 9; + BUG_ON(be64_to_cpu(key->u.bmbt.br_startoff) == 0); + xfs_bmbt_disk_set_allf(&rec->u.bmbt, + be64_to_cpu(key->u.bmbt.br_startoff), + 0, 0, XFS_EXT_NORM); } -xfs_exntst_t -xfs_bmbt_get_state( - xfs_bmbt_rec_host_t *r) +STATIC void +xfs_bmbt_init_rec_from_cur( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec) { - int ext_flag; + BUG_ON(cur->bc_rec.b.br_startoff == 0); + xfs_bmbt_disk_set_all(&rec->u.bmbt, &cur->bc_rec.b); +} - ext_flag = (int)((r->l0) >> (64 - BMBT_EXNTFLAG_BITLEN)); - return xfs_extent_state(xfs_bmbt_get_blockcount(r), - ext_flag); +STATIC xfs_btree_key_t * +xfs_bmbt_key_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) +{ + return (xfs_btree_key_t *)XFS_BMAP_KEY_IADDR(&block->bb_h, index, cur); } -/* Endian flipping versions of the bmbt extraction functions */ -void -xfs_bmbt_disk_get_all( - xfs_bmbt_rec_t *r, - xfs_bmbt_irec_t *s) +STATIC xfs_btree_ptr_t * +xfs_bmbt_ptr_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) { - __xfs_bmbt_get_all(be64_to_cpu(r->l0), be64_to_cpu(r->l1), s); + return (xfs_btree_ptr_t *)XFS_BMAP_PTR_IADDR(&block->bb_h, index, cur); } -/* - * Extract the blockcount field from an on disk bmap extent record. - */ -xfs_filblks_t -xfs_bmbt_disk_get_blockcount( - xfs_bmbt_rec_t *r) +STATIC xfs_btree_rec_t * +xfs_bmbt_rec_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) { - return (xfs_filblks_t)(be64_to_cpu(r->l1) & XFS_MASK64LO(21)); + return (xfs_btree_rec_t *)XFS_BMAP_REC_IADDR(&block->bb_h, index, cur); } -/* - * Extract the startoff field from a disk format bmap extent record. - */ -xfs_fileoff_t -xfs_bmbt_disk_get_startoff( - xfs_bmbt_rec_t *r) +STATIC int64_t +xfs_bmbt_key_diff( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key) { - return ((xfs_fileoff_t)be64_to_cpu(r->l0) & - XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN)) >> 9; + return (int64_t)(be64_to_cpu(key->u.bmbt.br_startoff) - + cur->bc_rec.b.br_startoff); } -/* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. - */ -int /* error */ -xfs_bmbt_increment( +STATIC xfs_daddr_t +xfs_bmbt_ptr_to_daddr( xfs_btree_cur_t *cur, - int level, - int *stat) /* success/failure */ + xfs_btree_ptr_t *ptr) { - xfs_bmbt_block_t *block; - xfs_buf_t *bp; - int error; /* error return value */ - xfs_fsblock_t fsbno; - int lev; - xfs_mount_t *mp; - xfs_trans_t *tp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGI(cur, level); - ASSERT(level < cur->bc_nlevels); - if (level < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); - block = xfs_bmbt_get_block(cur, level, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, level, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (++cur->bc_ptrs[level] <= be16_to_cpu(block->bb_numrecs)) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; - } - if (be64_to_cpu(block->bb_rightsib) == NULLDFSBNO) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; - } - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - block = xfs_bmbt_get_block(cur, lev, &bp); -#ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, lev, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) - break; - if (lev < cur->bc_nlevels - 1) - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); + return XFS_FSB_TO_DADDR(cur->bc_mp, be64_to_cpu(ptr->u.bmbt)); +} + +STATIC void +xfs_bmbt_move_keys( + xfs_btree_cur_t *cur, + xfs_btree_key_t *src_key, + xfs_btree_key_t *dst_key, + int from, + int to, + int numkeys) +{ + if (dst_key == NULL) { + /* moving within a block */ + xfs_bmbt_key_t *kp = &src_key->u.bmbt; + memmove(&kp[to], &kp[from], numkeys * sizeof(*kp)); + } else { + /* moving between blocks */ + memcpy(dst_key, src_key, numkeys * sizeof(xfs_bmbt_key_t)); } - if (lev == cur->bc_nlevels) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; +} + +STATIC void +xfs_bmbt_move_ptrs( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *src_ptr, + xfs_btree_ptr_t *dst_ptr, + int from, + int to, + int numptrs) +{ + if (dst_ptr == NULL) { + /* moving within a block */ + xfs_bmbt_ptr_t *pp = &src_ptr->u.bmbt; + memmove(&pp[to], &pp[from], numptrs * sizeof(*pp)); + } else { + /* moving between blocks */ + memcpy(dst_ptr, src_ptr, numptrs * sizeof(xfs_bmbt_ptr_t)); } - tp = cur->bc_tp; - mp = cur->bc_mp; - for (block = xfs_bmbt_get_block(cur, lev, &bp); lev > level; ) { - fsbno = be64_to_cpu(*XFS_BMAP_PTR_IADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufl(mp, tp, fsbno, 0, &bp, - XFS_BMAP_BTREE_REF))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_BMBT_BLOCK(bp); - if ((error = xfs_btree_check_lblock(cur, block, lev, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - cur->bc_ptrs[lev] = 1; +} + +STATIC void +xfs_bmbt_move_recs( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *src_rec, + xfs_btree_rec_t *dst_rec, + int from, + int to, + int numrecs) +{ + if (dst_rec == NULL) { + /* moving within a block */ + xfs_bmbt_rec_t *rp = &src_rec->u.bmbt; + memmove(&rp[to], &rp[from], numrecs * sizeof(*rp)); + } else { + /* moving between blocks */ + memcpy(dst_rec, src_rec, numrecs * sizeof(xfs_bmbt_rec_t)); } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 1; - return 0; +} + + +STATIC void +xfs_bmbt_set_key( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key_addr, + int index, + xfs_btree_key_t *newkey) +{ + xfs_bmbt_key_t *kp = &key_addr->u.bmbt; + + kp[index] = newkey->u.bmbt; +} + +STATIC void +xfs_bmbt_set_ptr( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr_addr, + int index, + xfs_btree_ptr_t *newptr) +{ + xfs_bmbt_ptr_t *pp = &ptr_addr->u.bmbt; + + pp[index] = newptr->u.bmbt; +} + +STATIC void +xfs_bmbt_set_rec( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec_addr, + int index, + xfs_btree_rec_t *newrec) +{ + xfs_bmbt_rec_t *rp = &rec_addr->u.bmbt; + + rp[index] = newrec->u.bmbt; } /* - * Insert the current record at the point referenced by cur. + * Log keys from a btree block (nonleaf). */ -int /* error */ -xfs_bmbt_insert( +STATIC void +xfs_bmbt_log_keys( xfs_btree_cur_t *cur, - int *stat) /* success/failure */ + xfs_buf_t *bp, + int kfirst, + int klast) { - int error; /* error return value */ - int i; - int level; - xfs_fsblock_t nbno; - xfs_btree_cur_t *ncur; - xfs_bmbt_rec_t nrec; - xfs_btree_cur_t *pcur; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - level = 0; - nbno = NULLFSBLOCK; - xfs_bmbt_disk_set_all(&nrec, &cur->bc_rec.b); - ncur = NULL; - pcur = cur; - do { - if ((error = xfs_bmbt_insrec(pcur, level++, &nbno, &nrec, &ncur, - &i))) { - if (pcur != cur) - xfs_btree_del_cursor(pcur, XFS_BTREE_ERROR); - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if (pcur != cur && (ncur || nbno == NULLFSBLOCK)) { - cur->bc_nlevels = pcur->bc_nlevels; - cur->bc_private.b.allocated += - pcur->bc_private.b.allocated; - pcur->bc_private.b.allocated = 0; - ASSERT((cur->bc_private.b.firstblock != NULLFSBLOCK) || - XFS_IS_REALTIME_INODE(cur->bc_private.b.ip)); - cur->bc_private.b.firstblock = - pcur->bc_private.b.firstblock; - ASSERT(cur->bc_private.b.flist == - pcur->bc_private.b.flist); - xfs_btree_del_cursor(pcur, XFS_BTREE_NOERROR); - } - if (ncur) { - pcur = ncur; - ncur = NULL; - } - } while (nbno != NULLFSBLOCK); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = i; - return 0; -error0: - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; + xfs_trans_t *tp; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, kfirst, klast); + tp = cur->bc_tp; + if (bp) { + xfs_bmbt_block_t *block; + int first; + xfs_bmbt_key_t *kp; + int last; + + block = XFS_BUF_TO_BMBT_BLOCK(bp); + kp = XFS_BMAP_KEY_DADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(tp, bp, first, last); + } else { + xfs_inode_t *ip; + + ip = cur->bc_private.b.ip; + xfs_trans_log_inode(tp, ip, + XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + } + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } /* - * Log fields from the btree block header. + * Log block pointer fields from a btree block (nonleaf). */ -void -xfs_bmbt_log_block( - xfs_btree_cur_t *cur, - xfs_buf_t *bp, - int fields) +STATIC void +xfs_bmbt_log_ptrs( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + int pfirst, + int plast) { - int first; - int last; - xfs_trans_t *tp; - static const short offsets[] = { - offsetof(xfs_bmbt_block_t, bb_magic), - offsetof(xfs_bmbt_block_t, bb_level), - offsetof(xfs_bmbt_block_t, bb_numrecs), - offsetof(xfs_bmbt_block_t, bb_leftsib), - offsetof(xfs_bmbt_block_t, bb_rightsib), - sizeof(xfs_bmbt_block_t) - }; + xfs_trans_t *tp; - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBI(cur, bp, fields); + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, pfirst, plast); tp = cur->bc_tp; if (bp) { - xfs_btree_offsets(fields, offsets, XFS_BB_NUM_BITS, &first, - &last); + xfs_bmbt_block_t *block; + int first; + int last; + xfs_bmbt_ptr_t *pp; + + block = XFS_BUF_TO_BMBT_BLOCK(bp); + pp = XFS_BMAP_PTR_DADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); xfs_trans_log_buf(tp, bp, first, last); - } else - xfs_trans_log_inode(tp, cur->bc_private.b.ip, + } else { + xfs_inode_t *ip; + + ip = cur->bc_private.b.ip; + xfs_trans_log_inode(tp, ip, XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + } + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } /* - * Log record values from the btree block. + * Log records from a btree block (leaf). */ void xfs_bmbt_log_recs( @@ -2130,445 +990,432 @@ xfs_bmbt_log_recs( int first; int last; xfs_bmbt_rec_t *rp; - xfs_trans_t *tp; - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGBII(cur, bp, rfirst, rlast); + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, rfirst, rlast); ASSERT(bp); - tp = cur->bc_tp; block = XFS_BUF_TO_BMBT_BLOCK(bp); rp = XFS_BMAP_REC_DADDR(block, 1, cur); first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(tp, bp, first, last); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); } -int /* error */ -xfs_bmbt_lookup_eq( - xfs_btree_cur_t *cur, - xfs_fileoff_t off, - xfs_fsblock_t bno, - xfs_filblks_t len, - int *stat) /* success/failure */ -{ - cur->bc_rec.b.br_startoff = off; - cur->bc_rec.b.br_startblock = bno; - cur->bc_rec.b.br_blockcount = len; - return xfs_bmbt_lookup(cur, XFS_LOOKUP_EQ, stat); -} +static const struct xfs_btree_record_ops xfs_bmbt_recops = { + .get_minrecs = xfs_bmbt_get_iminrecs, + .get_maxrecs = xfs_bmbt_get_imaxrecs, + .get_dminrecs = xfs_bmbt_get_dminrecs, + .get_dmaxrecs = xfs_bmbt_get_dmaxrecs, + .get_numrecs = xfs_btree_get_numrecs, + .set_numrecs = xfs_btree_set_numrecs, + + .init_key_from_rec = xfs_bmbt_init_key_from_rec, + .init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur, + .init_rec_from_key = xfs_bmbt_init_rec_from_key, + .init_rec_from_cur = xfs_bmbt_init_rec_from_cur, + + .key_addr = xfs_bmbt_key_addr, + .ptr_addr = xfs_bmbt_ptr_addr, + .rec_addr = xfs_bmbt_rec_addr, + + .key_diff = xfs_bmbt_key_diff, + .ptr_to_daddr = xfs_bmbt_ptr_to_daddr, + + .move_keys = xfs_bmbt_move_keys, + .move_ptrs = xfs_bmbt_move_ptrs, + .move_recs = xfs_bmbt_move_recs, + + .set_key = xfs_bmbt_set_key, + .set_ptr = xfs_bmbt_set_ptr, + .set_rec = xfs_bmbt_set_rec, + + .log_keys = xfs_bmbt_log_keys, + .log_ptrs = xfs_bmbt_log_ptrs, + .log_recs = xfs_bmbt_log_recs, -int /* error */ -xfs_bmbt_lookup_ge( - xfs_btree_cur_t *cur, - xfs_fileoff_t off, - xfs_fsblock_t bno, - xfs_filblks_t len, - int *stat) /* success/failure */ -{ - cur->bc_rec.b.br_startoff = off; - cur->bc_rec.b.br_startblock = bno; - cur->bc_rec.b.br_blockcount = len; - return xfs_bmbt_lookup(cur, XFS_LOOKUP_GE, stat); -} + .check_ptrs = xfs_btree_check_lptr, +}; -/* - * Give the bmap btree a new root block. Copy the old broot contents - * down into a real block and make the broot point to it. - */ -int /* error */ -xfs_bmbt_newroot( +STATIC int /* error */ +xfs_bmbt_new_root( xfs_btree_cur_t *cur, /* btree cursor */ - int *logflags, /* logging flags for inode */ int *stat) /* return status - 0 fail */ { - xfs_alloc_arg_t args; /* allocation arguments */ - xfs_bmbt_block_t *block; /* bmap btree block */ - xfs_buf_t *bp; /* buffer for block */ - xfs_bmbt_block_t *cblock; /* child btree block */ - xfs_bmbt_key_t *ckp; /* child key pointer */ - xfs_bmbt_ptr_t *cpp; /* child ptr pointer */ - int error; /* error return code */ -#ifdef DEBUG - int i; /* loop counter */ -#endif - xfs_bmbt_key_t *kp; /* pointer to bmap btree key */ - int level; /* btree level */ - xfs_bmbt_ptr_t *pp; /* pointer to bmap block addr */ + int logflags = 0; + int error; + + error = xfs_bmbt_newroot(cur, &logflags, stat); + if (!(error || *stat == 0)) + xfs_trans_log_inode(cur->bc_tp, cur->bc_private.b.ip, logflags); + return error; +} + +STATIC int +xfs_bmbt_killroot( + xfs_btree_cur_t *cur, + int lev, /* unused */ + xfs_btree_ptr_t *newroot) /* unused */ +{ + xfs_btree_block_t *block; + xfs_btree_block_t *cblock; + xfs_buf_t *cbp; + xfs_btree_key_t *ckp; + xfs_btree_ptr_t *cpp; + int i; + xfs_btree_key_t *kp; + xfs_inode_t *ip; + xfs_ifork_t *ifp; + int level; + xfs_btree_ptr_t *pp; + + ASSERT(newroot == NULL); + ASSERT(lev == -1); - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); level = cur->bc_nlevels - 1; - block = xfs_bmbt_get_block(cur, level, &bp); + ASSERT(level >= 1); /* - * Copy the root into a real block. + * Don't deal with the root block needs to be a leaf case. + * We're just going to turn the thing back into extents anyway. */ - args.mp = cur->bc_mp; - pp = XFS_BMAP_PTR_IADDR(block, 1, cur); - args.tp = cur->bc_tp; - args.fsbno = cur->bc_private.b.firstblock; - args.mod = args.minleft = args.alignment = args.total = args.isfl = - args.userdata = args.minalignslop = 0; - args.minlen = args.maxlen = args.prod = 1; - args.wasdel = cur->bc_private.b.flags & XFS_BTCUR_BPRV_WASDEL; - args.firstblock = args.fsbno; - if (args.fsbno == NULLFSBLOCK) { -#ifdef DEBUG - if ((error = xfs_btree_check_lptr_disk(cur, *pp, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - args.fsbno = be64_to_cpu(*pp); - args.type = XFS_ALLOCTYPE_START_BNO; - } else - args.type = XFS_ALLOCTYPE_NEAR_BNO; - if ((error = xfs_alloc_vextent(&args))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } - if (args.fsbno == NULLFSBLOCK) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *stat = 0; - return 0; + if (level == 1) + goto out0; + + block = xfs_bmbt_get_block(cur, level, &cbp); + /* + * Give up if the root has multiple children. + */ + if (be16_to_cpu(block->bb_h.bb_numrecs) != 1) + goto out0; + /* + * Only do this if the next level will fit. + * Then the data must be copied up to the inode, + * instead of freeing the root you free the next level. + */ + cbp = cur->bc_bufs[level - 1]; + cblock = xfs_bmbt_buf_to_block(cur, cbp); + if (be16_to_cpu(cblock->bb_h.bb_numrecs) > xfs_bmbt_get_dmaxrecs(cur, level)) + goto out0; + + ASSERT(be64_to_cpu(cblock->bb_h.bb_leftsib) == NULLDFSBNO); + ASSERT(be64_to_cpu(cblock->bb_h.bb_rightsib) == NULLDFSBNO); + ip = cur->bc_private.b.ip; + ifp = XFS_IFORK_PTR(ip, cur->bc_private.b.whichfork); + ASSERT(xfs_bmbt_get_imaxrecs(cur, level) == + XFS_BMAP_BROOT_MAXRECS(ifp->if_broot_bytes)); + i = (int)(be16_to_cpu(cblock->bb_h.bb_numrecs) - xfs_bmbt_get_imaxrecs(cur, level)); + if (i) { + xfs_iroot_realloc(ip, i, cur->bc_private.b.whichfork); + block = (xfs_btree_block_t *)ifp->if_broot; } - ASSERT(args.len == 1); - cur->bc_private.b.firstblock = args.fsbno; - cur->bc_private.b.allocated++; - cur->bc_private.b.ip->i_d.di_nblocks++; - XFS_TRANS_MOD_DQUOT_BYINO(args.mp, args.tp, cur->bc_private.b.ip, - XFS_TRANS_DQ_BCOUNT, 1L); - bp = xfs_btree_get_bufl(args.mp, cur->bc_tp, args.fsbno, 0); - cblock = XFS_BUF_TO_BMBT_BLOCK(bp); - *cblock = *block; - be16_add(&block->bb_level, 1); - block->bb_numrecs = cpu_to_be16(1); - cur->bc_nlevels++; - cur->bc_ptrs[level + 1] = 1; - kp = XFS_BMAP_KEY_IADDR(block, 1, cur); - ckp = XFS_BMAP_KEY_IADDR(cblock, 1, cur); - memcpy(ckp, kp, be16_to_cpu(cblock->bb_numrecs) * sizeof(*kp)); - cpp = XFS_BMAP_PTR_IADDR(cblock, 1, cur); -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(cblock->bb_numrecs); i++) { - if ((error = xfs_btree_check_lptr_disk(cur, pp[i], level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); + be16_add(&block->bb_h.bb_numrecs, i); + ASSERT(block->bb_h.bb_numrecs == cblock->bb_h.bb_numrecs); + kp = xfs_bmbt_key_addr(cur, 1, block); + ckp = xfs_bmbt_key_addr(cur, 1, cblock); + memcpy(kp, ckp, be16_to_cpu(block->bb_h.bb_numrecs) * sizeof(xfs_bmbt_key_t)); + pp = xfs_bmbt_ptr_addr(cur, 1, block); + cpp = xfs_bmbt_ptr_addr(cur, 1, cblock); +#ifdef DEBUG + for (i = 0; i < be16_to_cpu(cblock->bb_h.bb_numrecs); i++) { + int error; + error = xfs_btree_check_lptr_disk(cur, cpp, i, level - 1); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); return error; } } #endif - memcpy(cpp, pp, be16_to_cpu(cblock->bb_numrecs) * sizeof(*pp)); -#ifdef DEBUG - if ((error = xfs_btree_check_lptr(cur, args.fsbno, level))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } -#endif - *pp = cpu_to_be64(args.fsbno); - xfs_iroot_realloc(cur->bc_private.b.ip, 1 - be16_to_cpu(cblock->bb_numrecs), - cur->bc_private.b.whichfork); - xfs_btree_setbuf(cur, level, bp); - /* - * Do all this logging at the end so that - * the root is at the right level. - */ - xfs_bmbt_log_block(cur, bp, XFS_BB_ALL_BITS); - xfs_bmbt_log_keys(cur, bp, 1, be16_to_cpu(cblock->bb_numrecs)); - xfs_bmbt_log_ptrs(cur, bp, 1, be16_to_cpu(cblock->bb_numrecs)); - XFS_BMBT_TRACE_CURSOR(cur, EXIT); - *logflags |= - XFS_ILOG_CORE | XFS_ILOG_FBROOT(cur->bc_private.b.whichfork); - *stat = 1; + memcpy(pp, cpp, be16_to_cpu(block->bb_h.bb_numrecs) * sizeof(xfs_bmbt_ptr_t)); + + xfs_bmbt_free_block(cur, cbp, 1); + cur->bc_bufs[level - 1] = NULL; + be16_add(&block->bb_h.bb_level, -1); + xfs_trans_log_inode(cur->bc_tp, ip, + XFS_ILOG_CORE | XFS_ILOG_FBROOT(cur->bc_private.b.whichfork)); + cur->bc_nlevels--; +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); return 0; } -/* - * Set all the fields in a bmap extent record from the arguments. - */ -void -xfs_bmbt_set_allf( - xfs_bmbt_rec_host_t *r, - xfs_fileoff_t startoff, - xfs_fsblock_t startblock, - xfs_filblks_t blockcount, - xfs_exntst_t state) +STATIC int +xfs_bmbt_realloc_root( + xfs_btree_cur_t *cur, + int index) { - int extent_flag = (state == XFS_EXT_NORM) ? 0 : 1; - - ASSERT(state == XFS_EXT_NORM || state == XFS_EXT_UNWRITTEN); - ASSERT((startoff & XFS_MASK64HI(64-BMBT_STARTOFF_BITLEN)) == 0); - ASSERT((blockcount & XFS_MASK64HI(64-BMBT_BLOCKCOUNT_BITLEN)) == 0); + xfs_inode_t *ip = cur->bc_private.b.ip; + xfs_iroot_realloc(ip, index, cur->bc_private.b.whichfork); + return 0; +} -#if XFS_BIG_BLKNOS - ASSERT((startblock & XFS_MASK64HI(64-BMBT_STARTBLOCK_BITLEN)) == 0); +STATIC void +xfs_bmbt_update_cursor( + xfs_btree_cur_t *src, + xfs_btree_cur_t *dst) +{ + ASSERT((dst->bc_private.b.firstblock != NULLFSBLOCK) || + (dst->bc_private.b.ip->i_d.di_flags & XFS_DIFLAG_REALTIME)); + ASSERT(dst->bc_private.b.flist == src->bc_private.b.flist); - r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | - ((xfs_bmbt_rec_base_t)startoff << 9) | - ((xfs_bmbt_rec_base_t)startblock >> 43); - r->l1 = ((xfs_bmbt_rec_base_t)startblock << 21) | - ((xfs_bmbt_rec_base_t)blockcount & - (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); -#else /* !XFS_BIG_BLKNOS */ - if (ISNULLSTARTBLOCK(startblock)) { - r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | - ((xfs_bmbt_rec_base_t)startoff << 9) | - (xfs_bmbt_rec_base_t)XFS_MASK64LO(9); - r->l1 = XFS_MASK64HI(11) | - ((xfs_bmbt_rec_base_t)startblock << 21) | - ((xfs_bmbt_rec_base_t)blockcount & - (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); - } else { - r->l0 = ((xfs_bmbt_rec_base_t)extent_flag << 63) | - ((xfs_bmbt_rec_base_t)startoff << 9); - r->l1 = ((xfs_bmbt_rec_base_t)startblock << 21) | - ((xfs_bmbt_rec_base_t)blockcount & - (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); - } -#endif /* XFS_BIG_BLKNOS */ + dst->bc_private.b.allocated += src->bc_private.b.allocated; + src->bc_private.b.allocated = 0; + dst->bc_private.b.firstblock = src->bc_private.b.firstblock; } +static const struct xfs_btree_cur_ops xfs_bmbt_curops = { + .new_root = xfs_bmbt_new_root, + .realloc_root = xfs_bmbt_realloc_root, + .kill_root = xfs_bmbt_killroot, + .update_cursor =xfs_bmbt_update_cursor, +}; + +#if defined(XFS_BTREE_TRACE) + /* - * Set all the fields in a bmap extent record from the uncompressed form. + * Global bmbt trace buffer */ -void -xfs_bmbt_set_all( - xfs_bmbt_rec_host_t *r, - xfs_bmbt_irec_t *s) +ktrace_t *xfs_bmbt_trace_buf; +/* + * Add a trace buffer entry for the arguments given to the routine, + * generic form. + */ +STATIC void +xfs_bmbt_trace_enter( + const char *func, + xfs_btree_cur_t *cur, + char *s, + int type, + int line, + __psunsigned_t a0, + __psunsigned_t a1, + __psunsigned_t a2, + __psunsigned_t a3, + __psunsigned_t a4, + __psunsigned_t a5, + __psunsigned_t a6, + __psunsigned_t a7, + __psunsigned_t a8, + __psunsigned_t a9, + __psunsigned_t a10) { - xfs_bmbt_set_allf(r, s->br_startoff, s->br_startblock, - s->br_blockcount, s->br_state); + xfs_inode_t *ip; + int whichfork; + + ip = cur->bc_private.b.ip; + whichfork = cur->bc_private.b.whichfork; + ktrace_enter(xfs_bmbt_trace_buf, + (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), + (void *)func, (void *)s, (void *)ip, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); + ASSERT(ip->i_btrace); + ktrace_enter(ip->i_btrace, + (void *)((__psint_t)type | (whichfork << 8) | (line << 16)), + (void *)func, (void *)s, (void *)ip, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); } - -/* - * Set all the fields in a disk format bmap extent record from the arguments. - */ -void -xfs_bmbt_disk_set_allf( - xfs_bmbt_rec_t *r, - xfs_fileoff_t startoff, - xfs_fsblock_t startblock, - xfs_filblks_t blockcount, - xfs_exntst_t state) +STATIC void +xfs_bmbt_trace_cursor( + xfs_btree_cur_t *cur, + __uint32_t *s0, + __uint64_t *l0, + __uint64_t *l1) { - int extent_flag = (state == XFS_EXT_NORM) ? 0 : 1; - - ASSERT(state == XFS_EXT_NORM || state == XFS_EXT_UNWRITTEN); - ASSERT((startoff & XFS_MASK64HI(64-BMBT_STARTOFF_BITLEN)) == 0); - ASSERT((blockcount & XFS_MASK64HI(64-BMBT_BLOCKCOUNT_BITLEN)) == 0); + xfs_bmbt_rec_host_t r; -#if XFS_BIG_BLKNOS - ASSERT((startblock & XFS_MASK64HI(64-BMBT_STARTBLOCK_BITLEN)) == 0); + xfs_bmbt_set_all(&r, &cur->bc_rec.b); - r->l0 = cpu_to_be64( - ((xfs_bmbt_rec_base_t)extent_flag << 63) | - ((xfs_bmbt_rec_base_t)startoff << 9) | - ((xfs_bmbt_rec_base_t)startblock >> 43)); - r->l1 = cpu_to_be64( - ((xfs_bmbt_rec_base_t)startblock << 21) | - ((xfs_bmbt_rec_base_t)blockcount & - (xfs_bmbt_rec_base_t)XFS_MASK64LO(21))); -#else /* !XFS_BIG_BLKNOS */ - if (ISNULLSTARTBLOCK(startblock)) { - r->l0 = cpu_to_be64( - ((xfs_bmbt_rec_base_t)extent_flag << 63) | - ((xfs_bmbt_rec_base_t)startoff << 9) | - (xfs_bmbt_rec_base_t)XFS_MASK64LO(9)); - r->l1 = cpu_to_be64(XFS_MASK64HI(11) | - ((xfs_bmbt_rec_base_t)startblock << 21) | - ((xfs_bmbt_rec_base_t)blockcount & - (xfs_bmbt_rec_base_t)XFS_MASK64LO(21))); - } else { - r->l0 = cpu_to_be64( - ((xfs_bmbt_rec_base_t)extent_flag << 63) | - ((xfs_bmbt_rec_base_t)startoff << 9)); - r->l1 = cpu_to_be64( - ((xfs_bmbt_rec_base_t)startblock << 21) | - ((xfs_bmbt_rec_base_t)blockcount & - (xfs_bmbt_rec_base_t)XFS_MASK64LO(21))); - } -#endif /* XFS_BIG_BLKNOS */ + *s0 = (cur->bc_private.b.flags << 16) | cur->bc_private.b.allocated; + *l0 = r.l0; + *l1 = r.l1; } -/* - * Set all the fields in a bmap extent record from the uncompressed form. - */ -void -xfs_bmbt_disk_set_all( - xfs_bmbt_rec_t *r, - xfs_bmbt_irec_t *s) -{ - xfs_bmbt_disk_set_allf(r, s->br_startoff, s->br_startblock, - s->br_blockcount, s->br_state); -} +STATIC void +xfs_bmbt_trace_record( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec, + __uint64_t *l0, + __uint64_t *l1, + __uint64_t *l2) +{ + xfs_bmbt_irec_t s; + + xfs_bmbt_disk_get_all(&rec->u.bmbt, &s); + *l0 = s.br_startoff; + *l1 = s.br_startblock; + *l2 = s.br_blockcount; +} + +static const struct xfs_btree_trc_ops xfs_bmbt_trcops = { + .enter = xfs_bmbt_trace_enter, + .cursor = xfs_bmbt_trace_cursor, + .record = xfs_bmbt_trace_record, +}; +#endif -/* - * Set the blockcount field in a bmap extent record. - */ void -xfs_bmbt_set_blockcount( - xfs_bmbt_rec_host_t *r, - xfs_filblks_t v) +xfs_bmbt_init_cursor( + xfs_btree_cur_t *cur) { - ASSERT((v & XFS_MASK64HI(43)) == 0); - r->l1 = (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(43)) | - (xfs_bmbt_rec_base_t)(v & XFS_MASK64LO(21)); + cur->bc_flags = XFS_BTREE_ROOT_IN_INODE; + cur->bc_curops = &xfs_bmbt_curops; + cur->bc_blkops = &xfs_bmbt_blkops; + cur->bc_recops = &xfs_bmbt_recops; +#if defined(XFS_BTREE_TRACE) + cur->bc_trcops = &xfs_bmbt_trcops; +#endif } /* - * Set the startblock field in a bmap extent record. + * BMBT functions that are not covered by core btree code. + * Externally visible routines. */ -void -xfs_bmbt_set_startblock( - xfs_bmbt_rec_host_t *r, - xfs_fsblock_t v) -{ -#if XFS_BIG_BLKNOS - ASSERT((v & XFS_MASK64HI(12)) == 0); - r->l0 = (r->l0 & (xfs_bmbt_rec_base_t)XFS_MASK64HI(55)) | - (xfs_bmbt_rec_base_t)(v >> 43); - r->l1 = (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)) | - (xfs_bmbt_rec_base_t)(v << 21); -#else /* !XFS_BIG_BLKNOS */ - if (ISNULLSTARTBLOCK(v)) { - r->l0 |= (xfs_bmbt_rec_base_t)XFS_MASK64LO(9); - r->l1 = (xfs_bmbt_rec_base_t)XFS_MASK64HI(11) | - ((xfs_bmbt_rec_base_t)v << 21) | - (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); - } else { - r->l0 &= ~(xfs_bmbt_rec_base_t)XFS_MASK64LO(9); - r->l1 = ((xfs_bmbt_rec_base_t)v << 21) | - (r->l1 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(21)); - } -#endif /* XFS_BIG_BLKNOS */ -} /* - * Set the startoff field in a bmap extent record. + * Update the record referred to by cur to the value given + * by [off, bno, len, state]. + * This either works (return 0) or gets an EFSCORRUPTED error. */ -void -xfs_bmbt_set_startoff( - xfs_bmbt_rec_host_t *r, - xfs_fileoff_t v) +int +xfs_bmbt_update( + xfs_btree_cur_t *cur, + xfs_fileoff_t off, + xfs_fsblock_t bno, + xfs_filblks_t len, + xfs_exntst_t state) { - ASSERT((v & XFS_MASK64HI(9)) == 0); - r->l0 = (r->l0 & (xfs_bmbt_rec_base_t) XFS_MASK64HI(1)) | - ((xfs_bmbt_rec_base_t)v << 9) | - (r->l0 & (xfs_bmbt_rec_base_t)XFS_MASK64LO(9)); + xfs_btree_rec_t rec; + + xfs_bmbt_disk_set_allf(&rec.u.bmbt, off, bno, len, state); + return xfs_btree_update(cur, &rec); } /* - * Set the extent state field in a bmap extent record. + * Lookup the record equal to [off, bno, len] in the btree given by cur. */ -void -xfs_bmbt_set_state( - xfs_bmbt_rec_host_t *r, - xfs_exntst_t v) +int /* error */ +xfs_bmbt_lookup_eq( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_fileoff_t off, + xfs_fsblock_t bno, + xfs_filblks_t len, + int *stat) /* success/failure */ { - ASSERT(v == XFS_EXT_NORM || v == XFS_EXT_UNWRITTEN); - if (v == XFS_EXT_NORM) - r->l0 &= XFS_MASK64LO(64 - BMBT_EXNTFLAG_BITLEN); - else - r->l0 |= XFS_MASK64HI(BMBT_EXNTFLAG_BITLEN); + cur->bc_rec.b.br_startoff = off; + cur->bc_rec.b.br_startblock = bno; + cur->bc_rec.b.br_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); } /* - * Convert in-memory form of btree root to on-disk form. + * Lookup the first record greater than or equal to [off, bno, len] + * in the btree given by cur. */ -void -xfs_bmbt_to_bmdr( - xfs_bmbt_block_t *rblock, - int rblocklen, - xfs_bmdr_block_t *dblock, - int dblocklen) +int /* error */ +xfs_bmbt_lookup_ge( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_fileoff_t off, + xfs_fsblock_t bno, + xfs_filblks_t len, + int *stat) /* success/failure */ { - int dmxr; - xfs_bmbt_key_t *fkp; - __be64 *fpp; - xfs_bmbt_key_t *tkp; - __be64 *tpp; - - ASSERT(be32_to_cpu(rblock->bb_magic) == XFS_BMAP_MAGIC); - ASSERT(be64_to_cpu(rblock->bb_leftsib) == NULLDFSBNO); - ASSERT(be64_to_cpu(rblock->bb_rightsib) == NULLDFSBNO); - ASSERT(be16_to_cpu(rblock->bb_level) > 0); - dblock->bb_level = rblock->bb_level; - dblock->bb_numrecs = rblock->bb_numrecs; - dmxr = (int)XFS_BTREE_BLOCK_MAXRECS(dblocklen, xfs_bmdr, 0); - fkp = XFS_BMAP_BROOT_KEY_ADDR(rblock, 1, rblocklen); - tkp = XFS_BTREE_KEY_ADDR(xfs_bmdr, dblock, 1); - fpp = XFS_BMAP_BROOT_PTR_ADDR(rblock, 1, rblocklen); - tpp = XFS_BTREE_PTR_ADDR(xfs_bmdr, dblock, 1, dmxr); - dmxr = be16_to_cpu(dblock->bb_numrecs); - memcpy(tkp, fkp, sizeof(*fkp) * dmxr); - memcpy(tpp, fpp, sizeof(*fpp) * dmxr); + cur->bc_rec.b.br_startoff = off; + cur->bc_rec.b.br_startblock = bno; + cur->bc_rec.b.br_blockcount = len; + return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); } /* - * Update the record to the passed values. + * Give the bmap btree a new root block. Copy the old broot contents + * down into a real block and make the broot point to it. */ -int -xfs_bmbt_update( - xfs_btree_cur_t *cur, - xfs_fileoff_t off, - xfs_fsblock_t bno, - xfs_filblks_t len, - xfs_exntst_t state) +int /* error */ +xfs_bmbt_newroot( + xfs_btree_cur_t *cur, /* btree cursor */ + int *logflags, /* logging flags for inode */ + int *stat) /* return status - 0 fail */ { - xfs_bmbt_block_t *block; - xfs_buf_t *bp; - int error; - xfs_bmbt_key_t key; - int ptr; - xfs_bmbt_rec_t *rp; - - XFS_BMBT_TRACE_CURSOR(cur, ENTRY); - XFS_BMBT_TRACE_ARGFFFI(cur, (xfs_dfiloff_t)off, (xfs_dfsbno_t)bno, - (xfs_dfilblks_t)len, (int)state); - block = xfs_bmbt_get_block(cur, 0, &bp); + xfs_btree_block_t *block; /* bmap btree block */ + xfs_buf_t *bp; /* buffer for block */ + xfs_btree_block_t *cblock; /* child btree block */ + xfs_btree_key_t *ckp; /* child key pointer */ + xfs_btree_ptr_t *cpp; /* child ptr pointer */ + int error; /* error return code */ #ifdef DEBUG - if ((error = xfs_btree_check_lblock(cur, block, 0, bp))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; - } + int i; /* loop counter */ #endif - ptr = cur->bc_ptrs[0]; - rp = XFS_BMAP_REC_IADDR(block, ptr, cur); - xfs_bmbt_disk_set_allf(rp, off, bno, len, state); - xfs_bmbt_log_recs(cur, bp, ptr, ptr); - if (ptr > 1) { - XFS_BMBT_TRACE_CURSOR(cur, EXIT); + xfs_btree_key_t *kp; /* pointer to bmap btree key */ + int level; /* btree level */ + xfs_btree_ptr_t *pp; /* pointer to bmap block addr */ + xfs_btree_ptr_t nptr; /* pointer to bmap block addr */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + level = cur->bc_nlevels - 1; + block = xfs_bmbt_get_block(cur, level, &bp); + pp = xfs_bmbt_ptr_addr(cur, 1, block); + + /* + * Allocate the new block. + * If we can't do it, we're toast. Give up. + */ + error = xfs_bmbt_alloc_block(cur, pp, &nptr, 1, stat); + if (error) + goto error0; + if (*stat == 0) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); return 0; } - key.br_startoff = cpu_to_be64(off); - if ((error = xfs_bmbt_updkey(cur, &key, 1))) { - XFS_BMBT_TRACE_CURSOR(cur, ERROR); - return error; + /* + * Copy the root into a real block. + */ + error = xfs_bmbt_get_buf(cur, &nptr, 0, &bp); + if (error) + goto error0; + cblock = xfs_bmbt_buf_to_block(cur, bp); + *cblock = *block; + be16_add(&block->bb_h.bb_level, 1); + block->bb_h.bb_numrecs = cpu_to_be16(1); + cur->bc_nlevels++; + cur->bc_ptrs[level + 1] = 1; + kp = xfs_bmbt_key_addr(cur, 1, block); + ckp = xfs_bmbt_key_addr(cur, 1, cblock); + memcpy(ckp, kp, be16_to_cpu(cblock->bb_h.bb_numrecs) * sizeof(xfs_bmbt_key_t)); + cpp = xfs_bmbt_ptr_addr(cur, 1, cblock); +#ifdef DEBUG + for (i = 0; i < be16_to_cpu(cblock->bb_h.bb_numrecs); i++) { + error = xfs_btree_check_lptr_disk(cur, pp[i], level); + if (error) + goto error0; } - XFS_BMBT_TRACE_CURSOR(cur, EXIT); +#endif + memcpy(cpp, pp, be16_to_cpu(cblock->bb_h.bb_numrecs) * sizeof(xfs_bmbt_ptr_t)); +#ifdef DEBUG + error = xfs_btree_check_lptr(cur, nptr.u.bmbt, level); + if (error) + goto error0; +#endif + memcpy(pp, &nptr, sizeof(xfs_bmbt_ptr_t)); + xfs_bmbt_realloc_root(cur, 1 - be16_to_cpu(cblock->bb_h.bb_numrecs)); + xfs_btree_setbuf(cur, level, bp); + /* + * Do all this logging at the end so that + * the root is at the right level. + */ + xfs_bmbt_log_block(cur, bp, XFS_BB_ALL_BITS); + xfs_bmbt_log_keys(cur, bp, 1, be16_to_cpu(cblock->bb_h.bb_numrecs)); + xfs_bmbt_log_ptrs(cur, bp, 1, be16_to_cpu(cblock->bb_h.bb_numrecs)); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *logflags |= + XFS_ILOG_CORE | XFS_ILOG_FBROOT(cur->bc_private.b.whichfork); + *stat = 1; return 0; +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; } -/* - * Check extent records, which have just been read, for - * any bit in the extent flag field. ASSERT on debug - * kernels, as this condition should not occur. - * Return an error condition (1) if any flags found, - * otherwise return 0. - */ - -int -xfs_check_nostate_extents( - xfs_ifork_t *ifp, - xfs_extnum_t idx, - xfs_extnum_t num) -{ - for (; num > 0; num--, idx++) { - xfs_bmbt_rec_host_t *ep = xfs_iext_get_ext(ifp, idx); - if ((ep->l0 >> - (64 - BMBT_EXNTFLAG_BITLEN)) != 0) { - ASSERT(0); - return 1; - } - } - return 0; -} Index: 2.6.x-xfs-new/fs/xfs/xfs_bmap_btree.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_bmap_btree.h 2007-08-02 22:13:10.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_bmap_btree.h 2007-11-06 19:40:29.718673016 +1100 @@ -254,11 +254,7 @@ extern ktrace_t *xfs_bmbt_trace_buf; * Prototypes for xfs_bmap.c to call. */ extern void xfs_bmdr_to_bmbt(xfs_bmdr_block_t *, int, xfs_bmbt_block_t *, int); -extern int xfs_bmbt_decrement(struct xfs_btree_cur *, int, int *); -extern int xfs_bmbt_delete(struct xfs_btree_cur *, int *); extern void xfs_bmbt_get_all(xfs_bmbt_rec_host_t *r, xfs_bmbt_irec_t *s); -extern xfs_bmbt_block_t *xfs_bmbt_get_block(struct xfs_btree_cur *cur, - int, struct xfs_buf **bpp); extern xfs_filblks_t xfs_bmbt_get_blockcount(xfs_bmbt_rec_host_t *r); extern xfs_fsblock_t xfs_bmbt_get_startblock(xfs_bmbt_rec_host_t *r); extern xfs_fileoff_t xfs_bmbt_get_startoff(xfs_bmbt_rec_host_t *r); @@ -268,8 +264,6 @@ extern void xfs_bmbt_disk_get_all(xfs_bm extern xfs_filblks_t xfs_bmbt_disk_get_blockcount(xfs_bmbt_rec_t *r); extern xfs_fileoff_t xfs_bmbt_disk_get_startoff(xfs_bmbt_rec_t *r); -extern int xfs_bmbt_increment(struct xfs_btree_cur *, int, int *); -extern int xfs_bmbt_insert(struct xfs_btree_cur *, int *); extern void xfs_bmbt_log_block(struct xfs_btree_cur *, struct xfs_buf *, int); extern void xfs_bmbt_log_recs(struct xfs_btree_cur *, struct xfs_buf *, int, int); @@ -299,6 +293,8 @@ extern void xfs_bmbt_disk_set_allf(xfs_b extern void xfs_bmbt_to_bmdr(xfs_bmbt_block_t *, int, xfs_bmdr_block_t *, int); extern int xfs_bmbt_update(struct xfs_btree_cur *, xfs_fileoff_t, xfs_fsblock_t, xfs_filblks_t, xfs_exntst_t); +extern void xfs_bmbt_init_cursor(struct xfs_btree_cur *cur); + #endif /* __KERNEL__ */ Index: 2.6.x-xfs-new/fs/xfs/xfs_btree.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_btree.c 2007-08-24 22:24:45.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_btree.c 2007-11-06 19:40:29.750668896 +1100 @@ -52,19 +52,7 @@ const __uint32_t xfs_magics[XFS_BTNUM_MA }; /* - * Prototypes for internal routines. - */ - -/* - * Checking routine: return maxrecs for the block. - */ -STATIC int /* number of records fitting in block */ -xfs_btree_maxrecs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_block_t *block);/* generic btree block pointer */ - -/* - * Internal routines. + * Internal prototypes */ /* @@ -75,7 +63,7 @@ STATIC xfs_btree_block_t * /* generic xfs_btree_get_block( xfs_btree_cur_t *cur, /* btree cursor */ int level, /* level in btree */ - struct xfs_buf **bpp); /* buffer containing the block */ + xfs_buf_t **bpp); /* buffer containing the block */ /* * Checking routine: return maxrecs for the block. @@ -177,65 +165,7 @@ xfs_btree_check_key( ASSERT(0); } } -#endif /* DEBUG */ - -/* - * Checking routine: check that long form block header is ok. - */ -/* ARGSUSED */ -int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_lblock( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_lblock_t *block, /* btree long form block pointer */ - int level, /* level of the btree block */ - xfs_buf_t *bp) /* buffer for block, if any */ -{ - int lblock_ok; /* block passes checks */ - xfs_mount_t *mp; /* file system mount point */ - - mp = cur->bc_mp; - lblock_ok = - be32_to_cpu(block->bb_magic) == xfs_magics[cur->bc_btnum] && - be16_to_cpu(block->bb_level) == level && - be16_to_cpu(block->bb_numrecs) <= - xfs_btree_maxrecs(cur, (xfs_btree_block_t *)block) && - block->bb_leftsib && - (be64_to_cpu(block->bb_leftsib) == NULLDFSBNO || - XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_leftsib))) && - block->bb_rightsib && - (be64_to_cpu(block->bb_rightsib) == NULLDFSBNO || - XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(block->bb_rightsib))); - if (unlikely(XFS_TEST_ERROR(!lblock_ok, mp, XFS_ERRTAG_BTREE_CHECK_LBLOCK, - XFS_RANDOM_BTREE_CHECK_LBLOCK))) { - if (bp) - xfs_buftrace("LBTREE ERROR", bp); - XFS_ERROR_REPORT("xfs_btree_check_lblock", XFS_ERRLEVEL_LOW, - mp); - return XFS_ERROR(EFSCORRUPTED); - } - return 0; -} - -/* - * Checking routine: check that (long) pointer is ok. - */ -int /* error (0 or EFSCORRUPTED) */ -xfs_btree_check_lptr( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_dfsbno_t ptr, /* btree block disk address */ - int level) /* btree block level */ -{ - xfs_mount_t *mp; /* file system mount point */ - - mp = cur->bc_mp; - XFS_WANT_CORRUPTED_RETURN( - level > 0 && - ptr != NULLDFSBNO && - XFS_FSB_SANITY_CHECK(mp, ptr)); - return 0; -} -#ifdef DEBUG /* * Debug routine: check that records are in the right order. */ @@ -296,13 +226,73 @@ xfs_btree_check_rec( #endif /* DEBUG */ /* + * Checking routine: check that long form block header is ok. + */ +/* ARGSUSED */ +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_lblock( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_btree_block_t *block, /* btree long form block pointer */ + int level, /* level of the btree block */ + xfs_buf_t *bp) /* buffer for block, if any */ +{ + int lblock_ok; /* block passes checks */ + xfs_mount_t *mp; /* file system mount point */ + xfs_btree_lblock_t *lb; /* btree long form block pointer */ + + mp = cur->bc_mp; + lb = (xfs_btree_lblock_t *)block; + lblock_ok = + be32_to_cpu(lb->bb_magic) == xfs_magics[cur->bc_btnum] && + be16_to_cpu(lb->bb_level) == level && + be16_to_cpu(lb->bb_numrecs) <= xfs_btree_maxrecs(cur, block) && + lb->bb_leftsib && + (be64_to_cpu(lb->bb_leftsib) == NULLDFSBNO || + XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(lb->bb_leftsib))) && + lb->bb_rightsib && + (be64_to_cpu(lb->bb_rightsib) == NULLDFSBNO || + XFS_FSB_SANITY_CHECK(mp, be64_to_cpu(lb->bb_rightsib))); + if (unlikely(XFS_TEST_ERROR(!lblock_ok, mp, XFS_ERRTAG_BTREE_CHECK_LBLOCK, + XFS_RANDOM_BTREE_CHECK_LBLOCK))) { + if (bp) + xfs_buftrace("LBTREE ERROR", bp); + XFS_ERROR_REPORT("xfs_btree_check_lblock", XFS_ERRLEVEL_LOW, + mp); + return XFS_ERROR(EFSCORRUPTED); + } + return 0; +} + +/* + * Checking routine: check that (long) pointer is ok. + */ +int /* error (0 or EFSCORRUPTED) */ +xfs_btree_check_lptr( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_btree_ptr_t *ptr, /* btree block disk address */ + int index, /* offset from ptr */ + int level) /* btree block level */ +{ + xfs_mount_t *mp; /* file system mount point */ + xfs_fsblock_t bno; + + mp = cur->bc_mp; + bno = be64_to_cpu((&ptr->u.l)[index]); + XFS_WANT_CORRUPTED_RETURN(level > 0 && + bno != NULLDFSBNO && + XFS_FSB_SANITY_CHECK(mp, bno)); + return 0; +} + + +/* * Checking routine: check that block header is ok. */ /* ARGSUSED */ int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_sblock( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_sblock_t *block, /* btree short form block pointer */ + xfs_btree_block_t *block, /* btree short form block pointer */ int level, /* level of the btree block */ xfs_buf_t *bp) /* buffer containing block */ { @@ -310,21 +300,22 @@ xfs_btree_check_sblock( xfs_agf_t *agf; /* ag. freespace structure */ xfs_agblock_t agflen; /* native ag. freespace length */ int sblock_ok; /* block passes checks */ + xfs_btree_sblock_t *sb; /* btree short form block pointer */ agbp = cur->bc_private.a.agbp; agf = XFS_BUF_TO_AGF(agbp); agflen = be32_to_cpu(agf->agf_length); + sb = (xfs_btree_sblock_t *)block; sblock_ok = - be32_to_cpu(block->bb_magic) == xfs_magics[cur->bc_btnum] && - be16_to_cpu(block->bb_level) == level && - be16_to_cpu(block->bb_numrecs) <= - xfs_btree_maxrecs(cur, (xfs_btree_block_t *)block) && - (be32_to_cpu(block->bb_leftsib) == NULLAGBLOCK || - be32_to_cpu(block->bb_leftsib) < agflen) && - block->bb_leftsib && - (be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK || - be32_to_cpu(block->bb_rightsib) < agflen) && - block->bb_rightsib; + be32_to_cpu(sb->bb_magic) == xfs_magics[cur->bc_btnum] && + be16_to_cpu(sb->bb_level) == level && + be16_to_cpu(sb->bb_numrecs) <= xfs_btree_maxrecs(cur, block) && + (be32_to_cpu(sb->bb_leftsib) == NULLAGBLOCK || + be32_to_cpu(sb->bb_leftsib) < agflen) && + sb->bb_leftsib && + (be32_to_cpu(sb->bb_rightsib) == NULLAGBLOCK || + be32_to_cpu(sb->bb_rightsib) < agflen) && + sb->bb_rightsib; if (unlikely(XFS_TEST_ERROR(!sblock_ok, cur->bc_mp, XFS_ERRTAG_BTREE_CHECK_SBLOCK, XFS_RANDOM_BTREE_CHECK_SBLOCK))) { @@ -343,22 +334,105 @@ xfs_btree_check_sblock( int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_sptr( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t ptr, /* btree block disk address */ + xfs_btree_ptr_t *ptr, /* btree block disk address */ + int index, /* offset from ptr to check */ int level) /* btree block level */ { xfs_buf_t *agbp; /* buffer for ag. freespace struct */ xfs_agf_t *agf; /* ag. freespace structure */ + xfs_agblock_t bno; agbp = cur->bc_private.a.agbp; agf = XFS_BUF_TO_AGF(agbp); - XFS_WANT_CORRUPTED_RETURN( - level > 0 && - ptr != NULLAGBLOCK && ptr != 0 && - ptr < be32_to_cpu(agf->agf_length)); + bno = be32_to_cpu((&ptr->u.s)[index]); + XFS_WANT_CORRUPTED_RETURN(level > 0 && bno != NULLAGBLOCK && + bno != 0 && bno < be32_to_cpu(agf->agf_length)); return 0; } /* + * Get/set/init sibling pointers + */ +void +xfs_btree_get_lsibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr) +{ + if (lr == XFS_BB_RIGHTSIB) { + ptr->u.l = block->bb_u.l.bb_rightsib; + } else { + ASSERT(lr == XFS_BB_LEFTSIB); + ptr->u.l = block->bb_u.l.bb_leftsib; + } + +} + +void +xfs_btree_set_lsibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr) +{ + if (lr == XFS_BB_RIGHTSIB) { + block->bb_u.l.bb_rightsib = ptr->u.l; + } else { + ASSERT(sibling == XFS_BB_LEFTSIB); + block->bb_u.l.bb_leftsib = ptr->u.l; + } + +} + +void +xfs_btree_get_ssibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr) +{ + if (lr == XFS_BB_RIGHTSIB) { + ptr->u.s = block->bb_u.s.bb_rightsib; + } else { + ASSERT(lr == XFS_BB_LEFTSIB); + ptr->u.s = block->bb_u.s.bb_leftsib; + } + +} + +void +xfs_btree_set_ssibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr) +{ + if (lr == XFS_BB_RIGHTSIB) { + block->bb_u.s.bb_rightsib = ptr->u.s; + } else { + ASSERT(sibling == XFS_BB_LEFTSIB); + block->bb_u.s.bb_leftsib = ptr->u.s; + } + +} + +/* set up block header and records for new block in split */ +void +xfs_btree_init_sibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *new, + xfs_btree_block_t *sib) /* sibling block next to new block */ +{ + /* + * Fill in the btree header for the new block. + */ + new->bb_h.bb_magic = cpu_to_be32(XFS_BMAP_MAGIC); + new->bb_h.bb_level = sib->bb_h.bb_level; + new->bb_h.bb_numrecs = 0; +} + +/* * Delete the btree cursor. */ void @@ -625,6 +699,7 @@ xfs_btree_init_cursor( */ cur->bc_private.a.agbp = agbp; cur->bc_private.a.agno = agno; + xfs_alloc_init_cursor(cur); break; case XFS_BTNUM_BMAP: /* @@ -637,6 +712,7 @@ xfs_btree_init_cursor( cur->bc_private.b.allocated = 0; cur->bc_private.b.flags = 0; cur->bc_private.b.whichfork = whichfork; + xfs_bmbt_init_cursor(cur); break; case XFS_BTNUM_INO: /* @@ -644,6 +720,7 @@ xfs_btree_init_cursor( */ cur->bc_private.i.agbp = agbp; cur->bc_private.i.agno = agno; + xfs_inobt_init_cursor(cur); break; default: ASSERT(0); @@ -848,60 +925,70 @@ xfs_btree_reada_bufs( * Read-ahead btree blocks, at the given level. * Bits in lr are set from XFS_BTCUR_{LEFT,RIGHT}RA. */ +STATIC int +xfs_btree_reada_cores( + xfs_btree_cur_t *cur, /* btree cursor */ + int lr, + xfs_agblock_t left, + xfs_agblock_t right) +{ + int rval = 0; + + if ((lr & XFS_BTCUR_LEFTRA) && (left != NULLAGBLOCK)) { + xfs_btree_reada_bufs(cur->bc_mp, + cur->bc_private.a.agno, left, 1); + rval++; + } + if ((lr & XFS_BTCUR_RIGHTRA) && (right != NULLAGBLOCK)) { + xfs_btree_reada_bufs(cur->bc_mp, + cur->bc_private.a.agno, right, 1); + rval++; + } + return rval; +} + +STATIC int +xfs_btree_reada_corel( + xfs_btree_cur_t *cur, /* btree cursor */ + int lr, + xfs_fsblock_t left, + xfs_fsblock_t right) +{ + int rval = 0; + + if ((lr & XFS_BTCUR_LEFTRA) && (left != NULLDFSBNO)) { + xfs_btree_reada_bufl(cur->bc_mp, left, 1); + rval++; + } + if ((lr & XFS_BTCUR_RIGHTRA) && (right != NULLDFSBNO)) { + xfs_btree_reada_bufl(cur->bc_mp, right, 1); + rval++; + } + return rval; +} + int xfs_btree_readahead_core( xfs_btree_cur_t *cur, /* btree cursor */ int lev, /* level in btree */ int lr) /* left/right bits */ { - xfs_alloc_block_t *a; - xfs_bmbt_block_t *b; - xfs_inobt_block_t *i; int rval = 0; ASSERT(cur->bc_bufs[lev] != NULL); cur->bc_ra[lev] |= lr; - switch (cur->bc_btnum) { - case XFS_BTNUM_BNO: - case XFS_BTNUM_CNT: - a = XFS_BUF_TO_ALLOC_BLOCK(cur->bc_bufs[lev]); - if ((lr & XFS_BTCUR_LEFTRA) && be32_to_cpu(a->bb_leftsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, - be32_to_cpu(a->bb_leftsib), 1); - rval++; - } - if ((lr & XFS_BTCUR_RIGHTRA) && be32_to_cpu(a->bb_rightsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.a.agno, - be32_to_cpu(a->bb_rightsib), 1); - rval++; - } - break; - case XFS_BTNUM_BMAP: - b = XFS_BUF_TO_BMBT_BLOCK(cur->bc_bufs[lev]); - if ((lr & XFS_BTCUR_LEFTRA) && be64_to_cpu(b->bb_leftsib) != NULLDFSBNO) { - xfs_btree_reada_bufl(cur->bc_mp, be64_to_cpu(b->bb_leftsib), 1); - rval++; - } - if ((lr & XFS_BTCUR_RIGHTRA) && be64_to_cpu(b->bb_rightsib) != NULLDFSBNO) { - xfs_btree_reada_bufl(cur->bc_mp, be64_to_cpu(b->bb_rightsib), 1); - rval++; - } - break; - case XFS_BTNUM_INO: - i = XFS_BUF_TO_INOBT_BLOCK(cur->bc_bufs[lev]); - if ((lr & XFS_BTCUR_LEFTRA) && be32_to_cpu(i->bb_leftsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.i.agno, - be32_to_cpu(i->bb_leftsib), 1); - rval++; - } - if ((lr & XFS_BTCUR_RIGHTRA) && be32_to_cpu(i->bb_rightsib) != NULLAGBLOCK) { - xfs_btree_reada_bufs(cur->bc_mp, cur->bc_private.i.agno, - be32_to_cpu(i->bb_rightsib), 1); - rval++; - } - break; - default: - ASSERT(0); + if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) { + xfs_btree_lblock_t *b; + b = XFS_BUF_TO_LBLOCK(cur->bc_bufs[lev]); + rval = xfs_btree_reada_corel(cur, lr, + be64_to_cpu(b->bb_leftsib), + be64_to_cpu(b->bb_rightsib)); + } else { + xfs_btree_sblock_t *b; + b = XFS_BUF_TO_SBLOCK(cur->bc_bufs[lev]); + rval = xfs_btree_reada_cores(cur, lr, + be32_to_cpu(b->bb_leftsib), + be32_to_cpu(b->bb_rightsib)); } return rval; } Index: 2.6.x-xfs-new/fs/xfs/xfs_btree.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_btree.h 2007-11-02 13:44:45.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_btree.h 2007-11-06 19:40:29.750668896 +1100 @@ -85,6 +85,43 @@ typedef struct xfs_btree_block { } xfs_btree_block_t; /* + * Generic block, key, ptr and record wrapper structures + * These are disk format structures, and are converted where + * necessary be the btree specific code that needs to interpret + * them. + */ +typedef struct xfs_btree_key { + union { + xfs_bmbt_key_t bmbt; + xfs_bmdr_key_t bmbr; /* bmbt root block */ + xfs_alloc_key_t alloc; + xfs_inobt_key_t inobt; + __be32 s; /* short form key */ + __be64 l; /* long form key */ + } u; +} xfs_btree_key_t; + +typedef struct xfs_btree_ptr { + union { + xfs_bmbt_ptr_t bmbt; + xfs_bmdr_ptr_t bmbr; /* bmbt root block */ + xfs_alloc_ptr_t alloc; + xfs_inobt_ptr_t inobt; + __be32 s; /* short form ptr */ + __be64 l; /* long form ptr */ + } u; +} xfs_btree_ptr_t; + +typedef struct xfs_btree_rec { + union { + xfs_bmbt_rec_t bmbt; + xfs_bmdr_rec_t bmbr; /* bmbt root block */ + xfs_alloc_rec_t alloc; + xfs_inobt_rec_t inobt; + } u; +} xfs_btree_rec_t; + +/* * For logging record fields. */ #define XFS_BB_MAGIC 0x01 @@ -136,6 +173,183 @@ extern const __uint32_t xfs_magics[]; #define XFS_BTREE_MAXLEVELS 8 /* max of all btrees */ +typedef const struct xfs_btree_cur_ops { + int (*new_root)(struct xfs_btree_cur *cur, int *stat); + int (*realloc_root)(struct xfs_btree_cur *cur, int index); + int (*kill_root)(struct xfs_btree_cur *cur, int level, + xfs_btree_ptr_t *nptr); + void (*set_root)(struct xfs_btree_cur *cur, + xfs_btree_ptr_t *nptr, int level_change); + int (*update_lastrec)(struct xfs_btree_cur *cur, + xfs_btree_block_t *block); + void (*update_cursor)(struct xfs_btree_cur *src, + struct xfs_btree_cur *dst); +} xfs_btree_curops_t; + +typedef const struct xfs_btree_block_ops { + int (*get_buf)(struct xfs_btree_cur *cur, xfs_btree_ptr_t *ptr, + int flags, struct xfs_buf **bpp); + int (*read_buf)(struct xfs_btree_cur *cur, xfs_btree_ptr_t *ptr, + int flags, struct xfs_buf **bpp); + int (*check_block)(struct xfs_btree_cur *cur, + xfs_btree_block_t *block, + int level, struct xfs_buf *bp); + xfs_btree_block_t * + (*get_block)(struct xfs_btree_cur *cur, int lvl, + struct xfs_buf **bpp); + xfs_btree_block_t * + (*buf_to_block)(struct xfs_btree_cur *cur, struct xfs_buf *bp); + void (*buf_to_ptr)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + xfs_btree_ptr_t *ptr); + void (*log_block)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int fields); + + int (*alloc_block)(struct xfs_btree_cur *cur, xfs_btree_ptr_t *sbno, + xfs_btree_ptr_t *nbno, int length, int *stat); + int (*free_block)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int length); + + void (*get_sibling)(struct xfs_btree_cur *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, int lr); + void (*set_sibling)(struct xfs_btree_cur *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, int lr); + void (*init_sibling)(struct xfs_btree_cur *cur, + xfs_btree_block_t *nsib, xfs_btree_block_t *sib); +} xfs_btree_blkops_t; + +typedef const struct xfs_btree_record_ops { + /* records in block/level */ + int (*get_minrecs)(struct xfs_btree_cur *cur, int level); + int (*get_maxrecs)(struct xfs_btree_cur *cur, int level); + int (*get_dminrecs)(struct xfs_btree_cur *cur, int level); + int (*get_dmaxrecs)(struct xfs_btree_cur *cur, int level); + int (*get_numrecs)(struct xfs_btree_cur *cur, + xfs_btree_block_t *block); + void (*set_numrecs)(struct xfs_btree_cur *cur, + xfs_btree_block_t *block, + int numrecs); + + /* init values of btree structures */ + void (*init_key_from_rec)(struct xfs_btree_cur *cur, + xfs_btree_key_t *key, xfs_btree_rec_t *rec); + void (*init_ptr_from_cur)(struct xfs_btree_cur *cur, + xfs_btree_ptr_t *ptr); + void (*init_rec_from_key)(struct xfs_btree_cur *cur, + xfs_btree_key_t *key, xfs_btree_rec_t *rec); + void (*init_rec_from_cur)(struct xfs_btree_cur *cur, + xfs_btree_rec_t *rec); + + /* return address of btree structures */ + xfs_btree_key_t * + (*key_addr)(struct xfs_btree_cur *cur, int index, + xfs_btree_block_t *block); + xfs_btree_ptr_t * + (*ptr_addr)(struct xfs_btree_cur *cur, int index, + xfs_btree_block_t *block); + xfs_btree_rec_t * + (*rec_addr)(struct xfs_btree_cur *cur, int index, + xfs_btree_block_t *block); + + /* difference between key value and cursor value */ + int64_t (*key_diff)(struct xfs_btree_cur *cur, xfs_btree_key_t *key); + + xfs_daddr_t + (*ptr_to_daddr)(struct xfs_btree_cur *cur, + xfs_btree_ptr_t *ptr); + + /* set values of btree structures */ + void (*set_key)(struct xfs_btree_cur *cur, + xfs_btree_key_t *key_addr, int index, + xfs_btree_key_t *newkey); + void (*set_ptr)(struct xfs_btree_cur *cur, + xfs_btree_ptr_t *ptr_addr, int index, + xfs_btree_ptr_t *newptr); + void (*set_rec)(struct xfs_btree_cur *cur, + xfs_btree_rec_t *rec_addr, int index, + xfs_btree_rec_t *newrec); + + /* move bits of btree blocks around */ + void (*move_keys)(struct xfs_btree_cur *cur, + xfs_btree_key_t *src_key, + xfs_btree_key_t *dst_key, int src_index, + int dst_index, int numkeys); + void (*move_ptrs)(struct xfs_btree_cur *cur, + xfs_btree_ptr_t *src_ptr, + xfs_btree_ptr_t *dst_ptr, int src_index, + int dst_index, int numptrs); + void (*move_recs)(struct xfs_btree_cur *cur, + xfs_btree_rec_t *src_rec, + xfs_btree_rec_t *dst_rec, int src_index, + int dst_index, int numrecs); + + /* log changes to btree structures */ + void (*log_keys)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int first, int last); + void (*log_ptrs)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int first, int last); + void (*log_recs)(struct xfs_btree_cur *cur, struct xfs_buf *bp, + int first, int last); + + /* paranoia */ + int (*check_ptrs)(struct xfs_btree_cur *cur, + xfs_btree_ptr_t *ptr, int index, int level); +} xfs_btree_recops_t; + +#ifdef XFS_BTREE_TRACE +typedef const struct xfs_btree_trc_ops { + void (*enter)(const char *func, xfs_btree_cur_t *cur, + char *s, int type, int line, + __psunsigned_t a0, __psunsigned_t a1, + __psunsigned_t a2, __psunsigned_t a3, + __psunsigned_t a4, __psunsigned_t a5, + __psunsigned_t a6, __psunsigned_t a7, + __psunsigned_t a8, __psunsigned_t a9, + __psunsigned_t a10); + void (*cursor)(xfs_btree_cur_t *cur, __uint32_t *s0, + __uint64_t *l0, __uint64_t *l1); + void (*record)(xfs_btree_cur_t *cur, xfs_btree_rec_t *rec, + __uint64_t *l0, __uint64_t *l1, + __uint64_t *l2); +} xfs_btree_trcops_t; + +#define XBT_ENTRY 1 +#define XBT_EXIT 2 +#define XBT_ERROR 3 +#define XBT_ARGS 4 + +/* + * Trace hooks. + * i,j = integer (32 bit) + * b = btree block buffer (xfs_buf_t) + * p = btree ptr + * r = btree record + * k = btree key + */ +#define XFS_BTREE_TRACE_ARGI(c,i) \ + xfs_btree_trace_argi(__FUNCTION__, c, i, __LINE__) +#define XFS_BTREE_TRACE_ARGBI(c,b,i) \ + xfs_btree_trace_argbi(__FUNCTION__, c, b, i, __LINE__) +#define XFS_BTREE_TRACE_ARGBII(c,b,i,j) \ + xfs_btree_trace_argbii(__FUNCTION__, c, b, i, j, __LINE__) +#define XFS_BTREE_TRACE_ARGIPK(c,i,p,s) \ + xfs_btree_trace_argifk(__FUNCTION__, c, i, p, s, __LINE__) +#define XFS_BTREE_TRACE_ARGIPR(c,i,p,r) \ + xfs_btree_trace_argifr(__FUNCTION__, c, i, p, r, __LINE__) +#define XFS_BTREE_TRACE_ARGIK(c,i,k) \ + xfs_btree_trace_argik(__FUNCTION__, c, i, k, __LINE__) +#define XFS_BTREE_TRACE_CURSOR(c,s) \ + xfs_btree_trace_cursor(__FUNCTION__, c, s, __LINE__) +#else +#define XFS_BTREE_TRACE_ARGBI(c,b,i) +#define XFS_BTREE_TRACE_ARGBII(c,b,i,j) +#define XFS_BTREE_TRACE_ARGI(c,i) +#define XFS_BTREE_TRACE_ARGIPK(c,i,p,s) +#define XFS_BTREE_TRACE_ARGIPR(c,i,p,r) +#define XFS_BTREE_TRACE_ARGIK(c,i,k) +#define XFS_BTREE_TRACE_CURSOR(c,s) +#endif /* XFS_BTREE_TRACE */ /* * Btree cursor structure. * This collects all information needed by the btree code in one place. @@ -144,6 +358,13 @@ typedef struct xfs_btree_cur { struct xfs_trans *bc_tp; /* transaction we're in, if any */ struct xfs_mount *bc_mp; /* file system mount struct */ + xfs_btree_curops_t *bc_curops; + xfs_btree_blkops_t *bc_blkops; + xfs_btree_recops_t *bc_recops; +#ifdef XFS_BTREE_TRACE + xfs_btree_trcops_t *bc_trcops; +#endif + uint bc_flags; /* btree features - below */ union { xfs_alloc_rec_incore_t a; xfs_bmbt_irec_t b; @@ -179,6 +400,10 @@ typedef struct xfs_btree_cur } bc_private; /* per-btree type data */ } xfs_btree_cur_t; +/* cursor flags */ +#define XFS_BTREE_ROOT_IN_INODE (1<<0) /* root may be variable size */ +#define XFS_BTREE_LASTREC_UPDATE (1<<1) /* track last rec externally */ + #define XFS_BTREE_NOERROR 0 #define XFS_BTREE_ERROR 1 @@ -192,6 +417,17 @@ typedef struct xfs_btree_cur #ifdef __KERNEL__ +#define XFS_BTREE_TRACE_ARGBI(c,b,i) +#define XFS_BTREE_TRACE_ARGBII(c,b,i,j) +#define XFS_BTREE_TRACE_ARGFFF(c,o,b,i) +#define XFS_BTREE_TRACE_ARGFFFI(c,o,b,i,j) +#define XFS_BTREE_TRACE_ARGI(c,i) +#define XFS_BTREE_TRACE_ARGII(c,i,j) +#define XFS_BTREE_TRACE_ARGIFK(c,i,f,s) +#define XFS_BTREE_TRACE_ARGIFR(c,i,f,r) +#define XFS_BTREE_TRACE_ARGIK(c,i,k) +#define XFS_BTREE_TRACE_CURSOR(c,s) + #ifdef DEBUG /* * Debug routine: check that block header is ok. @@ -232,7 +468,7 @@ xfs_btree_check_rec( int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_lblock( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_lblock_t *block, /* btree long form block pointer */ + xfs_btree_block_t *block, /* btree long form block pointer */ int level, /* level of the btree block */ struct xfs_buf *bp); /* buffer containing block, if any */ @@ -242,19 +478,17 @@ xfs_btree_check_lblock( int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_lptr( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_dfsbno_t ptr, /* btree block disk address */ + xfs_btree_ptr_t *ptr, /* btree block ptr */ + int offset, /* offset from ptr to check */ int level); /* btree block level */ -#define xfs_btree_check_lptr_disk(cur, ptr, level) \ - xfs_btree_check_lptr(cur, be64_to_cpu(ptr), level) - /* * Checking routine: check that short form block header is ok. */ int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_sblock( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_btree_sblock_t *block, /* btree short form block pointer */ + xfs_btree_block_t *block, /* btree short form block pointer */ int level, /* level of the btree block */ struct xfs_buf *bp); /* buffer containing block */ @@ -264,7 +498,8 @@ xfs_btree_check_sblock( int /* error (0 or EFSCORRUPTED) */ xfs_btree_check_sptr( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agblock_t ptr, /* btree block disk address */ + xfs_btree_ptr_t *ptr, /* btree block ptr */ + int offset, /* offset from ptr to check */ int level); /* btree block level */ /* @@ -423,12 +658,52 @@ xfs_btree_readahead( int lev, /* level in btree */ int lr) /* left/right bits */ { + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (lev == cur->bc_nlevels - 1)) + return 0; + if ((cur->bc_ra[lev] | lr) == cur->bc_ra[lev]) return 0; return xfs_btree_readahead_core(cur, lev, lr); } +/* + * Block sibling operations. + */ +void +xfs_btree_get_lsibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr); + +void +xfs_btree_set_lsibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr); + +void +xfs_btree_get_ssibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr); + +void +xfs_btree_set_ssibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + xfs_btree_ptr_t *ptr, + int lr); + +void +xfs_btree_init_sibling( + xfs_btree_cur_t *cur, + xfs_btree_block_t *nsib, + xfs_btree_block_t *sib); /* sibling block next to new block */ /* * Set the buffer for level "lev" in the cursor to bp, releasing @@ -440,6 +715,136 @@ xfs_btree_setbuf( int lev, /* level in btree */ struct xfs_buf *bp); /* new buffer to set */ +/* + * Core btree functions + */ + +/* + * Insert one record/level. Return information to the caller + * allowing the next level up to proceed if necessary. + */ +int xfs_btree_insrec( + xfs_btree_cur_t *cur, + int level, + xfs_btree_ptr_t *ptrp, + xfs_btree_rec_t *recp, + xfs_btree_cur_t **curp, + int *stat); /* no-go/done/continue */ + +/* + * Delete record pointed to by cur/level. + */ +int xfs_btree_delrec( + xfs_btree_cur_t *cur, + int level, + int *stat); /* success/failure */ + +/* + * Move 1 record right from cur/level if possible. + * Update cur to reflect the new path. + */ +int xfs_btree_rshift( + xfs_btree_cur_t *cur, + int level, + int *stat); /* success/failure */ + +/* + * Move 1 record left from cur/level if possible. + * Update cur to reflect the new path. + */ +int xfs_btree_lshift( + xfs_btree_cur_t *cur, + int level, + int *stat); /* success/failure */ + +/* + * Split cur/level block in half. + * Return new block number and the key to its + * first record (to be inserted into parent). + */ +int /* error */ +xfs_btree_split( + xfs_btree_cur_t *cur, + int level, + xfs_btree_ptr_t *ptrp, + xfs_btree_key_t *key, + xfs_btree_cur_t **curp, + int *stat); /* success/failure */ + +/* + * Update keys for the record. + */ +int +xfs_btree_updkey( + xfs_btree_cur_t *cur, + xfs_btree_key_t *keyp, /* on-disk format */ + int level); + +/* + * Decrement cursor by one record at the level. + * For nonzero levels the leaf-ward information is untouched. + */ +int /* error */ +xfs_btree_decrement( + xfs_btree_cur_t *cur, + int level, + int *stat); /* success/failure */ + +/* + * Increment cursor by one record at the level. + * For nonzero levels the leaf-ward information is untouched. + */ +int /* error */ +xfs_btree_increment( + xfs_btree_cur_t *cur, + int level, + int *stat); /* success/failure */ + +/* + * Insert the record in cur at the point referenced by cur. + * The cursor may be inconsistent on return if splits have been done. + */ +int +xfs_btree_insert( + xfs_btree_cur_t *cur, + int *stat); + +/* + * Delete the record pointed to by cur. + */ +int /* error */ +xfs_btree_delete( + xfs_btree_cur_t *cur, + int *stat); /* success/failure */ + +/* + * Lookup the record. The cursor is made to point to it, based on dir. + * Return 0 if can't find any such record, 1 for success. + */ +int /* error */ +xfs_btree_lookup( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_lookup_t dir, /* <=, ==, or >= */ + int *stat); /* success/failure */ + +/* + * Allocate a new root block, fill it in. + */ +int /* error */ +xfs_btree_newroot( + xfs_btree_cur_t *cur, /* btree cursor */ + int *stat); /* success/failure */ + +/* + * Update the record referred to by cur to the value in the + * given record. This either works (return 0) or gets an + * EFSCORRUPTED error. + */ +int +xfs_btree_update( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec); + #endif /* __KERNEL__ */ Index: 2.6.x-xfs-new/fs/xfs/xfs_btree_core.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ 2.6.x-xfs-new/fs/xfs/xfs_btree_core.c 2007-11-06 19:40:29.758667866 +1100 @@ -0,0 +1,2299 @@ +/* + * Copyright (c) 2007 Silicon Graphics, Inc. + * All Rights Reserved. + * + * Derived from existing XFS btree code by Dave Chinner. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_types.h" +#include "xfs_bit.h" +#include "xfs_log.h" +#include "xfs_inum.h" +#include "xfs_trans.h" +#include "xfs_sb.h" +#include "xfs_ag.h" +#include "xfs_dir2.h" +#include "xfs_dmapi.h" +#include "xfs_mount.h" +#include "xfs_bmap_btree.h" +#include "xfs_alloc_btree.h" +#include "xfs_ialloc_btree.h" +#include "xfs_dir2_sf.h" +#include "xfs_attr_sf.h" +#include "xfs_dinode.h" +#include "xfs_inode.h" +#include "xfs_btree.h" +#include "xfs_ialloc.h" +#include "xfs_error.h" + +/* + * ToDo: + * + * - trace infrastructure + * - fix 32bit-ness in xfs_btree_newroot + * - per-btree stats + * - fix check_sblock/sptr as they are alloc btree specific + */ + +/* + * Keys, ptrs and records are supposed to be passed around in host + * format in this code. type specific callouts need to do endian + * swapping as necessary. + */ + +/* + * Btree keys, ptrs and records are passed around in disk format + * and converted where needed by end functions. The values held in + * the cursor for anything is in host order. + */ + +/* + * Internal functions. + */ +STATIC int +xfs_btree_ptr_null( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) +{ + switch(cur->bc_btnum) { + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + return be32_to_cpu(ptr->u.alloc) == NULLAGBLOCK; + break; + case XFS_BTNUM_INO: + return be32_to_cpu(ptr->u.inobt) == NULLAGBLOCK; + case XFS_BTNUM_BMAP: + return be64_to_cpu(ptr->u.bmbt) == NULLFSBLOCK; + default: + ASSERT(0); + break; + } + return 0; +} + +STATIC void +xfs_btree_set_ptr_null( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) +{ + switch(cur->bc_btnum) { + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + ptr->u.alloc = cpu_to_be32(NULLAGBLOCK); + break; + case XFS_BTNUM_INO: + ptr->u.inobt = cpu_to_be32(NULLAGBLOCK); + break; + case XFS_BTNUM_BMAP: + ptr->u.bmbt = cpu_to_be64(NULLFSBLOCK); + break; + default: + ASSERT(0); + break; + } +} + +STATIC int +xfs_btree_dec_cursor( + xfs_btree_cur_t *cur, + int level, + int *stat) +{ + int i; + int error; + + if (level > 0) { + error = xfs_btree_decrement(cur, level, &i); + if (error) + return error; + } + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; +} + +/* + * Return true if ptr is the last record in the btree and + * we need to track updateÑ• to this record. + */ +STATIC int +xfs_btree_is_lastrec( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + int level, + int last) +{ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_ptr_t ptr; + int numrecs; + + numrecs = rops->get_numrecs(cur, block); + bops->get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); + return ((cur->bc_flags & XFS_BTREE_LASTREC_UPDATE) && + level == 0 && + xfs_btree_ptr_null(cur, &ptr) && + last >= numrecs); + +} + +/* + * Move numrecs from the src block to the dst block. + * Log the changes to the destination block. + */ +STATIC int +xfs_btree_move_entries( + xfs_btree_cur_t *cur, + int level, + xfs_buf_t *sbp, /* source block */ + xfs_buf_t *dbp, /* destination block */ + int src_index, /* src block index */ + int dst_index, /* dst block index */ + int numrecs) /* number of records to move */ +{ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_block_t *src; /* src btree block */ + xfs_btree_key_t *skp; /* src btree key */ + xfs_btree_ptr_t *spp; /* src address pointer */ + xfs_btree_rec_t *srp; /* src record pointer */ + xfs_btree_block_t *dst; /* dst btree block */ + xfs_btree_key_t *dkp; /* dst btree key */ + xfs_btree_ptr_t *dpp; /* dst address pointer */ + xfs_btree_rec_t *drp; /* dst record pointer */ + + src = bops->buf_to_block(cur, sbp); + dst = bops->buf_to_block(cur, dbp); + if (level > 0) { + /* + * It's a non-leaf. Move keys and pointers. + */ + skp = rops->key_addr(cur, src_index, src); + spp = rops->ptr_addr(cur, src_index, src); + dkp = rops->key_addr(cur, dst_index, dst); + dpp = rops->ptr_addr(cur, dst_index, dst); +#ifdef DEBUG + for (i = ptr; i < numrecs; i++) { + error = bops->check_lptr_disk(cur, rpp, i, level); + if (error) + goto error0; + } +#endif + rops->move_keys(cur, skp, dkp, 0, 0, numrecs); + rops->move_ptrs(cur, spp, dpp, 0, 0, numrecs); + + rops->log_keys(cur, dbp, dst_index, numrecs); + rops->log_ptrs(cur, dbp, dst_index, numrecs); + } else { + /* + * It's a leaf. Move records. + */ + srp = rops->rec_addr(cur, src_index, src); + drp = rops->rec_addr(cur, dst_index, dst); + rops->move_recs(cur, srp, drp, 0, 0, numrecs); + rops->log_recs(cur, dbp, dst_index, numrecs); + } + rops->set_numrecs(cur, dst, rops->get_numrecs(cur, dst) + numrecs); + bops->log_block(cur, dbp, XFS_BB_NUMRECS); +#ifdef DEBUG + if (level > 0) + xfs_btree_check_key(cur->bc_btnum, lkp - 1, lkp); + else + xfs_btree_check_rec(cur->bc_btnum, lrp - 1, lrp); +#endif + return 0; +} +/* + * Excise the entries indicated by the start, end. + * Simply slide the entries past them down. + * Log the changed areas of the block. + */ +STATIC int +xfs_btree_remove_entry( + xfs_btree_cur_t *cur, + int level, + xfs_buf_t *bp, + xfs_btree_key_t *key, + int index) /* index to excise */ +{ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_block_t *block; /* bmap btree block */ + xfs_btree_key_t *kp=NULL; /* pointer to bmap btree key */ + xfs_btree_ptr_t *pp; /* pointer to bmap block addr */ + xfs_btree_rec_t *rp; /* pointer to bmap btree rec */ + int numrecs; + + block = bops->buf_to_block(cur, bp); + numrecs = rops->get_numrecs(cur, block); + if (level > 0) { + /* + * It's a nonleaf. Excise the key and ptr being deleted, by + * sliding the entries past them down one. Log the changed + * areas of the block. + */ + kp = rops->key_addr(cur, 1, block); + pp = rops->ptr_addr(cur, 1, block); +#ifdef DEBUG + for (i = index; i < numrecs; i++) { + error = cur->b_ops->check_lptr_disk(cur, pp, i, level); + if (error) + goto error0; + } +#endif + if (index < numrecs) { + rops->move_keys(cur, kp, NULL, index, index - 1, numrecs - index); + rops->move_ptrs(cur, pp, NULL, index, index - 1, numrecs - index); + rops->log_ptrs(cur, bp, index, numrecs - 1); + rops->log_keys(cur, bp, index, numrecs - 1); + } + } else { + /* + * It's a leaf. Excise the record being deleted, by sliding + * the entries past it down one. Log the changed areas of the + * block. + */ + rp = rops->rec_addr(cur, 1, block); + if (index < numrecs) { + rops->move_recs(cur, rp, NULL, index, index - 1, numrecs - index); + rops->log_recs(cur, bp, index, numrecs - 1); + } + /* + * If it's the first record in the block, we'll need a key + * structure to pass up to the next level (updkey). + */ + if (index == 1) + rops->init_key_from_rec(cur, key, rp); + } + numrecs--; + rops->set_numrecs(cur, block, numrecs); + bops->log_block(cur, bp, XFS_BB_NUMRECS); + return 0; +} + +/* + * Insert the entry indicated by the start index + * Simply slide the entries up one, inser the new entry and + * Log the changed areas of the block. + */ +STATIC int +xfs_btree_insert_entry( + xfs_btree_cur_t *cur, + int level, + xfs_buf_t *bp, + int index, /* index to insert at */ + xfs_btree_key_t *key, + xfs_btree_ptr_t *ptr, + xfs_btree_rec_t *rec) +{ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_block_t *block; + xfs_btree_key_t *kp; + xfs_btree_ptr_t *pp; + xfs_btree_rec_t *rp; + int numrecs; + + block = bops->buf_to_block(cur, bp); + numrecs = rops->get_numrecs(cur, block); + if (level > 0) { + /* + * It's a non-leaf entry. Make a hole for the new data + * in the key and ptr regions of the block. + */ + kp = rops->key_addr(cur, 1, block); + pp = rops->ptr_addr(cur, 1, block); +#ifdef DEBUG + for (i = numrecs; i >= index; i--) { + error = bops->check_lptr_disk(cur, pp, i - 1, level); + if (error) + goto error0; + } +#endif + rops->move_keys(cur, kp, NULL, index - 1, index, + numrecs - index + 1); + rops->move_ptrs(cur, pp, NULL, index - 1, index, + numrecs - index + 1); + /* + * Now stuff the new data in, bump numrecs and log the new data. + */ +#ifdef DEBUG + error = bops->check_lptr_disk(cur, ptr, 0, level); + if (error) + goto error0; +#endif + rops->set_key(cur, kp, index - 1, key); + rops->set_ptr(cur, pp, index - 1, ptr); + numrecs++; + rops->set_numrecs(cur, block, numrecs); + rops->log_ptrs(cur, bp, index, numrecs); + rops->log_keys(cur, bp, index, numrecs); + } else { + /* + * It's a leaf entry. Make a hole for the new record. + */ + rp = rops->rec_addr(cur, 1, block); + rops->move_recs(cur, rp, NULL, index - 1, index, + numrecs - index + 1); + /* + * Now stuff the new record in, bump numrecs + * and log the new data. + */ + rops->set_rec(cur, rp, index - 1, rec); + numrecs++; + rops->set_numrecs(cur, block, numrecs); + rops->log_recs(cur, bp, index, numrecs); + } + /* + * Log the new number of records in the btree header. + */ + bops->log_block(cur, bp, XFS_BB_NUMRECS); + +#ifdef DEBUG + /* + * Check that the key/record is in the right place, now. + */ + if (ptr < numrecs) { + if (level == 0) + xfs_btree_check_rec(cur->bc_btnum, rp + index - 1, + rp + index); + else + xfs_btree_check_key(cur->bc_btnum, kp + index - 1, + kp + index); + } +#endif + return 0; +} + +/* + * Single level of the btree record deletion routine. + * Delete record pointed to by cur/level. + * Remove the record from its block then rebalance the tree. + * Return 0 for error, 1 for done, 2 to go on to the next level. + */ +int /* error */ +xfs_btree_delrec( + xfs_btree_cur_t *cur, /* btree cursor */ + int level, /* level removing record from */ + int *stat) /* fail/done/go-on */ +{ + xfs_btree_block_t *block; /* bmap btree block */ + xfs_btree_ptr_t cptr; /* current block ptr */ + xfs_buf_t *bp; /* buffer for block */ + int error; /* error return value */ + int i; /* loop counter */ + xfs_btree_key_t key; /* bmap btree key */ + xfs_btree_key_t *kp=NULL; /* pointer to bmap btree key */ + xfs_btree_ptr_t lptr; /* left sibling block ptr */ + xfs_buf_t *lbp; /* left buffer pointer */ + xfs_btree_block_t *left; /* left btree block */ + int lrecs=0; /* left record count */ + int ptr; /* key/record index */ + xfs_btree_ptr_t rptr; /* right sibling block ptr */ + xfs_buf_t *rbp; /* right buffer pointer */ + xfs_btree_block_t *right; /* right btree block */ + xfs_btree_block_t *rrblock; /* right-right btree block */ + xfs_buf_t *rrbp; /* right-right buffer pointer */ + int rrecs=0; /* right record count */ + xfs_btree_cur_t *tcur; /* temporary btree cursor */ + int numrecs; /* temporary numrec count */ + xfs_btree_curops_t *cops = cur->bc_curops; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + tcur = NULL; + + /* + * Get the index of the entry being deleted, check for nothing there. + */ + ptr = cur->bc_ptrs[level]; + if (ptr == 0) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + + /* + * Get the buffer & block containing the record or key/ptr. + */ + block = bops->get_block(cur, level, &bp); + numrecs = rops->get_numrecs(cur, block); +#ifdef DEBUG + error = bops->check_block(cur, block, level, bp); + if (error) + goto error0; +#endif + /* + * Fail if we're off the end of the block. + */ + if (ptr > numrecs) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + + XFS_STATS_INC(xs_bmbt_delrec); + + /* + * Excise the entries being deleted. + * Log the changed areas of the block. + */ + error = xfs_btree_remove_entry(cur, level, bp, &key, ptr); + if (error) + goto error0; + + /* + * If we are tracking the last record in the tree and + * we are at the far right edge of the tree, update it. + */ + numrecs = rops->get_numrecs(cur, block); + if (xfs_btree_is_lastrec(cur, block, level, ptr)) { + ASSERT(ptr == numrecs + 1); + error = cops->update_lastrec(cur, block); + if (error) + goto error0; + } + + /* + * We're at the root level. + * First, shrink the root block in-memory. + * Try to get rid of the next level down. + * If we can't then there's nothing left to do. + */ + if (level == cur->bc_nlevels - 1) { + /* root in inode is special */ + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { + cops->realloc_root(cur, -1); + error = cops->kill_root(cur, -1, NULL); + if (!error) + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + } + /* + * If this is the root level, and there's only one entry left, + * and it's NOT the leaf level, then we can get rid of this + * level. + */ + else if (numrecs == 1 && level > 0) { + xfs_btree_ptr_t *pp; + /* + * pp is still set to the first pointer in the block. + * Make it the new root of the btree. + */ + pp = rops->ptr_addr(cur, 1, block); + error = cops->kill_root(cur, level, pp); + if (error) + goto error0; + } else if (level > 0) { + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + } + *stat = 1; + return 0; + } + + /* + * If we deleted the leftmost entry in the block, update the + * key values above us in the tree. + */ + if (ptr == 1) { + error = xfs_btree_updkey(cur, kp, level + 1); + if (error) + goto error0; + } + + /* + * If the number of records remaining in the block is at least + * the minimum, we're done. + */ + if (numrecs >= rops->get_minrecs(cur, level)) { + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + + /* + * Otherwise, we have to move some records around to keep the + * tree balanced. Look at the left and right sibling blocks to + * see if we can re-balance by moving only one record. + */ + bops->get_sibling(cur, block, &rptr, XFS_BB_RIGHTSIB); + bops->get_sibling(cur, block, &lptr, XFS_BB_LEFTSIB); + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { + /* + * One child of root, need to get a chance to copy its contents + * into the root and delete it. Can't go up to next level, + * there's nothing to delete there. + */ + if (xfs_btree_ptr_null(cur, &rptr) && + xfs_btree_ptr_null(cur, &lptr) && + level == cur->bc_nlevels - 2) { + error = cops->kill_root(cur, -1, NULL); + if (!error) + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + } + ASSERT(!xfs_btree_ptr_null(cur, &rptr) || + !xfs_btree_ptr_null(cur, &lptr)); + + /* + * Duplicate the cursor so our btree manipulations here won't + * disrupt the next level up. + */ + error = xfs_btree_dup_cursor(cur, &tcur); + if (error) + goto error0; + + /* + * If there's a right sibling, see if it's ok to shift an entry + * out of it. + */ + if (!xfs_btree_ptr_null(cur, &rptr)) { + /* + * Move the temp cursor to the last entry in the next block. + * Actually any entry but the first would suffice. + */ + i = xfs_btree_lastrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_increment(tcur, level, &i); + if (error) + goto error0; + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + i = xfs_btree_lastrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + /* + * Grab a pointer to the block. + */ + rbp = tcur->bc_bufs[level]; + right = bops->buf_to_block(tcur, rbp); +#ifdef DEBUG + error = bops->check_block(tcur, right, level, rbp); + if (error) + goto error0; +#endif + /* + * Grab the current block number, for future use. + */ + bops->get_sibling(tcur, right, &cptr, XFS_BB_LEFTSIB); + /* + * If right block is full enough so that removing one entry + * won't make it too empty, and left-shifting an entry out + * of right to us works, we're done. + */ + if (rops->get_numrecs(tcur, right) - 1 >= + rops->get_minrecs(tcur, level)) { + error = xfs_btree_lshift(tcur, level, &i); + if (error) + goto error0; + if (i) { + ASSERT(rops->get_numrecs(tcur, block) >= + rops->get_minrecs(tcur, level)); + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + tcur = NULL; + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + } + /* + * Otherwise, grab the number of records in right for + * future reference, and fix up the temp cursor to point + * to our block again (last record). + */ + rrecs = rops->get_numrecs(tcur, right); + if (!xfs_btree_ptr_null(cur, &lptr)) { + i = xfs_btree_firstrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_decrement(tcur, level, &i); + if (error) + goto error0; + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + } + } + /* + * If there's a left sibling, see if it's ok to shift an entry + * out of it. + */ + if (!xfs_btree_ptr_null(cur, &lptr)) { + /* + * Move the temp cursor to the first entry in the + * previous block. + */ + i = xfs_btree_firstrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_decrement(tcur, level, &i); + if (error) + goto error0; + i = xfs_btree_firstrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + /* + * Grab a pointer to the block. + */ + lbp = tcur->bc_bufs[level]; + left = bops->buf_to_block(cur, lbp); +#ifdef DEBUG + error = bops->check_block(cur, left, level, lbp); + if (error) + goto error0; +#endif + /* + * Grab the current block number, for future use. + */ + bops->get_sibling(tcur, left, &cptr, XFS_BB_RIGHTSIB); + /* + * If left block is full enough so that removing one entry + * won't make it too empty, and right-shifting an entry out + * of left to us works, we're done. + */ + if (rops->get_numrecs(tcur, left) - 1 >= + rops->get_minrecs(tcur, level)) { + error = xfs_btree_rshift(tcur, level, &i); + if (error) + goto error0; + if (i) { + ASSERT(rops->get_numrecs(tcur, block) >= + rops->get_minrecs(tcur, level)); + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + tcur = NULL; + if (level == 0) + cur->bc_ptrs[0]++; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + } + } + /* + * Otherwise, grab the number of records in right for + * future reference. + */ + lrecs = rops->get_numrecs(tcur, left); + } + /* + * Delete the temp cursor, we're done with it. + */ + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + tcur = NULL; + + /* + * If here, we need to do a join to keep the tree balanced. + */ + ASSERT(!xfs_btree_ptr_null(cur, &cptr)); + if (!xfs_btree_ptr_null(cur, &lptr) && + ((lrecs + rops->get_numrecs(cur, block)) <= + (rops->get_maxrecs(cur, level)))) { + /* + * Set "right" to be the starting block, + * "left" to be the left neighbor. + */ + rptr = cptr; + right = block; + rbp = bp; + error = bops->read_buf(cur, &lptr, 0, &lbp); + if (error) + goto error0; + left = bops->buf_to_block(cur, lbp); + error = bops->check_block(cur, left, level, lbp); + if (error) + goto error0; + } + /* + * If that won't work, see if we can join with the right neighbor block. + */ + else if (!xfs_btree_ptr_null(cur, &rptr) && + ((rrecs + rops->get_numrecs(cur, block)) <= + (rops->get_maxrecs(cur, level)))) { + /* + * Set "left" to be the starting block, + * "right" to be the right neighbor. + */ + lptr = cptr; + left = block; + lbp = bp; + error = bops->read_buf(cur, &rptr, 0, &rbp); + if (error) + goto error0; + right = bops->buf_to_block(cur, rbp); + error = bops->check_block(cur, right, level, rbp); + if (error) + goto error0; + lrecs = rops->get_numrecs(cur, left); + } + /* + * Otherwise, we can't fix the imbalance. + * Just return. This is probably a logic error, but it's not fatal. + */ + else { + error = xfs_btree_dec_cursor(cur, level, stat); + if (error) + goto error0; + return 0; + } + /* + * We're now going to join "left" and "right" by moving all the stuff + * in "right" to "left" and deleting "right". + */ + error = xfs_btree_move_entries(cur, level, rbp, lbp, 1, lrecs + 1, rrecs); + if (error) + goto error0; + + /* + * Fix up the right block pointer in the surviving block, and log it. + */ + bops->get_sibling(cur, right, &cptr, XFS_BB_RIGHTSIB), + bops->set_sibling(cur, left, &cptr, XFS_BB_RIGHTSIB); + bops->log_block(cur, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); + + /* + * If there is a right sibling now, make it point to the + * remaining block. + */ + bops->get_sibling(cur, left, &cptr, XFS_BB_RIGHTSIB); + if (!xfs_btree_ptr_null(cur, &cptr)) { + error = bops->read_buf(cur, &cptr, 0, &rrbp); + if (error) + goto error0; + rrblock = bops->buf_to_block(cur, rrbp); + error = bops->check_block(cur, rrblock, level, rrbp); + if (error) + goto error0; + bops->set_sibling(cur, rrblock, &lptr, XFS_BB_LEFTSIB); + bops->log_block(cur, rrbp, XFS_BB_LEFTSIB); + } + /* + * Free the deleted block. + */ + error = bops->free_block(cur, rbp, 1); + if (error) + goto error0; + + /* + * If we joined with the left neighbor, set the buffer in the + * cursor to the left block, and fix up the index. + */ + if (bp != lbp) { + cur->bc_bufs[level] = lbp; + cur->bc_ptrs[level] += lrecs; + cur->bc_ra[level] = 0; + } + /* + * If we joined with the right neighbor and there's a level above + * us, increment the cursor at that level. + */ + else if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) || + (level + 1 < cur->bc_nlevels)) { + error = xfs_btree_increment(cur, level + 1, &i); + if (error) + goto error0; + } + + /* + * Readjust the ptr at this level if it's not a leaf, since it's + * still pointing at the deletion point, which makes the cursor + * inconsistent. If this makes the ptr 0, the caller fixes it up. + * We can't use decrement because it would change the next level up. + */ + if (level > 0) + cur->bc_ptrs[level]--; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + /* + * Return value means the next level up has something to do. + */ + *stat = 2; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + if (tcur) + xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); + return error; +} + +STATIC int +xfs_btree_make_block_unfull( + xfs_btree_cur_t *cur, /* btree cursor */ + int level, /* btree level */ + int numrecs, /* # of recs in block */ + int *oindex, /* old tree index */ + int *index, /* new tree index */ + xfs_btree_ptr_t *nptr, /* new btree ptr */ + xfs_btree_cur_t **ncur, /* new btree cursor */ + xfs_btree_rec_t *nrec, /* new record */ + int *stat) +{ + xfs_btree_curops_t *cops = cur->bc_curops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_key_t key; /* new btree key value */ + int error = 0; + + if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) { + if (numrecs < rops->get_dmaxrecs(cur, level)) { + /* A resizeable root block that can be made bigger. */ + cops->realloc_root(cur, 1); + return 0; + } + if (level == cur->bc_nlevels - 1) { + /* A root block that needs replacing */ + error = cops->new_root(cur, stat); + if (error || *stat == 0) + return error; + return 0; + } + } + + /* + * First, try shifting an entry to the right neighbor. + */ + error = xfs_btree_rshift(cur, level, stat); + if (error) + return error; + if (*stat) { + /* nothing */ + } else { + /* + * Next, try shifting an entry to the left neighbor. + */ + error = xfs_btree_lshift(cur, level, stat); + if (error) + return error; + if (*stat) { + *oindex = *index = cur->bc_ptrs[level]; + } else { + /* + * Next, try splitting the current block in half. If + * this works we have to re-set our variables because + * we could be in a different block now. + */ + error = xfs_btree_split(cur, level, nptr, &key, + ncur, stat); + if (error || *stat == 0) + return error; + + *index = cur->bc_ptrs[level]; + rops->init_rec_from_key(cur, &key, nrec); + } + } + return 0; +} + +/* + * Insert one record/level. Return information to the caller + * allowing the next level up to proceed if necessary. + */ +int +xfs_btree_insrec( + xfs_btree_cur_t *cur, /* btree cursor */ + int level, /* level to insert record at */ + xfs_btree_ptr_t *ptrp, /* i/o: block number inserted */ + xfs_btree_rec_t *recp, /* i/o: record data inserted */ + xfs_btree_cur_t **curp, /* output: new cursor replacing cur */ + int *stat) /* success/failure */ +{ + xfs_btree_block_t *block; /* bmap btree block */ + xfs_buf_t *bp; /* buffer for block */ + int error; /* error return value */ + int i; /* loop index */ + xfs_btree_key_t key; /* bmap btree key */ + xfs_btree_ptr_t nptr; /* new block ptr */ + struct xfs_btree_cur *ncur; /* new btree cursor */ + xfs_btree_rec_t nrec; /* new record count */ + int optr; /* old key/record index */ + int ptr; /* key/record index */ + int numrecs; + xfs_btree_curops_t *cops = cur->bc_curops; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + ASSERT(level < cur->bc_nlevels); + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGIPR(cur, level, ptrp, recp); + ncur = NULL; + /* + * If we have an external root pointer, and we've made it to the + * root level, allocate a new root block and we're done. + */ + if (!(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (level >= cur->bc_nlevels)) { + error = cops->new_root(cur, &i); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + xfs_btree_set_ptr_null(cur, ptrp); + *stat = i; + return error; + } + /* + * Make a key out of the record data to be inserted, and save it. + */ + rops->init_key_from_rec(cur, &key, recp); + /* + * If we're off the left edge, return failure. + */ + optr = ptr = cur->bc_ptrs[level]; + if (ptr == 0) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + XFS_STATS_INC(xs_bmbt_insrec); + + /* + * Get pointers to the btree buffer and block. + */ + block = bops->get_block(cur, level, &bp); + numrecs = rops->get_numrecs(cur, block); +#ifdef DEBUG + error = bops->check_block(cur, block, level, bp); + if (error) + goto error0; + /* + * Check that the new entry is being inserted in the right place. + */ + if (ptr <= numrecs) { + if (level == 0) { + rp = rops->rec_addr(cur, ptr, block); + xfs_btree_check_rec(cur->bc_btnum, recp, rp); + } else { + kp = rops->key_addr(cur, ptr, block); + xfs_btree_check_key(cur->bc_btnum, &key, kp); + } + } +#endif + /* + * If the block is full, we can't insert the new entry until we + * make the block un-full. + */ + xfs_btree_set_ptr_null(cur, &nptr); + ncur = NULL; + if (numrecs == rops->get_maxrecs(cur, level)) { + error = xfs_btree_make_block_unfull(cur, level, numrecs, + &optr, &ptr, &nptr, &ncur, &nrec, stat); + if (error || *stat == 0) + goto error0; + } + /* + * The current block may have changed during the split. + */ + block = bops->get_block(cur, level, &bp); +#ifdef DEBUG + error = bops->check_block(cur, block, level, bp); + if (error) + return error; +#endif + + /* + * At this point we know there's room for our new entry in the block + * we're pointing at. + */ + error = xfs_btree_insert_entry(cur, level, bp, ptr, &key, ptrp, recp); + if (error) + goto error0; + + /* + * If we inserted at the start of a block, update the parents' keys. + */ + if (optr == 1) { + error = xfs_btree_updkey(cur, &key, level + 1); + if (error) + goto error0; + } + + /* + * Return the new block number, if any. + * If there is one, give back a record value and a cursor too. + */ + *ptrp = nptr; + if (!xfs_btree_ptr_null(cur, &nptr)) { + *recp = nrec; + *curp = ncur; + } + + /* + * If we are tracking the last record in the tree and + * we are at the far right edge of the tree, update it. + */ + if (xfs_btree_is_lastrec(cur, block, level, ptr)) { + error = cops->update_lastrec(cur, block); + if (error) + goto error0; + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Move 1 record left from cur/level if possible. + * Update cur to reflect the new path. + */ +int /* error */ +xfs_btree_lshift( + xfs_btree_cur_t *cur, + int level, + int *stat) /* success/failure */ +{ + int error; /* error return value */ +#ifdef DEBUG + int i; /* loop counter */ +#endif + xfs_btree_key_t key; /* btree key */ + xfs_buf_t *lbp; /* left buffer pointer */ + xfs_btree_block_t *left; /* left btree block */ + int lrecs; /* left record count */ + xfs_buf_t *rbp; /* right buffer pointer */ + xfs_btree_block_t *right; /* right btree block */ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_ptr_t rptr; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + level == cur->bc_nlevels - 1) + goto out0; + /* + * Set up variables for this block as "right". + */ + rbp = cur->bc_bufs[level]; + right = bops->buf_to_block(cur, rbp); +#ifdef DEBUG + error = bops->check_block(cur, right, level, rbp); + if (error) + goto error0; +#endif + /* + * If we've got no left sibling then we can't shift an entry left. + */ + bops->get_sibling(cur, right, &rptr, XFS_BB_LEFTSIB); + if (xfs_btree_ptr_null(cur, &rptr)) + goto out0; + /* + * If the cursor entry is the one that would be moved, don't + * do it... it's too complicated. + */ + if (cur->bc_ptrs[level] <= 1) + goto out0; + + /* + * Set up the left neighbor as "left". + */ + error = bops->read_buf(cur, &rptr, 0, &lbp); + if (error) + goto error0; + left = bops->buf_to_block(cur, lbp); + error = bops->check_block(cur, left, level, lbp); + if (error) + goto error0; + + /* + * If it's full, it can't take another entry. + */ + lrecs = rops->get_numrecs(cur, left); + if (lrecs == rops->get_maxrecs(cur, level)) + goto out0; + /* + * If non-leaf, copy a key and a ptr to the left block. + * Log the changes to the left block. + */ + error = xfs_btree_move_entries(cur, level, rbp, lbp, 1, lrecs + 1, 1); + if (error) + goto error0; + + /* + * Slide the contents of right down one entry. + * Log the changes to the right block. + */ + error = xfs_btree_remove_entry(cur, level, rbp, &key, 1); + if (error) + goto error0; + + /* + * Update the parent key values of right. + */ + error = xfs_btree_updkey(cur, &key, level + 1); + if (error) + goto error0; + /* + * Slide the cursor value left one. + */ + cur->bc_ptrs[level]--; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Move 1 record right from cur/level if possible. + * Update cur to reflect the new path. + */ +int /* error */ +xfs_btree_rshift( + xfs_btree_cur_t *cur, + int level, + int *stat) /* success/failure */ +{ + int error; /* error return value */ + int i; /* loop counter */ + xfs_btree_key_t key; /* btree key */ + xfs_buf_t *lbp; /* left buffer pointer */ + xfs_btree_block_t *left; /* left btree block */ + xfs_buf_t *rbp; /* right buffer pointer */ + xfs_btree_block_t *right; /* right btree block */ + struct xfs_btree_cur *tcur; /* temporary btree cursor */ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_ptr_t rptr; + int rrecs; /* right record count */ + int lrecs; /* left record count */ + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (level == cur->bc_nlevels - 1)) + goto out0; + /* + * Set up variables for this block as "left". + */ + lbp = cur->bc_bufs[level]; + left = bops->buf_to_block(cur, lbp); +#ifdef DEBUG + error = bops->check_block(cur, left, level, lbp); + if (error) + goto error0; +#endif + /* + * If we've got no right sibling then we can't shift an entry right. + */ + bops->get_sibling(cur, left, &rptr, XFS_BB_RIGHTSIB); + if (xfs_btree_ptr_null(cur, &rptr)) + goto out0; + /* + * If the cursor entry is the one that would be moved, don't + * do it... it's too complicated. + */ + lrecs = rops->get_numrecs(cur, left); + if (cur->bc_ptrs[level] >= lrecs) + goto out0; + /* + * Set up the right neighbor as "right". + */ + error = bops->read_buf(cur, &rptr, 0, &rbp); + if (error) + goto error0; + right = bops->buf_to_block(cur, rbp); + error = bops->check_block(cur, right, level, rbp); + if (error) + goto error0; + + /* + * If it's full, it can't take another entry. + */ + rrecs = rops->get_numrecs(cur, right); + if (rrecs == rops->get_maxrecs(cur, level)) + goto out0; + + /* + * Make a hole at the start of the right neighbor block, then + * copy the last left block entry to the hole. Update and + * log the right block. + */ + error = xfs_btree_insert_entry(cur, level, rbp, 1, + rops->key_addr(cur, lrecs, left), + rops->ptr_addr(cur, lrecs, left), + rops->rec_addr(cur, lrecs, left)); + if (error) + goto error0; + + /* + * If we are at leaf level, grab the key of the new entry in + * the right block for later. + */ + if (level == 0) + rops->init_key_from_rec(cur, &key, rops->rec_addr(cur, 1, right)); + + /* + * Now update the left block to reflect the moved entry + */ + lrecs--; + rops->set_numrecs(cur, left, lrecs); + bops->log_block(cur, lbp, XFS_BB_NUMRECS); + + /* + * Using a temporary cursor, update the parent key values of the + * block on the right. + */ + error = xfs_btree_dup_cursor(cur, &tcur); + if (error) + goto error0; + i = xfs_btree_lastrec(tcur, level); + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_increment(tcur, level, &i); + if (error) + goto error1; + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + + error = xfs_btree_updkey(cur, &key, level + 1); + if (error) + goto error1; + + xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; + +error1: + XFS_BTREE_TRACE_CURSOR(tcur, XBT_ERROR); + xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); + return error; +} + +/* + * Split cur/level block in half. + * Return new block number and the key to its first + * record (to be inserted into parent). + */ +int /* error */ +xfs_btree_split( + xfs_btree_cur_t *cur, + int level, + xfs_btree_ptr_t *ptrp, + xfs_btree_key_t *key, + xfs_btree_cur_t **curp, + int *stat) /* success/failure */ +{ + int error; /* error return value */ + xfs_btree_ptr_t lptr; /* left sibling block ptr */ + xfs_buf_t *lbp; /* left buffer pointer */ + xfs_btree_block_t *left; /* left btree block */ + xfs_btree_ptr_t rptr; /* right sibling block ptr */ + xfs_buf_t *rbp; /* right buffer pointer */ + xfs_btree_block_t *right; /* right btree block */ + xfs_btree_ptr_t rrptr; /* right-right sibling ptr */ + xfs_buf_t *rrbp; /* right-right buffer pointer */ + xfs_btree_block_t *rrblock; /* right-right btree block */ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + int lrecs; + int rrecs; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGIPK(cur, level, ptrp, key); + + /* + * Set up left block (current one). + */ + lbp = cur->bc_bufs[level]; + bops->buf_to_ptr(cur, lbp, &lptr); + + /* + * Allocate the new block. + * If we can't do it, we're toast. Give up. + */ + error = bops->alloc_block(cur, &lptr, &rptr, 1, stat); + if (error) + goto error0; + if (*stat == 0) + goto out0; + + /* + * Set up the new block as "right". + */ + error = bops->get_buf(cur, &rptr, 0, &rbp); + if (error) + goto error0; + right = bops->buf_to_block(cur, rbp); + + /* + * "Left" is the current (according to the cursor) block. + */ + left = bops->buf_to_block(cur, lbp); +#ifdef DEBUG + error = bops->check_block(cur, left, level, lbp); + if (error) + goto error0; +#endif + + /* + * Fill in the btree header for the new block. + */ + bops->init_sibling(cur, right, left); + + /* + * Split the entries between the old and the new block evenly. + * Make sure that if there's an odd number of entries now, that + * each new block will have the same number of entries. + */ + lrecs = rops->get_numrecs(cur, left); + rrecs = lrecs / 2; + if ((lrecs & 1) && cur->bc_ptrs[level] <= rrecs + 1) + rrecs++; + + /* + * Copy btree block entries from the left block over to the + * new block, the right. Update the right block and log the + * changes. + */ + error = xfs_btree_move_entries(cur, level, lbp, rbp, + (lrecs - rrecs + 1), 1, rrecs); + if (error) + goto error0; + + /* + * Grab the keys to the entries moved to the right block + */ + if (level > 0) { + xfs_btree_key_t *keyp; + keyp = rops->key_addr(cur, 1, right); + rops->move_keys(cur, keyp, key, 0, 0, 1); + } else { + rops->init_key_from_rec(cur, key, rops->rec_addr(cur, 1, right)); + } + + /* + * Find the left block number by looking in the buffer. + * Adjust numrecs, sibling pointers. + */ + bops->get_sibling(cur, left, &rrptr, XFS_BB_RIGHTSIB); + bops->set_sibling(cur, right, &rrptr, XFS_BB_RIGHTSIB); + bops->set_sibling(cur, right, &lptr, XFS_BB_LEFTSIB); + bops->set_sibling(cur, left, &rptr, XFS_BB_RIGHTSIB); + + lrecs -= rrecs; + rops->set_numrecs(cur, left, lrecs); + + bops->log_block(cur, rbp, XFS_BB_ALL_BITS); + bops->log_block(cur, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); + + /* + * If there's a block to the new block's right, make that block + * point back to right instead of to left. + */ + if (!xfs_btree_ptr_null(cur, &rrptr)) { + error = bops->read_buf(cur, &rrptr, 0, &rrbp); + if (error) + goto error0; + rrblock = bops->buf_to_block(cur, rrbp); + error = bops->check_block(cur, rrblock, level, rrbp); + if (error) + goto error0; + + bops->set_sibling(cur, rrblock, &rptr, XFS_BB_LEFTSIB); + bops->log_block(cur, rrbp, XFS_BB_LEFTSIB); + } + /* + * If the cursor is really in the right block, move it there. + * If it's just pointing past the last entry in left, then we'll + * insert there, so don't change anything in that case. + */ + if (cur->bc_ptrs[level] > lrecs + 1) { + xfs_btree_setbuf(cur, level, rbp); + cur->bc_ptrs[level] -= lrecs; + } + /* + * If there are more levels, we'll need another cursor which refers + * the right block, no matter where this cursor was. + */ + if (level + 1 < cur->bc_nlevels) { + error = xfs_btree_dup_cursor(cur, curp); + if (error) + goto error0; + (*curp)->bc_ptrs[level + 1]++; + } + *ptrp = rptr; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Update keys at all levels from here to the root along the cursor's path. + */ +int +xfs_btree_updkey( + xfs_btree_cur_t *cur, + xfs_btree_key_t *keyp, /* on-disk format */ + int level) +{ + xfs_btree_block_t *block; + xfs_buf_t *bp; +#ifdef DEBUG + int error; +#endif + xfs_btree_key_t *kp; + int ptr; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + ASSERT(!(cur->bc_flags & XFS_BTREE_INODE_IN_ROOT) || level >= 1); + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGIK(cur, level, keyp); + /* + * Go up the tree from this level toward the root. + * At each level, update the key value to the value input. + * Stop when we reach a level where the cursor isn't pointing + * at the first entry in the block. + */ + for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { + block = bops->get_block(cur, level, &bp); +#ifdef DEBUG + error = bops->check_block(cur, block, level, bp); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; + } +#endif + ptr = cur->bc_ptrs[level]; + kp = rops->key_addr(cur, ptr, block); + rops->move_keys(cur, keyp, kp, 0, 0, 1); + rops->log_keys(cur, bp, ptr, ptr); + } + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; +} + +/* + * Increment cursor by one record at the level. + * For nonzero levels the leaf-ward information is untouched. + */ +int /* error */ +xfs_btree_increment( + xfs_btree_cur_t *cur, + int level, + int *stat) /* success/failure */ +{ + xfs_btree_block_t *block; + xfs_btree_ptr_t ptr; + xfs_buf_t *bp; + int error; /* error return value */ + int lev; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + ASSERT(level < cur->bc_nlevels); + /* + * Read-ahead to the right at this level. + */ + xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); + /* + * Get a pointer to the btree block. + */ + block = bops->get_block(cur, level, &bp); +#ifdef DEBUG + error = bops->check_block(cur, block, level, bp); + if (error) + goto error0; +#endif + /* + * Increment the ptr at this level. If we're still in the block + * then we're done. + */ + if (++cur->bc_ptrs[level] <= rops->get_numrecs(cur, block)) + goto out1; + /* + * If we just went off the right edge of the tree, return failure. + */ + bops->get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); + if (xfs_btree_ptr_null(cur, &ptr)) + goto out0; + + /* + * March up the tree incrementing pointers. + * Stop when we don't go off the right edge of a block. + */ + for (lev = level + 1; lev < cur->bc_nlevels; lev++) { + block = bops->get_block(cur, lev, &bp); +#ifdef DEBUG + error = bops->check_block(cur, block, lev, bp); + if (error) + goto error0; +#endif + if (++cur->bc_ptrs[lev] <= rops->get_numrecs(cur, block)) + break; + /* + * Read-ahead the right block, we're going to read it + * in the next loop. + */ + xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); + } + /* + * If we went off the root then we are either seriously + * confused or have the tree root in an inode. + */ + if (lev == cur->bc_nlevels) { + ASSERT(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE); + goto out0; + } + ASSERT(lev < cur->bc_nlevels); + + /* + * Now walk back down the tree, fixing up the cursor's buffer + * pointers and key numbers. + */ + for (block = bops->get_block(cur, lev, &bp); lev > level; ) { + xfs_btree_ptr_t *ptrp; + + ptrp = rops->ptr_addr(cur, cur->bc_ptrs[lev], block); + error = bops->read_buf(cur, ptrp, 0, &bp); + if (error) + goto error0; + lev--; + xfs_btree_setbuf(cur, lev, bp); + block = bops->buf_to_block(cur, bp); + error = bops->check_block(cur, block, lev, bp); + if (error) + goto error0; + cur->bc_ptrs[lev] = 1; + } +out1: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Decrement cursor by one record at the level. + * For nonzero levels the leaf-ward information is untouched. + */ +int /* error */ +xfs_btree_decrement( + xfs_btree_cur_t *cur, + int level, + int *stat) /* success/failure */ +{ + xfs_btree_block_t *block; + xfs_buf_t *bp; + int error; /* error return value */ + int lev; + xfs_btree_ptr_t ptr; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + XFS_BTREE_TRACE_ARGI(cur, level); + ASSERT(level < cur->bc_nlevels); + /* + * Read-ahead to the left at this level. + */ + xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); + /* + * Decrement the ptr at this level. If we're still in the block + * then we're done. + */ + if (--cur->bc_ptrs[level] > 0) + goto out1; + /* + * Get a pointer to the btree block. + */ + block = bops->get_block(cur, level, &bp); +#ifdef DEBUG + error = bops->check_block(cur, block, level, bp); + if (error) + goto error0; +#endif + /* + * If we just went off the left edge of the tree, return failure. + */ + bops->get_sibling(cur, block, &ptr, XFS_BB_LEFTSIB); + if (xfs_btree_ptr_null(cur, &ptr)) + goto out0; + /* + * March up the tree decrementing pointers. + * Stop when we don't go off the left edge of a block. + */ + for (lev = level + 1; lev < cur->bc_nlevels; lev++) { + if (--cur->bc_ptrs[lev] > 0) + break; + /* + * Read-ahead the left block, we're going to read it + * in the next loop. + */ + xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); + } + /* + * If we went off the root then we are seriously confused. + * or the root of the tree is in an inode. + */ + if (lev == cur->bc_nlevels) { + ASSERT(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE); + goto out0; + } + ASSERT(lev < cur->bc_nlevels); + /* + * Now walk back down the tree, fixing up the cursor's buffer + * pointers and key numbers. + */ + for (block = bops->get_block(cur, lev, &bp); lev > level; ) { + xfs_btree_ptr_t *ptrp; + + ptrp = rops->ptr_addr(cur, cur->bc_ptrs[lev], block); + error = bops->read_buf(cur, ptrp, 0, &bp); + if (error) + goto error0; + lev--; + xfs_btree_setbuf(cur, lev, bp); + block = bops->buf_to_block(cur, bp); + error = bops->check_block(cur, block, lev, bp); + if (error) + goto error0; + cur->bc_ptrs[lev] = rops->get_numrecs(cur, block); + } +out1: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Insert the record at the point referenced by cur. + * The cursor may be inconsistent on return if splits have been done. + */ +int +xfs_btree_insert( + xfs_btree_cur_t *cur, + int *stat) +{ + int error; /* error return value */ + int i; /* result value, 0 for failure */ + int level; /* current level number in btree */ + xfs_btree_ptr_t nptr; /* new block number (split result) */ + xfs_btree_cur_t *ncur; /* new cursor (split result) */ + xfs_btree_cur_t *pcur; /* previous level's cursor */ + xfs_btree_rec_t rec; /* record to insert */ + xfs_btree_curops_t *cops = cur->bc_curops; + + level = 0; + xfs_btree_set_ptr_null(cur, &nptr); + cur->bc_recops->init_rec_from_cur(cur, &rec); + ncur = NULL; + pcur = cur; + /* + * Loop going up the tree, starting at the leaf level. + * Stop when we don't get a split block, that must mean that + * the insert is finished with this level. + */ + do { + /* + * Insert nrec/nptr into this level of the tree. + * Note if we fail, nptr will be null. + */ + error = xfs_btree_insrec(pcur, level, &nptr, &rec, &ncur, &i); + if (error) { + if (pcur != cur) + xfs_btree_del_cursor(pcur, XFS_BTREE_ERROR); + goto error0; + } + XFS_WANT_CORRUPTED_GOTO(i == 1, error0); + level++; + /* + * See if the cursor we just used is trash. + * Can't trash the caller's cursor, but otherwise we should + * if ncur is a new cursor or we're about to be done. + */ + if (pcur != cur && (ncur || xfs_btree_ptr_null(cur, &nptr))) { + /* + * some btrees need to move state from one cursor + * to another here. + */ + if (cops->update_cursor) + cops->update_cursor(pcur, cur); + cur->bc_nlevels = pcur->bc_nlevels; + xfs_btree_del_cursor(pcur, XFS_BTREE_NOERROR); + } + /* + * If we got a new cursor, switch to it. + */ + if (ncur) { + pcur = ncur; + ncur = NULL; + } + } while (!xfs_btree_ptr_null(cur, &nptr)); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = i; + return 0; +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Delete the record pointed to by cur. + * The cursor refers to the place where the record was (could be inserted) + * when the operation returns. + */ +int /* error */ +xfs_btree_delete( + xfs_btree_cur_t *cur, + int *stat) /* success/failure */ +{ + int error; /* error return value */ + int i; + int level; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + /* + * Go up the tree, starting at leaf level. + * If 2 is returned then a join was done; go to the next level. + * Otherwise we are done. + */ + for (level = 0, i = 2; i == 2; level++) { + error = xfs_btree_delrec(cur, level, &i); + if (error) + goto error0; + } + if (i == 0) { + for (level = 1; level < cur->bc_nlevels; level++) { + if (cur->bc_ptrs[level] == 0) { + error = xfs_btree_decrement(cur, level, &i); + if (error) + goto error0; + break; + } + } + } + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = i; + return 0; +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +STATIC int +xfs_btree_lookup_get_block( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_btree_block_t **blkp, /* current btree block */ + int level, /* level in the btree */ + xfs_btree_ptr_t *pp) /* ptr to btree block */ +{ + xfs_buf_t *bp; /* buffer pointer for btree block */ + xfs_daddr_t d; /* disk address of btree block */ + int error = 0; + xfs_btree_block_t *block; /* current btree block */ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + /* + * special case the root block if in an inode + */ + if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) && + (level >= cur->bc_nlevels - 1)) { + *blkp = bops->get_block(cur, level, &bp); + return 0; + } + + /* + * Get the disk address we're looking for. + */ + d = rops->ptr_to_daddr(cur, pp); + /* + * If the old buffer at this level is for a different block, + * throw it away, otherwise just use it. + */ + bp = cur->bc_bufs[level]; + if (bp && XFS_BUF_ADDR(bp) != d) + bp = NULL; + if (!bp) { + /* + * Need to get a new buffer. Read it, then + * set it in the cursor, releasing the old one. + */ + error = bops->read_buf(cur, pp, 0, &bp); + if (error) + return error; + xfs_btree_setbuf(cur, level, bp); + /* + * Point to the btree block, now that we have the buffer + */ + block = bops->buf_to_block(cur, bp); + error = bops->check_block(cur, block, level, bp); + if (error) + return error; + } else + block = bops->buf_to_block(cur, bp); + + *blkp = block; + return 0; +} + +/* + * Lookup the record. The cursor is made to point to it, based on dir. + * Return 0 if can't find any such record, 1 for success. + */ +int /* error */ +xfs_btree_lookup( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_lookup_t dir, /* <=, ==, or >= */ + int *stat) /* success/failure */ +{ + xfs_btree_block_t *block = NULL; /* current btree block */ + __int64_t diff; /* difference for the current key */ + int error; /* error return value */ + int keyno = 0; /* current key number */ + int level; /* level in the btree */ + xfs_btree_ptr_t *pp; /* ptr to btree block */ + xfs_btree_ptr_t ptr; /* ptr to btree block */ + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + /* + * initialise start pointer from cursor + */ + rops->init_ptr_from_cur(cur, &ptr); + pp = &ptr; + + /* + * Iterate over each level in the btree, starting at the root. + * For each level above the leaves, find the key we need, based + * on the lookup record, then follow the corresponding block + * pointer down to the next level. + */ + for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { + /* + * Get the block we need to do the lookup on. + */ + error = xfs_btree_lookup_get_block(cur, &block, level, pp); + if (error) + goto error0; + + /* + * If we already had a key match at a higher level, we know + * we need to use the first entry in this block. + */ + if (diff == 0) + keyno = 1; + /* + * Otherwise we need to search this block. Do a binary search. + */ + else { + int high; /* high entry number */ + int low; /* low entry number */ + + /* + * Set low and high entry numbers, 1-based. + */ + low = 1; + high = rops->get_numrecs(cur, block); + if (!high) { + /* + * If the block is empty, the tree must + * be an empty leaf. + */ + ASSERT(level == 0 && cur->bc_nlevels == 1); + cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; + } + /* + * Binary search the block. + */ + while (low <= high) { + xfs_btree_key_t key; + xfs_btree_key_t *kp; + + XFS_STATS_INC(xs_bmbt_compare); + /* + * keyno is average of low and high. + */ + keyno = (low + high) >> 1; + /* + * Get current search key + */ + if (level > 0) { + kp = rops->key_addr(cur, keyno, block); + } else { + xfs_btree_rec_t *krp; + + krp = rops->rec_addr(cur, keyno, block); + kp = &key; + rops->init_key_from_rec(cur, kp, krp); + } + /* + * Compute difference to get next direction. + */ + diff = rops->key_diff(cur, kp); + + /* + * Less than, move right. + * Greater than, move left. + * Equal, we're done. + */ + if (diff < 0) + low = keyno + 1; + else if (diff > 0) + high = keyno - 1; + else + break; + } + } + /* + * If there are more levels, set up for the next level + * by getting the block number and filling in the cursor. + */ + if (level > 0) { + /* + * If we moved left, need the previous key number, + * unless there isn't one. + */ + if (diff > 0 && --keyno < 1) + keyno = 1; + pp = rops->ptr_addr(cur, keyno, block); + +#ifdef DEBUG + error = bops->xfs_btree_check_ptr(cur, pp, level); + if (error) + goto error0; +#endif + cur->bc_ptrs[level] = keyno; + } + } + /* + * Done with the search. + * See if we need to adjust the results. + */ + if (dir != XFS_LOOKUP_LE && diff < 0) { + keyno++; + /* + * If ge search and we went off the end of the block, but it's + * not the last block, we're in the wrong block. + */ + bops->get_sibling(cur, block, &ptr, XFS_BB_RIGHTSIB); + if (dir == XFS_LOOKUP_GE && + keyno > rops->get_numrecs(cur, block) && + !xfs_btree_ptr_null(cur, &ptr)) { + int i; + + cur->bc_ptrs[0] = keyno; + error = xfs_btree_increment(cur, 0, &i); + if (error) + goto error0; + ASSERT(i == 1); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; + } + } + else if (dir == XFS_LOOKUP_LE && diff > 0) + keyno--; + cur->bc_ptrs[0] = keyno; + /* + * Return if we succeeded or not. + */ + if (keyno == 0 || keyno > rops->get_numrecs(cur, block)) + *stat = 0; + else + *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + +/* + * Allocate a new root block, fill it in. + */ +int /* error */ +xfs_btree_newroot( + xfs_btree_cur_t *cur, /* btree cursor */ + int *stat) /* success/failure */ +{ + xfs_btree_block_t *block; /* one half of the old root block */ + xfs_buf_t *bp; /* buffer containing block */ + int error; /* error return value */ + xfs_btree_key_t *kp; /* btree key pointer */ + xfs_buf_t *lbp; /* left buffer pointer */ + xfs_btree_block_t *left; /* left btree block */ + xfs_buf_t *nbp; /* new (root) buffer */ + xfs_btree_block_t *new; /* new (root) btree block */ + int nptr; /* new value for key index, 1 or 2 */ + xfs_btree_ptr_t *pp; /* btree address pointer */ + xfs_buf_t *rbp; /* right buffer pointer */ + xfs_btree_block_t *right; /* right btree block */ + xfs_btree_curops_t *cops = cur->bc_curops; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + xfs_btree_ptr_t rptr; + xfs_btree_ptr_t lptr; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + //ASSERT(cur->bc_nlevels < XFS_IN_MAXLEVELS(cur->bc_mp)); // inobt + //ASSERT(cur->bc_nlevels < XFS_AG_MAXLEVELS(cur->bc_mp)); // alloc + + /* + * Get a block & a buffer. + */ + rops->init_ptr_from_cur(cur, &rptr); + + /* + * Allocate the new block. + * If we can't do it, we're toast. Give up. + */ + error = bops->alloc_block(cur, &rptr, &lptr, 1, stat); + if (error) + goto error0; + if (*stat == 0) + goto out0; + + /* + * Set up the new block. + */ + error = bops->get_buf(cur, &lptr, 0, &nbp); + if (error) + goto error0; + new = bops->buf_to_block(cur, nbp); + + /* + * Set the root data in the a.g. inode structure, + * increasing the level by 1. + */ + cops->set_root(cur, &lptr, 1); + + /* + * At the previous root level there are now two blocks: the old + * root, and the new block generated when it was split. + * We don't know which one the cursor is pointing at, so we + * set up variables "left" and "right" for each case. + */ + bp = cur->bc_bufs[cur->bc_nlevels - 1]; + block = bops->buf_to_block(cur, bp); +#ifdef DEBUG + error = bops->check_block(cur, block, cur->bc_nlevels - 1, bp); + if (error) + goto error0; +#endif + bops->get_sibling(cur, block, &rptr, XFS_BB_RIGHTSIB); + if (!xfs_btree_ptr_null(cur, &rptr)) { + /* + * Our block is left, pick up the right block. + */ + lbp = bp; + bops->buf_to_ptr(cur, lbp, &lptr); + left = block; + error = bops->read_buf(cur, &rptr, 0, &rbp); + if (error) + goto error0; + bp = rbp; + right = bops->buf_to_block(cur, rbp); + error = bops->check_block(cur, right, cur->bc_nlevels-1, rbp); + if (error) + goto error0; + nptr = 1; + } else { + /* + * Our block is right, pick up the left block. + */ + rbp = bp; + bops->buf_to_ptr(cur, rbp, &rptr); + right = block; + bops->get_sibling(cur, right, &lptr, XFS_BB_LEFTSIB); + error = bops->read_buf(cur, &lptr, 0, &lbp); + if (error) + goto error0; + bp = lbp; + left = bops->buf_to_block(cur, lbp); + error = bops->check_block(cur, left, cur->bc_nlevels-1, lbp); + if (error) + goto error0; + nptr = 2; + } + /* + * Fill in the new block's btree header and log it. + * XXX: this is 32bit btree specific + */ + new->bb_h.bb_magic = cpu_to_be32(xfs_magics[cur->bc_btnum]); + new->bb_h.bb_level = cpu_to_be16(cur->bc_nlevels); + new->bb_h.bb_numrecs = cpu_to_be16(2); + new->bb_u.s.bb_leftsib = cpu_to_be32(NULLAGBLOCK); + new->bb_u.s.bb_rightsib = cpu_to_be32(NULLAGBLOCK); + bops->log_block(cur, nbp, XFS_BB_ALL_BITS); + ASSERT(!xfs_btree_ptr_null(lp) && !xfs_btree_ptr_null(rp)); + + /* + * Fill in the key data in the new root. + */ + kp = rops->key_addr(cur, 1, new); + if (be16_to_cpu(left->bb_h.bb_level) > 0) { + rops->set_key(cur, kp, 0, rops->key_addr(cur, 1, left)); + rops->set_key(cur, kp, 1, rops->key_addr(cur, 1, right)); + } else { + rops->init_key_from_rec(cur, kp, rops->rec_addr(cur, 1, left)); + kp = rops->key_addr(cur, 2, new); + rops->init_key_from_rec(cur, kp, rops->rec_addr(cur, 1, right)); + } + rops->log_keys(cur, nbp, 1, 2); + /* + * Fill in the pointer data in the new root. + */ + pp = rops->ptr_addr(cur, 1, new); + rops->set_ptr(cur, pp, 0, &lptr); + rops->set_ptr(cur, pp, 1, &rptr); + rops->log_ptrs(cur, nbp, 1, 2); + /* + * Fix up the cursor. + */ + xfs_btree_setbuf(cur, cur->bc_nlevels, nbp); + cur->bc_ptrs[cur->bc_nlevels] = nptr; + cur->bc_nlevels++; + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 1; + return 0; +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +out0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + *stat = 0; + return 0; +} + +/* + * Update the record referred to by cur to the value in the + * given record. This either works (return 0) or gets an + * EFSCORRUPTED error. + */ +int +xfs_btree_update( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec) +{ + xfs_btree_block_t *block; + xfs_buf_t *bp; + int error; + int ptr; + xfs_btree_rec_t *rp; + xfs_btree_curops_t *cops = cur->bc_curops; + xfs_btree_blkops_t *bops = cur->bc_blkops; + xfs_btree_recops_t *rops = cur->bc_recops; + + XFS_BTREE_TRACE_CURSOR(cur, XBT_ENTRY); + //XFS_BTREE_TRACE_ARGR(cur, rec); + + /* + * Pick up the current block. + */ + block = bops->get_block(cur, 0, &bp); +#ifdef DEBUG + error = bops->check_block(cur, block, 0, bp); + if (error) + goto error0; +#endif + /* + * Get the address of the rec to be updated. + */ + ptr = cur->bc_ptrs[0]; + rp = rops->rec_addr(cur, ptr, block); + /* + * Fill in the new contents and log them. + */ + rops->move_recs(cur, rec, rp, 0, 0, 1); + rops->log_recs(cur, bp, ptr, ptr); + /* + * If we are tracking the last record in the tree and + * we are at the far right edge of the tree, update it. + */ + if (xfs_btree_is_lastrec(cur, block, 0, ptr)) { + error = cops->update_lastrec(cur, block); + if (error) + goto error0; + } + + /* + * Updating first record in leaf. Pass new key value up to our parent. + */ + if (ptr == 1) { + xfs_btree_key_t key; + + rops->init_key_from_rec(cur, &key, rec); + error = xfs_btree_updkey(cur, &key, 1); + if (error) + goto error0; + } + + XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT); + return 0; + +error0: + XFS_BTREE_TRACE_CURSOR(cur, XBT_ERROR); + return error; +} + Index: 2.6.x-xfs-new/fs/xfs/xfs_btree_trace.c =================================================================== --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ 2.6.x-xfs-new/fs/xfs/xfs_btree_trace.c 2007-11-06 19:40:29.758667866 +1100 @@ -0,0 +1,202 @@ +/* + * Copyright (c) 2000-2001,2005 Silicon Graphics, Inc. + * All Rights Reserved. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it would be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_types.h" +#include "xfs_bit.h" +#include "xfs_log.h" +#include "xfs_inum.h" +#include "xfs_trans.h" +#include "xfs_sb.h" +#include "xfs_ag.h" +#include "xfs_dir2.h" +#include "xfs_dmapi.h" +#include "xfs_mount.h" +#include "xfs_bmap_btree.h" +#include "xfs_alloc_btree.h" +#include "xfs_ialloc_btree.h" +#include "xfs_dir2_sf.h" +#include "xfs_attr_sf.h" +#include "xfs_dinode.h" +#include "xfs_inode.h" +#include "xfs_btree.h" +#include "xfs_ialloc.h" +#include "xfs_alloc.h" +#include "xfs_error.h" + +#if defined(XFS_BTREE_TRACE) + +/* + * Add a trace buffer entry for arguments, for one integer arg. + */ +STATIC void +xfs_btree_trace_argi( + const char *func, + xfs_btree_cur_t *cur, + int i, + int line) +{ + cur->bc_trcops->trace(func, cur, XBT_ARGS, XFS_BTREE_KTRACE_ARGI, line, + i, 0, 0, 0, + 0, 0, 0, 0, + 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for a buffer & 1 integer arg. + */ +STATIC void +xfs_btree_trace_argbi( + const char *func, + xfs_btree_cur_t *cur, + xfs_buf_t *b, + int i, + int line) +{ + cur->bc_trcops->trace(func, cur, XBT_ARGS, XFS_BTREE_KTRACE_ARGBI, line, + (__psunsigned_t)b, i, 0, 0, + 0, 0, 0, 0, + 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for a buffer & 2 integer args. + */ +STATIC void +xfs_btree_trace_argbii( + const char *func, + xfs_btree_cur_t *cur, + xfs_buf_t *b, + int i0, + int i1, + int line) +{ + cur->bc_trcops->trace(func, cur, XBT_ARGS, XFS_BTREE_KTRACE_ARGBII, line, + (__psunsigned_t)b, i0, i1, 0, + 0, 0, 0, 0, + 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for int, ptr, key. + */ +STATIC void +xfs_btree_trace_argipk( + const char *func, + xfs_btree_cur_t *cur, + int i, + xfs_btree_ptr_t *p, + xfs_btree_key_t *k, + int line) +{ + __uint64_t v = 0, u = 0; + if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) { + u = be64_to_cpu(p->u.l); + v = be64_to_cpu(k->u.l); + } else { + u = be32_to_cpu(p->u.s); + v = be32_to_cpu(k->u.s); + } + + cur->bc_trcops->trace(func, cur, XBT_ARGS, XFS_BTREE_KTRACE_ARGIPK, + line, i, u >> 32, (int)u, + v >> 32, (int)v, 0, 0, 0, + 0, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for int, ptr, rec. + */ +STATIC void +xfs_btree_trace_argipr( + const char *func, + xfs_btree_cur_t *cur, + int i, + xfs_btree_ptr_t *p, + xfs_btree_rec_t *r, + int line) +{ + __uint64_t l0 = 0, l1 = 0, l2 = 0; + __uint64_t d; + + if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) + d = be64_to_cpu(p->u.l); + else + d = be32_to_cpu(p->u.s); + + if (cur->bc_trcops->record) + cur->bc_trcops->record(cur, r, &l0, &l1, &l2); + + cur->bc_trcops->trace(func, cur, XBT_ARGS, XFS_BTREE_KTRACE_ARGIPR, line, + i, d >> 32, (int)d, l0 >> 32, + (int)l0, l1 >> 32, (int)l1, l2 >> 32, + (int)l2, 0, 0); +} + +/* + * Add a trace buffer entry for arguments, for int, key. + */ +STATIC void +xfs_btree_trace_argik( + const char *func, + xfs_btree_cur_t *cur, + int i, + xfs_btree_key_t *k, + int line) +{ + __uint64_t v = 0; + + if (XFS_BTREE_LONG_PTRS(cur->bc_btnum)) + v = be64_to_cpu(k->u.l); + else + v = be32_to_cpu(k->u.s); + + cur->bc_trcops->trace(func, cur, XBT_ARGS, XFS_BTREE_KTRACE_ARGIPK, line, + i, 0, 0, v >> 32, (int)v, + 0, 0, 0, + 0, 0, 0); +} + +/* + * Add a trace buffer entry for the cursor/operation. + */ +STATIC void +xfs_btree_trace_cursor( + const char *func, + xfs_btree_cur_t *cur, + char *s, + int line) +{ + __uint32_t s0 = 0; + __uint64_t l0 = 0, l1 = 0; + + if (cur->bc_trcops->cursor) + cur->bc_trcops->cursor(cur, &s0, &l0, &l1); + + cur->bc_trcops->enter(func, cur, s, XFS_BTREE_KTRACE_CUR, line, + (cur->bc_nlevels << 24) | s0, + l0 >> 32, (int)l0, + l1 >> 32, (int)l1, + (unsigned long)cur->bc_bufs[0], (unsigned long)cur->bc_bufs[1], + (unsigned long)cur->bc_bufs[2], (unsigned long)cur->bc_bufs[3], + (cur->bc_ptrs[0] << 16) | cur->bc_ptrs[1], + (cur->bc_ptrs[2] << 16) | cur->bc_ptrs[3]); +} + + + Index: 2.6.x-xfs-new/fs/xfs/xfs_ialloc.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_ialloc.c 2007-10-16 08:52:58.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_ialloc.c 2007-11-06 19:40:29.762667351 +1100 @@ -322,7 +322,7 @@ xfs_ialloc_ag_alloc( return error; } ASSERT(i == 0); - if ((error = xfs_inobt_insert(cur, &i))) { + if ((error = xfs_btree_insert(cur, &i))) { xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); return error; } @@ -673,7 +673,7 @@ nextag: goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } while (i == 1); @@ -717,7 +717,7 @@ nextag: /* * Search left with tcur, back up 1 record. */ - if ((error = xfs_inobt_decrement(tcur, 0, &i))) + if ((error = xfs_btree_decrement(tcur, 0, &i))) goto error1; doneleft = !i; if (!doneleft) { @@ -731,7 +731,7 @@ nextag: /* * Search right with cur, go forward 1 record. */ - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error1; doneright = !i; if (!doneright) { @@ -793,7 +793,7 @@ nextag: * further left. */ if (useleft) { - if ((error = xfs_inobt_decrement(tcur, 0, + if ((error = xfs_btree_decrement(tcur, 0, &i))) goto error1; doneleft = !i; @@ -813,7 +813,7 @@ nextag: * further right. */ else { - if ((error = xfs_inobt_increment(cur, 0, + if ((error = xfs_btree_increment(cur, 0, &i))) goto error1; doneright = !i; @@ -868,7 +868,7 @@ nextag: XFS_WANT_CORRUPTED_GOTO(i == 1, error0); if (rec.ir_freecount > 0) break; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); } @@ -902,7 +902,7 @@ nextag: goto error0; XFS_WANT_CORRUPTED_GOTO(i == 1, error0); freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } while (i == 1); ASSERT(freecount == be32_to_cpu(agi->agi_freecount) || @@ -1012,7 +1012,7 @@ xfs_difree( goto error0; if (i) { freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } } while (i == 1); @@ -1074,8 +1074,8 @@ xfs_difree( xfs_trans_mod_sb(tp, XFS_TRANS_SB_ICOUNT, -ilen); xfs_trans_mod_sb(tp, XFS_TRANS_SB_IFREE, -(ilen - 1)); - if ((error = xfs_inobt_delete(cur, &i))) { - cmn_err(CE_WARN, "xfs_difree: xfs_inobt_delete returned an error %d on %s.\n", + if ((error = xfs_btree_delete(cur, &i))) { + cmn_err(CE_WARN, "xfs_difree: xfs_btree_delete returned an error %d on %s.\n", error, mp->m_fsname); goto error0; } @@ -1117,7 +1117,7 @@ xfs_difree( goto error0; if (i) { freecount += rec.ir_freecount; - if ((error = xfs_inobt_increment(cur, 0, &i))) + if ((error = xfs_btree_increment(cur, 0, &i))) goto error0; } } while (i == 1); Index: 2.6.x-xfs-new/fs/xfs/xfs_ialloc_btree.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_ialloc_btree.c 2007-06-05 22:12:50.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_ialloc_btree.c 2007-11-06 19:40:29.770666321 +1100 @@ -39,711 +39,132 @@ #include "xfs_alloc.h" #include "xfs_error.h" -STATIC void xfs_inobt_log_block(xfs_trans_t *, xfs_buf_t *, int); -STATIC void xfs_inobt_log_keys(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_inobt_log_ptrs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC void xfs_inobt_log_recs(xfs_btree_cur_t *, xfs_buf_t *, int, int); -STATIC int xfs_inobt_lshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_inobt_newroot(xfs_btree_cur_t *, int *); -STATIC int xfs_inobt_rshift(xfs_btree_cur_t *, int, int *); -STATIC int xfs_inobt_split(xfs_btree_cur_t *, int, xfs_agblock_t *, - xfs_inobt_key_t *, xfs_btree_cur_t **, int *); -STATIC int xfs_inobt_updkey(xfs_btree_cur_t *, xfs_inobt_key_t *, int); /* - * Single level of the xfs_inobt_delete record deletion routine. - * Delete record pointed to by cur/level. - * Remove the record from its block then rebalance the tree. - * Return 0 for error, 1 for done, 2 to go on to the next level. + * Get the block pointer for the given level of the cursor. + * Fill in the buffer pointer, if applicable. */ -STATIC int /* error */ -xfs_inobt_delrec( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level removing record from */ - int *stat) /* fail/done/go-on */ +STATIC xfs_btree_block_t * +xfs_inobt_get_block( + xfs_btree_cur_t *cur, + int level, + xfs_buf_t **bpp) { - xfs_buf_t *agbp; /* buffer for a.g. inode header */ - xfs_mount_t *mp; /* mount structure */ - xfs_agi_t *agi; /* allocation group inode header */ - xfs_inobt_block_t *block; /* btree block record/key lives in */ - xfs_agblock_t bno; /* btree block number */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_inobt_key_t key; /* kp points here if block is level 0 */ - xfs_inobt_key_t *kp = NULL; /* pointer to btree keys */ - xfs_agblock_t lbno; /* left block's block number */ - xfs_buf_t *lbp; /* left block's buffer pointer */ - xfs_inobt_block_t *left; /* left btree block */ - xfs_inobt_key_t *lkp; /* left block key pointer */ - xfs_inobt_ptr_t *lpp; /* left block address pointer */ - int lrecs = 0; /* number of records in left block */ - xfs_inobt_rec_t *lrp; /* left block record pointer */ - xfs_inobt_ptr_t *pp = NULL; /* pointer to btree addresses */ - int ptr; /* index in btree block for this rec */ - xfs_agblock_t rbno; /* right block's block number */ - xfs_buf_t *rbp; /* right block's buffer pointer */ - xfs_inobt_block_t *right; /* right btree block */ - xfs_inobt_key_t *rkp; /* right block key pointer */ - xfs_inobt_rec_t *rp; /* pointer to btree records */ - xfs_inobt_ptr_t *rpp; /* right block address pointer */ - int rrecs = 0; /* number of records in right block */ - int numrecs; - xfs_inobt_rec_t *rrp; /* right block record pointer */ - xfs_btree_cur_t *tcur; /* temporary btree cursor */ + ASSERT(level < cur->bc_nlevels); + *bpp = cur->bc_bufs[level]; + return (xfs_btree_block_t *)XFS_BUF_TO_INOBT_BLOCK(*bpp); +} - mp = cur->bc_mp; - /* - * Get the index of the entry being deleted, check for nothing there. - */ - ptr = cur->bc_ptrs[level]; - if (ptr == 0) { - *stat = 0; - return 0; - } - - /* - * Get the buffer & block containing the record or key/ptr. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - /* - * Fail if we're off the end of the block. - */ +STATIC int +xfs_inobt_get_buf( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int flags, + xfs_buf_t **bpp) +{ + xfs_buf_t *bp; - numrecs = be16_to_cpu(block->bb_numrecs); - if (ptr > numrecs) { - *stat = 0; - return 0; - } - /* - * It's a nonleaf. Excise the key and ptr being deleted, by - * sliding the entries past them down one. - * Log the changed areas of the block. - */ - if (level > 0) { - kp = XFS_INOBT_KEY_ADDR(block, 1, cur); - pp = XFS_INOBT_PTR_ADDR(block, 1, cur); -#ifdef DEBUG - for (i = ptr; i < numrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(pp[i]), level))) - return error; - } -#endif - if (ptr < numrecs) { - memmove(&kp[ptr - 1], &kp[ptr], - (numrecs - ptr) * sizeof(*kp)); - memmove(&pp[ptr - 1], &pp[ptr], - (numrecs - ptr) * sizeof(*kp)); - xfs_inobt_log_keys(cur, bp, ptr, numrecs - 1); - xfs_inobt_log_ptrs(cur, bp, ptr, numrecs - 1); - } - } - /* - * It's a leaf. Excise the record being deleted, by sliding the - * entries past it down one. Log the changed areas of the block. - */ - else { - rp = XFS_INOBT_REC_ADDR(block, 1, cur); - if (ptr < numrecs) { - memmove(&rp[ptr - 1], &rp[ptr], - (numrecs - ptr) * sizeof(*rp)); - xfs_inobt_log_recs(cur, bp, ptr, numrecs - 1); - } - /* - * If it's the first record in the block, we'll need a key - * structure to pass up to the next level (updkey). - */ - if (ptr == 1) { - key.ir_startino = rp->ir_startino; - kp = &key; - } - } - /* - * Decrement and log the number of entries in the block. - */ - numrecs--; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_inobt_log_block(cur->bc_tp, bp, XFS_BB_NUMRECS); - /* - * Is this the root level? If so, we're almost done. - */ - if (level == cur->bc_nlevels - 1) { - /* - * If this is the root level, - * and there's only one entry left, - * and it's NOT the leaf level, - * then we can get rid of this level. - */ - if (numrecs == 1 && level > 0) { - agbp = cur->bc_private.i.agbp; - agi = XFS_BUF_TO_AGI(agbp); - /* - * pp is still set to the first pointer in the block. - * Make it the new root of the btree. - */ - bno = be32_to_cpu(agi->agi_root); - agi->agi_root = *pp; - be32_add(&agi->agi_level, -1); - /* - * Free the block. - */ - if ((error = xfs_free_extent(cur->bc_tp, - XFS_AGB_TO_FSB(mp, cur->bc_private.i.agno, bno), 1))) - return error; - xfs_trans_binval(cur->bc_tp, bp); - xfs_ialloc_log_agi(cur->bc_tp, agbp, - XFS_AGI_ROOT | XFS_AGI_LEVEL); - /* - * Update the cursor so there's one fewer level. - */ - cur->bc_bufs[level] = NULL; - cur->bc_nlevels--; - } else if (level > 0 && - (error = xfs_inobt_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * If we deleted the leftmost entry in the block, update the - * key values above us in the tree. - */ - if (ptr == 1 && (error = xfs_inobt_updkey(cur, kp, level + 1))) - return error; - /* - * If the number of records remaining in the block is at least - * the minimum, we're done. - */ - if (numrecs >= XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if (level > 0 && - (error = xfs_inobt_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * Otherwise, we have to move some records around to keep the - * tree balanced. Look at the left and right sibling blocks to - * see if we can re-balance by moving only one record. - */ - rbno = be32_to_cpu(block->bb_rightsib); - lbno = be32_to_cpu(block->bb_leftsib); - bno = NULLAGBLOCK; - ASSERT(rbno != NULLAGBLOCK || lbno != NULLAGBLOCK); - /* - * Duplicate the cursor so our btree manipulations here won't - * disrupt the next level up. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - /* - * If there's a right sibling, see if it's ok to shift an entry - * out of it. - */ - if (rbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the last entry in the next block. - * Actually any entry but the first would suffice. - */ - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - if ((error = xfs_inobt_increment(tcur, level, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - i = xfs_btree_lastrec(tcur, level); - XFS_WANT_CORRUPTED_GOTO(i == 1, error0); - /* - * Grab a pointer to the block. - */ - rbp = tcur->bc_bufs[level]; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(right->bb_leftsib); - /* - * If right block is full enough so that removing one entry - * won't make it too empty, and left-shifting an entry out - * of right to us works, we're done. - */ - if (be16_to_cpu(right->bb_numrecs) - 1 >= - XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_inobt_lshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_INOBT_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level > 0 && - (error = xfs_inobt_decrement(cur, level, - &i))) - return error; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference, and fix up the temp cursor to point - * to our block again (last record). - */ - rrecs = be16_to_cpu(right->bb_numrecs); - if (lbno != NULLAGBLOCK) { - xfs_btree_firstrec(tcur, level); - if ((error = xfs_inobt_decrement(tcur, level, &i))) - goto error0; - } - } - /* - * If there's a left sibling, see if it's ok to shift an entry - * out of it. - */ - if (lbno != NULLAGBLOCK) { - /* - * Move the temp cursor to the first entry in the - * previous block. - */ - xfs_btree_firstrec(tcur, level); - if ((error = xfs_inobt_decrement(tcur, level, &i))) - goto error0; - xfs_btree_firstrec(tcur, level); - /* - * Grab a pointer to the block. - */ - lbp = tcur->bc_bufs[level]; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - goto error0; -#endif - /* - * Grab the current block number, for future use. - */ - bno = be32_to_cpu(left->bb_rightsib); - /* - * If left block is full enough so that removing one entry - * won't make it too empty, and right-shifting an entry out - * of left to us works, we're done. - */ - if (be16_to_cpu(left->bb_numrecs) - 1 >= - XFS_INOBT_BLOCK_MINRECS(level, cur)) { - if ((error = xfs_inobt_rshift(tcur, level, &i))) - goto error0; - if (i) { - ASSERT(be16_to_cpu(block->bb_numrecs) >= - XFS_INOBT_BLOCK_MINRECS(level, cur)); - xfs_btree_del_cursor(tcur, - XFS_BTREE_NOERROR); - if (level == 0) - cur->bc_ptrs[0]++; - *stat = 1; - return 0; - } - } - /* - * Otherwise, grab the number of records in right for - * future reference. - */ - lrecs = be16_to_cpu(left->bb_numrecs); - } - /* - * Delete the temp cursor, we're done with it. - */ - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - /* - * If here, we need to do a join to keep the tree balanced. - */ - ASSERT(bno != NULLAGBLOCK); - /* - * See if we can join with the left neighbor block. - */ - if (lbno != NULLAGBLOCK && - lrecs + numrecs <= XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - /* - * Set "right" to be the starting block, - * "left" to be the left neighbor. - */ - rbno = bno; - right = block; - rrecs = be16_to_cpu(right->bb_numrecs); - rbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.i.agno, lbno, 0, &lbp, - XFS_INO_BTREE_REF))) - return error; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); - lrecs = be16_to_cpu(left->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - } - /* - * If that won't work, see if we can join with the right neighbor block. - */ - else if (rbno != NULLAGBLOCK && - rrecs + numrecs <= XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - /* - * Set "left" to be the starting block, - * "right" to be the right neighbor. - */ - lbno = bno; - left = block; - lrecs = be16_to_cpu(left->bb_numrecs); - lbp = bp; - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.i.agno, rbno, 0, &rbp, - XFS_INO_BTREE_REF))) - return error; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); - rrecs = be16_to_cpu(right->bb_numrecs); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - } - /* - * Otherwise, we can't fix the imbalance. - * Just return. This is probably a logic error, but it's not fatal. - */ - else { - if (level > 0 && (error = xfs_inobt_decrement(cur, level, &i))) - return error; - *stat = 1; - return 0; - } - /* - * We're now going to join "left" and "right" by moving all the stuff - * in "right" to "left" and deleting "right". - */ - if (level > 0) { - /* - * It's a non-leaf. Move keys and pointers. - */ - lkp = XFS_INOBT_KEY_ADDR(left, lrecs + 1, cur); - lpp = XFS_INOBT_PTR_ADDR(left, lrecs + 1, cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < rrecs; i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memcpy(lkp, rkp, rrecs * sizeof(*lkp)); - memcpy(lpp, rpp, rrecs * sizeof(*lpp)); - xfs_inobt_log_keys(cur, lbp, lrecs + 1, lrecs + rrecs); - xfs_inobt_log_ptrs(cur, lbp, lrecs + 1, lrecs + rrecs); - } else { - /* - * It's a leaf. Move records. - */ - lrp = XFS_INOBT_REC_ADDR(left, lrecs + 1, cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - memcpy(lrp, rrp, rrecs * sizeof(*lrp)); - xfs_inobt_log_recs(cur, lbp, lrecs + 1, lrecs + rrecs); - } - /* - * If we joined with the left neighbor, set the buffer in the - * cursor to the left block, and fix up the index. - */ - if (bp != lbp) { - xfs_btree_setbuf(cur, level, lbp); - cur->bc_ptrs[level] += lrecs; - } - /* - * If we joined with the right neighbor and there's a level above - * us, increment the cursor at that level. - */ - else if (level + 1 < cur->bc_nlevels && - (error = xfs_alloc_increment(cur, level + 1, &i))) - return error; - /* - * Fix up the number of records in the surviving block. - */ - lrecs += rrecs; - left->bb_numrecs = cpu_to_be16(lrecs); - /* - * Fix up the right block pointer in the surviving block, and log it. - */ - left->bb_rightsib = right->bb_rightsib; - xfs_inobt_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - /* - * If there is a right sibling now, make it point to the - * remaining block. - */ - if (be32_to_cpu(left->bb_rightsib) != NULLAGBLOCK) { - xfs_inobt_block_t *rrblock; - xfs_buf_t *rrbp; - - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - cur->bc_private.i.agno, be32_to_cpu(left->bb_rightsib), 0, - &rrbp, XFS_INO_BTREE_REF))) - return error; - rrblock = XFS_BUF_TO_INOBT_BLOCK(rrbp); - if ((error = xfs_btree_check_sblock(cur, rrblock, level, rrbp))) - return error; - rrblock->bb_leftsib = cpu_to_be32(lbno); - xfs_inobt_log_block(cur->bc_tp, rrbp, XFS_BB_LEFTSIB); - } - /* - * Free the deleting block. - */ - if ((error = xfs_free_extent(cur->bc_tp, XFS_AGB_TO_FSB(mp, - cur->bc_private.i.agno, rbno), 1))) - return error; - xfs_trans_binval(cur->bc_tp, rbp); - /* - * Readjust the ptr at this level if it's not a leaf, since it's - * still pointing at the deletion point, which makes the cursor - * inconsistent. If this makes the ptr 0, the caller fixes it up. - * We can't use decrement because it would change the next level up. - */ - if (level > 0) - cur->bc_ptrs[level]--; - /* - * Return value means the next level up has something to do. - */ - *stat = 2; + bp = xfs_btree_get_bufs(cur->bc_mp, cur->bc_tp, cur->bc_private.i.agno, + be32_to_cpu(ptr->u.inobt), flags); + *bpp = bp; return 0; -error0: - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; } -/* - * Insert one record/level. Return information to the caller - * allowing the next level up to proceed if necessary. - */ -STATIC int /* error */ -xfs_inobt_insrec( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to insert record at */ - xfs_agblock_t *bnop, /* i/o: block number inserted */ - xfs_inobt_rec_t *recp, /* i/o: record data inserted */ - xfs_btree_cur_t **curp, /* output: new cursor replacing cur */ - int *stat) /* success/failure */ +STATIC int +xfs_inobt_read_buf( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr, + int flags, + xfs_buf_t **bpp) { - xfs_inobt_block_t *block; /* btree block record/key lives in */ - xfs_buf_t *bp; /* buffer for block */ - int error; /* error return value */ - int i; /* loop index */ - xfs_inobt_key_t key; /* key value being inserted */ - xfs_inobt_key_t *kp=NULL; /* pointer to btree keys */ - xfs_agblock_t nbno; /* block number of allocated block */ - xfs_btree_cur_t *ncur; /* new cursor to be used at next lvl */ - xfs_inobt_key_t nkey; /* new key value, from split */ - xfs_inobt_rec_t nrec; /* new record value, for caller */ - int numrecs; - int optr; /* old ptr value */ - xfs_inobt_ptr_t *pp; /* pointer to btree addresses */ - int ptr; /* index in btree block for this rec */ - xfs_inobt_rec_t *rp=NULL; /* pointer to btree records */ + return xfs_btree_read_bufs(cur->bc_mp, + cur->bc_tp, cur->bc_private.i.agno, + be32_to_cpu(ptr->u.inobt), flags, + bpp, XFS_INO_BTREE_REF); +} - /* - * GCC doesn't understand the (arguably complex) control flow in - * this function and complains about uninitialized structure fields - * without this. - */ - memset(&nrec, 0, sizeof(nrec)); +STATIC xfs_btree_block_t * +xfs_inobt_buf_to_block( + xfs_btree_cur_t *cur, + xfs_buf_t *bp) +{ + /* XFS_BUF_TO_INOBT_BLOCK(rbp); */ + return XFS_BUF_TO_BLOCK(bp); +} - /* - * If we made it to the root level, allocate a new root block - * and we're done. - */ - if (level >= cur->bc_nlevels) { - error = xfs_inobt_newroot(cur, &i); - *bnop = NULLAGBLOCK; - *stat = i; +STATIC void +xfs_inobt_buf_to_ptr( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + xfs_btree_ptr_t *ptr) +{ + ptr->u.inobt = cpu_to_be32(XFS_DADDR_TO_AGBNO(cur->bc_mp, XFS_BUF_ADDR(bp))); +} + +STATIC int +xfs_inobt_alloc_block( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *start, + xfs_btree_ptr_t *new, + int length, + int *stat) +{ + xfs_alloc_arg_t args; /* block allocation args */ + int error; /* error return value */ + xfs_agblock_t sbno = be32_to_cpu(start->u.inobt); + + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + memset(&args, 0, sizeof(args)); + args.tp = cur->bc_tp; + args.mp = cur->bc_mp; + args.fsbno = XFS_AGB_TO_FSB(args.mp, cur->bc_private.i.agno, sbno); + args.mod = args.minleft = args.alignment = args.total = args.wasdel = + args.isfl = args.userdata = args.minalignslop = 0; + args.minlen = args.maxlen = args.prod = 1; + args.type = XFS_ALLOCTYPE_NEAR_BNO; + + error = xfs_alloc_vextent(&args); + if (error) { + XFS_BTREE_TRACE_CURSOR(cur, ERROR); return error; } - /* - * Make a key out of the record data to be inserted, and save it. - */ - key.ir_startino = recp->ir_startino; - optr = ptr = cur->bc_ptrs[level]; - /* - * If we're off the left edge, return failure. - */ - if (ptr == 0) { + if (args.fsbno == NULLFSBLOCK) { + XFS_BTREE_TRACE_CURSOR(cur, EXIT); *stat = 0; return 0; } - /* - * Get pointers to the btree buffer and block. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); - numrecs = be16_to_cpu(block->bb_numrecs); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; - /* - * Check that the new entry is being inserted in the right place. - */ - if (ptr <= numrecs) { - if (level == 0) { - rp = XFS_INOBT_REC_ADDR(block, ptr, cur); - xfs_btree_check_rec(cur->bc_btnum, recp, rp); - } else { - kp = XFS_INOBT_KEY_ADDR(block, ptr, cur); - xfs_btree_check_key(cur->bc_btnum, &key, kp); - } - } -#endif - nbno = NULLAGBLOCK; - ncur = NULL; - /* - * If the block is full, we can't insert the new entry until we - * make the block un-full. - */ - if (numrecs == XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - /* - * First, try shifting an entry to the right neighbor. - */ - if ((error = xfs_inobt_rshift(cur, level, &i))) - return error; - if (i) { - /* nothing */ - } - /* - * Next, try shifting an entry to the left neighbor. - */ - else { - if ((error = xfs_inobt_lshift(cur, level, &i))) - return error; - if (i) { - optr = ptr = cur->bc_ptrs[level]; - } else { - /* - * Next, try splitting the current block - * in half. If this works we have to - * re-set our variables because - * we could be in a different block now. - */ - if ((error = xfs_inobt_split(cur, level, &nbno, - &nkey, &ncur, &i))) - return error; - if (i) { - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, - block, level, bp))) - return error; -#endif - ptr = cur->bc_ptrs[level]; - nrec.ir_startino = nkey.ir_startino; - } else { - /* - * Otherwise the insert fails. - */ - *stat = 0; - return 0; - } - } - } - } - /* - * At this point we know there's room for our new entry in the block - * we're pointing at. - */ - numrecs = be16_to_cpu(block->bb_numrecs); - if (level > 0) { - /* - * It's a non-leaf entry. Make a hole for the new data - * in the key and ptr regions of the block. - */ - kp = XFS_INOBT_KEY_ADDR(block, 1, cur); - pp = XFS_INOBT_PTR_ADDR(block, 1, cur); -#ifdef DEBUG - for (i = numrecs; i >= ptr; i--) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(pp[i - 1]), level))) - return error; - } -#endif - memmove(&kp[ptr], &kp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*kp)); - memmove(&pp[ptr], &pp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*pp)); - /* - * Now stuff the new data in, bump numrecs and log the new data. - */ -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, *bnop, level))) - return error; -#endif - kp[ptr - 1] = key; - pp[ptr - 1] = cpu_to_be32(*bnop); - numrecs++; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_inobt_log_keys(cur, bp, ptr, numrecs); - xfs_inobt_log_ptrs(cur, bp, ptr, numrecs); - } else { - /* - * It's a leaf entry. Make a hole for the new record. - */ - rp = XFS_INOBT_REC_ADDR(block, 1, cur); - memmove(&rp[ptr], &rp[ptr - 1], - (numrecs - ptr + 1) * sizeof(*rp)); - /* - * Now stuff the new record in, bump numrecs - * and log the new data. - */ - rp[ptr - 1] = *recp; - numrecs++; - block->bb_numrecs = cpu_to_be16(numrecs); - xfs_inobt_log_recs(cur, bp, ptr, numrecs); - } - /* - * Log the new number of records in the btree header. - */ - xfs_inobt_log_block(cur->bc_tp, bp, XFS_BB_NUMRECS); -#ifdef DEBUG - /* - * Check that the key/record is in the right place, now. - */ - if (ptr < numrecs) { - if (level == 0) - xfs_btree_check_rec(cur->bc_btnum, rp + ptr - 1, - rp + ptr); - else - xfs_btree_check_key(cur->bc_btnum, kp + ptr - 1, - kp + ptr); - } -#endif - /* - * If we inserted at the start of a block, update the parents' keys. - */ - if (optr == 1 && (error = xfs_inobt_updkey(cur, &key, level + 1))) - return error; - /* - * Return the new block number, if any. - * If there is one, give back a record value and a cursor too. - */ - *bnop = nbno; - if (nbno != NULLAGBLOCK) { - *recp = nrec; - *curp = ncur; - } + ASSERT(args.len == 1); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); + + new->u.inobt = cpu_to_be32(XFS_FSB_TO_AGBNO(args.mp, args.fsbno)); *stat = 1; return 0; } +STATIC int +xfs_inobt_free_block( + xfs_btree_cur_t *cur, + xfs_buf_t *bp, + int size) +{ + int error; + + error = xfs_free_extent(cur->bc_tp, + XFS_DADDR_TO_FSB(cur->bc_mp, XFS_BUF_ADDR(bp)), 1); + if (error) + return error; + xfs_trans_binval(cur->bc_tp, bp); + return 0; +} + /* - * Log header fields from a btree block. + * Log fields from the btree block header. */ STATIC void xfs_inobt_log_block( - xfs_trans_t *tp, /* transaction pointer */ + xfs_btree_cur_t *cur, /* btree cursor */ xfs_buf_t *bp, /* buffer containing btree block */ int fields) /* mask of fields: XFS_BB_... */ { @@ -758,1218 +179,514 @@ xfs_inobt_log_block( sizeof(xfs_inobt_block_t) }; + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBI(cur, bp, fields); xfs_btree_offsets(fields, offsets, XFS_BB_NUM_BITS, &first, &last); - xfs_trans_log_buf(tp, bp, first, last); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); } -/* - * Log keys from a btree block (nonleaf). - */ -STATIC void -xfs_inobt_log_keys( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int kfirst, /* index of first key to log */ - int klast) /* index of last key to log */ +static const struct xfs_btree_block_ops xfs_inobt_blkops = { + .get_buf = xfs_inobt_get_buf, + .read_buf = xfs_inobt_read_buf, + .get_block = xfs_inobt_get_block, + .buf_to_block = xfs_inobt_buf_to_block, + .buf_to_ptr = xfs_inobt_buf_to_ptr, + .log_block = xfs_inobt_log_block, + .check_block = xfs_btree_check_sblock, + + .alloc_block = xfs_inobt_alloc_block, + .free_block = xfs_inobt_free_block, + + .get_sibling = xfs_btree_get_ssibling, + .set_sibling = xfs_btree_set_ssibling, + .init_sibling = xfs_btree_init_sibling, +}; + +STATIC int +xfs_inobt_get_minrecs( + xfs_btree_cur_t *cur, + int lev) { - xfs_inobt_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - xfs_inobt_key_t *kp; /* key pointer in btree block */ - int last; /* last byte offset logged */ + return cur->bc_mp->m_inobt_mnr[lev != 0]; +} - block = XFS_BUF_TO_INOBT_BLOCK(bp); - kp = XFS_INOBT_KEY_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); +STATIC int +xfs_inobt_get_maxrecs( + xfs_btree_cur_t *cur, + int lev) +{ + return cur->bc_mp->m_inobt_mxr[lev != 0]; +} + +STATIC int +xfs_btree_get_numrecs( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block) +{ + return be16_to_cpu(block->bb_h.bb_numrecs); } -/* - * Log block pointer fields from a btree block (nonleaf). - */ STATIC void -xfs_inobt_log_ptrs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int pfirst, /* index of first pointer to log */ - int plast) /* index of last pointer to log */ +xfs_btree_set_numrecs( + xfs_btree_cur_t *cur, + xfs_btree_block_t *block, + int numrecs) { - xfs_inobt_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_inobt_ptr_t *pp; /* block-pointer pointer in btree blk */ + block->bb_h.bb_numrecs = cpu_to_be16(numrecs); +} - block = XFS_BUF_TO_INOBT_BLOCK(bp); - pp = XFS_INOBT_PTR_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); +STATIC void +xfs_inobt_init_key_from_rec( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key, + xfs_btree_rec_t *rec) +{ + key->u.inobt.ir_startino = rec->u.inobt.ir_startino; } /* - * Log records from a btree block (leaf). + * intial value of ptr for lookup */ STATIC void -xfs_inobt_log_recs( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_buf_t *bp, /* buffer containing btree block */ - int rfirst, /* index of first record to log */ - int rlast) /* index of last record to log */ +xfs_inobt_init_ptr_from_cur( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) { - xfs_inobt_block_t *block; /* btree block to log from */ - int first; /* first byte offset logged */ - int last; /* last byte offset logged */ - xfs_inobt_rec_t *rp; /* record pointer for btree block */ + xfs_agi_t *agi; /* a.g. inode header */ - block = XFS_BUF_TO_INOBT_BLOCK(bp); - rp = XFS_INOBT_REC_ADDR(block, 1, cur); - first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); - last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); - xfs_trans_log_buf(cur->bc_tp, bp, first, last); + agi = XFS_BUF_TO_AGI(cur->bc_private.i.agbp); + ASSERT(cur->bc_private.i.agno == be32_to_cpu(agi->agi_seqno)); + + ptr->u.inobt = agi->agi_root; } -/* - * Lookup the record. The cursor is made to point to it, based on dir. - * Return 0 if can't find any such record, 1 for success. - */ -STATIC int /* error */ -xfs_inobt_lookup( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_lookup_t dir, /* <=, ==, or >= */ - int *stat) /* success/failure */ +STATIC void +xfs_inobt_init_rec_from_key( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key, + xfs_btree_rec_t *rec) { - xfs_agblock_t agbno; /* a.g. relative btree block number */ - xfs_agnumber_t agno; /* allocation group number */ - xfs_inobt_block_t *block=NULL; /* current btree block */ - __int64_t diff; /* difference for the current key */ - int error; /* error return value */ - int keyno=0; /* current key number */ - int level; /* level in the btree */ - xfs_mount_t *mp; /* file system mount point */ - - /* - * Get the allocation group header, and the root block number. - */ - mp = cur->bc_mp; - { - xfs_agi_t *agi; /* a.g. inode header */ - - agi = XFS_BUF_TO_AGI(cur->bc_private.i.agbp); - agno = be32_to_cpu(agi->agi_seqno); - agbno = be32_to_cpu(agi->agi_root); - } - /* - * Iterate over each level in the btree, starting at the root. - * For each level above the leaves, find the key we need, based - * on the lookup record, then follow the corresponding block - * pointer down to the next level. - */ - for (level = cur->bc_nlevels - 1, diff = 1; level >= 0; level--) { - xfs_buf_t *bp; /* buffer pointer for btree block */ - xfs_daddr_t d; /* disk address of btree block */ - - /* - * Get the disk address we're looking for. - */ - d = XFS_AGB_TO_DADDR(mp, agno, agbno); - /* - * If the old buffer at this level is for a different block, - * throw it away, otherwise just use it. - */ - bp = cur->bc_bufs[level]; - if (bp && XFS_BUF_ADDR(bp) != d) - bp = NULL; - if (!bp) { - /* - * Need to get a new buffer. Read it, then - * set it in the cursor, releasing the old one. - */ - if ((error = xfs_btree_read_bufs(mp, cur->bc_tp, - agno, agbno, 0, &bp, XFS_INO_BTREE_REF))) - return error; - xfs_btree_setbuf(cur, level, bp); - /* - * Point to the btree block, now that we have the buffer - */ - block = XFS_BUF_TO_INOBT_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, level, - bp))) - return error; - } else - block = XFS_BUF_TO_INOBT_BLOCK(bp); - /* - * If we already had a key match at a higher level, we know - * we need to use the first entry in this block. - */ - if (diff == 0) - keyno = 1; - /* - * Otherwise we need to search this block. Do a binary search. - */ - else { - int high; /* high entry number */ - xfs_inobt_key_t *kkbase=NULL;/* base of keys in block */ - xfs_inobt_rec_t *krbase=NULL;/* base of records in block */ - int low; /* low entry number */ - - /* - * Get a pointer to keys or records. - */ - if (level > 0) - kkbase = XFS_INOBT_KEY_ADDR(block, 1, cur); - else - krbase = XFS_INOBT_REC_ADDR(block, 1, cur); - /* - * Set low and high entry numbers, 1-based. - */ - low = 1; - if (!(high = be16_to_cpu(block->bb_numrecs))) { - /* - * If the block is empty, the tree must - * be an empty leaf. - */ - ASSERT(level == 0 && cur->bc_nlevels == 1); - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; - *stat = 0; - return 0; - } - /* - * Binary search the block. - */ - while (low <= high) { - xfs_agino_t startino; /* key value */ - - /* - * keyno is average of low and high. - */ - keyno = (low + high) >> 1; - /* - * Get startino. - */ - if (level > 0) { - xfs_inobt_key_t *kkp; - - kkp = kkbase + keyno - 1; - startino = be32_to_cpu(kkp->ir_startino); - } else { - xfs_inobt_rec_t *krp; - - krp = krbase + keyno - 1; - startino = be32_to_cpu(krp->ir_startino); - } - /* - * Compute difference to get next direction. - */ - diff = (__int64_t) - startino - cur->bc_rec.i.ir_startino; - /* - * Less than, move right. - */ - if (diff < 0) - low = keyno + 1; - /* - * Greater than, move left. - */ - else if (diff > 0) - high = keyno - 1; - /* - * Equal, we're done. - */ - else - break; - } - } - /* - * If there are more levels, set up for the next level - * by getting the block number and filling in the cursor. - */ - if (level > 0) { - /* - * If we moved left, need the previous key number, - * unless there isn't one. - */ - if (diff > 0 && --keyno < 1) - keyno = 1; - agbno = be32_to_cpu(*XFS_INOBT_PTR_ADDR(block, keyno, cur)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, agbno, level))) - return error; -#endif - cur->bc_ptrs[level] = keyno; - } - } - /* - * Done with the search. - * See if we need to adjust the results. - */ - if (dir != XFS_LOOKUP_LE && diff < 0) { - keyno++; - /* - * If ge search and we went off the end of the block, but it's - * not the last block, we're in the wrong block. - */ - if (dir == XFS_LOOKUP_GE && - keyno > be16_to_cpu(block->bb_numrecs) && - be32_to_cpu(block->bb_rightsib) != NULLAGBLOCK) { - int i; - - cur->bc_ptrs[0] = keyno; - if ((error = xfs_inobt_increment(cur, 0, &i))) - return error; - ASSERT(i == 1); - *stat = 1; - return 0; - } - } - else if (dir == XFS_LOOKUP_LE && diff > 0) - keyno--; - cur->bc_ptrs[0] = keyno; - /* - * Return if we succeeded or not. - */ - if (keyno == 0 || keyno > be16_to_cpu(block->bb_numrecs)) - *stat = 0; - else - *stat = ((dir != XFS_LOOKUP_EQ) || (diff == 0)); - return 0; + rec->u.inobt.ir_startino = key->u.inobt.ir_startino; } -/* - * Move 1 record left from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_inobt_lshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ +STATIC void +xfs_inobt_init_rec_from_cur( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec) { - int error; /* error return value */ -#ifdef DEBUG - int i; /* loop index */ -#endif - xfs_inobt_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left neighbor block */ - xfs_inobt_block_t *left; /* left neighbor btree block */ - xfs_inobt_key_t *lkp=NULL; /* key pointer for left block */ - xfs_inobt_ptr_t *lpp; /* address pointer for left block */ - xfs_inobt_rec_t *lrp=NULL; /* record pointer for left block */ - int nrec; /* new number of left block entries */ - xfs_buf_t *rbp; /* buffer for right (current) block */ - xfs_inobt_block_t *right; /* right (current) btree block */ - xfs_inobt_key_t *rkp=NULL; /* key pointer for right block */ - xfs_inobt_ptr_t *rpp=NULL; /* address pointer for right block */ - xfs_inobt_rec_t *rrp=NULL; /* record pointer for right block */ + rec->u.inobt.ir_startino = cpu_to_be32(cur->bc_rec.i.ir_startino); + rec->u.inobt.ir_freecount = cpu_to_be32(cur->bc_rec.i.ir_freecount); + rec->u.inobt.ir_free = cpu_to_be64(cur->bc_rec.i.ir_free); +} - /* - * Set up variables for this block as "right". - */ - rbp = cur->bc_bufs[level]; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; -#endif - /* - * If we've got no left sibling then we can't shift an entry left. - */ - if (be32_to_cpu(right->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] <= 1) { - *stat = 0; - return 0; - } - /* - * Set up the left neighbor as "left". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.i.agno, be32_to_cpu(right->bb_leftsib), - 0, &lbp, XFS_INO_BTREE_REF))) - return error; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(left->bb_numrecs) == XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - nrec = be16_to_cpu(left->bb_numrecs) + 1; - /* - * If non-leaf, copy a key and a ptr to the left block. - */ - if (level > 0) { - lkp = XFS_INOBT_KEY_ADDR(left, nrec, cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - *lkp = *rkp; - xfs_inobt_log_keys(cur, lbp, nrec, nrec); - lpp = XFS_INOBT_PTR_ADDR(left, nrec, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*rpp), level))) - return error; -#endif - *lpp = *rpp; - xfs_inobt_log_ptrs(cur, lbp, nrec, nrec); - } - /* - * If leaf, copy a record to the left block. - */ - else { - lrp = XFS_INOBT_REC_ADDR(left, nrec, cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - *lrp = *rrp; - xfs_inobt_log_recs(cur, lbp, nrec, nrec); - } - /* - * Bump and log left's numrecs, decrement and log right's numrecs. - */ - be16_add(&left->bb_numrecs, 1); - xfs_inobt_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(cur->bc_btnum, lkp - 1, lkp); - else - xfs_btree_check_rec(cur->bc_btnum, lrp - 1, lrp); -#endif - be16_add(&right->bb_numrecs, -1); - xfs_inobt_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Slide the contents of right down one entry. - */ - if (level > 0) { -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i + 1]), - level))) - return error; - } -#endif - memmove(rkp, rkp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp, rpp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_inobt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_inobt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - } else { - memmove(rrp, rrp + 1, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_inobt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - key.ir_startino = rrp->ir_startino; - rkp = &key; - } - /* - * Update the parent key values of right. - */ - if ((error = xfs_inobt_updkey(cur, rkp, level + 1))) - return error; - /* - * Slide the cursor value left one. - */ - cur->bc_ptrs[level]--; - *stat = 1; - return 0; +STATIC xfs_btree_key_t * +xfs_inobt_key_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) +{ + return (xfs_btree_key_t *)XFS_INOBT_KEY_ADDR(&block->bb_h, index, cur); } -/* - * Allocate a new root block, fill it in. - */ -STATIC int /* error */ -xfs_inobt_newroot( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ +STATIC xfs_btree_ptr_t * +xfs_inobt_ptr_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) { - xfs_agi_t *agi; /* a.g. inode header */ - xfs_alloc_arg_t args; /* allocation argument structure */ - xfs_inobt_block_t *block; /* one half of the old root block */ - xfs_buf_t *bp; /* buffer containing block */ - int error; /* error return value */ - xfs_inobt_key_t *kp; /* btree key pointer */ - xfs_agblock_t lbno; /* left block number */ - xfs_buf_t *lbp; /* left buffer pointer */ - xfs_inobt_block_t *left; /* left btree block */ - xfs_buf_t *nbp; /* new (root) buffer */ - xfs_inobt_block_t *new; /* new (root) btree block */ - int nptr; /* new value for key index, 1 or 2 */ - xfs_inobt_ptr_t *pp; /* btree address pointer */ - xfs_agblock_t rbno; /* right block number */ - xfs_buf_t *rbp; /* right buffer pointer */ - xfs_inobt_block_t *right; /* right btree block */ - xfs_inobt_rec_t *rp; /* btree record pointer */ + return (xfs_btree_ptr_t *)XFS_INOBT_PTR_ADDR(&block->bb_h, index, cur); +} - ASSERT(cur->bc_nlevels < XFS_IN_MAXLEVELS(cur->bc_mp)); +STATIC xfs_btree_rec_t * +xfs_inobt_rec_addr( + xfs_btree_cur_t *cur, + int index, + xfs_btree_block_t *block) +{ + return (xfs_btree_rec_t *)XFS_INOBT_REC_ADDR(&block->bb_h, index, cur); +} - /* - * Get a block & a buffer. - */ - agi = XFS_BUF_TO_AGI(cur->bc_private.i.agbp); - args.tp = cur->bc_tp; - args.mp = cur->bc_mp; - args.fsbno = XFS_AGB_TO_FSB(args.mp, cur->bc_private.i.agno, - be32_to_cpu(agi->agi_root)); - args.mod = args.minleft = args.alignment = args.total = args.wasdel = - args.isfl = args.userdata = args.minalignslop = 0; - args.minlen = args.maxlen = args.prod = 1; - args.type = XFS_ALLOCTYPE_NEAR_BNO; - if ((error = xfs_alloc_vextent(&args))) - return error; - /* - * None available, we fail. - */ - if (args.fsbno == NULLFSBLOCK) { - *stat = 0; - return 0; - } - ASSERT(args.len == 1); - nbp = xfs_btree_get_bufs(args.mp, args.tp, args.agno, args.agbno, 0); - new = XFS_BUF_TO_INOBT_BLOCK(nbp); - /* - * Set the root data in the a.g. inode structure. - */ - agi->agi_root = cpu_to_be32(args.agbno); - be32_add(&agi->agi_level, 1); - xfs_ialloc_log_agi(args.tp, cur->bc_private.i.agbp, - XFS_AGI_ROOT | XFS_AGI_LEVEL); - /* - * At the previous root level there are now two blocks: the old - * root, and the new block generated when it was split. - * We don't know which one the cursor is pointing at, so we - * set up variables "left" and "right" for each case. - */ - bp = cur->bc_bufs[cur->bc_nlevels - 1]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, cur->bc_nlevels - 1, bp))) - return error; -#endif - if (be32_to_cpu(block->bb_rightsib) != NULLAGBLOCK) { - /* - * Our block is left, pick up the right block. - */ - lbp = bp; - lbno = XFS_DADDR_TO_AGBNO(args.mp, XFS_BUF_ADDR(lbp)); - left = block; - rbno = be32_to_cpu(left->bb_rightsib); - if ((error = xfs_btree_read_bufs(args.mp, args.tp, args.agno, - rbno, 0, &rbp, XFS_INO_BTREE_REF))) - return error; - bp = rbp; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); - if ((error = xfs_btree_check_sblock(cur, right, - cur->bc_nlevels - 1, rbp))) - return error; - nptr = 1; - } else { - /* - * Our block is right, pick up the left block. - */ - rbp = bp; - rbno = XFS_DADDR_TO_AGBNO(args.mp, XFS_BUF_ADDR(rbp)); - right = block; - lbno = be32_to_cpu(right->bb_leftsib); - if ((error = xfs_btree_read_bufs(args.mp, args.tp, args.agno, - lbno, 0, &lbp, XFS_INO_BTREE_REF))) - return error; - bp = lbp; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); - if ((error = xfs_btree_check_sblock(cur, left, - cur->bc_nlevels - 1, lbp))) - return error; - nptr = 2; - } - /* - * Fill in the new block's btree header and log it. - */ - new->bb_magic = cpu_to_be32(xfs_magics[cur->bc_btnum]); - new->bb_level = cpu_to_be16(cur->bc_nlevels); - new->bb_numrecs = cpu_to_be16(2); - new->bb_leftsib = cpu_to_be32(NULLAGBLOCK); - new->bb_rightsib = cpu_to_be32(NULLAGBLOCK); - xfs_inobt_log_block(args.tp, nbp, XFS_BB_ALL_BITS); - ASSERT(lbno != NULLAGBLOCK && rbno != NULLAGBLOCK); - /* - * Fill in the key data in the new root. - */ - kp = XFS_INOBT_KEY_ADDR(new, 1, cur); - if (be16_to_cpu(left->bb_level) > 0) { - kp[0] = *XFS_INOBT_KEY_ADDR(left, 1, cur); - kp[1] = *XFS_INOBT_KEY_ADDR(right, 1, cur); - } else { - rp = XFS_INOBT_REC_ADDR(left, 1, cur); - kp[0].ir_startino = rp->ir_startino; - rp = XFS_INOBT_REC_ADDR(right, 1, cur); - kp[1].ir_startino = rp->ir_startino; - } - xfs_inobt_log_keys(cur, nbp, 1, 2); - /* - * Fill in the pointer data in the new root. - */ - pp = XFS_INOBT_PTR_ADDR(new, 1, cur); - pp[0] = cpu_to_be32(lbno); - pp[1] = cpu_to_be32(rbno); - xfs_inobt_log_ptrs(cur, nbp, 1, 2); - /* - * Fix up the cursor. - */ - xfs_btree_setbuf(cur, cur->bc_nlevels, nbp); - cur->bc_ptrs[cur->bc_nlevels] = nptr; - cur->bc_nlevels++; - *stat = 1; - return 0; +STATIC int64_t +xfs_inobt_key_diff( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key) +{ + return (int64_t)(be32_to_cpu(key->u.inobt.ir_startino)) - + cur->bc_rec.i.ir_startino; } -/* - * Move 1 record right from cur/level if possible. - * Update cur to reflect the new path. - */ -STATIC int /* error */ -xfs_inobt_rshift( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to shift record on */ - int *stat) /* success/failure */ +STATIC xfs_daddr_t +xfs_inobt_ptr_to_daddr( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr) { - int error; /* error return value */ - int i; /* loop index */ - xfs_inobt_key_t key; /* key value for leaf level upward */ - xfs_buf_t *lbp; /* buffer for left (current) block */ - xfs_inobt_block_t *left; /* left (current) btree block */ - xfs_inobt_key_t *lkp; /* key pointer for left block */ - xfs_inobt_ptr_t *lpp; /* address pointer for left block */ - xfs_inobt_rec_t *lrp; /* record pointer for left block */ - xfs_buf_t *rbp; /* buffer for right neighbor block */ - xfs_inobt_block_t *right; /* right neighbor btree block */ - xfs_inobt_key_t *rkp; /* key pointer for right block */ - xfs_inobt_ptr_t *rpp; /* address pointer for right block */ - xfs_inobt_rec_t *rrp=NULL; /* record pointer for right block */ - xfs_btree_cur_t *tcur; /* temporary cursor */ + return XFS_AGB_TO_DADDR(cur->bc_mp, cur->bc_private.i.agno, + be32_to_cpu(ptr->u.inobt)); +} - /* - * Set up variables for this block as "left". - */ - lbp = cur->bc_bufs[level]; - left = XFS_BUF_TO_INOBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; -#endif - /* - * If we've got no right sibling then we can't shift an entry right. - */ - if (be32_to_cpu(left->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * If the cursor entry is the one that would be moved, don't - * do it... it's too complicated. - */ - if (cur->bc_ptrs[level] >= be16_to_cpu(left->bb_numrecs)) { - *stat = 0; - return 0; - } - /* - * Set up the right neighbor as "right". - */ - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.i.agno, be32_to_cpu(left->bb_rightsib), - 0, &rbp, XFS_INO_BTREE_REF))) - return error; - right = XFS_BUF_TO_INOBT_BLOCK(rbp); - if ((error = xfs_btree_check_sblock(cur, right, level, rbp))) - return error; - /* - * If it's full, it can't take another entry. - */ - if (be16_to_cpu(right->bb_numrecs) == XFS_INOBT_BLOCK_MAXRECS(level, cur)) { - *stat = 0; - return 0; - } - /* - * Make a hole at the start of the right neighbor block, then - * copy the last left block entry to the hole. - */ - if (level > 0) { - lkp = XFS_INOBT_KEY_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - lpp = XFS_INOBT_PTR_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = be16_to_cpu(right->bb_numrecs) - 1; i >= 0; i--) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(rpp[i]), level))) - return error; - } -#endif - memmove(rkp + 1, rkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memmove(rpp + 1, rpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); -#ifdef DEBUG - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(*lpp), level))) - return error; -#endif - *rkp = *lkp; - *rpp = *lpp; - xfs_inobt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - xfs_inobt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); +STATIC void +xfs_inobt_move_keys( + xfs_btree_cur_t *cur, + xfs_btree_key_t *src_key, + xfs_btree_key_t *dst_key, + int from, + int to, + int numkeys) +{ + if (dst_key == NULL) { + /* moving within a block */ + xfs_inobt_key_t *kp = &src_key->u.inobt; + memmove(&kp[to], &kp[from], numkeys * sizeof(*kp)); } else { - lrp = XFS_INOBT_REC_ADDR(left, be16_to_cpu(left->bb_numrecs), cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - memmove(rrp + 1, rrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - *rrp = *lrp; - xfs_inobt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs) + 1); - key.ir_startino = rrp->ir_startino; - rkp = &key; + /* moving between blocks */ + memcpy(dst_key, src_key, numkeys * sizeof(xfs_inobt_key_t)); } - /* - * Decrement and log left's numrecs, bump and log right's numrecs. - */ - be16_add(&left->bb_numrecs, -1); - xfs_inobt_log_block(cur->bc_tp, lbp, XFS_BB_NUMRECS); - be16_add(&right->bb_numrecs, 1); -#ifdef DEBUG - if (level > 0) - xfs_btree_check_key(cur->bc_btnum, rkp, rkp + 1); - else - xfs_btree_check_rec(cur->bc_btnum, rrp, rrp + 1); -#endif - xfs_inobt_log_block(cur->bc_tp, rbp, XFS_BB_NUMRECS); - /* - * Using a temporary cursor, update the parent key values of the - * block on the right. - */ - if ((error = xfs_btree_dup_cursor(cur, &tcur))) - return error; - xfs_btree_lastrec(tcur, level); - if ((error = xfs_inobt_increment(tcur, level, &i)) || - (error = xfs_inobt_updkey(tcur, rkp, level + 1))) { - xfs_btree_del_cursor(tcur, XFS_BTREE_ERROR); - return error; - } - xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); - *stat = 1; - return 0; } -/* - * Split cur/level block in half. - * Return new block number and its first record (to be inserted into parent). - */ -STATIC int /* error */ -xfs_inobt_split( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level to split */ - xfs_agblock_t *bnop, /* output: block number allocated */ - xfs_inobt_key_t *keyp, /* output: first key of new block */ - xfs_btree_cur_t **curp, /* output: new cursor */ - int *stat) /* success/failure */ -{ - xfs_alloc_arg_t args; /* allocation argument structure */ - int error; /* error return value */ - int i; /* loop index/record number */ - xfs_agblock_t lbno; /* left (current) block number */ - xfs_buf_t *lbp; /* buffer for left block */ - xfs_inobt_block_t *left; /* left (current) btree block */ - xfs_inobt_key_t *lkp; /* left btree key pointer */ - xfs_inobt_ptr_t *lpp; /* left btree address pointer */ - xfs_inobt_rec_t *lrp; /* left btree record pointer */ - xfs_buf_t *rbp; /* buffer for right block */ - xfs_inobt_block_t *right; /* right (new) btree block */ - xfs_inobt_key_t *rkp; /* right btree key pointer */ - xfs_inobt_ptr_t *rpp; /* right btree address pointer */ - xfs_inobt_rec_t *rrp; /* right btree record pointer */ - - /* - * Set up left block (current one). - */ - lbp = cur->bc_bufs[level]; - args.tp = cur->bc_tp; - args.mp = cur->bc_mp; - lbno = XFS_DADDR_TO_AGBNO(args.mp, XFS_BUF_ADDR(lbp)); - /* - * Allocate the new block. - * If we can't do it, we're toast. Give up. - */ - args.fsbno = XFS_AGB_TO_FSB(args.mp, cur->bc_private.i.agno, lbno); - args.mod = args.minleft = args.alignment = args.total = args.wasdel = - args.isfl = args.userdata = args.minalignslop = 0; - args.minlen = args.maxlen = args.prod = 1; - args.type = XFS_ALLOCTYPE_NEAR_BNO; - if ((error = xfs_alloc_vextent(&args))) - return error; - if (args.fsbno == NULLFSBLOCK) { - *stat = 0; - return 0; - } - ASSERT(args.len == 1); - rbp = xfs_btree_get_bufs(args.mp, args.tp, args.agno, args.agbno, 0); - /* - * Set up the new block as "right". - */ - right = XFS_BUF_TO_INOBT_BLOCK(rbp); - /* - * "Left" is the current (according to the cursor) block. - */ - left = XFS_BUF_TO_INOBT_BLOCK(lbp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, left, level, lbp))) - return error; -#endif - /* - * Fill in the btree header for the new block. - */ - right->bb_magic = cpu_to_be32(xfs_magics[cur->bc_btnum]); - right->bb_level = left->bb_level; - right->bb_numrecs = cpu_to_be16(be16_to_cpu(left->bb_numrecs) / 2); - /* - * Make sure that if there's an odd number of entries now, that - * each new block will have the same number of entries. - */ - if ((be16_to_cpu(left->bb_numrecs) & 1) && - cur->bc_ptrs[level] <= be16_to_cpu(right->bb_numrecs) + 1) - be16_add(&right->bb_numrecs, 1); - i = be16_to_cpu(left->bb_numrecs) - be16_to_cpu(right->bb_numrecs) + 1; - /* - * For non-leaf blocks, copy keys and addresses over to the new block. - */ - if (level > 0) { - lkp = XFS_INOBT_KEY_ADDR(left, i, cur); - lpp = XFS_INOBT_PTR_ADDR(left, i, cur); - rkp = XFS_INOBT_KEY_ADDR(right, 1, cur); - rpp = XFS_INOBT_PTR_ADDR(right, 1, cur); -#ifdef DEBUG - for (i = 0; i < be16_to_cpu(right->bb_numrecs); i++) { - if ((error = xfs_btree_check_sptr(cur, be32_to_cpu(lpp[i]), level))) - return error; - } -#endif - memcpy(rkp, lkp, be16_to_cpu(right->bb_numrecs) * sizeof(*rkp)); - memcpy(rpp, lpp, be16_to_cpu(right->bb_numrecs) * sizeof(*rpp)); - xfs_inobt_log_keys(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - xfs_inobt_log_ptrs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - *keyp = *rkp; - } - /* - * For leaf blocks, copy records over to the new block. - */ - else { - lrp = XFS_INOBT_REC_ADDR(left, i, cur); - rrp = XFS_INOBT_REC_ADDR(right, 1, cur); - memcpy(rrp, lrp, be16_to_cpu(right->bb_numrecs) * sizeof(*rrp)); - xfs_inobt_log_recs(cur, rbp, 1, be16_to_cpu(right->bb_numrecs)); - keyp->ir_startino = rrp->ir_startino; - } - /* - * Find the left block number by looking in the buffer. - * Adjust numrecs, sibling pointers. - */ - be16_add(&left->bb_numrecs, -(be16_to_cpu(right->bb_numrecs))); - right->bb_rightsib = left->bb_rightsib; - left->bb_rightsib = cpu_to_be32(args.agbno); - right->bb_leftsib = cpu_to_be32(lbno); - xfs_inobt_log_block(args.tp, rbp, XFS_BB_ALL_BITS); - xfs_inobt_log_block(args.tp, lbp, XFS_BB_NUMRECS | XFS_BB_RIGHTSIB); - /* - * If there's a block to the new block's right, make that block - * point back to right instead of to left. - */ - if (be32_to_cpu(right->bb_rightsib) != NULLAGBLOCK) { - xfs_inobt_block_t *rrblock; /* rr btree block */ - xfs_buf_t *rrbp; /* buffer for rrblock */ - - if ((error = xfs_btree_read_bufs(args.mp, args.tp, args.agno, - be32_to_cpu(right->bb_rightsib), 0, &rrbp, - XFS_INO_BTREE_REF))) - return error; - rrblock = XFS_BUF_TO_INOBT_BLOCK(rrbp); - if ((error = xfs_btree_check_sblock(cur, rrblock, level, rrbp))) - return error; - rrblock->bb_leftsib = cpu_to_be32(args.agbno); - xfs_inobt_log_block(args.tp, rrbp, XFS_BB_LEFTSIB); - } - /* - * If the cursor is really in the right block, move it there. - * If it's just pointing past the last entry in left, then we'll - * insert there, so don't change anything in that case. - */ - if (cur->bc_ptrs[level] > be16_to_cpu(left->bb_numrecs) + 1) { - xfs_btree_setbuf(cur, level, rbp); - cur->bc_ptrs[level] -= be16_to_cpu(left->bb_numrecs); +STATIC void +xfs_inobt_move_ptrs( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *src_ptr, + xfs_btree_ptr_t *dst_ptr, + int from, + int to, + int numptrs) +{ + if (dst_ptr == NULL) { + /* moving within a block */ + xfs_inobt_ptr_t *pp = &src_ptr->u.inobt; + memmove(&pp[to], &pp[from], numptrs * sizeof(*pp)); + } else { + /* moving between blocks */ + memcpy(dst_ptr, src_ptr, numptrs * sizeof(xfs_inobt_ptr_t)); } - /* - * If there are more levels, we'll need another cursor which refers - * the right block, no matter where this cursor was. - */ - if (level + 1 < cur->bc_nlevels) { - if ((error = xfs_btree_dup_cursor(cur, curp))) - return error; - (*curp)->bc_ptrs[level + 1]++; +} + +STATIC void +xfs_inobt_move_recs( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *src_rec, + xfs_btree_rec_t *dst_rec, + int from, + int to, + int numrecs) +{ + if (dst_rec == NULL) { + /* moving within a block */ + xfs_inobt_rec_t *rp = &src_rec->u.inobt; + memmove(&rp[to], &rp[from], numrecs * sizeof(*rp)); + } else { + /* moving between blocks */ + memcpy(dst_rec, src_rec, numrecs * sizeof(xfs_inobt_rec_t)); } - *bnop = args.agbno; - *stat = 1; - return 0; } -/* - * Update keys at all levels from here to the root along the cursor's path. - */ -STATIC int /* error */ -xfs_inobt_updkey( - xfs_btree_cur_t *cur, /* btree cursor */ - xfs_inobt_key_t *keyp, /* new key value to update to */ - int level) /* starting level for update */ + +STATIC void +xfs_inobt_set_key( + xfs_btree_cur_t *cur, + xfs_btree_key_t *key_addr, + int index, + xfs_btree_key_t *newkey) { - int ptr; /* index of key in block */ + xfs_inobt_key_t *kp = &key_addr->u.inobt; - /* - * Go up the tree from this level toward the root. - * At each level, update the key value to the value input. - * Stop when we reach a level where the cursor isn't pointing - * at the first entry in the block. - */ - for (ptr = 1; ptr == 1 && level < cur->bc_nlevels; level++) { - xfs_buf_t *bp; /* buffer for block */ - xfs_inobt_block_t *block; /* btree block */ -#ifdef DEBUG - int error; /* error return value */ -#endif - xfs_inobt_key_t *kp; /* ptr to btree block keys */ + kp[index] = newkey->u.inobt; +} - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; -#endif - ptr = cur->bc_ptrs[level]; - kp = XFS_INOBT_KEY_ADDR(block, ptr, cur); - *kp = *keyp; - xfs_inobt_log_keys(cur, bp, ptr, ptr); - } - return 0; +STATIC void +xfs_inobt_set_ptr( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *ptr_addr, + int index, + xfs_btree_ptr_t *newptr) +{ + xfs_inobt_ptr_t *pp = &ptr_addr->u.inobt; + + pp[index] = newptr->u.inobt; } -/* - * Externally visible routines. - */ +STATIC void +xfs_inobt_set_rec( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec_addr, + int index, + xfs_btree_rec_t *newrec) +{ + xfs_inobt_rec_t *rp = &rec_addr->u.inobt; + + rp[index] = newrec->u.inobt; +} /* - * Decrement cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. + * Log keys from a btree block (nonleaf). */ -int /* error */ -xfs_inobt_decrement( +STATIC void +xfs_inobt_log_keys( xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ + xfs_buf_t *bp, /* buffer containing btree block */ + int kfirst, /* index of first key to log */ + int klast) /* index of last key to log */ { - xfs_inobt_block_t *block; /* btree block */ - int error; - int lev; /* btree level */ + xfs_inobt_block_t *block; /* btree block to log from */ + int first; /* first byte offset logged */ + xfs_inobt_key_t *kp; /* key pointer in btree block */ + int last; /* last byte offset logged */ - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the left at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); - /* - * Decrement the ptr at this level. If we're still in the block - * then we're done. - */ - if (--cur->bc_ptrs[level] > 0) { - *stat = 1; - return 0; - } - /* - * Get a pointer to the btree block. - */ - block = XFS_BUF_TO_INOBT_BLOCK(cur->bc_bufs[level]); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, - cur->bc_bufs[level]))) - return error; -#endif - /* - * If we just went off the left edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_leftsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree decrementing pointers. - * Stop when we don't go off the left edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) - break; - /* - * Read-ahead the left block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (block = XFS_BUF_TO_INOBT_BLOCK(cur->bc_bufs[lev]); lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - xfs_buf_t *bp; /* buffer containing btree block */ - - agbno = be32_to_cpu(*XFS_INOBT_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.i.agno, agbno, 0, &bp, - XFS_INO_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_INOBT_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = be16_to_cpu(block->bb_numrecs); - } - *stat = 1; - return 0; + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, kfirst, klast); + block = XFS_BUF_TO_INOBT_BLOCK(bp); + kp = XFS_INOBT_KEY_ADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&kp[kfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&kp[klast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); } /* - * Delete the record pointed to by cur. - * The cursor refers to the place where the record was (could be inserted) - * when the operation returns. + * Log block pointer fields from a btree block (nonleaf). */ -int /* error */ -xfs_inobt_delete( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ +STATIC void +xfs_inobt_log_ptrs( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_buf_t *bp, /* buffer containing btree block */ + int pfirst, /* index of first pointer to log */ + int plast) /* index of last pointer to log */ { - int error; - int i; /* result code */ - int level; /* btree level */ + xfs_inobt_block_t *block; /* btree block to log from */ + int first; /* first byte offset logged */ + int last; /* last byte offset logged */ + xfs_inobt_ptr_t *pp; /* block-pointer pointer in btree blk */ - /* - * Go up the tree, starting at leaf level. - * If 2 is returned then a join was done; go to the next level. - * Otherwise we are done. - */ - for (level = 0, i = 2; i == 2; level++) { - if ((error = xfs_inobt_delrec(cur, level, &i))) - return error; - } - if (i == 0) { - for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { - if ((error = xfs_inobt_decrement(cur, level, &i))) - return error; - break; - } - } - } - *stat = i; - return 0; + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, pfirst, plast); + block = XFS_BUF_TO_INOBT_BLOCK(bp); + pp = XFS_INOBT_PTR_ADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&pp[pfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&pp[plast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); } - /* - * Get the data from the pointed-to record. + * Log records from a btree block (leaf). */ -int /* error */ -xfs_inobt_get_rec( +STATIC void +xfs_inobt_log_recs( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agino_t *ino, /* output: starting inode of chunk */ - __int32_t *fcnt, /* output: number of free inodes */ - xfs_inofree_t *free, /* output: free inode mask */ - int *stat) /* output: success/failure */ + xfs_buf_t *bp, /* buffer containing btree block */ + int rfirst, /* index of first record to log */ + int rlast) /* index of last record to log */ { - xfs_inobt_block_t *block; /* btree block */ - xfs_buf_t *bp; /* buffer containing btree block */ -#ifdef DEBUG - int error; /* error return value */ -#endif - int ptr; /* record number */ - xfs_inobt_rec_t *rec; /* record data */ + xfs_inobt_block_t *block; /* btree block to log from */ + int first; /* first byte offset logged */ + int last; /* last byte offset logged */ + xfs_inobt_rec_t *rp; /* record pointer for btree block */ - bp = cur->bc_bufs[0]; - ptr = cur->bc_ptrs[0]; + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGBII(cur, bp, rfirst, rlast); block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, 0, bp))) - return error; -#endif + rp = XFS_INOBT_REC_ADDR(block, 1, cur); + first = (int)((xfs_caddr_t)&rp[rfirst - 1] - (xfs_caddr_t)block); + last = (int)(((xfs_caddr_t)&rp[rlast] - 1) - (xfs_caddr_t)block); + xfs_trans_log_buf(cur->bc_tp, bp, first, last); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); +} + +static const struct xfs_btree_record_ops xfs_inobt_recops = { + .get_minrecs = xfs_inobt_get_minrecs, + .get_maxrecs = xfs_inobt_get_maxrecs, + .get_numrecs = xfs_btree_get_numrecs, + .set_numrecs = xfs_btree_set_numrecs, + + .init_key_from_rec = xfs_inobt_init_key_from_rec, + .init_ptr_from_cur = xfs_inobt_init_ptr_from_cur, + .init_rec_from_key = xfs_inobt_init_rec_from_key, + .init_rec_from_cur = xfs_inobt_init_rec_from_cur, + + .key_addr = xfs_inobt_key_addr, + .ptr_addr = xfs_inobt_ptr_addr, + .rec_addr = xfs_inobt_rec_addr, + + .key_diff = xfs_inobt_key_diff, + .ptr_to_daddr = xfs_inobt_ptr_to_daddr, + + .move_keys = xfs_inobt_move_keys, + .move_ptrs = xfs_inobt_move_ptrs, + .move_recs = xfs_inobt_move_recs, + + .set_key = xfs_inobt_set_key, + .set_ptr = xfs_inobt_set_ptr, + .set_rec = xfs_inobt_set_rec, + + .log_keys = xfs_inobt_log_keys, + .log_ptrs = xfs_inobt_log_ptrs, + .log_recs = xfs_inobt_log_recs, + + .check_ptrs = xfs_btree_check_sptr, +}; + +STATIC void +xfs_inobt_setroot( + xfs_btree_cur_t *cur, + xfs_btree_ptr_t *nptr, + int inc) /* level change */ +{ + xfs_buf_t *agbp = cur->bc_private.i.agbp; + xfs_agi_t *agi = XFS_BUF_TO_AGI(agbp); + + agi->agi_root = nptr->u.inobt; + be32_add(&agi->agi_level, inc); + xfs_ialloc_log_agi(cur->bc_tp, agbp, XFS_AGI_ROOT | XFS_AGI_LEVEL); +} + + +STATIC int +xfs_inobt_killroot( + xfs_btree_cur_t *cur, + int level, + xfs_btree_ptr_t *newroot) +{ + xfs_buf_t *agbp = cur->bc_private.i.agbp; + xfs_agi_t *agi = XFS_BUF_TO_AGI(agbp); + xfs_agblock_t bno; + int error; + /* - * Off the right end or left end, return failure. + * Set the root entry in the a.g. inode structure, + * decreasing the level by 1. */ - if (ptr > be16_to_cpu(block->bb_numrecs) || ptr <= 0) { - *stat = 0; - return 0; - } + bno = be32_to_cpu(agi->agi_root); + xfs_inobt_setroot(cur, newroot, -1); /* - * Point to the record and extract its data. + * Free the old root. */ - rec = XFS_INOBT_REC_ADDR(block, ptr, cur); - *ino = be32_to_cpu(rec->ir_startino); - *fcnt = be32_to_cpu(rec->ir_freecount); - *free = be64_to_cpu(rec->ir_free); - *stat = 1; + error = xfs_free_extent(cur->bc_tp, + XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.i.agno, bno), 1); + if (error) + return error; + xfs_trans_binval(cur->bc_tp, cur->bc_bufs[level]); + /* + * Update the cursor so there's one fewer level. + */ + cur->bc_bufs[level] = NULL; + cur->bc_nlevels--; return 0; } +static const struct xfs_btree_cur_ops xfs_inobt_curops = { + .set_root = xfs_inobt_setroot, + .new_root = xfs_btree_newroot, + .kill_root = xfs_inobt_killroot, +}; + + +#if defined(XFS_BTREE_TRACE) + /* - * Increment cursor by one record at the level. - * For nonzero levels the leaf-ward information is untouched. + * Global inobt trace buffer */ -int /* error */ -xfs_inobt_increment( - xfs_btree_cur_t *cur, /* btree cursor */ - int level, /* level in btree, 0 is leaf */ - int *stat) /* success/failure */ +ktrace_t *xfs_inobt_trace_buf; +/* + * Add a trace buffer entry for the arguments given to the routine, + * generic form. + */ +STATIC void +xfs_inobt_trace_enter( + const char *func, + xfs_btree_cur_t *cur, + char *s, + int type, + int line, + __psunsigned_t a0, + __psunsigned_t a1, + __psunsigned_t a2, + __psunsigned_t a3, + __psunsigned_t a4, + __psunsigned_t a5, + __psunsigned_t a6, + __psunsigned_t a7, + __psunsigned_t a8, + __psunsigned_t a9, + __psunsigned_t a10) +{ + ktrace_enter(xfs_inobt_trace_buf, + (void *)(__psint_t)type, + (void *)func, (void *)s, (void *)ip, (void *)cur, + (void *)a0, (void *)a1, (void *)a2, (void *)a3, + (void *)a4, (void *)a5, (void *)a6, (void *)a7, + (void *)a8, (void *)a9, (void *)a10); +} + +STATIC void +xfs_inobt_trace_cursor( + xfs_btree_cur_t *cur, + __uint32_t *s0, + __uint64_t *l0, + __uint64_t *l1) +{ + *s0 = cur->bc_private.i.agno; + *l0 = cur->bc_rec.i.ir_startino; + *l1 = cur->bc_rec.i.ir_free; +} + +STATIC void +xfs_inobt_trace_record( + xfs_btree_cur_t *cur, + xfs_btree_rec_t *rec, + __uint64_t *l0, + __uint64_t *l1, + __uint64_t *l2) { - xfs_inobt_block_t *block; /* btree block */ - xfs_buf_t *bp; /* buffer containing btree block */ - int error; /* error return value */ - int lev; /* btree level */ + *l0 = be32_to_cpu(&rec->u.inobt.ir_startino); + *l1 = be32_to_cpu(&rec->u.inobt.ir_freecount); + *l2 = be64_to_cpu(&rec->u.inobt.ir_free); +} - ASSERT(level < cur->bc_nlevels); - /* - * Read-ahead to the right at this level. - */ - xfs_btree_readahead(cur, level, XFS_BTCUR_RIGHTRA); - /* - * Get a pointer to the btree block. - */ - bp = cur->bc_bufs[level]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, level, bp))) - return error; +static const struct xfs_btree_trc_ops xfs_inobt_trcops = { + .enter = xfs_inobt_trace_enter, + .cursor = xfs_inobt_trace_cursor, + .record = xfs_inobt_trace_record, +}; #endif - /* - * Increment the ptr at this level. If we're still in the block - * then we're done. - */ - if (++cur->bc_ptrs[level] <= be16_to_cpu(block->bb_numrecs)) { - *stat = 1; - return 0; - } - /* - * If we just went off the right edge of the tree, return failure. - */ - if (be32_to_cpu(block->bb_rightsib) == NULLAGBLOCK) { - *stat = 0; - return 0; - } - /* - * March up the tree incrementing pointers. - * Stop when we don't go off the right edge of a block. - */ - for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - bp = cur->bc_bufs[lev]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); -#ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; + +void +xfs_inobt_init_cursor( + xfs_btree_cur_t *cur) +{ + cur->bc_flags = 0; + cur->bc_curops = &xfs_inobt_curops; + cur->bc_blkops = &xfs_inobt_blkops; + cur->bc_recops = &xfs_inobt_recops; +#if defined(XFS_BTREE_TRACE) + cur->bc_trcops = &xfs_inobt_trcops; #endif - if (++cur->bc_ptrs[lev] <= be16_to_cpu(block->bb_numrecs)) - break; - /* - * Read-ahead the right block, we're going to read it - * in the next loop. - */ - xfs_btree_readahead(cur, lev, XFS_BTCUR_RIGHTRA); - } - /* - * If we went off the root then we are seriously confused. - */ - ASSERT(lev < cur->bc_nlevels); - /* - * Now walk back down the tree, fixing up the cursor's buffer - * pointers and key numbers. - */ - for (bp = cur->bc_bufs[lev], block = XFS_BUF_TO_INOBT_BLOCK(bp); - lev > level; ) { - xfs_agblock_t agbno; /* block number of btree block */ - - agbno = be32_to_cpu(*XFS_INOBT_PTR_ADDR(block, cur->bc_ptrs[lev], cur)); - if ((error = xfs_btree_read_bufs(cur->bc_mp, cur->bc_tp, - cur->bc_private.i.agno, agbno, 0, &bp, - XFS_INO_BTREE_REF))) - return error; - lev--; - xfs_btree_setbuf(cur, lev, bp); - block = XFS_BUF_TO_INOBT_BLOCK(bp); - if ((error = xfs_btree_check_sblock(cur, block, lev, bp))) - return error; - cur->bc_ptrs[lev] = 1; - } - *stat = 1; - return 0; } /* - * Insert the current record at the point referenced by cur. - * The cursor may be inconsistent on return if splits have been done. + * INOBT functions that are not covered by core btree code. + * Externally visible routines. + */ + +/* + * Update the record referred to by cur to the value given + * by [ino, fcnt, free]. + * This either works (return 0) or gets an EFSCORRUPTED error. */ int /* error */ -xfs_inobt_insert( - xfs_btree_cur_t *cur, /* btree cursor */ - int *stat) /* success/failure */ +xfs_inobt_update( + xfs_btree_cur_t *cur, /* btree cursor */ + xfs_agino_t ino, /* starting inode of chunk */ + __int32_t fcnt, /* free inode count */ + xfs_inofree_t free) /* free inode mask */ { - int error; /* error return value */ - int i; /* result value, 0 for failure */ - int level; /* current level number in btree */ - xfs_agblock_t nbno; /* new block number (split result) */ - xfs_btree_cur_t *ncur; /* new cursor (split result) */ - xfs_inobt_rec_t nrec; /* record being inserted this level */ - xfs_btree_cur_t *pcur; /* previous level's cursor */ - - level = 0; - nbno = NULLAGBLOCK; - nrec.ir_startino = cpu_to_be32(cur->bc_rec.i.ir_startino); - nrec.ir_freecount = cpu_to_be32(cur->bc_rec.i.ir_freecount); - nrec.ir_free = cpu_to_be64(cur->bc_rec.i.ir_free); - ncur = NULL; - pcur = cur; - /* - * Loop going up the tree, starting at the leaf level. - * Stop when we don't get a split block, that must mean that - * the insert is finished with this level. - */ - do { - /* - * Insert nrec/nbno into this level of the tree. - * Note if we fail, nbno will be null. - */ - if ((error = xfs_inobt_insrec(pcur, level++, &nbno, &nrec, &ncur, - &i))) { - if (pcur != cur) - xfs_btree_del_cursor(pcur, XFS_BTREE_ERROR); - return error; - } - /* - * See if the cursor we just used is trash. - * Can't trash the caller's cursor, but otherwise we should - * if ncur is a new cursor or we're about to be done. - */ - if (pcur != cur && (ncur || nbno == NULLAGBLOCK)) { - cur->bc_nlevels = pcur->bc_nlevels; - xfs_btree_del_cursor(pcur, XFS_BTREE_NOERROR); - } - /* - * If we got a new cursor, switch to it. - */ - if (ncur) { - pcur = ncur; - ncur = NULL; - } - } while (nbno != NULLAGBLOCK); - *stat = i; - return 0; + xfs_btree_rec_t rec; + + rec.u.inobt.ir_startino = cpu_to_be32(ino); + rec.u.inobt.ir_freecount = cpu_to_be32(fcnt); + rec.u.inobt.ir_free = cpu_to_be64(free); + return xfs_btree_update(cur, &rec); } /* @@ -1986,7 +703,7 @@ xfs_inobt_lookup_eq( cur->bc_rec.i.ir_startino = ino; cur->bc_rec.i.ir_freecount = fcnt; cur->bc_rec.i.ir_free = free; - return xfs_inobt_lookup(cur, XFS_LOOKUP_EQ, stat); + return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); } /* @@ -2004,7 +721,7 @@ xfs_inobt_lookup_ge( cur->bc_rec.i.ir_startino = ino; cur->bc_rec.i.ir_freecount = fcnt; cur->bc_rec.i.ir_free = free; - return xfs_inobt_lookup(cur, XFS_LOOKUP_GE, stat); + return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); } /* @@ -2022,57 +739,55 @@ xfs_inobt_lookup_le( cur->bc_rec.i.ir_startino = ino; cur->bc_rec.i.ir_freecount = fcnt; cur->bc_rec.i.ir_free = free; - return xfs_inobt_lookup(cur, XFS_LOOKUP_LE, stat); + return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); } /* - * Update the record referred to by cur, to the value given - * by [ino, fcnt, free]. - * This either works (return 0) or gets an EFSCORRUPTED error. + * Get the data from the pointed-to record. */ int /* error */ -xfs_inobt_update( +xfs_inobt_get_rec( xfs_btree_cur_t *cur, /* btree cursor */ - xfs_agino_t ino, /* starting inode of chunk */ - __int32_t fcnt, /* free inode count */ - xfs_inofree_t free) /* free inode mask */ + xfs_agino_t *ino, /* output: starting inode of chunk */ + __int32_t *fcnt, /* output: number of free inodes */ + xfs_inofree_t *free, /* output: free inode mask */ + int *stat) /* output: success/failure */ { - xfs_inobt_block_t *block; /* btree block to update */ + xfs_btree_block_t *block; /* btree block */ xfs_buf_t *bp; /* buffer containing btree block */ +#ifdef DEBUG int error; /* error return value */ - int ptr; /* current record number (updating) */ - xfs_inobt_rec_t *rp; /* pointer to updated record */ +#endif + int ptr; /* record number */ + xfs_btree_rec_t *rec; /* record data */ - /* - * Pick up the current block. - */ - bp = cur->bc_bufs[0]; - block = XFS_BUF_TO_INOBT_BLOCK(bp); + XFS_BTREE_TRACE_CURSOR(cur, ENTRY); + XFS_BTREE_TRACE_ARGFFF(cur, *ino, *fcnt, *free); + + ptr = cur->bc_ptrs[0]; + block = xfs_inobt_get_block(cur, 0, &bp); #ifdef DEBUG - if ((error = xfs_btree_check_sblock(cur, block, 0, bp))) + error = xfs_btree_check_sblock(cur, block, 0, bp); + if (error) return error; #endif /* - * Get the address of the rec to be updated. - */ - ptr = cur->bc_ptrs[0]; - rp = XFS_INOBT_REC_ADDR(block, ptr, cur); - /* - * Fill in the new contents and log them. + * Off the right end or left end, return failure. */ - rp->ir_startino = cpu_to_be32(ino); - rp->ir_freecount = cpu_to_be32(fcnt); - rp->ir_free = cpu_to_be64(free); - xfs_inobt_log_recs(cur, bp, ptr, ptr); + if (ptr > be16_to_cpu(block->bb_h.bb_numrecs) || ptr <= 0) { + XFS_BTREE_TRACE_CURSOR(cur, EXIT); + *stat = 0; + return 0; + } /* - * Updating first record in leaf. Pass new key value up to our parent. + * Point to the record and extract its data. */ - if (ptr == 1) { - xfs_inobt_key_t key; /* key containing [ino] */ - - key.ir_startino = cpu_to_be32(ino); - if ((error = xfs_inobt_updkey(cur, &key, 1))) - return error; - } + rec = xfs_inobt_rec_addr(cur, ptr, block); + *ino = be32_to_cpu(rec->u.inobt.ir_startino); + *fcnt = be32_to_cpu(rec->u.inobt.ir_freecount); + *free = be64_to_cpu(rec->u.inobt.ir_free); + XFS_BTREE_TRACE_CURSOR(cur, EXIT); + *stat = 1; return 0; } + Index: 2.6.x-xfs-new/fs/xfs/xfs_ialloc_btree.h =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_ialloc_btree.h 2007-10-15 09:58:18.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_ialloc_btree.h 2007-11-06 19:40:29.770666321 +1100 @@ -116,6 +116,8 @@ typedef struct xfs_btree_sblock xfs_inob (XFS_BTREE_PTR_ADDR(xfs_inobt, bb, \ i, XFS_INOBT_BLOCK_MAXRECS(1, cur))) +extern void xfs_inobt_init_cursor(struct xfs_btree_cur *cur); + /* * Decrement cursor by one record at the level. * For nonzero levels the leaf-ward information is untouched. Index: 2.6.x-xfs-new/fs/xfs/xfs_itable.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_itable.c 2007-10-24 16:01:47.000000000 +1000 +++ 2.6.x-xfs-new/fs/xfs/xfs_itable.c 2007-11-06 19:40:29.770666321 +1100 @@ -475,7 +475,7 @@ xfs_bulkstat( * In any case, increment to the next record. */ if (!error) - error = xfs_inobt_increment(cur, 0, &tmp); + error = xfs_btree_increment(cur, 0, &tmp); } else { /* * Start of ag. Lookup the first inode chunk. @@ -541,7 +541,7 @@ xfs_bulkstat( * Set agino to after this chunk and bump the cursor. */ agino = gino + XFS_INODES_PER_CHUNK; - error = xfs_inobt_increment(cur, 0, &tmp); + error = xfs_btree_increment(cur, 0, &tmp); } /* * Drop the btree buffers and the agi buffer. @@ -881,7 +881,7 @@ xfs_inumbers( bufidx = 0; } if (left) { - error = xfs_inobt_increment(cur, 0, &tmp); + error = xfs_btree_increment(cur, 0, &tmp); if (error) { xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); cur = NULL; From owner-xfs@oss.sgi.com Tue Nov 6 01:21:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 01:21:44 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from spike.grumly.eu.org (spike.grumly.eu.org [195.5.253.226]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA69LXZZ003355 for ; Tue, 6 Nov 2007 01:21:37 -0800 Received: by spike.grumly.eu.org (Postfix, from userid 1001) id AD82611984; Tue, 6 Nov 2007 10:21:57 +0100 (CET) Date: Tue, 6 Nov 2007 10:21:57 +0100 From: Cedric - Equinoxe Media To: David Chinner Cc: xfs@oss.sgi.com Subject: Re: xfs crash Message-ID: <20071106092157.GB16694@e-m.fr> References: <20071105215135.GA12238@e-m.fr> <20071106082632.GU995458@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20071106082632.GU995458@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4680/Mon Nov 5 20:49:40 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13563 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cedric@e-m.fr Precedence: bulk X-list: xfs On 06/11/2007 19:26, David Chinner wrote: > On Mon, Nov 05, 2007 at 10:51:35PM +0100, Cedric - Equinoxe Media wrote: > > XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1563 of file > > fs/xfs/xfs_alloc.c. Caller 0xffffffff88113b35 > > > > Call Trace: > > [] :xfs:xfs_free_ag_extent+0x1a6/0x6b5 > > [] :xfs:xfs_free_extent+0xa9/0xc9 > > [] :xfs:xfs_bmap_finish+0xee/0x167 > > [] :xfs:xfs_itruncate_finish+0x19b/0x2e0 > > [] :xfs:xfs_setattr+0x841/0xe57 > > Corrupted free space btree, by the look of it. Can you run > xfs_check on the filesystem and report the output. You can recover > from this by running xfs_repair. xfs_check /dev/sda4 : bad format 2 for inode 2961770479 type 0 bad format 2 for inode 3229517262 type 0 block 20/621714 type unknown not expected link count mismatch for inode 2961770479 (name ?), nlink 0, counted 1 link count mismatch for inode 3229517262 (name ?), nlink 0, counted 1 the xfs_repair worked perfectly. Do you have an idea why this corruption happened ? Thanks. Cédric From owner-xfs@oss.sgi.com Tue Nov 6 06:41:33 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 06:41:35 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_05,SUBJECT_FUZZY_TION autolearn=no version=3.3.0-r574664 Received: from r2d2.neofacto.lu (mail.neofacto.lu [158.64.60.195]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6EfULA018283 for ; Tue, 6 Nov 2007 06:41:32 -0800 Received: from localhost (localhost [127.0.0.1]) by r2d2.neofacto.lu (Postfix) with ESMTP id 323CCC2F3E for ; Tue, 6 Nov 2007 15:09:14 +0100 (CET) X-Virus-Scanned: ClamAV 0.91.2/4680/Mon Nov 5 20:49:40 2007 on oss.sgi.com X-Virus-Scanned: Ubuntu amavisd-new at neofacto.lu Received: from r2d2.neofacto.lu ([127.0.0.1]) by localhost (r2d2.neofacto.lu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cYvlIgzHPDPY for ; Tue, 6 Nov 2007 15:09:00 +0100 (CET) Received: by r2d2.neofacto.lu (Postfix, from userid 65534) id C956DC302E; Tue, 6 Nov 2007 14:58:24 +0100 (CET) Received: from [192.168.1.166] (SU105.tudor.lu [158.64.4.205]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by r2d2.neofacto.lu (Postfix) with ESMTP id 80341C2F49 for ; Tue, 6 Nov 2007 14:58:15 +0100 (CET) Message-ID: <473072FD.4070104@jamendo.com> Date: Tue, 06 Nov 2007 14:58:21 +0100 From: Amandine AUPETIT User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: xfs@oss.sgi.com Subject: 7Tb XFS partition lost on reboot Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Status: Clean X-archive-position: 13564 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: amandine@jamendo.com Precedence: bulk X-list: xfs Hi ! (I'm on Ubuntu 7.10 64 bits) I have array of 12 750gb disks in hardware Raid 6 that gives me a 7Tb partition. I tried to format it in ext3, but it took too much time, so I tried in Reiserfs, but the partition were lost on reboot, and now i'm trying XFS. So I created the partition with parted, because fdisk can't do more that 2tb partitions. It's ok, I can do what I want but... on reboot, there is a Superblock problem, something like that. When I check with xfs_check : xfs_check: unexpected XFS SB magic number 0x00000000 xfs_check: read failed: Invalid argument xfs_check: data size check failed cache_node_purge: refcount was 1, not zero (node=0x681420) xfs_check: cannot read root inode (22) bad superblock magic number 0, giving up So, I tried to delete the partition with parted, to recreate a new one. No problem. But when I mount the new partition, all the data that were on my deleted partition are there !!! That's of course not a problem, but I'm wondering if there's a way to have this partition work directly without having to delete and recreate it ? I checked the /proc/partition before and after doing parted : BEFORE major minor #blocks name 104 0 35532720 cciss/c0d0 104 1 34025638 cciss/c0d0p1 104 2 1 cciss/c0d0p2 104 5 1502046 cciss/c0d0p5 105 0 7325417080 cciss/c1d0 105 1 [B]882966102[/B] cciss/c1d0p1 AFTER major minor #blocks name 104 0 35532720 cciss/c0d0 104 1 34025638 cciss/c0d0p1 104 2 1 cciss/c0d0p2 104 5 1502046 cciss/c0d0p5 105 0 7325417080 cciss/c1d0 105 1 [B]7325417046[/B] cciss/c1d0p1 So, any idea what I could do ? Thanks a lot Amandine From owner-xfs@oss.sgi.com Tue Nov 6 08:07:03 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 08:07:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from spike.grumly.eu.org (spike.grumly.eu.org [195.5.253.226]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6G6xxb030197 for ; Tue, 6 Nov 2007 08:07:02 -0800 Received: by spike.grumly.eu.org (Postfix, from userid 1001) id 5AC22119FA; Tue, 6 Nov 2007 17:07:21 +0100 (CET) Date: Tue, 6 Nov 2007 17:07:21 +0100 From: Cedric - Equinoxe Media To: David Chinner Cc: xfs@oss.sgi.com Subject: Re: xfs crash Message-ID: <20071106160721.GB25295@e-m.fr> References: <20071105215135.GA12238@e-m.fr> <20071106082632.GU995458@sgi.com> <20071106092157.GB16694@e-m.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20071106092157.GB16694@e-m.fr> X-Virus-Scanned: ClamAV 0.91.2/4681/Tue Nov 6 04:52:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13565 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cedric@e-m.fr Precedence: bulk X-list: xfs I just had exactly the same crash again today : /dev/sda4 on /filer type xfs (rw,noexec,nosuid,nodev,noatime) Nov 6 16:40:24 fng2 kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1563 of file fs/xfs/xfs_alloc.c. Caller 0xffffffff88113b35 Nov 6 16:40:24 fng2 kernel: Nov 6 16:40:24 fng2 kernel: Call Trace: Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_free_ag_extent+0x1a6/0x6b5 Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_free_extent+0xa9/0xc9 Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_bmap_finish+0xee/0x167 Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_itruncate_finish+0x19b/0x2e0 Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_setattr+0x841/0xe57 Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_fs_get_dentry+0x38/0x59 Nov 6 16:40:24 fng2 kernel: [] task_rq_lock+0x3d/0x6f Nov 6 16:40:24 fng2 kernel: [] __activate_task+0x26/0x38 Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_vn_setattr+0x121/0x144 Nov 6 16:40:24 fng2 kernel: [] notify_change+0x156/0x2f1 Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd_setattr+0x334/0x4b1 Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd3_proc_setattr+0xa2/0xae Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd_dispatch+0xdd/0x19e Nov 6 16:40:24 fng2 kernel: [] :sunrpc:svc_process+0x3df/0x6ef Nov 6 16:40:24 fng2 kernel: [] __down_read+0x12/0x9a Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd+0x191/0x2ac Nov 6 16:40:24 fng2 kernel: [] child_rip+0xa/0x12 Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd+0x0/0x2ac Nov 6 16:40:24 fng2 kernel: [] child_rip+0x0/0x12 Nov 6 16:40:24 fng2 kernel: Nov 6 16:40:24 fng2 kernel: xfs_force_shutdown(sda4,0x8) called from line 4258 of file fs/xfs/xfs_bmap.c. Return address = 0xffffffff8811cfb4 Nov 6 16:40:24 fng2 kernel: Filesystem "sda4": Corruption of in-memory data detected. Shutting down filesystem: sda4 Nov 6 16:40:24 fng2 kernel: Please umount the filesystem, and rectify the problem(s) Seems to be again on a setattr() ? Regards. Cédric From owner-xfs@oss.sgi.com Tue Nov 6 08:44:15 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 08:44:19 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6GiEm2007594 for ; Tue, 6 Nov 2007 08:44:15 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id 58AC71C000263; Tue, 6 Nov 2007 11:44:19 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id 549744019521; Tue, 6 Nov 2007 11:44:19 -0500 (EST) Date: Tue, 6 Nov 2007 11:44:19 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Cedric - Equinoxe Media cc: David Chinner , xfs@oss.sgi.com Subject: Re: xfs crash In-Reply-To: <20071106160721.GB25295@e-m.fr> Message-ID: References: <20071105215135.GA12238@e-m.fr> <20071106082632.GU995458@sgi.com> <20071106092157.GB16694@e-m.fr> <20071106160721.GB25295@e-m.fr> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-1463747160-1479258781-1194367459=:17411" X-Virus-Scanned: ClamAV 0.91.2/4681/Tue Nov 6 04:52:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13566 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463747160-1479258781-1194367459=:17411 Content-Type: TEXT/PLAIN; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Tue, 6 Nov 2007, Cedric - Equinoxe Media wrote: > I just had exactly the same crash again today : > /dev/sda4 on /filer type xfs (rw,noexec,nosuid,nodev,noatime) > > Nov 6 16:40:24 fng2 kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO a= t line 1563 of file fs/xfs/xfs_alloc.c. Caller 0xffffffff88113b35 > Nov 6 16:40:24 fng2 kernel: > Nov 6 16:40:24 fng2 kernel: Call Trace: > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_free_ag_exten= t+0x1a6/0x6b5 > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_free_extent+0= xa9/0xc9 > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_bmap_finish+0= xee/0x167 > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_itruncate_fin= ish+0x19b/0x2e0 > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_setattr+0x841= /0xe57 > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_fs_get_dentry= +0x38/0x59 > Nov 6 16:40:24 fng2 kernel: [] task_rq_lock+0x3d/0x6f > Nov 6 16:40:24 fng2 kernel: [] __activate_task+0x26/0= x38 > Nov 6 16:40:24 fng2 kernel: [] :xfs:xfs_vn_setattr+0x= 121/0x144 > Nov 6 16:40:24 fng2 kernel: [] notify_change+0x156/0x= 2f1 > Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd_setattr+0x3= 34/0x4b1 > Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd3_proc_setat= tr+0xa2/0xae > Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd_dispatch+0x= dd/0x19e > Nov 6 16:40:24 fng2 kernel: [] :sunrpc:svc_process+0x= 3df/0x6ef > Nov 6 16:40:24 fng2 kernel: [] __down_read+0x12/0x9a > Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd+0x191/0x2ac > Nov 6 16:40:24 fng2 kernel: [] child_rip+0xa/0x12 > Nov 6 16:40:24 fng2 kernel: [] :nfsd:nfsd+0x0/0x2ac > Nov 6 16:40:24 fng2 kernel: [] child_rip+0x0/0x12 > Nov 6 16:40:24 fng2 kernel: > Nov 6 16:40:24 fng2 kernel: xfs_force_shutdown(sda4,0x8) called from lin= e 4258 of file fs/xfs/xfs_bmap.c. Return address =3D 0xffffffff8811cfb4 > Nov 6 16:40:24 fng2 kernel: Filesystem "sda4": Corruption of in-memory d= ata detected. Shutting down filesystem: sda4 > Nov 6 16:40:24 fng2 kernel: Please umount the filesystem, and rectify th= e problem(s) > > Seems to be again on a setattr() ? > > Regards. > C=E9dric > > Have you run a memory test on your server? memtest86 ---1463747160-1479258781-1194367459=:17411-- From owner-xfs@oss.sgi.com Tue Nov 6 08:46:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 08:46:16 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00, SUBJECT_FUZZY_TION autolearn=no version=3.3.0-r574664 Received: from hogwarts.egr.duke.edu (hogwarts.egr.duke.edu [152.3.195.84]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6GkAKd008127 for ; Tue, 6 Nov 2007 08:46:13 -0800 Received: from hogwarts.egr.duke.edu (localhost.localdomain [127.0.0.1]) by hogwarts.egr.duke.edu (8.13.1/8.13.1) with ESMTP id lA6GkBNZ017533; Tue, 6 Nov 2007 11:46:11 -0500 Received: from localhost (jlb@localhost) by hogwarts.egr.duke.edu (8.13.1/8.13.1/Submit) with ESMTP id lA6Gk9uA017530; Tue, 6 Nov 2007 11:46:11 -0500 X-Authentication-Warning: hogwarts.egr.duke.edu: jlb owned process doing -bs Date: Tue, 6 Nov 2007 11:46:09 -0500 (EST) From: Joshua Baker-LePain X-X-Sender: jlb@hogwarts.egr.duke.edu To: Amandine AUPETIT cc: xfs@oss.sgi.com Subject: Re: 7Tb XFS partition lost on reboot In-Reply-To: <473072FD.4070104@jamendo.com> Message-ID: References: <473072FD.4070104@jamendo.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.91.2/4681/Tue Nov 6 04:52:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13567 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jlb17@duke.edu Precedence: bulk X-list: xfs On Tue, 6 Nov 2007 at 2:58pm, Amandine AUPETIT wrote > So I created the partition with parted, because fdisk can't do more that 2tb > partitions. > It's ok, I can do what I want but... > > on reboot, there is a Superblock problem, something like that. When I check > with xfs_check : First guess -- did you use a gpt disklabel on that device? Standard (msdos) disklabels don't work on devices >2TB. The usual symptom of a big device with an msdos disklabel is that the partition table goes away on reboot. -- Joshua Baker-LePain QB3 Shared Cluster Sysadmin UCSF From owner-xfs@oss.sgi.com Tue Nov 6 09:08:16 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 09:08:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from spike.grumly.eu.org (spike.grumly.eu.org [195.5.253.226]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6H8CYu011051 for ; Tue, 6 Nov 2007 09:08:16 -0800 Received: by spike.grumly.eu.org (Postfix, from userid 1001) id 072A811960; Tue, 6 Nov 2007 18:08:37 +0100 (CET) Date: Tue, 6 Nov 2007 18:08:37 +0100 From: Cedric - Equinoxe Media To: xfs@oss.sgi.com Subject: Re: xfs crash Message-ID: <20071106170837.GC25295@e-m.fr> References: <20071105215135.GA12238@e-m.fr> <20071106082632.GU995458@sgi.com> <20071106092157.GB16694@e-m.fr> <20071106160721.GB25295@e-m.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Virus-Scanned: ClamAV 0.91.2/4682/Tue Nov 6 07:42:37 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13568 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cedric@e-m.fr Precedence: bulk X-list: xfs On 06/11/2007 11:44, Justin Piszcz wrote: > Have you run a memory test on your server? memtest86 But I doubt it is hardware memory corruption because it is a brand new dell server with ECC memory and the backtrace is always the same. Anyway I will do the memtest tomorrow... I also have a spare server, If I find nothing I will move everything to the spare and wait for the possible bug to appear again. Regards Cédric From owner-xfs@oss.sgi.com Tue Nov 6 09:27:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 09:27:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE, J_CHICKENPOX_42,J_CHICKENPOX_43 autolearn=no version=3.3.0-r574664 Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.225]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6HR0ux013831 for ; Tue, 6 Nov 2007 09:27:03 -0800 Received: by wr-out-0506.google.com with SMTP id c48so148366wra for ; Tue, 06 Nov 2007 09:27:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; bh=bGZp5kFip+2MK0zxw/wQj1HCX6f9DWekhByHVUd5t/Q=; b=dXFsJn5GKaieMrqaHaA82fXGKfZeeOOK/nTTG9JTzY1ALbKzswlnWYNR0fCGZ42znRrIgHeGYmnWYqqN4dp9FfZVJaxz4DejapzGUM4JNR+KBYoRiJx39/WcftRBufb1L8TD7DHF2Nlq7PnVPfJ/lM0o9xECFxR2bHBUpjiL6Wk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=Y6LFvWyhB8cO/5Xy9+vd2ZMy0ojbdyZB5CywX2ZPZAv/0FvFRYcWyKr7H5qc4hwv/VtGifrDGZ/DdLPd02ARe31qwTV79RxNdb4gxc9ourRmd+2sUH+9gNSWL4yOum9gPkm8gwFdL16bHXxRURwm06PnGCDatLqY0T+PzF0owME= Received: by 10.142.191.2 with SMTP id o2mr1670563wff.1194370023388; Tue, 06 Nov 2007 09:27:03 -0800 (PST) Received: by 10.142.162.19 with HTTP; Tue, 6 Nov 2007 09:27:03 -0800 (PST) Message-ID: Date: Tue, 6 Nov 2007 22:57:03 +0530 From: "Bhagi rathi" To: "David Chinner" Subject: Re: TAKE 972756 - Implement fallocate. Cc: xfs@oss.sgi.com In-Reply-To: <20071106001223.GY66820511@sgi.com> MIME-Version: 1.0 References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> <20071106001223.GY66820511@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4682/Tue Nov 6 07:42:37 2007 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1184 X-archive-position: 13569 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs File is of size 1k. A 4k block is allocated as file-system block size is 4k. Preallocation happened from 1k to 256k. Now, it looks to me that we have un-written extents from 4k to 256k. There is no guarantee that data from 1k to 4k is all zero'es. Fallocate is updating size. Hence on subsequent read, we can get garbage from 1k to 4k and all zero'es from 4k to 256k Is the expectation here is application should take the responsibility of zero'ing data? I still need to through fallocate requirements. -Thanks, Bhagi. On 11/6/07, David Chinner wrote: > > On Tue, Nov 06, 2007 at 12:12:52AM +0530, Bhagi rathi wrote: > > David, What happens if offset is not aligned to 4k? Let's say we have a > file > > whose size is > > not aligned to 4k. It could have blocks beyond the eof which haven't > been > > zero'ed out. > > No it won't. They are *preallocated* blocks, which by definition are > zero-filled. Preallocated blocks are marked as unwritten on disk, so > it is known that they contain zeros, even if they lie beyond EOF. > > Cheers, > > Dave. > -- > Dave Chinner > Principal Engineer > SGI Australian Software Group > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Tue Nov 6 10:01:27 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 10:01:33 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham version=3.3.0-r574664 Received: from astoria.ccjclearline.com (astoria.ccjclearline.com [64.235.106.9]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6I1Mh0018327 for ; Tue, 6 Nov 2007 10:01:26 -0800 Received: from [99.236.101.138] (helo=crashcourse.ca) by astoria.ccjclearline.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.68) (envelope-from ) id 1IpQNv-0007fs-6Z for xfs@oss.sgi.com; Tue, 06 Nov 2007 10:30:43 -0500 Date: Tue, 6 Nov 2007 10:28:44 -0500 (EST) From: "Robert P. J. Day" X-X-Sender: rpjday@localhost.localdomain To: xfs@oss.sgi.com Subject: use is_power_of_2() macro? Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - astoria.ccjclearline.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - crashcourse.ca X-Source: X-Source-Args: X-Source-Dir: X-Virus-Scanned: ClamAV 0.91.2/4682/Tue Nov 6 07:42:37 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13570 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rpjday@crashcourse.ca Precedence: bulk X-list: xfs given this in fs/xfs/xfs_inode.c: /* * xfs_iroundup: round up argument to next power of two */ uint xfs_iroundup( uint v) { int i; uint m; if ((v & (v - 1)) == 0) return v; ASSERT((v & 0x80000000) == 0); if ((v & (v + 1)) == 0) return v + 1; for (i = 0, m = 1; i < 31; i++, m <<= 1) { if (v & m) continue; v |= m; if ((v & (v + 1)) == 0) return v + 1; } ASSERT(0); return( 0 ); } is there any reason that can't be rewritten with simply roundup_pow_of_two() as defined in include/linux/log2.h? #define roundup_pow_of_two(n) \ ( \ __builtin_constant_p(n) ? ( \ (n == 1) ? 1 : \ (1UL << (ilog2((n) - 1) + 1)) \ ) : \ __roundup_pow_of_two(n) \ ) just curious. rday -- ======================================================================== Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca ======================================================================== From owner-xfs@oss.sgi.com Tue Nov 6 10:59:29 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 10:59:36 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-4.6 required=5.0 tests=AWL,BAYES_00, RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SUBJECT_FUZZY_TION autolearn=ham version=3.3.0-r574664 Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6IxPYn028921 for ; Tue, 6 Nov 2007 10:59:29 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.1) with ESMTP id lA6IxRXo000423; Tue, 6 Nov 2007 13:59:27 -0500 Received: from lacrosse.corp.redhat.com (lacrosse.corp.redhat.com [172.16.52.154]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lA6IxR93023320; Tue, 6 Nov 2007 13:59:27 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by lacrosse.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id lA6IxO6U031373; Tue, 6 Nov 2007 13:59:26 -0500 Message-ID: <4730B98C.5090008@sandeen.net> Date: Tue, 06 Nov 2007 12:59:24 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.12 (X11/20070530) MIME-Version: 1.0 To: Joshua Baker-LePain CC: Amandine AUPETIT , xfs@oss.sgi.com Subject: Re: 7Tb XFS partition lost on reboot References: <473072FD.4070104@jamendo.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4683/Tue Nov 6 10:30:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13571 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Joshua Baker-LePain wrote: > On Tue, 6 Nov 2007 at 2:58pm, Amandine AUPETIT wrote > >> So I created the partition with parted, because fdisk can't do more that 2tb >> partitions. >> It's ok, I can do what I want but... >> >> on reboot, there is a Superblock problem, something like that. When I check >> with xfs_check : > > First guess -- did you use a gpt disklabel on that device? Standard > (msdos) disklabels don't work on devices >2TB. The usual symptom of a big > device with an msdos disklabel is that the partition table goes away on > reboot. I second that hunch. :) -Eric From owner-xfs@oss.sgi.com Tue Nov 6 11:04:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 11:04:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-4.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42, J_CHICKENPOX_43,RCVD_IN_DNSWL_MED,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6J43b7029852 for ; Tue, 6 Nov 2007 11:04:04 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.1) with ESMTP id lA6J47a8001845; Tue, 6 Nov 2007 14:04:07 -0500 Received: from lacrosse.corp.redhat.com (lacrosse.corp.redhat.com [172.16.52.154]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id lA6J47xK026664; Tue, 6 Nov 2007 14:04:07 -0500 Received: from [10.15.80.10] (neon.msp.redhat.com [10.15.80.10]) by lacrosse.corp.redhat.com (8.12.11.20060308/8.11.6) with ESMTP id lA6J46uE000447; Tue, 6 Nov 2007 14:04:06 -0500 Message-ID: <4730BAA5.1080406@sandeen.net> Date: Tue, 06 Nov 2007 13:04:05 -0600 From: Eric Sandeen User-Agent: Thunderbird 1.5.0.12 (X11/20070530) MIME-Version: 1.0 To: Bhagi rathi CC: David Chinner , xfs@oss.sgi.com Subject: Re: TAKE 972756 - Implement fallocate. References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> <20071106001223.GY66820511@sgi.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4683/Tue Nov 6 10:30:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13572 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Bhagi rathi wrote: > File is of size 1k. A 4k block is allocated as file-system block size is > 4k. > Preallocation happened from 1k to 256k. Now, it looks to me that we have > un-written extents from 4k to 256k. There is no guarantee that data from 1k > to 4k is all zero'es. Fallocate is updating size. Hence on subsequent read, > we can get garbage from 1k to 4k and all zero'es from 4k to 256k You've tested this and found it to be true? -Eric > Is the expectation here is application should take the responsibility of > zero'ing > data? I still need to through fallocate requirements. From owner-xfs@oss.sgi.com Tue Nov 6 11:18:28 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 11:18:33 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6JIQBR032278 for ; Tue, 6 Nov 2007 11:18:27 -0800 Received: from f237116.upc-f.chello.nl ([80.56.237.116] helo=[192.168.0.111]) by pentafluge.infradead.org with esmtpsa (Exim 4.63 #1 (Red Hat Linux)) id 1IpTfv-0007IT-Cx; Tue, 06 Nov 2007 19:01:23 +0000 Subject: Re: writeout stalls in current -git From: Peter Zijlstra To: David Chinner Cc: Torsten Kaiser , Fengguang Wu , Maxim Levitsky , linux-kernel@vger.kernel.org, Andrew Morton , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <20071106042527.GT995458@sgi.com> References: <393903856.06449@ustc.edu.cn> <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com> <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> <20071105014510.GU66820511@sgi.com> <64bb37e0711051027v49869699s9593ea54713b15ff@mail.gmail.com> <20071106042527.GT995458@sgi.com> Content-Type: text/plain Date: Tue, 06 Nov 2007 20:01:22 +0100 Message-Id: <1194375682.6289.88.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4683/Tue Nov 6 10:30:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13573 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: peterz@infradead.org Precedence: bulk X-list: xfs On Tue, 2007-11-06 at 15:25 +1100, David Chinner wrote: > I'm struggling to understand what possible changed in XFS or writeback that > would lead to stalls like this, esp. as you appear to be removing files when > the stalls occur. Just a crazy idea,.. Could there be a set_page_dirty() that doesn't have balance_dirty_pages() call near? For example modifying meta data in unlink? Such a situation could lead to an excess of dirty pages and the next call to balance_dirty_pages() would appear to stall, as it would desperately try to get below the limit again. From owner-xfs@oss.sgi.com Tue Nov 6 12:26:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 12:26:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.180]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6KQ1jx011466 for ; Tue, 6 Nov 2007 12:26:10 -0800 Received: by py-out-1112.google.com with SMTP id u77so4148314pyb for ; Tue, 06 Nov 2007 12:26:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=aSteImI9UbqzHyP5g7FCeQ+QCyP8T0EWeHKKbofjW0o=; b=S7gDtLwRUiE5Ovdf8ur4fxwRnoaOXK2sVwA7E6OUz4ya4PK2by4zM+QHiMD9mHqPkz1Efzq8LL0dVaC3m9nRfYT0tYScklfxXCCRNdvnlmmu1+JbkLoPkm55cQP2zRLLaTmunT9GLIZWWaHJOyEp3qti/rRHNfGlhE7Ilm2147E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=QRbL0zT95BKKQ+R/H0hZFxCjLgIYnr0gnPQaQ3b0R5Py/dP7MltHWw118p4ZP2lls6Em6I34ZooZDS6VGmuPG+n8+r+Y1lnl53ncf0FZorpLCHGSJBxP+KoDzH3VqV7FjhB9n2nBppye+f2yZgLDHHLdT7xiMENqLPpeLDHRydM= Received: by 10.65.100.14 with SMTP id c14mr13116180qbm.1194380765201; Tue, 06 Nov 2007 12:26:05 -0800 (PST) Received: by 10.65.112.13 with HTTP; Tue, 6 Nov 2007 12:26:05 -0800 (PST) Message-ID: <64bb37e0711061226l48dce395ub2f9539efc66ecc0@mail.gmail.com> Date: Tue, 6 Nov 2007 21:26:05 +0100 From: "Torsten Kaiser" To: "Peter Zijlstra" Subject: Re: writeout stalls in current -git Cc: "David Chinner" , "Fengguang Wu" , "Maxim Levitsky" , linux-kernel@vger.kernel.org, "Andrew Morton" , linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: <1194375682.6289.88.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <393903856.06449@ustc.edu.cn> <1193998532.27652.343.camel@twins> <64bb37e0711021222q7d12c825mc62d433c4fe19e8@mail.gmail.com> <20071102204258.GR995458@sgi.com> <64bb37e0711040319l5de285c3xea64474540a51b6e@mail.gmail.com> <20071105014510.GU66820511@sgi.com> <64bb37e0711051027v49869699s9593ea54713b15ff@mail.gmail.com> <20071106042527.GT995458@sgi.com> <1194375682.6289.88.camel@twins> X-Virus-Scanned: ClamAV 0.91.2/4683/Tue Nov 6 10:30:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13574 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: just.for.lkml@googlemail.com Precedence: bulk X-list: xfs On 11/6/07, Peter Zijlstra wrote: > On Tue, 2007-11-06 at 15:25 +1100, David Chinner wrote: > > > I'm struggling to understand what possible changed in XFS or writeback that > > would lead to stalls like this, esp. as you appear to be removing files when > > the stalls occur. > > Just a crazy idea,.. > > Could there be a set_page_dirty() that doesn't have > balance_dirty_pages() call near? For example modifying meta data in > unlink? > > Such a situation could lead to an excess of dirty pages and the next > call to balance_dirty_pages() would appear to stall, as it would > desperately try to get below the limit again. Only if accounting of the dirty pages is also broken. In the unmerge testcase I see most of the time only <200kb of dirty data in /proc/meminfo. The system has 4Gb of RAM so I'm not sure if it should ever be valid to stall even the emerge/install testcase. Torsten Now building a kernel with the skipped-pages-accounting-patch reverted... From owner-xfs@oss.sgi.com Tue Nov 6 12:41:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 12:41:14 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA6Kf6iY013333 for ; Tue, 6 Nov 2007 12:41:10 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id HAA07528; Wed, 7 Nov 2007 07:41:04 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA6Kf2dD96273635; Wed, 7 Nov 2007 07:41:03 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA6Kf1Tj94970484; Wed, 7 Nov 2007 07:41:01 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Wed, 7 Nov 2007 07:41:00 +1100 From: David Chinner To: Bhagi rathi Cc: David Chinner , xfs@oss.sgi.com Subject: Re: TAKE 972756 - Implement fallocate. Message-ID: <20071106204100.GW995458@sgi.com> References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> <20071106001223.GY66820511@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4683/Tue Nov 6 10:30:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13575 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Nov 06, 2007 at 10:57:03PM +0530, Bhagi rathi wrote: > File is of size 1k. A 4k block is allocated as file-system block size is > 4k. > Preallocation happened from 1k to 256k. Now, it looks to me that we have > un-written extents from 4k to 256k. There is no guarantee that data from 1k > to 4k is all zero'es. Fallocate is updating size. Hence on subsequent read, > we can get garbage from 1k to 4k and all zero'es from 4k to 256k # rm /mnt/test/fred # xfs_io -f -c "pwrite 0 1024" -c "fsync" -c "falloc_allocsp 0 262144" -c "bmap -vp" /mnt/test/fred wrote 1024/1024 bytes at offset 0 1 KiB, 1 ops; 0.0000 sec (42.459 MiB/sec and 43478.2609 ops/sec) /mnt/test/fred: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS 0: [0..7]: 14520..14527 0 (14520..14527) 8 00000 1: [8..511]: 345688..346191 0 (345688..346191) 504 10000 # dd if=/mnt/test/fred bs=4k count=1 |od -Ax 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.004566 seconds, 897 kB/s 000000 146715 146715 146715 146715 146715 146715 146715 146715 * 000400 000000 000000 000000 000000 000000 000000 000000 000000 * 001000 Only 1k of modified data, then 3k of zeros, then a bunch of unwritten extents out to EOF. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Nov 6 12:56:05 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 12:56:09 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA6Ku13c015102 for ; Tue, 6 Nov 2007 12:56:04 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id HAA07893; Wed, 7 Nov 2007 07:55:59 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA6KtwdD96464484; Wed, 7 Nov 2007 07:55:58 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA6KtuKm95774644; Wed, 7 Nov 2007 07:55:56 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Wed, 7 Nov 2007 07:55:56 +1100 From: David Chinner To: Cedric - Equinoxe Media Cc: David Chinner , xfs@oss.sgi.com Subject: Re: xfs crash Message-ID: <20071106205556.GZ995458@sgi.com> References: <20071105215135.GA12238@e-m.fr> <20071106082632.GU995458@sgi.com> <20071106092157.GB16694@e-m.fr> <20071106160721.GB25295@e-m.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071106160721.GB25295@e-m.fr> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4684/Tue Nov 6 11:09:39 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13576 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Nov 06, 2007 at 05:07:21PM +0100, Cedric - Equinoxe Media wrote: > I just had exactly the same crash again today : > /dev/sda4 on /filer type xfs (rw,noexec,nosuid,nodev,noatime) What did xfs_check tell you about the corruption? > Seems to be again on a setattr() ? Doing a truncation freeing some blocks. What is the client doing (i.e. io patterns, application, etc) to cause this? can you reproduce it without NFS being used? To track this down I'm going to need a reproducable test case.... Seeing this is a brand new server, have you run and soak or stress test on the raw storage to confirm it is error free? Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Nov 6 14:38:42 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 14:38:48 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42 autolearn=no version=3.3.0-r574664 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA6McbiH026277 for ; Tue, 6 Nov 2007 14:38:41 -0800 Received: from edge.yarra.acx (unknown [203.89.192.141]) by postoffice.aconex.com (Postfix) with ESMTP id 727B492C5AC; Wed, 7 Nov 2007 09:38:41 +1100 (EST) Subject: Re: TAKE 972756 - Implement fallocate. From: Nathan Scott Reply-To: nscott@aconex.com To: Bhagi rathi , David Chinner Cc: xfs@oss.sgi.com In-Reply-To: <20071106204100.GW995458@sgi.com> References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> <20071106001223.GY66820511@sgi.com> <20071106204100.GW995458@sgi.com> Content-Type: text/plain Organization: Aconex Date: Wed, 07 Nov 2007 09:38:53 +1100 Message-Id: <1194388733.3862.206.camel@edge.yarra.acx> Mime-Version: 1.0 X-Mailer: Evolution 2.6.3 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4684/Tue Nov 6 11:09:39 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13577 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs On Wed, 2007-11-07 at 07:41 +1100, David Chinner wrote: > > Preallocation happened from 1k to 256k. Now, it looks to me that we > have > > un-written extents from 4k to 256k. There is no guarantee that data > from 1k > > to 4k is all zero'es. That guarantee does exist - when the initial 1K block write is done, the end of the block is zeroed (by the kernel write path). This is always done (guaranteed) and is required independently to unwritten extents. cheers. -- Nathan From owner-xfs@oss.sgi.com Tue Nov 6 16:16:13 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 16:16:19 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: **** X-Spam-Status: No, score=4.0 required=5.0 tests=BAYES_99 autolearn=no version=3.3.0-r574664 Received: from atlas.kreativmedia.ch (ns23.kreativmedia.ch [80.74.146.167]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA70G8hF005441 for ; Tue, 6 Nov 2007 16:16:12 -0800 Received: (qmail 31423 invoked by uid 0); 7 Nov 2007 00:49:32 +0100 Date: 7 Nov 2007 00:49:32 +0100 Message-ID: <20071106234932.31422.qmail@atlas.kreativmedia.ch> From: cji@mdpi.org To: linux-xfs@oss.sgi.com MIME-Version: 1.0 Subject: =?utf-8?Q?Re:_Delivery_reports_about_your_e=2Dmail?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.91.2/4685/Tue Nov 6 13:58:19 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13578 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cji@mdpi.org Precedence: bulk X-list: xfs Dear Colleague, Thank you very much for your e-mail. Please note that the Chemical Journal on Internet (CJI, ISSN 1523-1623) is no longer published by Molecular Diversity Preservation International (MDPI). Please send your message to cji@chemistrymag.org or visit the journals website at http://www.chemistrymag.org/. Best regards, Dr. Shu-Kun Lin Publisher MDPI -- MDPI Center Matthaeusstrasse 11 CH-4057 Basel Switzerland Tel. +41 61 683 77 34 (office) Fax +41 61 302 89 18 From owner-xfs@oss.sgi.com Tue Nov 6 20:58:45 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 20:58:47 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from rv-out-0910.google.com (rv-out-0910.google.com [209.85.198.186]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA74wgQ4021606 for ; Tue, 6 Nov 2007 20:58:44 -0800 Received: by rv-out-0910.google.com with SMTP id k20so1515943rvb for ; Tue, 06 Nov 2007 20:58:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=/OCOo4yLX3XBaDQTzpW04ma0EgRKZRo8JYDfB4wlBIg=; b=Uv0CbHorBtp086Jl0VQEZcTQn7fSY3ejCkIGrfHG5P9bN+lKV5+3fG55oNQQdO4IgBz5rwLtFvmyA1Q6SdmwD+p/7+jYnJDiJAFaDDOmKsydFTfyAs1Rhs3dzpJo0NWDWU/t+qSIfvtmwuiFcsPZCND7RcYLIIs6hkXZ6o8Oh/s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=gsXjC/lFtz1weSFvus2RAqiR1dAFuxF/d64BtQq2FnNdTyZqtERr40PfcHALH4fU6MYA9WxmSdjbCbDhVLsGbuEUdSyNv08jFEkaUICNQqA9zTIaqBk3ZiE8OAY/vC9otadD6OVw6OGBoh3zshX7d3Uk9uhsWscvavuOoSfUbqE= Received: by 10.115.88.1 with SMTP id q1mr1689915wal.1194409932422; Tue, 06 Nov 2007 20:32:12 -0800 (PST) Received: by 10.115.88.8 with HTTP; Tue, 6 Nov 2007 20:32:12 -0800 (PST) Message-ID: Date: Wed, 7 Nov 2007 10:02:12 +0530 From: "Manoj Kumar Pradhan" To: xfs@oss.sgi.com Subject: Deviation from XSDM in DM_EVENT_XXX MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.91.2/4689/Tue Nov 6 20:23:47 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13579 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: manojkp80@gmail.com Precedence: bulk X-list: xfs Hi, Can someone tell me why XFS-DMAPI deviates in the enum DM_EVEN_XXX from the standard? Thanks, Manoj From owner-xfs@oss.sgi.com Tue Nov 6 21:18:59 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 21:19:05 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA75Isf5024386 for ; Tue, 6 Nov 2007 21:18:57 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA20497; Wed, 7 Nov 2007 16:18:52 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA75IpdD96666406; Wed, 7 Nov 2007 16:18:52 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA75ImNS96065424; Wed, 7 Nov 2007 16:18:48 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Wed, 7 Nov 2007 16:18:48 +1100 From: David Chinner To: "Robert P. J. Day" Cc: xfs@oss.sgi.com Subject: Re: use is_power_of_2() macro? Message-ID: <20071107051848.GI995458@sgi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4689/Tue Nov 6 20:23:47 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13580 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Tue, Nov 06, 2007 at 10:28:44AM -0500, Robert P. J. Day wrote: > > given this in fs/xfs/xfs_inode.c: > > /* > * xfs_iroundup: round up argument to next power of two > */ > uint > xfs_iroundup( > uint v) > { > int i; > uint m; > > if ((v & (v - 1)) == 0) > return v; > ASSERT((v & 0x80000000) == 0); > if ((v & (v + 1)) == 0) > return v + 1; > for (i = 0, m = 1; i < 31; i++, m <<= 1) { > if (v & m) > continue; > v |= m; > if ((v & (v + 1)) == 0) > return v + 1; > } > ASSERT(0); > return( 0 ); > } > > is there any reason that can't be rewritten with simply > roundup_pow_of_two() as defined in include/linux/log2.h? > > #define roundup_pow_of_two(n) \ > ( \ > __builtin_constant_p(n) ? ( \ > (n == 1) ? 1 : \ > (1UL << (ilog2((n) - 1) + 1)) \ > ) : \ > __roundup_pow_of_two(n) \ > ) > > just curious. No - patch please. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Tue Nov 6 21:42:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 21:42:36 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=AWL,BAYES_00,HTML_MESSAGE, J_CHICKENPOX_42 autolearn=no version=3.3.0-r574664 Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.224]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA75gOGb026653 for ; Tue, 6 Nov 2007 21:42:25 -0800 Received: by nz-out-0506.google.com with SMTP id x3so1391256nzd for ; Tue, 06 Nov 2007 21:42:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; bh=wlZWvUDl1yZ4aFk5SmE5QW4tW1UDfVwiNd9J8RN7mTk=; b=eIjCB0Sv9TOiztUUbEL+PLfvTO7vL0rx+zc4sVLea302OlHwV8CA6WZS1onJf84Ir9crzPfMxA5MT0IZ4RYRRsr0zSCRyEsE9Uar+1woLJhbvPp/M3mZrWdd3rr7BgelMclrBcFVqasYjQsQj3BhMzsSKKvhDzCWoH+7udMt0Y4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=Fz0P8MWta04jnXP4EppGxDEBGmidyZt3FHCEfzJb91Q8aKHRMlwFqcmwOBWJi4X9q+TQh4nK80ylYBqJZ0ppCuErlfveYeDeaT2V4qXYJm+YZBiNxb0HgmOwwXx/IN8QTVBAuTdZ+NZtq7hw42Dn/k7TobcaPl3r/m0xhWQDeok= Received: by 10.142.213.9 with SMTP id l9mr1911578wfg.1194414148592; Tue, 06 Nov 2007 21:42:28 -0800 (PST) Received: by 10.142.162.19 with HTTP; Tue, 6 Nov 2007 21:42:28 -0800 (PST) Message-ID: Date: Wed, 7 Nov 2007 11:12:28 +0530 From: "Bhagi rathi" To: nscott@aconex.com Subject: Re: TAKE 972756 - Implement fallocate. Cc: "David Chinner" , xfs@oss.sgi.com In-Reply-To: <1194388733.3862.206.camel@edge.yarra.acx> MIME-Version: 1.0 References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> <20071106001223.GY66820511@sgi.com> <20071106204100.GW995458@sgi.com> <1194388733.3862.206.camel@edge.yarra.acx> X-Virus-Scanned: ClamAV 0.91.2/4689/Tue Nov 6 20:23:47 2007 on oss.sgi.com X-Virus-Status: Clean Content-Type: text/plain Content-Disposition: inline Content-Transfer-Encoding: 7bit Content-length: 1345 X-archive-position: 13581 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jahnu77@gmail.com Precedence: bulk X-list: xfs Since size log change and data I/O are not binded, it is always possible that size can reach to the disk before I/O reaching to the disk. Also, the other problem is because of speculative allocation. A write-back allocation can leady to allocation of delayed extents into real and gets pruned only close of the file. Before that, we get fallocate, it allocates the exents, but the extents residing because of delayed allocation write-back will not have zero'ed content. Conceptually, fallocate if it intends to change size, it is no way different from size extending write. We do xfs_zero_eof for write and not in this case. Probably, I am missing the context of usage of fallocate if it has some semantics over-loaded. -Thanks, Bhagi. On 11/7/07, Nathan Scott wrote: > > On Wed, 2007-11-07 at 07:41 +1100, David Chinner wrote: > > > Preallocation happened from 1k to 256k. Now, it looks to me that we > > have > > > un-written extents from 4k to 256k. There is no guarantee that data > > from 1k > > > to 4k is all zero'es. > > That guarantee does exist - when the initial 1K block write is done, the > end of the block is zeroed (by the kernel write path). This is always > done (guaranteed) and is required independently to unwritten extents. > > cheers. > > -- > Nathan > > [[HTML alternate version deleted]] From owner-xfs@oss.sgi.com Tue Nov 6 23:36:26 2007 Received: with ECARTIS (v1.0.0; list xfs); Tue, 06 Nov 2007 23:36:32 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from astoria.ccjclearline.com (astoria.ccjclearline.com [64.235.106.9]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA77aMvs007362 for ; Tue, 6 Nov 2007 23:36:26 -0800 Received: from [99.236.101.138] (helo=crashcourse.ca) by astoria.ccjclearline.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.68) (envelope-from ) id 1IpfSd-0005EZ-28; Wed, 07 Nov 2007 02:36:27 -0500 Date: Wed, 7 Nov 2007 02:34:36 -0500 (EST) From: "Robert P. J. Day" X-X-Sender: rpjday@localhost.localdomain To: xfs@oss.sgi.com cc: dgc@sgi.com Subject: [PATCH] XFS: Use kernel-supplied "roundup_pow_of_two" for simplicity. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - astoria.ccjclearline.com X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - crashcourse.ca X-Source: X-Source-Args: X-Source-Dir: X-Virus-Scanned: ClamAV 0.91.2/4691/Tue Nov 6 21:39:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13582 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: rpjday@crashcourse.ca Precedence: bulk X-list: xfs Signed-off-by: Robert P. J. Day --- compile-tested on i386. fs/xfs/xfs_inode.c | 32 ++++---------------------------- fs/xfs/xfs_inode.h | 1 - 2 files changed, 4 insertions(+), 29 deletions(-) diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index abf509a..bcc3d27 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -15,6 +15,8 @@ * along with this program; if not, write the Free Software Foundation, * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA */ +#include + #include "xfs.h" #include "xfs_fs.h" #include "xfs_types.h" @@ -3672,32 +3674,6 @@ xfs_iaccess( return XFS_ERROR(EACCES); } -/* - * xfs_iroundup: round up argument to next power of two - */ -uint -xfs_iroundup( - uint v) -{ - int i; - uint m; - - if ((v & (v - 1)) == 0) - return v; - ASSERT((v & 0x80000000) == 0); - if ((v & (v + 1)) == 0) - return v + 1; - for (i = 0, m = 1; i < 31; i++, m <<= 1) { - if (v & m) - continue; - v |= m; - if ((v & (v + 1)) == 0) - return v + 1; - } - ASSERT(0); - return( 0 ); -} - #ifdef XFS_ILOCK_TRACE ktrace_t *xfs_ilock_trace_buf; @@ -4204,7 +4180,7 @@ xfs_iext_realloc_direct( return; } if (!is_power_of_2(new_size)){ - rnew_size = xfs_iroundup(new_size); + rnew_size = roundup_pow_of_two(new_size); } if (rnew_size != ifp->if_real_bytes) { ifp->if_u1.if_extents = @@ -4227,7 +4203,7 @@ xfs_iext_realloc_direct( else { new_size += ifp->if_bytes; if (!is_power_of_2(new_size)) { - rnew_size = xfs_iroundup(new_size); + rnew_size = roundup_pow_of_two(new_size); } xfs_iext_inline_to_direct(ifp, rnew_size); } diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index e5aff92..e3a552e 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -568,7 +568,6 @@ int xfs_iextents_copy(xfs_inode_t *, xfs_bmbt_rec_t *, int); int xfs_iflush(xfs_inode_t *, uint); void xfs_iflush_all(struct xfs_mount *); int xfs_iaccess(xfs_inode_t *, mode_t, cred_t *); -uint xfs_iroundup(uint); void xfs_ichgtime(xfs_inode_t *, int); xfs_fsize_t xfs_file_last_byte(xfs_inode_t *); void xfs_lock_inodes(xfs_inode_t **, int, int, uint); -- ======================================================================== Robert P. J. Day Linux Consulting, Training and Annoying Kernel Pedantry Waterloo, Ontario, CANADA http://crashcourse.ca ======================================================================== From owner-xfs@oss.sgi.com Wed Nov 7 01:34:52 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 01:34:56 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_42 autolearn=no version=3.3.0-r574664 Received: from postoffice.aconex.com (mail.app.aconex.com [203.89.192.138]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA79YoO4025749 for ; Wed, 7 Nov 2007 01:34:52 -0800 Received: from mail.aconex.com (castle.yarra.acx [192.168.3.3]) by postoffice.aconex.com (Postfix) with ESMTP id 2168D92C8E4; Wed, 7 Nov 2007 20:34:56 +1100 (EST) Received: from 192.168.3.1 (proxying for 58.107.42.33) (SquirrelMail authenticated user nscott) by mail.aconex.com with HTTP; Wed, 7 Nov 2007 20:35:21 +1100 (EST) Message-ID: <56697.192.168.3.1.1194428121.squirrel@mail.aconex.com> In-Reply-To: References: <20071102024314.9BF3458C38F7@chook.melbourne.sgi.com> <20071106001223.GY66820511@sgi.com> <20071106204100.GW995458@sgi.com> <1194388733.3862.206.camel@edge.yarra.acx> Date: Wed, 7 Nov 2007 20:35:21 +1100 (EST) Subject: Re: TAKE 972756 - Implement fallocate. From: nscott@aconex.com To: "Bhagi rathi" Cc: "David Chinner" , xfs@oss.sgi.com User-Agent: SquirrelMail/1.4.8-4.el4.centos MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV 0.91.2/4691/Tue Nov 6 21:39:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13583 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: nscott@aconex.com Precedence: bulk X-list: xfs > Since size log change and data I/O are not binded, it is always possible > that size can reach to the > disk before I/O reaching to the disk. Not clear what that has to do with whether partial blocks are zeroed or not? Can you give a specific series of steps that would demonstrate a problem? (preferably with a test case) > Also, the other problem is because > of > speculative allocation. > A write-back allocation can leady to allocation of delayed extents into > real > and gets pruned only > close of the file. > Before that, we get fallocate, it allocates the exents, > but the extents residing > because of delayed allocation write-back will not have zero'ed content. Again, I think a test case demonstrating the problem would go a long way to helping explain the issue. The preallocation code and ioctl interface have been in XFS forever on Linux - are you reporting problems you've actually observed here, or are these rather "potential issues" that you foresee from code analysis? cheers. -- Nathan From owner-xfs@oss.sgi.com Wed Nov 7 01:55:37 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 01:55:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA79tXg3028455 for ; Wed, 7 Nov 2007 01:55:35 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id UAA25829; Wed, 7 Nov 2007 20:55:33 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA79tWdD96463197; Wed, 7 Nov 2007 20:55:33 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA79tUun92134906; Wed, 7 Nov 2007 20:55:30 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Wed, 7 Nov 2007 20:55:30 +1100 From: David Chinner To: Manoj Kumar Pradhan Cc: xfs@oss.sgi.com Subject: Re: Deviation from XSDM in DM_EVENT_XXX Message-ID: <20071107095530.GJ995458@sgi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4691/Tue Nov 6 21:39:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13584 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Wed, Nov 07, 2007 at 10:02:12AM +0530, Manoj Kumar Pradhan wrote: > Hi, > > Can someone tell me why XFS-DMAPI deviates in the enum DM_EVEN_XXX > from the standard? From the spec: (http://www.opengroup.org/onlinepubs/9657099/chap4.htm) " dm_eventtype_t REQUIREMENT This enumeration must contain at least the elements listed here. The DMAPI implementation may choose a different order for the elements. " So as long as we have the events defined, it doesn't matter what their value or order in the enum is. It's a very rubbery spec.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Wed Nov 7 02:57:54 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 02:57:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from spike.grumly.eu.org (spike.grumly.eu.org [195.5.253.226]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA7AvoJU004674 for ; Wed, 7 Nov 2007 02:57:54 -0800 Received: by spike.grumly.eu.org (Postfix, from userid 1001) id 528F911935; Wed, 7 Nov 2007 11:58:14 +0100 (CET) Date: Wed, 7 Nov 2007 11:58:14 +0100 From: Cedric - Equinoxe Media To: David Chinner Cc: xfs@oss.sgi.com Subject: Re: xfs crash Message-ID: <20071107105814.GD25295@e-m.fr> References: <20071105215135.GA12238@e-m.fr> <20071106082632.GU995458@sgi.com> <20071106092157.GB16694@e-m.fr> <20071106160721.GB25295@e-m.fr> <20071106205556.GZ995458@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20071106205556.GZ995458@sgi.com> X-Virus-Scanned: ClamAV 0.91.2/4691/Tue Nov 6 21:39:41 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13585 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: cedric@e-m.fr Precedence: bulk X-list: xfs On 07/11/2007 07:55, David Chinner wrote: > On Tue, Nov 06, 2007 at 05:07:21PM +0100, Cedric - Equinoxe Media wrote: > > I just had exactly the same crash again today : > > /dev/sda4 on /filer type xfs (rw,noexec,nosuid,nodev,noatime) > > What did xfs_check tell you about the corruption? I had quite the same message as last time, forgot to copy/paste... > > Seems to be again on a setattr() ? > > Doing a truncation freeing some blocks. > > What is the client doing (i.e. io patterns, application, etc) to > cause this? can you reproduce it without NFS being used? To track > this down I'm going to need a reproducable test case.... It is 5 web servers with php as nfsv3 clients. > Seeing this is a brand new server, have you run and soak or stress > test on the raw storage to confirm it is error free? I have run bonnie++ and memtest86+ with no errors. I am now trying to recompile linux without nfsv4, ACL and all experimental features of nfs and xfs. -- Cédric Tabary Ingénieur réseau - Equinoxe Media +33 (0)6 77 45 80 15 From owner-xfs@oss.sgi.com Wed Nov 7 16:38:51 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 16:39:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA80cmTx011409 for ; Wed, 7 Nov 2007 16:38:50 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA17067; Thu, 8 Nov 2007 11:38:52 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA80cpdD95857910; Thu, 8 Nov 2007 11:38:51 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA80cmCg97600867; Thu, 8 Nov 2007 11:38:48 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Thu, 8 Nov 2007 11:38:48 +1100 From: David Chinner To: xfs-oss Cc: xfs-dev Subject: [patch] Fix broken inode clustering Message-ID: <20071108003848.GA66820511@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4694/Wed Nov 7 10:55:51 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13586 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs The radix tree based inode caches did away with the inode cluster hashes, replacing them with a bunch of masking and gang lookups on the radix tree. This masking got broken when moving the code to per-ag radix trees and indexing by agino # rather than straight inode number. The result is clustered inode writeback does not cluster and things can go extremely slowly when there are lots of inodes to write. The following patch fixes this up by comparing agino # of the inode found to the index of the cluster we are looking for. Signed-off-by: Dave Chinner Tested-by: Torsten Kaiser --- fs/xfs/xfs_iget.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: 2.6.x-xfs-new/fs/xfs/xfs_iget.c =================================================================== --- 2.6.x-xfs-new.orig/fs/xfs/xfs_iget.c 2007-11-02 13:44:46.000000000 +1100 +++ 2.6.x-xfs-new/fs/xfs/xfs_iget.c 2007-11-07 13:08:42.534440675 +1100 @@ -248,7 +248,7 @@ finish_inode: icl = NULL; if (radix_tree_gang_lookup(&pag->pag_ici_root, (void**)&iq, first_index, 1)) { - if ((iq->i_ino & mask) == first_index) + if ((XFS_INO_TO_AGINO(mp, iq->i_ino) & mask) == first_index) icl = iq->i_cluster; } From owner-xfs@oss.sgi.com Wed Nov 7 18:34:38 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 18:34:46 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=ANY_BOUNCE_MESSAGE,AWL, BAYES_50,VBOUNCE_MESSAGE autolearn=no version=3.3.0-r574664 Received: from omr-m23.mx.aol.com (omr-m23.mx.aol.com [64.12.136.131]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA82YZ1P022538 for ; Wed, 7 Nov 2007 18:34:38 -0800 Received: from rly-mc05.mail.aol.com (rly-mc05.mail.aol.com [172.20.118.147]) by omr-m23.mx.aol.com (v117.7) with ESMTP id MAILOMRM234-7dff473275c0357; Wed, 07 Nov 2007 21:34:40 -0400 Received: from localhost (localhost) by rly-mc05.mail.aol.com (8.8.8/8.8.8/AOL-5.0.0) with internal id VAA21506; Wed, 7 Nov 2007 21:34:40 -0500 (EST) Date: Wed, 7 Nov 2007 21:34:40 -0500 (EST) From: Mail Delivery Subsystem Message-Id: <200711080234.VAA21506@rly-mc05.mail.aol.com> To: MIME-Version: 1.0 Content-Type: multipart/report; report-type=delivery-status; boundary="VAA21506.1194489280/rly-mc05.mail.aol.com" Subject: Returned mail: Service unavailable Auto-Submitted: auto-generated (failure) X-AOL-INRLY: host121.sleepys.com [65.200.161.121] rly-mc05 X-AOL-IP: 172.20.118.147 X-Virus-Scanned: ClamAV 0.91.2/4695/Wed Nov 7 16:08:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13587 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: MAILER-DAEMON@aol.com Precedence: bulk X-list: xfs This is a MIME-encapsulated message --VAA21506.1194489280/rly-mc05.mail.aol.com The original message was received at Wed, 7 Nov 2007 21:34:22 -0500 (EST) from host121.sleepys.com [65.200.161.121] *** ATTENTION *** Your e-mail is being returned to you because there was a problem with its delivery. The address which was undeliverable is listed in the section labeled: "----- The following addresses had permanent fatal errors -----". The reason your mail is being returned to you is listed in the section labeled: "----- Transcript of Session Follows -----". The line beginning with "<<<" describes the specific reason your e-mail could not be delivered. The next line contains a second error message which is a general translation for other e-mail servers. Please direct further questions regarding this message to your e-mail administrator. --AOL Postmaster ----- The following addresses had permanent fatal errors ----- ----- Transcript of session follows ----- ... while talking to air-mc04.mail.aol.com.: >>> DATA <<< 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent. 554 ... Service unavailable --VAA21506.1194489280/rly-mc05.mail.aol.com Content-Type: message/delivery-status Reporting-MTA: dns; rly-mc05.mail.aol.com Arrival-Date: Wed, 7 Nov 2007 21:34:22 -0500 (EST) Final-Recipient: RFC822; rsexymama1965@aol.com Action: failed Status: 5.0.0 Remote-MTA: DNS; air-mc04.mail.aol.com Diagnostic-Code: SMTP; 554 TRANSACTION FAILED - Unrepairable Virus Detected. Your mail has not been sent. Last-Attempt-Date: Wed, 7 Nov 2007 21:34:40 -0500 (EST) --VAA21506.1194489280/rly-mc05.mail.aol.com Content-Type: text/rfc822-headers Received: from oss.sgi.com (host121.sleepys.com [65.200.161.121]) by rly-mc05.mail.aol.com (v120.9) with ESMTP id MAILRELAYINMC510-12c473275ad1f; Wed, 07 Nov 2007 21:34:21 -0400 From: linux-xfs@oss.sgi.com To: rsexymama1965@aol.com Subject: RETURNED MAIL: DATA FORMAT ERROR Date: Wed, 7 Nov 2007 21:34:21 -0500 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0006_AB6620E8.A6952C57" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-AOL-IP: 65.200.161.121 X-AOL-SCOLL-SCORE: 0:2:268420784:9395240 X-AOL-SCOLL-URL_COUNT: X-AOL-SCOLL-AUTHENTICATION: listenair ; SPF_helo : X-AOL-SCOLL-AUTHENTICATION: listenair ; SPF_822_from : Message-ID: <200711072134.12c473275ad1f@rly-mc05.mail.aol.com> --VAA21506.1194489280/rly-mc05.mail.aol.com-- From owner-xfs@oss.sgi.com Wed Nov 7 19:12:55 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 19:12:58 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA83Co1V026200 for ; Wed, 7 Nov 2007 19:12:53 -0800 Received: from timothy-shimmins-power-mac-g5.local (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA20497; Thu, 8 Nov 2007 14:12:49 +1100 Message-ID: <47327ED2.8060402@sgi.com> Date: Thu, 08 Nov 2007 14:13:22 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Roger Willcocks CC: xfs@oss.sgi.com Subject: Re: bug: truncate to zero + setuid References: <47249E7A.7060709@filmlight.ltd.uk> <47252F62.6030503@sgi.com> <47262CD0.5010708@filmlight.ltd.uk> <4726ADAE.9070206@sgi.com> <472769A1.5090605@filmlight.ltd.uk> <472A7940.5070800@sgi.com> <000001c81f3e$eff344b0$6501a8c0@BODDINGTON> In-Reply-To: <000001c81f3e$eff344b0$6501a8c0@BODDINGTON> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4699/Wed Nov 7 18:08:23 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13588 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Hi Roger, Roger Willcocks wrote: > Timothy Shimmin wrote: >> Hi Roger, >> > ... >> I don't like all these inconsistencies. > > Take a look at the attached patch relative to the current cvs (it's a > bit big to put > inline). The basic problem is it's currently unclear when to set the > times from > va_atime etc. and when to set them to the current time. So I've used the > already > defined XFS_AT_UPDxTIME flags to indicate that a time should be set to > 'now' > and XFS_AT_xTIME to mean set it using va_xtime. This seems to fit well with > the current code and I wonder if that's how it was meant to work in the > first > place. Yeah, I've looked at this a few times now ;-) and this _seems_ like a reasonable thing to do to me. So patch: ATTR_ATIME_SET => XFS_AT_ATIME (& set va_atime etc) (used to set to given time) ATTR_ATIME => XFS_AT_UPDATIME (used to set to "now") likewise for M variant. Previously: ATTR_ATIME_SET => ATTR_UTIME flag (used to set given time) must expect ATTR_ATIME to be set too to get va_atime ATTR_ATIME => XFS_AT_ATIME (& set va_atime) (used to set to "now") a bit confusing since it can store va_atime even if ATTR_ATIME_SET is not on > I've also removed the now redundant ATTR_UTIME flag and pulled > the null truncate to the top, which simplifies things. > So these changes of: if (mask & (XFS_AT_ATIME|XFS_AT_MTIME)) { if (!file_owner) { - if ((flags & ATTR_UTIME) && - !capable(CAP_FOWNER)) { + if (!capable(CAP_FOWNER)) { Where you take out ATTR_UTIME make sense since XFS_AT_ATIME et al, now refer to the case where a given time is provided instead of requiring ATTR_UTIME to be set. > One query: in both xfs_iops.c/xfs_vn_setattr and > xfs_dm.c/xfs_dm_set_fileattr the > ATIME branch sets the inode's atime directly. xfs_vn_setattr() if (ia_valid & ATTR_ATIME) { vattr.va_mask |= XFS_AT_ATIME; vattr.va_atime = attr->ia_atime; inode->i_atime = attr->ia_atime; } xfs_dm_set_fileattr() if (mask & DM_AT_ATIME) { vat.va_mask |= XFS_AT_ATIME; vat.va_atime.tv_sec = stat.fa_atime; vat.va_atime.tv_nsec = 0; inode->i_atime.tv_sec = stat.fa_atime; } Hmmm.... So this could change behavior for xfs_vn_setattr(). If previously we had ATTR_ATIME set but NOT ATTR_ATIME_SET, then we would set inode->i_atime. Now with the patch, in this case, we don't set inode->i_atime at this point. However, in this case we wouldn't want i_atime to be set to ia_atime as we would want it to be set to "now" in xfs_ichgtime(). > This is probably something > to do with > the comment above xfs_iops.c/xfs_ichgtime ('to make sure the access time > update > will take') but it could probably be handled better. > I'll need to look. >> BTW, your locking looks wrong - it appears you don't unlock when the >> file is non-zero size. > > Oops... > I was also thinking of a read lock here. And initializing quot vars to zero in variable definition at top. This stuff really needs to be QA'ed well. It would be too easy to get a regression in expected behavior. Need to hunt out qa tests. Thanks for the effort, Tim. From owner-xfs@oss.sgi.com Wed Nov 7 22:45:42 2007 Received: with ECARTIS (v1.0.0; list xfs); Wed, 07 Nov 2007 22:45:45 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA86jaUx015322 for ; Wed, 7 Nov 2007 22:45:40 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id RAA24077; Thu, 8 Nov 2007 17:45:34 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA86jXdD97868567; Thu, 8 Nov 2007 17:45:33 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA86jUrZ97942740; Thu, 8 Nov 2007 17:45:30 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Thu, 8 Nov 2007 17:45:30 +1100 From: David Chinner To: "Robert P. J. Day" Cc: xfs@oss.sgi.com, dgc@sgi.com Subject: Re: [PATCH] XFS: Use kernel-supplied "roundup_pow_of_two" for simplicity. Message-ID: <20071108064530.GE66820511@sgi.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4703/Wed Nov 7 20:19:56 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13589 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Wed, Nov 07, 2007 at 02:34:36AM -0500, Robert P. J. Day wrote: > > Signed-off-by: Robert P. J. Day > > --- > > compile-tested on i386. Thanks. I'll QA it and queue it up for .25. Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 8 03:27:12 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 03:27:16 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from waldorf.loreland.org (uk.loreland.org [89.16.172.112]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA8BR7Ub026046 for ; Thu, 8 Nov 2007 03:27:12 -0800 Received: by waldorf.loreland.org (Postfix, from userid 33) id B00BC200AA; Thu, 8 Nov 2007 11:27:09 +0000 (GMT) To: xfs@oss.sgi.com Subject: =?UTF-8?Q?xfs=5Frepair=20=32=2E=39=2E=34=20threading/progress=20info=20mi?= =?UTF-8?Q?ssing=3F?= MIME-Version: 1.0 Date: Thu, 8 Nov 2007 11:27:09 +0000 From: James Braid Message-ID: X-Sender: jamesb@loreland.org User-Agent: RoundCube Webmail/0.1-rc1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.91.2/4708/Wed Nov 7 22:07:54 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13590 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jamesb@loreland.org Precedence: bulk X-list: xfs I just upgraded to xfsprogs 2.9.4 and it seems the multi-threading and progress information is no longer being reported? 2.8.18: - creating 8 worker thread(s) Phase 1 - find and verify superblock... - reporting progress in intervals of 15 minutes Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... 2.9.4: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Have I just mis-compiled something? The progress information in particular was REALLY useful on our bigger filesystems. From owner-xfs@oss.sgi.com Thu Nov 8 05:13:50 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 05:13:53 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00, SUBJECT_FUZZY_TION autolearn=no version=3.3.0-r574664 Received: from r2d2.neofacto.lu (mail.neofacto.lu [158.64.60.195]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA8DDlAj009037 for ; Thu, 8 Nov 2007 05:13:50 -0800 Received: from localhost (localhost [127.0.0.1]) by r2d2.neofacto.lu (Postfix) with ESMTP id 8BE67C2F95; Thu, 8 Nov 2007 14:13:51 +0100 (CET) X-Virus-Scanned: ClamAV 0.91.2/4708/Wed Nov 7 22:07:54 2007 on oss.sgi.com X-Virus-Scanned: Ubuntu amavisd-new at neofacto.lu Received: from r2d2.neofacto.lu ([127.0.0.1]) by localhost (r2d2.neofacto.lu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8WEtiOV3IGPu; Thu, 8 Nov 2007 14:13:37 +0100 (CET) Received: by r2d2.neofacto.lu (Postfix, from userid 65534) id B8116C2F99; Thu, 8 Nov 2007 14:13:37 +0100 (CET) Received: from [192.168.1.166] (SU105.tudor.lu [158.64.4.205]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by r2d2.neofacto.lu (Postfix) with ESMTP id 10BC1C2F95; Thu, 8 Nov 2007 14:13:28 +0100 (CET) Message-ID: <47330B8E.5010008@jamendo.com> Date: Thu, 08 Nov 2007 14:13:50 +0100 From: Amandine AUPETIT User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: Eric Sandeen CC: Joshua Baker-LePain , xfs@oss.sgi.com Subject: Re: 7Tb XFS partition lost on reboot References: <473072FD.4070104@jamendo.com> <4730B98C.5090008@sandeen.net> In-Reply-To: <4730B98C.5090008@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Status: Clean X-archive-position: 13591 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: amandine@jamendo.com Precedence: bulk X-list: xfs Hi, Thanks for the advice ! I checked the label with xfs_admin -l : # xfs_admin -l /dev/cciss/c1d0p1 label = "" I tried to blank it in case there is something invisible : # xfs_admin -L -- /dev/cciss/c1d0p1 writing all SBs new label = "" But it seems to be the same. :( Amandine Eric Sandeen a écrit : > Joshua Baker-LePain wrote: > >> On Tue, 6 Nov 2007 at 2:58pm, Amandine AUPETIT wrote >> >> >>> So I created the partition with parted, because fdisk can't do more that 2tb >>> partitions. >>> It's ok, I can do what I want but... >>> >>> on reboot, there is a Superblock problem, something like that. When I check >>> with xfs_check : >>> >> First guess -- did you use a gpt disklabel on that device? Standard >> (msdos) disklabels don't work on devices >2TB. The usual symptom of a big >> device with an msdos disklabel is that the partition table goes away on >> reboot. >> > > I second that hunch. :) > > -Eric > From owner-xfs@oss.sgi.com Thu Nov 8 06:41:33 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 06:41:38 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS, SUBJECT_FUZZY_TION autolearn=no version=3.3.0-r574664 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA8EfSDM021423 for ; Thu, 8 Nov 2007 06:41:32 -0800 Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 26C8D18008612; Thu, 8 Nov 2007 08:41:33 -0600 (CST) Message-ID: <4733201C.9060802@sandeen.net> Date: Thu, 08 Nov 2007 08:41:32 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Amandine AUPETIT CC: Joshua Baker-LePain , xfs@oss.sgi.com Subject: Re: 7Tb XFS partition lost on reboot References: <473072FD.4070104@jamendo.com> <4730B98C.5090008@sandeen.net> <47330B8E.5010008@jamendo.com> In-Reply-To: <47330B8E.5010008@jamendo.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4708/Wed Nov 7 22:07:54 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13592 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Amandine AUPETIT wrote: > Hi, > Thanks for the advice ! > I checked the label with xfs_admin -l : > # xfs_admin -l /dev/cciss/c1d0p1 > label = "" > > I tried to blank it in case there is something invisible : > # xfs_admin -L -- /dev/cciss/c1d0p1 > writing all SBs > new label = "" > > But it seems to be the same. :( No, not the filesystem label, the disklabel, otherwise known as the partition table - dos vs. gpt. This is something you set with parted or fdisk, not xfs_admin. -Eric From owner-xfs@oss.sgi.com Thu Nov 8 06:59:36 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 06:59:39 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_40, SUBJECT_FUZZY_TION autolearn=no version=3.3.0-r574664 Received: from r2d2.neofacto.lu (mail.neofacto.lu [158.64.60.195]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA8ExWfY024308 for ; Thu, 8 Nov 2007 06:59:35 -0800 Received: from localhost (localhost [127.0.0.1]) by r2d2.neofacto.lu (Postfix) with ESMTP id B37FFC2FC7; Thu, 8 Nov 2007 15:59:37 +0100 (CET) X-Virus-Scanned: ClamAV 0.91.2/4708/Wed Nov 7 22:07:54 2007 on oss.sgi.com X-Virus-Scanned: Ubuntu amavisd-new at neofacto.lu Received: from r2d2.neofacto.lu ([127.0.0.1]) by localhost (r2d2.neofacto.lu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iBCyDkDgGrAb; Thu, 8 Nov 2007 15:59:22 +0100 (CET) Received: by r2d2.neofacto.lu (Postfix, from userid 65534) id 4A3BDC2FC8; Thu, 8 Nov 2007 15:59:22 +0100 (CET) Received: from [192.168.1.166] (SU105.tudor.lu [158.64.4.205]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by r2d2.neofacto.lu (Postfix) with ESMTP id 6BAE9C2FC7; Thu, 8 Nov 2007 15:59:12 +0100 (CET) Message-ID: <47332456.9030805@jamendo.com> Date: Thu, 08 Nov 2007 15:59:34 +0100 From: Amandine AUPETIT User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: Eric Sandeen CC: Joshua Baker-LePain , xfs@oss.sgi.com Subject: Re: 7Tb XFS partition lost on reboot References: <473072FD.4070104@jamendo.com> <4730B98C.5090008@sandeen.net> <47330B8E.5010008@jamendo.com> <4733201C.9060802@sandeen.net> In-Reply-To: <4733201C.9060802@sandeen.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Status: Clean X-archive-position: 13593 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: amandine@jamendo.com Precedence: bulk X-list: xfs Ok, sorry for the mix-up. You were right, the problem was about the partition table. Actually, even if it was a gpt partition table, it was corrupted. I seems to be a ubuntu parted problem. I had to rebuild pared from the sources and when I ran it, it said : # ./parted GNU Parted 1.8.8 Using /dev/cciss/c1d0 Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) print print Warning: /dev/cciss/c1d0 contains GPT signatures, indicating that it has a GPT table. However, it does not have a valid fake msdos partition table, as it should. Perhaps it was corrupted -- possibly by a program that doesn't understand GPT partition tables. Or perhaps you deleted the GPT table, and are now using an msdos partition table. Is this a GPT partition table? Yes/No? yes yes Model: Compaq Smart Array (cpqarray) Disk /dev/cciss/c1d0: 7501GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 17.4kB 7501GB 7501GB xfs primary So I deleted the existing partition and recreated a new one. I found back all my data on this new partition, and on reboot there is no problem anymore. Thanks a lot for your help ! :) Amandine Eric Sandeen a écrit : > Amandine AUPETIT wrote: > >> Hi, >> Thanks for the advice ! >> I checked the label with xfs_admin -l : >> # xfs_admin -l /dev/cciss/c1d0p1 >> label = "" >> >> I tried to blank it in case there is something invisible : >> # xfs_admin -L -- /dev/cciss/c1d0p1 >> writing all SBs >> new label = "" >> >> But it seems to be the same. :( >> > > No, not the filesystem label, the disklabel, otherwise known as the > partition table - dos vs. gpt. This is something you set with parted or > fdisk, not xfs_admin. > > -Eric > From owner-xfs@oss.sgi.com Thu Nov 8 07:14:31 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 07:14:34 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS, SUBJECT_FUZZY_TION autolearn=no version=3.3.0-r574664 Received: from sandeen.net (sandeen.net [209.173.210.139]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA8FET4r026436 for ; Thu, 8 Nov 2007 07:14:30 -0800 Received: from liberator.sandeen.net (liberator.sandeen.net [10.0.0.4]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by sandeen.net (Postfix) with ESMTP id 3890C18008614; Thu, 8 Nov 2007 09:14:35 -0600 (CST) Message-ID: <473327DA.8070909@sandeen.net> Date: Thu, 08 Nov 2007 09:14:34 -0600 From: Eric Sandeen User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Amandine AUPETIT CC: Joshua Baker-LePain , xfs@oss.sgi.com Subject: Re: 7Tb XFS partition lost on reboot References: <473072FD.4070104@jamendo.com> <4730B98C.5090008@sandeen.net> <47330B8E.5010008@jamendo.com> <4733201C.9060802@sandeen.net> <47332456.9030805@jamendo.com> In-Reply-To: <47332456.9030805@jamendo.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4708/Wed Nov 7 22:07:54 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13594 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: sandeen@sandeen.net Precedence: bulk X-list: xfs Amandine AUPETIT wrote: > So I deleted the existing partition and recreated a new one. I found > back all my data on this new partition, and on reboot there is no > problem anymore. Wow, I love a happy ending, especially when it involves 7T of data ;-) -Eric From owner-xfs@oss.sgi.com Thu Nov 8 15:29:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 15:29:26 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA8NTIRM001191 for ; Thu, 8 Nov 2007 15:29:21 -0800 Received: from pc-bnaujok.melbourne.sgi.com (pc-bnaujok.melbourne.sgi.com [134.14.55.58]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id KAA16669; Fri, 9 Nov 2007 10:29:18 +1100 Date: Fri, 09 Nov 2007 10:29:28 +1100 To: "James Braid" , xfs@oss.sgi.com Subject: Re: xfs_repair 2.9.4 threading/progress info missing? From: "Barry Naujok" Organization: SGI Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8 MIME-Version: 1.0 References: Message-ID: In-Reply-To: User-Agent: Opera Mail/9.24 (Win32) X-Virus-Scanned: ClamAV 0.91.2/4715/Thu Nov 8 14:31:53 2007 on oss.sgi.com X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from Quoted-Printable to 8bit by oss.sgi.com id lA8NTMRM001213 X-archive-position: 13595 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: bnaujok@sgi.com Precedence: bulk X-list: xfs On Thu, 08 Nov 2007 22:27:09 +1100, James Braid wrote: > > I just upgraded to xfsprogs 2.9.4 and it seems the multi-threading and > progress information is no longer being reported? > > 2.8.18: > - creating 8 worker thread(s) > Phase 1 - find and verify superblock... > - reporting progress in intervals of 15 minutes > Phase 2 - using internal log > - zero log... > - scan filesystem freespace and inode maps... > > 2.9.4: > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - zero log... > - scan filesystem freespace and inode maps... > - found root inode chunk > > Have I just mis-compiled something? The progress information in > particular > was REALLY useful on our bigger filesystems. Hi, It is sort of still there. With the performance improvements in 2.9.4, the multithreading implementation was changed and hidden from the user. A side effect of this change was the progress info is now only visible when using the ag_stride option. Depending on your drive layout, ag_stride may speed up repair even further, especially concats. With other layouts, try "-o ag_stride=1". I found on one of our RAIDs, that actually doubled doubled the repair performance on a 5 way stripe. Progress info for normal runs will be reintroduced in the near future and other output will be cleaned up a tad. Regards, Barry. From owner-xfs@oss.sgi.com Thu Nov 8 16:34:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 16:34:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_05,J_CHICKENPOX_62, J_CHICKENPOX_64 autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA90XxAB017738 for ; Thu, 8 Nov 2007 16:34:01 -0800 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id LAA18419; Fri, 9 Nov 2007 11:33:58 +1100 Message-ID: <4733AB27.70208@sgi.com> Date: Fri, 09 Nov 2007 11:34:47 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.6 (X11/20070728) MIME-Version: 1.0 To: David Chinner CC: xfs-oss , xfs-dev Subject: Re: [PATCH, RFC] Move AIL pushing into a separate thread References: <20071105050706.GW66820511@sgi.com> In-Reply-To: <20071105050706.GW66820511@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4715/Thu Nov 8 14:31:53 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13596 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs I like the sound of this Dave. I'm still going through the code in detail. Could we convert the ail lock into a mutex to ease the load? I know it may not improve throughput but it would at least relieve the CPUs to do other stuff. David Chinner wrote: > When many hundreds to thousands of threads all try to do simultaneous > transactions and the log is in a tail-pushing situation (i.e. full), > we can get multiple threads walking the AIL list and contending on > the AIL lock. > > Recently wevve had two cases of machines basically locking up because > most of the CPUs in the system are trying to obtain the AIL lock. > The first was an 8p machine with ~2,500 kernel threads trying to > do transactions, and the latest is a 2048p altix closing a file per > MPI rank in a synchronised fashion resulting in > 400 processes > all trying to walk and push the AIL at the same time. > > The AIL push is, in effect, a simple I/O dispatch algorithm complicated > by the ordering constraints placed on it by the transaction subsystem. > It really does not need multiple threads to push on it - even when > only a single CPU is pushing the AIL, it can push the I/O out far faster > that pretty much any disk subsystem can handle. > > So, to avoid contention problems stemming from multiple list walkers, > move the list walk off into another thread and simply provide a "target" > to push to. When a thread requires a push, it sets the target and wakes > the push thread, then goes to sleep waiting for the required amount > of space to become available in the log. > > This mechanism should also be a lot fairer under heavy load as the > waiters will queue in arrival order, rather than queuing in "who completed > a push first" order. > > Also, by moving the pushing to a separate thread we can do more effectively > overload detection and prevention as we can keep context from loop iteration > to loop iteration. That is, we can push only part of the list each loop and not > have to loop back to the start of the list every time we run. This should > also help by reducing the number of items we try to lock and/or push items > that we cannot move. > > Note that this patch is not intended to solve the inefficiencies in the > AIL structure and the associated issues with extremely large list contents. > That needs to be addresses separately; parallel access would cause problems > to any new structure as well, so I'm only aiming to isolate the structure > from unbounded parallelism here. > > Signed-Off-By: Dave Chinner > --- > fs/xfs/linux-2.6/xfs_super.c | 60 +++++++++++ > fs/xfs/xfs_log.c | 12 ++ > fs/xfs/xfs_mount.c | 6 - > fs/xfs/xfs_mount.h | 10 + > fs/xfs/xfs_trans.h | 1 > fs/xfs/xfs_trans_ail.c | 231 ++++++++++++++++++++++++++++--------------- > fs/xfs/xfs_trans_priv.h | 8 + > fs/xfs/xfsidbg.c | 12 +- > 8 files changed, 247 insertions(+), 93 deletions(-) > > Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_super.c 2007-11-05 10:39:05.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_super.c 2007-11-05 14:48:39.871177707 +1100 > @@ -51,6 +51,7 @@ > #include "xfs_vfsops.h" > #include "xfs_version.h" > #include "xfs_log_priv.h" > +#include "xfs_trans_priv.h" > > #include > #include > @@ -765,6 +766,65 @@ xfs_blkdev_issue_flush( > blkdev_issue_flush(buftarg->bt_bdev, NULL); > } > > +/* > + * XFS AIL push thread support > + */ > +void > +xfsaild_wakeup( > + xfs_mount_t *mp, > + xfs_lsn_t threshold_lsn) > +{ > + > + if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0) { > + mp->m_ail.xa_target = threshold_lsn; > + wake_up_process(mp->m_ail.xa_task); > + } > +} > + > +int > +xfsaild( > + void *data) > +{ > + xfs_mount_t *mp = (xfs_mount_t *)data; > + xfs_lsn_t last_pushed_lsn = 0; > + long tout = 0; > + > + while (!kthread_should_stop()) { > + if (tout) > + schedule_timeout_interruptible(msecs_to_jiffies(tout)); > + > + /* swsusp */ > + try_to_freeze(); > + > + /* we're either starting or stopping if there is no log */ > + if (!mp->m_log) > + continue; > + > + tout = xfsaild_push(mp, &last_pushed_lsn); > + } > + > + return 0; > +} /* xfsaild */ > + > +void > +xfsaild_start( > + xfs_mount_t *mp) > +{ > + mp->m_ail.xa_target = 0; > + mp->m_ail.xa_task = kthread_run(xfsaild, mp, "xfsaild"); > + ASSERT(!IS_ERR(mp->m_ail.xa_task)); > + /* XXX: should return error but nowhere to do it */ > +} > + > +void > +xfsaild_stop( > + xfs_mount_t *mp) > +{ > + kthread_stop(mp->m_ail.xa_task); > +} > + > + > + > STATIC struct inode * > xfs_fs_alloc_inode( > struct super_block *sb) > Index: 2.6.x-xfs-new/fs/xfs/xfs_log.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_log.c 2007-11-02 18:00:19.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_log.c 2007-11-05 14:07:16.850189316 +1100 > @@ -515,6 +515,12 @@ xfs_log_mount(xfs_mount_t *mp, > mp->m_log = xlog_alloc_log(mp, log_target, blk_offset, num_bblks); > > /* > + * Initialize the AIL now we have a log. > + */ > + spin_lock_init(&mp->m_ail_lock); > + xfs_trans_ail_init(mp); > + > + /* > * skip log recovery on a norecovery mount. pretend it all > * just worked. > */ > @@ -530,7 +536,7 @@ xfs_log_mount(xfs_mount_t *mp, > mp->m_flags |= XFS_MOUNT_RDONLY; > if (error) { > cmn_err(CE_WARN, "XFS: log mount/recovery failed: error %d", error); > - xlog_dealloc_log(mp->m_log); > + xfs_log_unmount_dealloc(mp); > return error; > } > } > @@ -722,10 +728,14 @@ xfs_log_unmount_write(xfs_mount_t *mp) > > /* > * Deallocate log structures for unmount/relocation. > + * > + * We need to stop the aild from running before we destroy > + * and deallocate the log as the aild references the log. > */ > void > xfs_log_unmount_dealloc(xfs_mount_t *mp) > { > + xfs_trans_ail_destroy(mp); > xlog_dealloc_log(mp->m_log); > } > > Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.c 2007-11-02 13:44:50.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.c 2007-11-05 14:12:22.554601173 +1100 > @@ -137,15 +137,9 @@ xfs_mount_init(void) > mp->m_flags |= XFS_MOUNT_NO_PERCPU_SB; > } > > - spin_lock_init(&mp->m_ail_lock); > spin_lock_init(&mp->m_sb_lock); > mutex_init(&mp->m_ilock); > mutex_init(&mp->m_growlock); > - /* > - * Initialize the AIL. > - */ > - xfs_trans_ail_init(mp); > - > atomic_set(&mp->m_active_trans, 0); > > return mp; > Index: 2.6.x-xfs-new/fs/xfs/xfs_mount.h > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_mount.h 2007-10-16 08:52:58.000000000 +1000 > +++ 2.6.x-xfs-new/fs/xfs/xfs_mount.h 2007-11-05 14:14:42.652456849 +1100 > @@ -219,12 +219,18 @@ extern void xfs_icsb_sync_counters_flags > #define xfs_icsb_sync_counters_flags(mp, flags) do { } while (0) > #endif > > +typedef struct xfs_ail { > + xfs_ail_entry_t xa_ail; > + uint xa_gen; > + struct task_struct *xa_task; > + xfs_lsn_t xa_target; > +} xfs_ail_t; > + > typedef struct xfs_mount { > struct super_block *m_super; > xfs_tid_t m_tid; /* next unused tid for fs */ > spinlock_t m_ail_lock; /* fs AIL mutex */ > - xfs_ail_entry_t m_ail; /* fs active log item list */ > - uint m_ail_gen; /* fs AIL generation count */ > + xfs_ail_t m_ail; /* fs active log item list */ > xfs_sb_t m_sb; /* copy of fs superblock */ > spinlock_t m_sb_lock; /* sb counter lock */ > struct xfs_buf *m_sb_bp; /* buffer for superblock */ > Index: 2.6.x-xfs-new/fs/xfs/xfs_trans.h > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans.h 2007-11-02 13:44:46.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_trans.h 2007-11-05 14:01:13.205272667 +1100 > @@ -993,6 +993,7 @@ int _xfs_trans_commit(xfs_trans_t *, > #define xfs_trans_commit(tp, flags) _xfs_trans_commit(tp, flags, NULL) > void xfs_trans_cancel(xfs_trans_t *, int); > void xfs_trans_ail_init(struct xfs_mount *); > +void xfs_trans_ail_destroy(struct xfs_mount *); > xfs_lsn_t xfs_trans_push_ail(struct xfs_mount *, xfs_lsn_t); > xfs_lsn_t xfs_trans_tail_ail(struct xfs_mount *); > void xfs_trans_unlocked_item(struct xfs_mount *, > Index: 2.6.x-xfs-new/fs/xfs/xfs_trans_ail.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans_ail.c 2007-10-02 16:01:48.000000000 +1000 > +++ 2.6.x-xfs-new/fs/xfs/xfs_trans_ail.c 2007-11-05 14:46:44.206327966 +1100 > @@ -57,7 +57,7 @@ xfs_trans_tail_ail( > xfs_log_item_t *lip; > > spin_lock(&mp->m_ail_lock); > - lip = xfs_ail_min(&(mp->m_ail)); > + lip = xfs_ail_min(&(mp->m_ail.xa_ail)); > if (lip == NULL) { > lsn = (xfs_lsn_t)0; > } else { > @@ -71,25 +71,22 @@ xfs_trans_tail_ail( > /* > * xfs_trans_push_ail > * > - * This routine is called to move the tail of the AIL > - * forward. It does this by trying to flush items in the AIL > - * whose lsns are below the given threshold_lsn. > + * This routine is called to move the tail of the AIL forward. It does this by > + * trying to flush items in the AIL whose lsns are below the given > + * threshold_lsn. > * > - * The routine returns the lsn of the tail of the log. > + * the push is run asynchronously in a separate thread, so we return the tail > + * of the log right now instead of the tail after the push. This means we will > + * either continue right away, or we will sleep waiting on the async thread to > + * do it's work. > */ > xfs_lsn_t > xfs_trans_push_ail( > xfs_mount_t *mp, > xfs_lsn_t threshold_lsn) > { > - xfs_lsn_t lsn; > xfs_log_item_t *lip; > int gen; > - int restarts; > - int lock_result; > - int flush_log; > - > -#define XFS_TRANS_PUSH_AIL_RESTARTS 1000 > > spin_lock(&mp->m_ail_lock); > lip = xfs_trans_first_ail(mp, &gen); > @@ -100,57 +97,105 @@ xfs_trans_push_ail( > spin_unlock(&mp->m_ail_lock); > return (xfs_lsn_t)0; > } > + if (XFS_LSN_CMP(threshold_lsn, mp->m_ail.xa_target) > 0) > + xfsaild_wakeup(mp, threshold_lsn); > + spin_unlock(&mp->m_ail_lock); > + return (xfs_lsn_t)lip->li_lsn; > +} > + > +/* > + * Return the item in the AIL with the current lsn. > + * Return the current tree generation number for use > + * in calls to xfs_trans_next_ail(). > + */ > +STATIC xfs_log_item_t * > +xfs_trans_first_push_ail( > + xfs_mount_t *mp, > + int *gen, > + xfs_lsn_t lsn) > +{ > + xfs_log_item_t *lip; > + > + lip = xfs_ail_min(&(mp->m_ail.xa_ail)); > + *gen = (int)mp->m_ail.xa_gen; > + while (lip && (XFS_LSN_CMP(lip->li_lsn, lsn) < 0)) > + lip = lip->li_ail.ail_forw; > + > + return (lip); > +} > + > +/* > + * Function that does the work of pushing on the AIL > + */ > +long > +xfsaild_push( > + xfs_mount_t *mp, > + xfs_lsn_t *last_lsn) > +{ > + long tout = 100; /* milliseconds */ > + xfs_lsn_t last_pushed_lsn = *last_lsn; > + xfs_lsn_t target = mp->m_ail.xa_target; > + xfs_lsn_t lsn; > + xfs_log_item_t *lip; > + int lock_result; > + int gen; > + int restarts; > + int flush_log, count, stuck; > + > +#define XFS_TRANS_PUSH_AIL_RESTARTS 10 > + > + spin_lock(&mp->m_ail_lock); > + lip = xfs_trans_first_push_ail(mp, &gen, *last_lsn); > + if (lip == NULL || XFS_FORCED_SHUTDOWN(mp)) { > + /* > + * AIL is empty or our push has reached the end. > + */ > + spin_unlock(&mp->m_ail_lock); > + last_pushed_lsn = 0; > + goto out; > + } > > XFS_STATS_INC(xs_push_ail); > > /* > * While the item we are looking at is below the given threshold > - * try to flush it out. Make sure to limit the number of times > - * we allow xfs_trans_next_ail() to restart scanning from the > - * beginning of the list. We'd like not to stop until we've at least > + * try to flush it out. We'd like not to stop until we've at least > * tried to push on everything in the AIL with an LSN less than > - * the given threshold. However, we may give up before that if > - * we realize that we've been holding the AIL lock for 'too long', > - * blocking interrupts. Currently, too long is < 500us roughly. > + * the given threshold. > + * > + * However, we will stop after a certain number of pushes and wait > + * for a reduced timeout to fire before pushing further. This > + * prevents use from spinning when we can't do anything or there is > + * lots of contention on the AIL lists. > */ > - flush_log = 0; > - restarts = 0; > - while (((restarts < XFS_TRANS_PUSH_AIL_RESTARTS) && > - (XFS_LSN_CMP(lip->li_lsn, threshold_lsn) < 0))) { > + tout = 10; > + lsn = lip->li_lsn; > + flush_log = stuck = count = 0; > + while ((XFS_LSN_CMP(lip->li_lsn, target) < 0)) { > /* > - * If we can lock the item without sleeping, unlock > - * the AIL lock and flush the item. Then re-grab the > - * AIL lock so we can look for the next item on the > - * AIL. Since we unlock the AIL while we flush the > - * item, the next routine may start over again at the > - * the beginning of the list if anything has changed. > - * That is what the generation count is for. > + * If we can lock the item without sleeping, unlock the AIL > + * lock and flush the item. Then re-grab the AIL lock so we > + * can look for the next item on the AIL. List changes are > + * handled by the AIL lookup functions internally > * > - * If we can't lock the item, either its holder will flush > - * it or it is already being flushed or it is being relogged. > - * In any of these case it is being taken care of and we > - * can just skip to the next item in the list. > + * If we can't lock the item, either its holder will flush it > + * or it is already being flushed or it is being relogged. In > + * any of these case it is being taken care of and we can just > + * skip to the next item in the list. > */ > lock_result = IOP_TRYLOCK(lip); > + spin_unlock(&mp->m_ail_lock); > switch (lock_result) { > case XFS_ITEM_SUCCESS: > - spin_unlock(&mp->m_ail_lock); > XFS_STATS_INC(xs_push_ail_success); > IOP_PUSH(lip); > - spin_lock(&mp->m_ail_lock); > + last_pushed_lsn = lsn; > break; > > case XFS_ITEM_PUSHBUF: > - spin_unlock(&mp->m_ail_lock); > XFS_STATS_INC(xs_push_ail_pushbuf); > -#ifdef XFSRACEDEBUG > - delay_for_intr(); > - delay(300); > -#endif > - ASSERT(lip->li_ops->iop_pushbuf); > - ASSERT(lip); > IOP_PUSHBUF(lip); > - spin_lock(&mp->m_ail_lock); > + last_pushed_lsn = lsn; > break; > > case XFS_ITEM_PINNED: > @@ -160,10 +205,14 @@ xfs_trans_push_ail( > > case XFS_ITEM_LOCKED: > XFS_STATS_INC(xs_push_ail_locked); > + last_pushed_lsn = lsn; > + stuck++; > break; > > case XFS_ITEM_FLUSHING: > XFS_STATS_INC(xs_push_ail_flushing); > + last_pushed_lsn = lsn; > + stuck++; > break; > > default: > @@ -171,19 +220,26 @@ xfs_trans_push_ail( > break; > } > > + spin_lock(&mp->m_ail_lock); > + count++; > + /* Too many items we can't do anything with? */ > + if (stuck > 100) > + break; > + /* we're either starting or stopping if there is no log */ > + if (!mp->m_log) > + break; > + /* should we bother continuing? */ > + if (XFS_FORCED_SHUTDOWN(mp)) > + break; > + /* get the next item */ > lip = xfs_trans_next_ail(mp, lip, &gen, &restarts); > - if (lip == NULL) { > + if (lip == NULL) > break; > - } > - if (XFS_FORCED_SHUTDOWN(mp)) { > - /* > - * Just return if we shut down during the last try. > - */ > - spin_unlock(&mp->m_ail_lock); > - return (xfs_lsn_t)0; > - } > - > + if (restarts > XFS_TRANS_PUSH_AIL_RESTARTS) > + break; > + lsn = lip->li_lsn; > } > + spin_unlock(&mp->m_ail_lock); > > if (flush_log) { > /* > @@ -191,22 +247,33 @@ xfs_trans_push_ail( > * push out the log so it will become unpinned and > * move forward in the AIL. > */ > - spin_unlock(&mp->m_ail_lock); > XFS_STATS_INC(xs_push_ail_flush); > xfs_log_force(mp, (xfs_lsn_t)0, XFS_LOG_FORCE); > - spin_lock(&mp->m_ail_lock); > } > > - lip = xfs_ail_min(&(mp->m_ail)); > - if (lip == NULL) { > - lsn = (xfs_lsn_t)0; > - } else { > - lsn = lip->li_lsn; > + /* > + * We reached the target so wait a bit longer for I/O to complete and > + * remove pushed items from the AIL before we start the next scan from > + * the start of the AIL. > + */ > + if ((XFS_LSN_CMP(lsn, target) >= 0)) { > + tout += 20; > + last_pushed_lsn = 0; > + } else if ((restarts > XFS_TRANS_PUSH_AIL_RESTARTS) || > + (count && (count < (stuck + 10)))) { > + /* > + * Either there is a lot of contention on the AIL or we > + * found a lot of items we couldn't do anything with. > + * Backoff a bit more to allow some I/O to complete before > + * continuing from where we were. > + */ > + tout += 10; > } > > - spin_unlock(&mp->m_ail_lock); > - return lsn; > -} /* xfs_trans_push_ail */ > +out: > + *last_lsn = last_pushed_lsn; > + return tout; > +} /* xfsaild_push */ > > > /* > @@ -247,7 +314,7 @@ xfs_trans_unlocked_item( > * the call to xfs_log_move_tail() doesn't do anything if there's > * not enough free space to wake people up so we're safe calling it. > */ > - min_lip = xfs_ail_min(&mp->m_ail); > + min_lip = xfs_ail_min(&mp->m_ail.xa_ail); > > if (min_lip == lip) > xfs_log_move_tail(mp, 1); > @@ -279,7 +346,7 @@ xfs_trans_update_ail( > xfs_log_item_t *dlip=NULL; > xfs_log_item_t *mlip; /* ptr to minimum lip */ > > - ailp = &(mp->m_ail); > + ailp = &(mp->m_ail.xa_ail); > mlip = xfs_ail_min(ailp); > > if (lip->li_flags & XFS_LI_IN_AIL) { > @@ -292,10 +359,10 @@ xfs_trans_update_ail( > lip->li_lsn = lsn; > > xfs_ail_insert(ailp, lip); > - mp->m_ail_gen++; > + mp->m_ail.xa_gen++; > > if (mlip == dlip) { > - mlip = xfs_ail_min(&(mp->m_ail)); > + mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); > spin_unlock(&mp->m_ail_lock); > xfs_log_move_tail(mp, mlip->li_lsn); > } else { > @@ -330,7 +397,7 @@ xfs_trans_delete_ail( > xfs_log_item_t *mlip; > > if (lip->li_flags & XFS_LI_IN_AIL) { > - ailp = &(mp->m_ail); > + ailp = &(mp->m_ail.xa_ail); > mlip = xfs_ail_min(ailp); > dlip = xfs_ail_delete(ailp, lip); > ASSERT(dlip == lip); > @@ -338,10 +405,10 @@ xfs_trans_delete_ail( > > lip->li_flags &= ~XFS_LI_IN_AIL; > lip->li_lsn = 0; > - mp->m_ail_gen++; > + mp->m_ail.xa_gen++; > > if (mlip == dlip) { > - mlip = xfs_ail_min(&(mp->m_ail)); > + mlip = xfs_ail_min(&(mp->m_ail.xa_ail)); > spin_unlock(&mp->m_ail_lock); > xfs_log_move_tail(mp, (mlip ? mlip->li_lsn : 0)); > } else { > @@ -379,10 +446,10 @@ xfs_trans_first_ail( > { > xfs_log_item_t *lip; > > - lip = xfs_ail_min(&(mp->m_ail)); > - *gen = (int)mp->m_ail_gen; > + lip = xfs_ail_min(&(mp->m_ail.xa_ail)); > + *gen = (int)mp->m_ail.xa_gen; > > - return (lip); > + return lip; > } > > /* > @@ -402,11 +469,11 @@ xfs_trans_next_ail( > xfs_log_item_t *nlip; > > ASSERT(mp && lip && gen); > - if (mp->m_ail_gen == *gen) { > - nlip = xfs_ail_next(&(mp->m_ail), lip); > + if (mp->m_ail.xa_gen == *gen) { > + nlip = xfs_ail_next(&(mp->m_ail.xa_ail), lip); > } else { > - nlip = xfs_ail_min(&(mp->m_ail)); > - *gen = (int)mp->m_ail_gen; > + nlip = xfs_ail_min(&(mp->m_ail).xa_ail); > + *gen = (int)mp->m_ail.xa_gen; > if (restarts != NULL) { > XFS_STATS_INC(xs_push_ail_restarts); > (*restarts)++; > @@ -435,8 +502,16 @@ void > xfs_trans_ail_init( > xfs_mount_t *mp) > { > - mp->m_ail.ail_forw = (xfs_log_item_t*)&(mp->m_ail); > - mp->m_ail.ail_back = (xfs_log_item_t*)&(mp->m_ail); > + mp->m_ail.xa_ail.ail_forw = (xfs_log_item_t*)&mp->m_ail.xa_ail; > + mp->m_ail.xa_ail.ail_back = (xfs_log_item_t*)&mp->m_ail.xa_ail; > + xfsaild_start(mp); > +} > + > +void > +xfs_trans_ail_destroy( > + xfs_mount_t *mp) > +{ > + xfsaild_stop(mp); > } > > /* > Index: 2.6.x-xfs-new/fs/xfs/xfs_trans_priv.h > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans_priv.h 2007-10-02 16:01:48.000000000 +1000 > +++ 2.6.x-xfs-new/fs/xfs/xfs_trans_priv.h 2007-11-05 14:02:18.784782356 +1100 > @@ -57,4 +57,12 @@ struct xfs_log_item *xfs_trans_next_ail( > struct xfs_log_item *, int *, int *); > > > +/* > + * AIL push thread support > + */ > +long xfsaild_push(struct xfs_mount *, xfs_lsn_t *); > +void xfsaild_wakeup(struct xfs_mount *, xfs_lsn_t); > +void xfsaild_start(struct xfs_mount *); > +void xfsaild_stop(struct xfs_mount *); > + > #endif /* __XFS_TRANS_PRIV_H__ */ > Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c 2007-11-02 13:44:50.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c 2007-11-05 14:50:43.099049624 +1100 > @@ -6220,13 +6220,13 @@ xfsidbg_xaildump(xfs_mount_t *mp) > }; > int count; > > - if ((mp->m_ail.ail_forw == NULL) || > - (mp->m_ail.ail_forw == (xfs_log_item_t *)&mp->m_ail)) { > + if ((mp->m_ail.xa_ail.ail_forw == NULL) || > + (mp->m_ail.xa_ail.ail_forw == (xfs_log_item_t *)&mp->m_ail.xa_ail)) { > kdb_printf("AIL is empty\n"); > return; > } > kdb_printf("AIL for mp 0x%p, oldest first\n", mp); > - lip = (xfs_log_item_t*)mp->m_ail.ail_forw; > + lip = (xfs_log_item_t*)mp->m_ail.xa_ail.ail_forw; > for (count = 0; lip; count++) { > kdb_printf("[%d] type %s ", count, xfsidbg_item_type_str(lip)); > printflags((uint)(lip->li_flags), li_flags, "flags:"); > @@ -6255,7 +6255,7 @@ xfsidbg_xaildump(xfs_mount_t *mp) > break; > } > > - if (lip->li_ail.ail_forw == (xfs_log_item_t*)&mp->m_ail) { > + if (lip->li_ail.ail_forw == (xfs_log_item_t*)&mp->m_ail.xa_ail) { > lip = NULL; > } else { > lip = lip->li_ail.ail_forw; > @@ -6312,9 +6312,9 @@ xfsidbg_xmount(xfs_mount_t *mp) > > kdb_printf("xfs_mount at 0x%p\n", mp); > kdb_printf("tid 0x%x ail_lock 0x%p &ail 0x%p\n", > - mp->m_tid, &mp->m_ail_lock, &mp->m_ail); > + mp->m_tid, &mp->m_ail_lock, &mp->m_ail.xa_ail); > kdb_printf("ail_gen 0x%x &sb 0x%p\n", > - mp->m_ail_gen, &mp->m_sb); > + mp->m_ail.xa_gen, &mp->m_sb); > kdb_printf("sb_lock 0x%p sb_bp 0x%p dev 0x%x logdev 0x%x rtdev 0x%x\n", > &mp->m_sb_lock, mp->m_sb_bp, > mp->m_ddev_targp ? mp->m_ddev_targp->bt_dev : 0, > From owner-xfs@oss.sgi.com Thu Nov 8 17:02:30 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 17:02:34 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00,J_CHICKENPOX_66, T_STOX_BOUND_090909_B autolearn=no version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA912QOj021135 for ; Thu, 8 Nov 2007 17:02:29 -0800 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA18984; Fri, 9 Nov 2007 12:02:18 +1100 Message-ID: <4733B1CA.9030109@sgi.com> Date: Fri, 09 Nov 2007 12:03:06 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.6 (X11/20070728) MIME-Version: 1.0 To: Christoph Hellwig CC: xfs-dev , xfs-oss Subject: Re: [PATCH] Turn off XBF_READ_AHEAD in io completion References: <47296FF7.8080607@sgi.com> <20071101100012.GA20065@infradead.org> In-Reply-To: <20071101100012.GA20065@infradead.org> Content-Type: multipart/mixed; boundary="------------020301010005050909020108" X-Virus-Scanned: ClamAV 0.91.2/4717/Thu Nov 8 16:05:49 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13597 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs This is a multi-part message in MIME format. --------------020301010005050909020108 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Christoph Hellwig wrote: > On Thu, Nov 01, 2007 at 05:19:35PM +1100, Lachlan McIlroy wrote: >> Read-ahead of an inode cluster will set XBF_READ_AHEAD in the buffer. >> If we don't remove the flag it will still be set when we flush the >> buffer back to disk. Not sure if leaving this flag set causes any >> serious problems but it does trigger an assert. > > It might be better if such temporary flags never actually make it to > bp->b_flags. Just pass down a flags variable all the way to > _xfs_buf_ioapply and keep the flags just for this I/O separate from > those that are permanent and in bp->b_flags. > Okay, I've done that (new patch attached). It's certainly not as clean as the last patch. --------------020301010005050909020108 Content-Type: text/x-patch; name="readahead.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="readahead.diff" --- fs/xfs/linux-2.6/xfs_buf.c_1.247 2007-10-29 16:01:29.000000000 +1100 +++ fs/xfs/linux-2.6/xfs_buf.c 2007-11-02 13:39:48.000000000 +1100 @@ -1020,7 +1020,7 @@ xfs_buf_iodone_work( (bp->b_flags & (XBF_ORDERED|XBF_ASYNC)) == (XBF_ORDERED|XBF_ASYNC)) { XB_TRACE(bp, "ordered_retry", bp->b_iodone); bp->b_flags &= ~XBF_ORDERED; - xfs_buf_iorequest(bp); + xfs_buf_iorequest(bp, bp->b_flags); } else if (bp->b_iodone) (*(bp->b_iodone))(bp); else if (bp->b_flags & XBF_ASYNC) @@ -1082,9 +1082,9 @@ xfs_buf_iostart( } bp->b_flags &= ~(XBF_READ | XBF_WRITE | XBF_ASYNC | XBF_DELWRI | \ - XBF_READ_AHEAD | _XBF_RUN_QUEUES); + _XBF_RUN_QUEUES); bp->b_flags |= flags & (XBF_READ | XBF_WRITE | XBF_ASYNC | \ - XBF_READ_AHEAD | _XBF_RUN_QUEUES); + _XBF_RUN_QUEUES); BUG_ON(bp->b_bn == XFS_BUF_DADDR_NULL); @@ -1093,7 +1093,7 @@ xfs_buf_iostart( * a shutdown situation, for example). */ status = (flags & XBF_WRITE) ? - xfs_buf_iostrategy(bp) : xfs_buf_iorequest(bp); + xfs_buf_iostrategy(bp, flags) : xfs_buf_iorequest(bp, flags); /* Wait for I/O if we are not an async request. * Note: async I/O request completion will release the buffer, @@ -1172,7 +1172,8 @@ xfs_buf_bio_end_io( STATIC void _xfs_buf_ioapply( - xfs_buf_t *bp) + xfs_buf_t *bp, + xfs_buf_flags_t flags) { int i, rw, map_i, total_nr_pages, nr_pages; struct bio *bio; @@ -1194,7 +1195,7 @@ _xfs_buf_ioapply( rw = (bp->b_flags & XBF_WRITE) ? WRITE_SYNC : READ_SYNC; } else { rw = (bp->b_flags & XBF_WRITE) ? WRITE : - (bp->b_flags & XBF_READ_AHEAD) ? READA : READ; + (flags & XBF_READ_AHEAD) ? READA : READ; } /* Special code path for reading a sub page size buffer in -- @@ -1279,7 +1280,8 @@ submit_io: int xfs_buf_iorequest( - xfs_buf_t *bp) + xfs_buf_t *bp, + xfs_buf_flags_t flags) { XB_TRACE(bp, "iorequest", 0); @@ -1299,7 +1301,7 @@ xfs_buf_iorequest( * all the I/O from calling xfs_buf_ioend too early. */ atomic_set(&bp->b_io_remaining, 1); - _xfs_buf_ioapply(bp); + _xfs_buf_ioapply(bp, flags); _xfs_buf_ioend(bp, 0); xfs_buf_rele(bp); @@ -1775,7 +1777,7 @@ xfsbufd( ASSERT(target == bp->b_target); list_del_init(&bp->b_list); - xfs_buf_iostrategy(bp); + xfs_buf_iostrategy(bp, bp->b_flags); count++; } @@ -1819,7 +1821,7 @@ xfs_flush_buftarg( else list_del_init(&bp->b_list); - xfs_buf_iostrategy(bp); + xfs_buf_iostrategy(bp, bp->b_flags); } if (wait) --- fs/xfs/linux-2.6/xfs_buf.h_1.122 2007-11-02 13:34:39.000000000 +1100 +++ fs/xfs/linux-2.6/xfs_buf.h 2007-11-02 13:38:35.000000000 +1100 @@ -191,14 +191,14 @@ extern void xfs_buf_unlock(xfs_buf_t *); extern void xfs_buf_ioend(xfs_buf_t *, int); extern void xfs_buf_ioerror(xfs_buf_t *, int); extern int xfs_buf_iostart(xfs_buf_t *, xfs_buf_flags_t); -extern int xfs_buf_iorequest(xfs_buf_t *); +extern int xfs_buf_iorequest(xfs_buf_t *, xfs_buf_flags_t); extern int xfs_buf_iowait(xfs_buf_t *); extern void xfs_buf_iomove(xfs_buf_t *, size_t, size_t, xfs_caddr_t, xfs_buf_rw_t); -static inline int xfs_buf_iostrategy(xfs_buf_t *bp) +static inline int xfs_buf_iostrategy(xfs_buf_t *bp, xfs_buf_flags_t flags) { - return bp->b_strat ? bp->b_strat(bp) : xfs_buf_iorequest(bp); + return bp->b_strat ? bp->b_strat(bp) : xfs_buf_iorequest(bp, flags); } static inline int xfs_buf_geterror(xfs_buf_t *bp) @@ -380,7 +380,7 @@ static inline int XFS_bwrite(xfs_buf_t * bp->b_flags |= _XBF_RUN_QUEUES; xfs_buf_delwri_dequeue(bp); - xfs_buf_iostrategy(bp); + xfs_buf_iostrategy(bp, bp->b_flags); if (iowait) { error = xfs_buf_iowait(bp); xfs_buf_relse(bp); @@ -395,7 +395,7 @@ static inline int xfs_bdwrite(void *mp, return xfs_buf_iostart(bp, XBF_DELWRI | XBF_ASYNC); } -#define XFS_bdstrat(bp) xfs_buf_iorequest(bp) +#define XFS_bdstrat(bp) xfs_buf_iorequest(bp, (bp)->b_flags) #define xfs_iowait(bp) xfs_buf_iowait(bp) --- fs/xfs/linux-2.6/xfs_lrw.c_1.271 2007-11-09 11:58:53.000000000 +1100 +++ fs/xfs/linux-2.6/xfs_lrw.c 2007-11-09 11:57:50.000000000 +1100 @@ -878,7 +878,7 @@ xfs_bdstrat_cb(struct xfs_buf *bp) mp = XFS_BUF_FSPRIVATE3(bp, xfs_mount_t *); if (!XFS_FORCED_SHUTDOWN(mp)) { - xfs_buf_iorequest(bp); + xfs_buf_iorequest(bp, bp->b_flags); return 0; } else { xfs_buftrace("XFS__BDSTRAT IOERROR", bp); @@ -912,7 +912,7 @@ xfsbdstrat( * if (XFS_BUF_IS_GRIO(bp)) { */ - xfs_buf_iorequest(bp); + xfs_buf_iorequest(bp, bp->b_flags); return 0; } --------------020301010005050909020108-- From owner-xfs@oss.sgi.com Thu Nov 8 17:08:53 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 17:08:59 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA918kYk022102 for ; Thu, 8 Nov 2007 17:08:52 -0800 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id MAA19109; Fri, 9 Nov 2007 12:08:46 +1100 Message-ID: <4733B34F.70407@sgi.com> Date: Fri, 09 Nov 2007 12:09:35 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.6 (X11/20070728) MIME-Version: 1.0 To: David Chinner CC: xfs-oss , xfs-dev Subject: Re: [patch] Fix broken inode clustering References: <20071108003848.GA66820511@sgi.com> In-Reply-To: <20071108003848.GA66820511@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4717/Thu Nov 8 16:05:49 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13598 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs Looks good Dave. David Chinner wrote: > The radix tree based inode caches did away with the inode cluster hashes, > replacing them with a bunch of masking and gang lookups on the radix tree. > > This masking got broken when moving the code to per-ag radix trees and > indexing by agino # rather than straight inode number. The result is > clustered inode writeback does not cluster and things can go extremely > slowly when there are lots of inodes to write. > > The following patch fixes this up by comparing agino # of the inode > found to the index of the cluster we are looking for. > > Signed-off-by: Dave Chinner > Tested-by: Torsten Kaiser > > --- > fs/xfs/xfs_iget.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: 2.6.x-xfs-new/fs/xfs/xfs_iget.c > =================================================================== > --- 2.6.x-xfs-new.orig/fs/xfs/xfs_iget.c 2007-11-02 13:44:46.000000000 +1100 > +++ 2.6.x-xfs-new/fs/xfs/xfs_iget.c 2007-11-07 13:08:42.534440675 +1100 > @@ -248,7 +248,7 @@ finish_inode: > icl = NULL; > if (radix_tree_gang_lookup(&pag->pag_ici_root, (void**)&iq, > first_index, 1)) { > - if ((iq->i_ino & mask) == first_index) > + if ((XFS_INO_TO_AGINO(mp, iq->i_ino) & mask) == first_index) > icl = iq->i_cluster; > } > > > > From owner-xfs@oss.sgi.com Thu Nov 8 19:17:04 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 19:17:07 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA93GxfV001867 for ; Thu, 8 Nov 2007 19:17:02 -0800 Received: from snort.melbourne.sgi.com (snort.melbourne.sgi.com [134.14.54.149]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id OAA21450; Fri, 9 Nov 2007 14:16:58 +1100 Received: from snort.melbourne.sgi.com (localhost [127.0.0.1]) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5) with ESMTP id lA93GvdD98944657; Fri, 9 Nov 2007 14:16:58 +1100 (AEDT) Received: (from dgc@localhost) by snort.melbourne.sgi.com (SGI-8.12.5/8.12.5/Submit) id lA93Gtnx99032984; Fri, 9 Nov 2007 14:16:55 +1100 (AEDT) X-Authentication-Warning: snort.melbourne.sgi.com: dgc set sender to dgc@sgi.com using -f Date: Fri, 9 Nov 2007 14:16:55 +1100 From: David Chinner To: Lachlan McIlroy Cc: David Chinner , xfs-oss , xfs-dev Subject: Re: [PATCH, RFC] Move AIL pushing into a separate thread Message-ID: <20071109031655.GQ66820511@sgi.com> References: <20071105050706.GW66820511@sgi.com> <4733AB27.70208@sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4733AB27.70208@sgi.com> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.91.2/4719/Thu Nov 8 17:49:00 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13599 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs On Fri, Nov 09, 2007 at 11:34:47AM +1100, Lachlan McIlroy wrote: > I like the sound of this Dave. I'm still going through the code in > detail. > > Could we convert the ail lock into a mutex to ease the load? I know > it may not improve throughput but it would at least relieve the CPUs > to do other stuff. Most of the time the ail lock is used for very short periods of time, (e.g. less than ten lines of code) so a spin lock is appropriate. What we are seeing here is too many CPUs holding it for to long trying to do the work one CPU could easily do. i.e. the bug we are seeing here is the contention on the lock, not the type of lock. If we change to a sleeping lock, all ppl will see is a slowdown and that is much, much harder to diagnose on a production system than spin lock contention.... Cheers, Dave. -- Dave Chinner Principal Engineer SGI Australian Software Group From owner-xfs@oss.sgi.com Thu Nov 8 21:02:40 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 21:02:43 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA952YG8015282 for ; Thu, 8 Nov 2007 21:02:38 -0800 Received: from chook.melbourne.sgi.com (chook.melbourne.sgi.com [134.14.54.237]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA23484; Fri, 9 Nov 2007 16:02:29 +1100 Received: by chook.melbourne.sgi.com (Postfix, from userid 16346) id 2C8B058C38F7; Fri, 9 Nov 2007 16:02:29 +1100 (EST) To: sgi.bugs.xfs@engr.sgi.com Cc: xfs@oss.sgi.com Subject: TAKE 972915 - Fix broken inode cluster setup. Message-Id: <20071109050229.2C8B058C38F7@chook.melbourne.sgi.com> Date: Fri, 9 Nov 2007 16:02:29 +1100 (EST) From: dgc@sgi.com (David Chinner) X-Virus-Scanned: ClamAV 0.91.2/4721/Thu Nov 8 19:47:43 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13600 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: dgc@sgi.com Precedence: bulk X-list: xfs Fix broken inode cluster setup. The radix tree based inode caches did away with the inode cluster hashes, replacing them with a bunch of masking and gang lookups on the radix tree. This masking got broken when moving the code to per-ag radix trees and indexing by agino # rather than straight inode number. The result is clustered inode writeback does not cluster and things can go extremely slowly when there are lots of inodes to write. Fix it up by comparing the agino # of the inode we just looked up to the index of the cluster we are looking for. Signed-off-by: Dave Chinner Tested-by: Torsten Kaiser Date: Fri Nov 9 16:02:06 AEDT 2007 Workarea: chook.melbourne.sgi.com:/build/dgc/isms/2.6.x-xfs Inspected by: lachlan@sgi.com The following file(s) were checked into: longdrop.melbourne.sgi.com:/isms/linux/2.6.x-xfs-melb Modid: xfs-linux-melb:xfs-kern:30033a fs/xfs/xfs_iget.c - 1.238 - changed http://oss.sgi.com/cgi-bin/cvsweb.cgi/xfs-linux/xfs_iget.c.diff?r1=text&tr1=1.238&r2=text&tr2=1.237&f=h - Make the cluster lookup use the agino rather than full inode number so that clusters are correctly built. From owner-xfs@oss.sgi.com Thu Nov 8 21:23:19 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 21:23:38 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00, T_STOX_BOUND_090909_B autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA95NDcJ017668 for ; Thu, 8 Nov 2007 21:23:17 -0800 Received: from [134.14.55.78] (redback.melbourne.sgi.com [134.14.55.78]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA23821; Fri, 9 Nov 2007 16:23:13 +1100 Message-ID: <4733EEF2.9010504@sgi.com> Date: Fri, 09 Nov 2007 16:24:02 +1100 From: Lachlan McIlroy Reply-To: lachlan@sgi.com User-Agent: Thunderbird 2.0.0.6 (X11/20070728) MIME-Version: 1.0 To: xfs-dev , xfs-oss Subject: [PATCH] bulkstat fixups Content-Type: multipart/mixed; boundary="------------080005070805090205050607" X-Virus-Scanned: ClamAV 0.91.2/4721/Thu Nov 8 19:47:43 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13601 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: lachlan@sgi.com Precedence: bulk X-list: xfs This is a multi-part message in MIME format. --------------080005070805090205050607 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Here's a collection of fixups for bulkstat for all the remaining issues. - sanity check for NULL user buffer in xfs_ioc_bulkstat[_compat]() - remove the special case for XFS_IOC_FSBULKSTAT with count == 1. This special case causes bulkstat to fail because the special case uses xfs_bulkstat_single() instead of xfs_bulkstat() and the two functions have different semantics. xfs_bulkstat() will return the next inode after the one supplied while skipping internal inodes (ie quota inodes). xfs_bulkstate_single() will only lookup the inode supplied and return an error if it is an internal inode. - in xfs_bulkstat(), need to initialise 'lastino' to the inode supplied so in cases were we return without examining any inodes the scan wont restart back at zero. - sanity check for valid *ubcountp values. Cannot sanity check for valid ubuffer here because some users of xfs_bulkstat() don't supply a buffer. - checks against 'ubleft' (the space left in the user's buffer) should be against 'statstruct_size' which is the supplied minimum object size. The mixture of checks against statstruct_size and 0 was one of the reasons we were skipping inodes. - if the formatter function returns BULKSTAT_RV_NOTHING and an error and the error is not ENOENT or EINVAL then we need to abort the scan. ENOENT is for inodes that are no longer valid and we just skip them. EINVAL is returned if we try to lookup an internal inode so we skip them too. For a DMF scan if the inode and DMF attribute cannot fit into the space left in the user's buffer it would return ERANGE. We didn't handle this error and skipped the inode. We would continue to skip inodes until one fitted into the user's buffer or we completed the scan. - put back the recalculation of agino (that got removed with the last fix) at the end of the while loop. This is because the code at the start of the loop expects agino to be the last inode examined if it is non-zero. - if we found some inodes but then encountered an error, return success this time and the error next time. If the formatter aborted with ENOMEM we will now return this error but only if we couldn't read any inodes. Previously if we encountered ENOMEM without reading any inodes we returned a zero count and no error which falsely indicated the scan was complete. --------------080005070805090205050607 Content-Type: text/x-patch; name="bulkstat.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="bulkstat.diff" --- fs/xfs/linux-2.6/xfs_ioctl.c_1.158 2007-11-09 15:51:03.000000000 +1100 +++ fs/xfs/linux-2.6/xfs_ioctl.c 2007-11-09 11:57:50.000000000 +1100 @@ -1024,24 +1024,20 @@ xfs_ioc_bulkstat( if ((count = bulkreq.icount) <= 0) return -XFS_ERROR(EINVAL); + if (bulkreq.ubuffer == NULL) + return -XFS_ERROR(EINVAL); + if (cmd == XFS_IOC_FSINUMBERS) error = xfs_inumbers(mp, &inlast, &count, bulkreq.ubuffer, xfs_inumbers_fmt); else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) error = xfs_bulkstat_single(mp, &inlast, bulkreq.ubuffer, &done); - else { /* XFS_IOC_FSBULKSTAT */ - if (count == 1 && inlast != 0) { - inlast++; - error = xfs_bulkstat_single(mp, &inlast, - bulkreq.ubuffer, &done); - } else { - error = xfs_bulkstat(mp, &inlast, &count, - (bulkstat_one_pf)xfs_bulkstat_one, NULL, - sizeof(xfs_bstat_t), bulkreq.ubuffer, - BULKSTAT_FG_QUICK, &done); - } - } + else /* XFS_IOC_FSBULKSTAT */ + error = xfs_bulkstat(mp, &inlast, &count, + (bulkstat_one_pf)xfs_bulkstat_one, NULL, + sizeof(xfs_bstat_t), bulkreq.ubuffer, + BULKSTAT_FG_QUICK, &done); if (error) return -error; --- fs/xfs/linux-2.6/xfs_ioctl32.c_1.23 2007-11-02 14:27:11.000000000 +1100 +++ fs/xfs/linux-2.6/xfs_ioctl32.c 2007-11-02 14:13:27.000000000 +1100 @@ -292,6 +292,9 @@ xfs_ioc_bulkstat_compat( if ((count = bulkreq.icount) <= 0) return -XFS_ERROR(EINVAL); + if (bulkreq.ubuffer == NULL) + return -XFS_ERROR(EINVAL); + if (cmd == XFS_IOC_FSINUMBERS) error = xfs_inumbers(mp, &inlast, &count, bulkreq.ubuffer, xfs_inumbers_fmt_compat); --- fs/xfs/xfs_itable.c_1.157 2007-10-25 17:22:09.000000000 +1000 +++ fs/xfs/xfs_itable.c 2007-11-01 17:22:28.000000000 +1100 @@ -353,7 +353,7 @@ xfs_bulkstat( xfs_inobt_rec_incore_t *irbp; /* current irec buffer pointer */ xfs_inobt_rec_incore_t *irbuf; /* start of irec buffer */ xfs_inobt_rec_incore_t *irbufend; /* end of good irec buffer entries */ - xfs_ino_t lastino=0; /* last inode number returned */ + xfs_ino_t lastino; /* last inode number returned */ int nbcluster; /* # of blocks in a cluster */ int nicluster; /* # of inodes in a cluster */ int nimask; /* mask for inode clusters */ @@ -373,6 +373,7 @@ xfs_bulkstat( * Get the last inode value, see if there's nothing to do. */ ino = (xfs_ino_t)*lastinop; + lastino = ino; dip = NULL; agno = XFS_INO_TO_AGNO(mp, ino); agino = XFS_INO_TO_AGINO(mp, ino); @@ -382,6 +383,9 @@ xfs_bulkstat( *ubcountp = 0; return 0; } + if (!ubcountp || *ubcountp <= 0) { + return EINVAL; + } ubcount = *ubcountp; /* statstruct's */ ubleft = ubcount * statstruct_size; /* bytes */ *ubcountp = ubelem = 0; @@ -560,7 +564,7 @@ xfs_bulkstat( * Now process this chunk of inodes. */ for (agino = irbp->ir_startino, chunkidx = clustidx = 0; - ubleft > 0 && + ubleft >= statstruct_size && irbp->ir_freecount < XFS_INODES_PER_CHUNK; chunkidx++, clustidx++, agino++) { ASSERT(chunkidx < XFS_INODES_PER_CHUNK); @@ -663,15 +667,13 @@ xfs_bulkstat( ubleft, private_data, bno, &ubused, dip, &fmterror); if (fmterror == BULKSTAT_RV_NOTHING) { - if (error == EFAULT) { - ubleft = 0; - rval = error; - break; - } - else if (error == ENOMEM) + if (error && error != ENOENT && + error != EINVAL) { ubleft = 0; - else - lastino = ino; + rval = error; + break; + } + lastino = ino; continue; } if (fmterror == BULKSTAT_RV_GIVEUP) { @@ -694,11 +696,12 @@ xfs_bulkstat( /* * Set up for the next loop iteration. */ - if (ubleft > 0) { + if (ubleft >= statstruct_size) { if (end_of_ag) { agno++; agino = 0; - } + } else + agino = XFS_INO_TO_AGINO(mp, lastino); } else break; } @@ -707,6 +710,11 @@ xfs_bulkstat( */ kmem_free(irbuf, irbsize); *ubcountp = ubelem; + /* + * Found some inodes, return them now and return the error next time. + */ + if (ubelem) + rval = 0; if (agno >= mp->m_sb.sb_agcount) { /* * If we ran out of filesystem, mark lastino as off --------------080005070805090205050607-- From owner-xfs@oss.sgi.com Thu Nov 8 21:33:57 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 21:34:05 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA95XoLt018838 for ; Thu, 8 Nov 2007 21:33:56 -0800 Received: from [134.14.55.89] (soarer.melbourne.sgi.com [134.14.55.89]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA24049; Fri, 9 Nov 2007 16:33:52 +1100 Message-ID: <4733F198.1090107@sgi.com> Date: Fri, 09 Nov 2007 16:35:20 +1100 From: Vlad Apostolov User-Agent: Thunderbird 2.0.0.6 (X11/20070728) MIME-Version: 1.0 To: lachlan@sgi.com CC: xfs-dev , xfs-oss Subject: Re: [PATCH] bulkstat fixups References: <4733EEF2.9010504@sgi.com> In-Reply-To: <4733EEF2.9010504@sgi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4721/Thu Nov 8 19:47:43 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13602 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: vapo@sgi.com Precedence: bulk X-list: xfs It is looking good Lachlan. I also verified the patch with XFS QA and Judith reported that it fixed the xfs_bulkstat() problem - skipping inodes in the last AG. Regards, Vlad Lachlan McIlroy wrote: > Here's a collection of fixups for bulkstat for all the remaining issues. > > - sanity check for NULL user buffer in xfs_ioc_bulkstat[_compat]() > > - remove the special case for XFS_IOC_FSBULKSTAT with count == 1. > This special > case causes bulkstat to fail because the special case uses > xfs_bulkstat_single() > instead of xfs_bulkstat() and the two functions have different > semantics. > xfs_bulkstat() will return the next inode after the one supplied > while skipping > internal inodes (ie quota inodes). xfs_bulkstate_single() will only > lookup the > inode supplied and return an error if it is an internal inode. > > - in xfs_bulkstat(), need to initialise 'lastino' to the inode > supplied so in cases > were we return without examining any inodes the scan wont restart > back at zero. > > - sanity check for valid *ubcountp values. Cannot sanity check for > valid ubuffer > here because some users of xfs_bulkstat() don't supply a buffer. > > - checks against 'ubleft' (the space left in the user's buffer) should > be against > 'statstruct_size' which is the supplied minimum object size. The > mixture of > checks against statstruct_size and 0 was one of the reasons we were > skipping > inodes. > > - if the formatter function returns BULKSTAT_RV_NOTHING and an error > and the error > is not ENOENT or EINVAL then we need to abort the scan. ENOENT is > for inodes that > are no longer valid and we just skip them. EINVAL is returned if we > try to lookup > an internal inode so we skip them too. For a DMF scan if the inode > and DMF > attribute cannot fit into the space left in the user's buffer it > would return > ERANGE. We didn't handle this error and skipped the inode. We > would continue to > skip inodes until one fitted into the user's buffer or we completed > the scan. > > - put back the recalculation of agino (that got removed with the last > fix) at the > end of the while loop. This is because the code at the start of the > loop expects > agino to be the last inode examined if it is non-zero. > > - if we found some inodes but then encountered an error, return > success this time > and the error next time. If the formatter aborted with ENOMEM we > will now return > this error but only if we couldn't read any inodes. Previously if > we encountered > ENOMEM without reading any inodes we returned a zero count and no > error which > falsely indicated the scan was complete. From owner-xfs@oss.sgi.com Thu Nov 8 21:53:52 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 21:53:56 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA95rnXx021081 for ; Thu, 8 Nov 2007 21:53:51 -0800 Received: from timothy-shimmins-power-mac-g5.local (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id QAA24166; Fri, 9 Nov 2007 16:40:44 +1100 Message-ID: <4733F301.9020706@sgi.com> Date: Fri, 09 Nov 2007 16:41:21 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Andreas Gruenbacher CC: linux-xfs@oss.sgi.com, Gerald Bringhurst , Brandon Philips Subject: Re: acl and attr: Fix path walking code References: <200710281858.24428.agruen@suse.de> In-Reply-To: <200710281858.24428.agruen@suse.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4721/Thu Nov 8 19:47:43 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13603 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Hi Andreas, Andreas Gruenbacher wrote: > Hello, > > the tree walking code in acl and attr broke when resolve_symlinks() was > introduced (by me, unfortunately). Following symlinks passed in on the > command line is the intended behavior for the tools (unless in -P mode). The > first version was buggy, and so someone "fixed" it by replacing readlink() > with realpath() in resolve_symlinks(). > > The result is that the output of getfattr and getfacl will show pathnames that > may point anywhere. When processing a directory tree it sometimes is helpful > to treat symlinks as regular files, but resolving the pathnames is totally > wrong. > > After runnig into problem after problem with nftw and never ending up with > even half-way clean code, I think it's time to ditch it altogether and > replace it with sane code. So here are two patches, one for attr and one for > acl, that does that. > > Files include/walk_tree.h and libmisc/walk_tree.c are identical in both > patches; that code is shared between the two packages. > > Okay to apply? > > Thanks, > Andreas > I applied attr patch and tried it out on xfstests/062 (which I believe was based on one of your tests). ========================================================== --- 062.out 2006-03-28 12:52:32.000000000 +1000 +++ 062.out.bad 2007-11-09 15:38:09.000000000 +1100 @@ -526,6 +526,10 @@ user.name=0xbabe user.name3=0xdeface +# file: SCRATCH_MNT/lnk +trusted.name=0xbabe +trusted.name3=0xdeface + # file: SCRATCH_MNT/dev/b trusted.name=0xbabe trusted.name3=0xdeface @@ -562,6 +566,10 @@ user.1=0x3233 user.x=0x797a +# file: SCRATCH_MNT/descend/and/ascend +trusted.9=0x3837 +trusted.a=0x6263 + *** directory descent without following symlinks # file: SCRATCH_MNT/reg ========================================================== So for the following of symlinks with getfattr -L i.e. echo "*** directory descent with us following symlinks" getfattr -h -L -R -m '.' -e hex $SCRATCH_MNT Looking at the 2nd difference... It now picks up descend/and/ascend which contains the symlink of descend/and --> here/up. So that makes sense, it is following a symlink which it didn't before and finding a dir, "up" in the linked dir. Good. Looking at 1st difference... It is now showing up "lnk" which is a symlink: lnk --> dir So why is it showing this up and yet it is not showing descend/and (which is a link to here/up)? So yes we are following symlinks but are we supposed to just do the symlinks themselves as well? BTW, do we not allow user EAs on symlinks? (I've forgotten) --Tim From owner-xfs@oss.sgi.com Thu Nov 8 23:39:22 2007 Received: with ECARTIS (v1.0.0; list xfs); Thu, 08 Nov 2007 23:39:27 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from larry.melbourne.sgi.com (larry.melbourne.sgi.com [134.14.52.130]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with SMTP id lA97dJSV032511 for ; Thu, 8 Nov 2007 23:39:20 -0800 Received: from timothy-shimmins-power-mac-g5.local (boing.melbourne.sgi.com [134.14.55.141]) by larry.melbourne.sgi.com (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id SAA26203; Fri, 9 Nov 2007 18:39:19 +1100 Message-ID: <47340ECC.4000205@sgi.com> Date: Fri, 09 Nov 2007 18:39:56 +1100 From: Timothy Shimmin User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Andreas Gruenbacher CC: linux-xfs@oss.sgi.com, Gerald Bringhurst , Brandon Philips Subject: Re: acl and attr: Fix path walking code References: <200710281858.24428.agruen@suse.de> In-Reply-To: <200710281858.24428.agruen@suse.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.91.2/4723/Thu Nov 8 22:33:05 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13604 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: tes@sgi.com Precedence: bulk X-list: xfs Andreas Gruenbacher wrote: > Hello, > > the tree walking code in acl and attr broke when resolve_symlinks() was > introduced (by me, unfortunately). Following symlinks passed in on the > command line is the intended behavior for the tools (unless in -P mode). The > first version was buggy, and so someone "fixed" it by replacing readlink() > with realpath() in resolve_symlinks(). > > The result is that the output of getfattr and getfacl will show pathnames that > may point anywhere. When processing a directory tree it sometimes is helpful > to treat symlinks as regular files, but resolving the pathnames is totally > wrong. > > After runnig into problem after problem with nftw and never ending up with > even half-way clean code, I think it's time to ditch it altogether and > replace it with sane code. So here are two patches, one for attr and one for > acl, that does that. > > Files include/walk_tree.h and libmisc/walk_tree.c are identical in both > patches; that code is shared between the two packages. > > Okay to apply? > > Thanks, > Andreas > You mention -L/-P is like chown. However, -P for getattr isn't about not walking symlinks to directories, it's about skipping symlinks altogether, right? --Tim From owner-xfs@oss.sgi.com Fri Nov 9 01:14:24 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 09 Nov 2007 01:14:32 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_20,SPF_HELO_PASS autolearn=ham version=3.3.0-r574664 Received: from lucidpixels.com (lucidpixels.com [75.144.35.66]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA99EMsS020232 for ; Fri, 9 Nov 2007 01:14:24 -0800 Received: by lucidpixels.com (Postfix, from userid 1001) id D0FF71C000263; Fri, 9 Nov 2007 04:14:27 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by lucidpixels.com (Postfix) with ESMTP id AEF8D4019521; Fri, 9 Nov 2007 04:14:27 -0500 (EST) Date: Fri, 9 Nov 2007 04:14:27 -0500 (EST) From: Justin Piszcz X-X-Sender: jpiszcz@p34.internal.lan To: Carlos Carvalho cc: Jeff Lessem , root@c3sl.ufpr.br, Dan Williams , =?iso-8859-1?Q?BERTRAND_Jo=EBl?= , Neil Brown , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state In-Reply-To: <18227.33346.994456.270194@fisica.ufpr.br> Message-ID: References: <18222.16003.92062.970530@notabene.brown> <47303FB8.7000801@systella.fr> <1194398700.2970.18.camel@dwillia2-linux.ch.intel.com> <47314653.80905@Lessem.org> <18227.33346.994456.270194@fisica.ufpr.br> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.91.2/4724/Thu Nov 8 22:48:44 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13605 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: jpiszcz@lucidpixels.com Precedence: bulk X-list: xfs On Thu, 8 Nov 2007, Carlos Carvalho wrote: > Jeff Lessem (Jeff@Lessem.org) wrote on 6 November 2007 22:00: > >Dan Williams wrote: > > > The following patch, also attached, cleans up cases where the code looks > > > at sh->ops.pending when it should be looking at the consistent > > > stack-based snapshot of the operations flags. > > > >I tried this patch (against a stock 2.6.23), and it did not work for > >me. Not only did I/O to the effected RAID5 & XFS partition stop, but > >also I/O to all other disks. I was not able to capture any debugging > >information, but I should be able to do that tomorrow when I can hook > >a serial console to the machine. > > > >I'm not sure if my problem is identical to these others, as mine only > >seems to manifest with RAID5+XFS. The RAID rebuilds with no problem, > >and I've not had any problems with RAID5+ext3. > > Us too! We're stuck trying to build a disk server with several disks > in a raid5 array, and the rsync from the old machine stops writing to > the new filesystem. It only happens under heavy IO. We can make it > lock without rsync, using 8 simultaneous dd's to the array. All IO > stops, including the resync after a newly created raid or after an > unclean reboot. > > We could not trigger the problem with ext3 or reiser3; it only happens > with xfs. > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Including XFS mailing list as well can you provide more information to them? From owner-xfs@oss.sgi.com Fri Nov 9 06:37:11 2007 Received: with ECARTIS (v1.0.0; list xfs); Fri, 09 Nov 2007 06:37:23 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.3.0-r574664 (2007-09-11) on oss.sgi.com X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.0-r574664 Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.182]) by oss.sgi.com (8.12.11.20060308/8.12.10/SuSE Linux 0.7) with ESMTP id lA9Eb5R1010445 for ; Fri, 9 Nov 2007 06:37:08 -0800 Received: by py-out-1112.google.com with SMTP id u77so1092696pyb for ; Fri, 09 Nov 2007 06:37:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:reply-to:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=0Y6W1bg1EB8oyZ+kPaaOFGaRejveHLlNPWtaLM3r6MU=; b=UGskOKJpj62NjBlAuTwMORWsOakzg5mJh0+6xusb9x4hzl9U4kHS2WyK9JQZgK1IjCWJweZ3jI7bTC274ylU430cybp/t4XNS77BiMJnEUyy1z4uE0SN//ai52Sax/rH0OXCmEmfxa/9Rc062JWJxMmFUCgocoi70jTvogSeEaM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:reply-to:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=DvvIO4nZO2Wz23PtQv1f0Guzhc7AELqJBbHArF+5LrL2fWTGBn/QC36RmeXux3pWHGay/IQRuP8cC/veXxPJe0Km+f4YgyRXFoBPeAE3V5fiK014V9turJWqQeLLN99W6V6rxsE65N29q7PQ6Q0dm44o1gr2ETKGnFZmV5rQHFM= Received: by 10.64.203.4 with SMTP id a4mr5752184qbg.1194617364250; Fri, 09 Nov 2007 06:09:24 -0800 (PST) Received: by 10.65.137.2 with HTTP; Fri, 9 Nov 2007 06:09:24 -0800 (PST) Message-ID: <3993afa00711090609j7ba8dee0t8c1772f8654eb2f0@mail.gmail.com> Date: Fri, 9 Nov 2007 12:09:24 -0200 From: "Fabiano Silva" Reply-To: fabiano@c3sl.ufpr.br To: "Justin Piszcz" Subject: Re: 2.6.23.1: mdadm/raid5 hung/d-state Cc: "Carlos Carvalho" , "Jeff Lessem" , root@c3sl.ufpr.br, "Dan Williams" , "=?ISO-8859-1?Q?BERTRAND_Jo=EBl?=" , "Neil Brown" , linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, xfs@oss.sgi.com In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_35744_18065873.1194617364213" References: <47303FB8.7000801@systella.fr> <1194398700.2970.18.camel@dwillia2-linux.ch.intel.com> <47314653.80905@Lessem.org> <18227.33346.994456.270194@fisica.ufpr.br> X-Google-Sender-Auth: 6d02a7b44afc9315 X-Virus-Scanned: ClamAV 0.91.2/4724/Thu Nov 8 22:48:44 2007 on oss.sgi.com X-Virus-Status: Clean X-archive-position: 13606 X-ecartis-version: Ecartis v1.0.0 Sender: xfs-bounce@oss.sgi.com Errors-to: xfs-bounce@oss.sgi.com X-original-sender: fabiano@c3sl.ufpr.br Precedence: bulk X-list: xfs ------=_Part_35744_18065873.1194617364213 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline On Nov 9, 2007 7:14 AM, Justin Piszcz wrote: > > > > On Thu, 8 Nov 2007, Carlos Carvalho wrote: > > > Jeff Lessem (Jeff@Lessem.org) wrote on 6 November 2007 22:00: > > >Dan Williams wrote: > > > > The following patch, also attached, cleans up cases where the code looks > > > > at sh->ops.pending when it should be looking at the consistent > > > > stack-based snapshot of the operations flags. > > > > > >I tried this patch (against a stock 2.6.23), and it did not