| To: | "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, xfs-dev <xfs-dev@xxxxxxx> |
|---|---|
| Subject: | [REVIEW 3/3] - xfs_repair speedups (enhanced prefetch) |
| From: | "Barry Naujok" <bnaujok@xxxxxxx> |
| Date: | Tue, 05 Jun 2007 12:23:32 +1000 |
| Organization: | SGI |
| Sender: | xfs-bounce@xxxxxxxxxxx |
| User-agent: | Opera Mail/9.10 (Win32) |
|
Back in Jan 2007, Michael Nishimoto from Agami posted a patch to a 2.7.x
based xfs_repair tree to perform prefetch/reahead which primed the linux
block buffer for the main xfs_repair processing threads
(http://oss.sgi.com/archives/xfs/2007-01/msg00135.html). Benchmarking this
and 2.8.1x xfs_repair at the time revealed very interesting numbers. 2.8.x
was very slow using direct I/O and the libxfs cache. Researching this
technique and intergrating it with the libxfs cache proved rather
challenging. Many changes were required : - proper xfs_buf_t locking for multi-threaded access. - unify I/O sizes for inodes and metadata blocks. - try to serialise as much I/O as possible. - handle queuing, I/O and processing in parallel minimising starvation, especially when only a subset of the metadata can be stored in memory. - smarter work queues. The unifying of the I/O sizes was a significant change which resulted in a lot of improvements in both performance and correctness, in particulary, with inode blocks. During phase 6, inodes are accessed using xfs_iread/xfs_iget which using inode "clusters" which are either 8KB or blocksize, whichever is greater. Phases 3/4 read using inode "chunks" which can be 16KB or larger. With libxfs caching method, this meant all data had to be flushed/purged before phase 6 started, and all the inodes read again. Also, one part of the libxfs transaction code didn't release buffers properly. This behaviour has been seen in the past with the infamous "shake on cache 0x######## left # nodes!?" warning. Batch reading/serialising I/O requests in the prefetch code had major benefits when metadata is close together, especially with RAIDs. Also, the AIO/LIO stuff was yanked with threaded I/O prefetch. The queuing/IO/processing threads and synchronising them efficiently, especially in low memory conditions was the most challenging aspect. Most of the changes for this are in prefetch.c with minor changes for I/O in the phases. Phase 6 also eliminated the dir_stack code which is not required. It now processes the directory inodes as per layout the inode AVL tree (which it did anyway after doing a path traversal). The patch will have a lot of apparenty noop changes, these are automatic EOL whitespace cleanups.
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| ||
| Previous by Date: | [REVIEW 2/3] - xfs_repair enhancements (lost+found handling), Barry Naujok |
|---|---|
| Next by Date: | Re: Reporting a bug, Germán Poó-Caamaño |
| Previous by Thread: | [REVIEW 2/3] - xfs_repair enhancements (lost+found handling), Barry Naujok |
| Next by Thread: | Reporting a bug, Germán Poó-Caamaño |
| Indexes: | [Date] [Thread] [Top] [All Lists] |