xfs
[Top] [All Lists]

Re: [PATCH] xfs_repair: multithread phase 2

To: Dave Chinner <david@xxxxxxxxxxxxx>
Subject: Re: [PATCH] xfs_repair: multithread phase 2
From: Christoph Hellwig <hch@xxxxxxxxxxxxx>
Date: Tue, 4 Jan 2011 05:02:40 -0500
Cc: xfs@xxxxxxxxxxx
In-reply-to: <1294121588-17233-1-git-send-email-david@xxxxxxxxxxxxx>
References: <1294121588-17233-1-git-send-email-david@xxxxxxxxxxxxx>
User-agent: Mutt/1.5.21 (2010-09-15)
> This patch uses 32-way threading which results in no noticable
> slowdown on single SATA drives with NCQ, but results in ~10x
> reduction in runtime on a 12 disk RAID-0 array.

Shouldn't we have at least an option to allow tuning this value,
similar to the ag_stride?  In fact I wonder why phase 3/4 should
use different values for it than phase2.

> @@ -75,8 +80,10 @@ scan_sbtree(
>                               xfs_agblock_t           bno,
>                               xfs_agnumber_t          agno,
>                               int                     suspect,
> -                             int                     isroot),
> -     int             isroot)
> +                             int                     isroot,
> +                             struct aghdr_cnts       *agcnts),
> +     int             isroot,
> +     struct aghdr_cnts *agcnts)

Please make this a

        void *priv

to keep scan_sbtree generic.

>  void
> +scanfunc_bno(
> +     struct xfs_btree_block  *block,
> +     int                     level,
> +     xfs_agblock_t           bno,
> +     xfs_agnumber_t          agno,
> +     int                     suspect,
> +     int                     isroot,
> +     struct aghdr_cnts       *agcnts)
> +{
> +     return scanfunc_allocbt(block, level, bno, agno,
> +                             suspect, isroot, XFS_ABTB_MAGIC, agcnts);
> +}

Now that we have private data bassed to the scanfuncs we could use that
to communicate if we're doing a bno or cnt scan.  Maybe writing it
directly into struct aghdr_cnts is too ugly, in that case we can have
a scan_priv structure that contains the magic and the aghdr_cnts.

>  
>  void
>  scan_freelist(

This could become static.

>   * Scan an AG for obvious corruption.
>   *
>   * Note: This code is not reentrant due to the use of global variables.

That's not true any more I think.

>   */
> -void
> -scan_ag(
> -     xfs_agnumber_t  agno)
> +void *
> +scan_ag(void *args)

Can be static.

> +#define SCAN_THREADS 32
> +
> +void
> +scan_ags(
> +     struct xfs_mount        *mp)
> +{
> +     struct aghdr_cnts agcnts[mp->m_sb.sb_agcount];
> +     pthread_t       thr[SCAN_THREADS];
> +     __uint64_t      fdblocks = 0;
> +     __uint64_t      icount = 0;
> +     __uint64_t      ifreecount = 0;
> +     int             i, j, err;
> +
> +     /*
> +      * scan a few AGs in parallel. The scan is IO latency bound,
> +      * so running a few at a time will speed it up significantly.
> +      */
> +     for (i = 0; i < mp->m_sb.sb_agcount; i += SCAN_THREADS) {

I think this should use the workqueues from repair/threads.c.  Just
create a workqueue with 32 threads, and then enqueue all the AGs.

<Prev in Thread] Current Thread [Next in Thread>