xfs
[Top] [All Lists]

Re: [PATCH] page_buf stuff

To: Luben Tuikov <luben@xxxxxxxxxxxx>
Subject: Re: [PATCH] page_buf stuff
From: Stephen Lord <lord@xxxxxxx>
Date: 29 Oct 2002 07:43:33 -0600
Cc: linux-xfs <linux-xfs@xxxxxxxxxxx>
In-reply-to: <3DBE3924.22B019A1@xxxxxxxxxxxx>
References: <3DBE3924.22B019A1@xxxxxxxxxxxx>
Sender: linux-xfs-bounce@xxxxxxxxxxx
On Tue, 2002-10-29 at 01:30, Luben Tuikov wrote:
> Maybe like this?
> 
> --- linux/fs/xfs/pagebuf/page_buf.c.orig        Tue Oct 29 01:48:57 2002
> +++ linux/fs/xfs/pagebuf/page_buf.c     Tue Oct 29 01:52:17 2002
> @@ -1665,22 +1665,18 @@
>                         bh = bufferlist[i];
>  
>                         /* Complete the buffer_head, then submit the IO */
> -                       if (psync) {
> -                               init_buffer(bh, callback, psync);
> -                       } else {
> -                               init_buffer(bh, callback, pb);
> -                       }
> +                       init_buffer(bh, callback, psync ? : pb);
>  

                        Well that's a little cryptic, a conditional
                        expression which I have to deduce is defaulting
                        to the left hand side in the case where there is
                        no right hand side.

>                         bh->b_rdev = bh->b_dev;
>                         bh->b_rsector = bh->b_blocknr;
>                         set_bit(BH_Mapped, &bh->b_state);
>                         set_bit(BH_Req, &bh->b_state);
> -
> -                       if (rw == WRITE) {
> -                               set_bit(BH_Uptodate, &bh->b_state);
> -                       }
> -                       generic_make_request(rw, bh);
> +                       if (rw == WRITE)
> +                               set_bit(BH_Dirty, &bh->b_state);

                        In general callers of the block layer
                        clear the dirty bit on buffers they submit,
                        see __block_write_full_page. In reality,
                        since we use generic_make_request it does
                        not appear these flags matter.

>                 }
> +               for (i = 0; i < cnt; i++)
> +                       generic_make_request(rw, bufferlist[i]);

                On the one hand you are doing fancy optimizations
                of the init_buffer call, and on the other you add
                a new loop - which is not needed once the other
                synchronization calls are in.

> +               run_task_queue(&tq_disk);

                ^^^^^^^^^

                This makes XFS more synchronous in terms of disk I/O
                than it needs to be. Try feeding this code a large
                dbench load or something similar.


>         } else {
>                 if (locking)
>                         unlock_page(page);
> 

I think the only part of this which makes a difference to the
execution of the code is the run_task_queue, please explain why
it is in there. This means xfs has no way to coalesce any
metadata I/O beyond an individual page into the device queue.
So our 32K log writes become 8 seperate disk I/Os for starters.
Now, if you are sitting on top of lvm it is probably slicing
and dicing under the covers and masking this, if you try a
single disk I think you will notice a performance drop.

Steve




<Prev in Thread] Current Thread [Next in Thread>