[Top] [All Lists]

Re: [PATCH] xfs: fix log space reservation calculation if log stripe uni

To: Jeff Liu <jeff.liu@xxxxxxxxxx>
Subject: Re: [PATCH] xfs: fix log space reservation calculation if log stripe unit is specified
From: Mark Tinguely <tinguely@xxxxxxx>
Date: Wed, 01 May 2013 11:40:24 -0500
Cc: "xfs@xxxxxxxxxxx" <xfs@xxxxxxxxxxx>, Dave Chinner <dchinner@xxxxxxxxxx>
Delivered-to: xfs@xxxxxxxxxxx
In-reply-to: <51813BA5.3070306@xxxxxxxxxx>
References: <51813BA5.3070306@xxxxxxxxxx>
User-agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:9.0) Gecko/20120122 Thunderbird/9.0
On 05/01/13 10:58, Jeff Liu wrote:

About two weeks ago, Dave has found an issue by running xfstests/297.

According to our previous discussion, if the log stripe unit is configured, we 
take it into account as it will dynamically increase the log reservation twice 
of it
per ticket.

This patch is trying to fix it by checking the given log space against the 
request among those transactions(this procedure is implemented similar to 
because the fundamental limit is that no single transaction can be larger than 
half of the log.
Also, looks at least another two log stripe unit should be added when 
calculating the minimum log
space, or else I can simply trigger a DEAD LOOP via create large number of 
files, I think I need
some time to digest Dave's comments posted on original bug ticket, i.e.
>>  The question is this: how much space do we need to reserve. I'm
>>  thinking a minimum of 4*lsu - 2*lsu for the existing CIL context, and
>>  another 2*lsu for any queued ticket waiting for space to come
>>  available.

Put simply, with this fix, mount a partition with an improper log space setup 
vs log stripe
unit will failed although mkfs still works. Ah, maybe we can improve the user 
space xfs_mkfs
with some pre-checkup similar to the implementation inside kernel?  Besides 
that, it will
drop a warning to syslog and the suggested log space for the given log stripe 
unit is shown
there, which looks like the following:

# mkfs.xfs -f -b size=512 -d agcount=16,su=256k,sw=12 -l su=256k,size=2560b 
meta-data=/dev/sdb1              isize=256    agcount=16, agsize=524288 blks
          =                       sectsz=512   attr=2, projid32bit=0
data     =                       bsize=512    blocks=8388608, imaxpct=25
          =                       sunit=512    swidth=6144 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal log           bsize=512    blocks=2560, version=2
          =                       sectsz=512   sunit=512 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Shouldn't mkfs.xfs also know it is building a filesystem that cannot be mounted?

When mkfs.xfs is given a log stripe unit is greater than 256KB, should we divide the specified log stripe unit by 2 until it is under 256KB rather than reset to 32KB?

# mount /dev/sdb1 /xfs1
mount: wrong fs type, bad option, bad superblock on /dev/sdb1,
        missing codepage or helper program, or other error
        In some cases useful info is found in syslog - try
        dmesg | tail  or so

# dmesg:
XFS (sdb1): log space of 2560 blocks too small, minimum request 6656
XFS (sdb1): log space validation failed
XFS (sdb1): log mount failed

Ran some cases in xfstests as well as a few self-defined Bonnie++/FIO tests 
with above
configuration(6656 log blocks), looks the current fix works, at least no crash 
to me.:)

But I have not yet dig into the detailed of how the suggested minimum log space 
affect the performance, given that the AIL push thresholds is defined to 25% of 
the log
space, a small logs might introduce IO overheads for pushing AIL too frequently.
In addition, considering the backgroup CIL commit threshold is 1/8 of the log, 
this would
also impact the log IO throughput IMHO.  Maybe we can figure out an optimized 
log space
combine those two cases and drop it to syslog along with the minimum size?

I think 1 MB is the smallest log size before we soft hang even without stripe units define.

To Dave,

Sorry for the delay in drop this patch since I have mentioned that I'll post a 
last night.  However, I have ran into an issue when testing it by 
creating/removing a
tons of files in parallel at that time:(

The iclog buffers have to be a multiple of the log stripe unit or we start punching the lsn in places that it should not. I think the idea that was mentioned is to remove the power of two on the iclog buffer size and replace with multiple of log stripe unit.



<Prev in Thread] Current Thread [Next in Thread>