posix_fallocate
Krzysztof Błaszkowski
kb at sysmikro.com.pl
Mon May 10 02:11:52 CDT 2010
On Friday 07 May 2010 18:53, Eric Sandeen wrote:
> Eric Sandeen wrote:
> > Krzysztof Błaszkowski wrote:
> >> Hello,
> >>
> >> I use this to preallocate large space but found an issue.
> >> Posix_fallocate works right with sizes like 100G, 1T and even 10T on
> >> some boxes (on some other can fail after e.g. 7T threshold) but if i
> >> tried e.g. 16T the user space process would be "R"unning forever and it
> >> is not interruptible. Furthermore some other not related processes like
> >> sshd, bash enter D state. There is nothing in kernel log.
>
> Oh, one thing you should know is that depending on your version of glibc,
> posix_fallocate may be writing 0s and not using preallocation calls.
I am absolutely sure that recent libc doesn't emulate this syscall
>
> Do you know which yours is using?
syscall (libc 2.9)
> strace should tell you on a small
> file test.
>
> Anyway, I am seeing things get stuck around 8T it seems...
yes, i noticed that sometimes the threshold point is higher.
>
> # touch /mnt/test/bigfile
> # xfs_io -c "resvsp 0 16t" /mnt/test/bigfile
>
> ... wait ... in other window ...
>
> # du -hc /mnt/test/bigfile
> 8.0G /mnt/test/bigfile
> 8.0G total
>
> # echo t > /proc/sysrq-trigger
It was good idea to use sysrq. I didn't think about this but rather focused on
ftrace and how to analyse these megs of data
> # dmesg | grep -A20 xfs_io
> xfs_io R running task 3576 29444 29362 0x00000006
> ffff8809cfbb4920 ffffffff81478d9f ffffffffa032d3c5 0000000000000246
> ffff8809cfbb4920 ffffffff814788bc 0000000000000000 ffffffff81ba3510
> ffff8809d3429a68 ffffffffa032b60f ffff8809d3429aa8 000000000000001e
> Call Trace:
> [<ffffffff81478d9f>] ? __mutex_lock_common+0x36d/0x392
> [<ffffffffa032d3c5>] ? xfs_icsb_modify_counters+0x17f/0x1ac [xfs]
> [<ffffffffa032b60f>] ? xfs_icsb_unlock_all_counters+0x4d/0x60 [xfs]
> [<ffffffffa032b8bf>] ? xfs_icsb_disable_counter+0x8c/0x95 [xfs]
> [<ffffffff81478e88>] ? mutex_lock_nested+0x3e/0x43
> [<ffffffffa032d3d3>] ? xfs_icsb_modify_counters+0x18d/0x1ac [xfs]
> [<ffffffffa032d536>] ? xfs_mod_incore_sb+0x29/0x6e [xfs]
> [<ffffffffa033052c>] ? _xfs_trans_alloc+0x27/0x61 [xfs]
> [<ffffffffa03303d3>] ? xfs_trans_reserve+0x6c/0x19e [xfs]
> [<ffffffff8106fb45>] ? up_write+0x2b/0x32
> [<ffffffffa0335e55>] ? xfs_alloc_file_space+0x163/0x306 [xfs]
> [<ffffffff8107120a>] ? sched_clock_cpu+0xc3/0xce
> [<ffffffffa0336122>] ? xfs_change_file_space+0x12a/0x2b8 [xfs]
> [<ffffffff8106f9bf>] ? down_write_nested+0x80/0x8b
> [<ffffffffa031b8ce>] ? xfs_ilock+0x30/0xb4 [xfs]
> [<ffffffffa033e0e4>] ? xfs_vn_fallocate+0x80/0xf4 [xfs]
> --
> R xfs_io 29444 86014624.786617 162 120 86014624.786617
> 137655.161327 408.979977 /
>
> # uname -r
> 2.6.34-0.4.rc0.git2.fc14.x86_64
>
> I'll look into it.
We stick with 2.6.31.5 which seems to be good for us. We do not change kernels
easily, as soon as higher revision arrives because it doesn't make sense from
stability point of view. We have seen too many times regression bugs so if we
are confident with some revision then there is no point to change this.
Krzysztof Błaszkowski
>
> -Eric
>
> _______________________________________________
> xfs mailing list
> xfs at oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
More information about the xfs
mailing list