I was contacted in private to offer more information, and thoguht it might be
a good idea to let the list know:
* all files are being written into the same directory. like 2-4 slow "dd
of=/xfs/video.dat"'s
* the disk is used exclusively for this application, no other writers
are present.
(fragmentation reduces i/o performance)
I just mentioned that in case xfs would display a high extent count but
the file would, in fact, be almost contiguous. I now used xfs_bmap on a
recently-written file (2.2GB), it looks like this:
17_20050403135800_20050403150000.nuv:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL
0: [0..895]: 161889248..161890143 8 (5598368..5599263) 896
1: [896..1151]: 161882848..161883103 8 (5591968..5592223) 256
2: [1152..101119]: 173099520..173199487 8 (16808640..16908607) 99968
3: [101120..511231]: 195363656..195773767 10 (56..410167) 410112
4: [511232..910975]: 214936240..215335983 11 (36280..436023) 399744
5: [910976..987903]: 243994088..244071015 12 (9557768..9634695) 76928
6: [987904..988927]: 238584712..238585735 12 (4148392..4149415) 1024
7: [988928..989951]: 238583688..238584711 12 (4147368..4148391) 1024
8: [989952..991999]: 238581640..238583687 12 (4145320..4147367) 2048
9: [992000..994175]: 238579464..238581639 12 (4143144..4145319) 2176
10: [994176..996351]: 238577280..238579455 12 (4140960..4143135) 2176
11: [996352..998399]: 238575232..238577279 12 (4138912..4140959) 2048
12: [998400..1000575]: 238573056..238575231 12 (4136736..4138911) 2176
13: [1000576..1002623]: 238571008..238573055 12 (4134688..4136735) 2048
14: [1002624..1003775]: 238569856..238571007 12 (4133536..4134687) 1152
15: [1003776..1004799]: 238568832..238569855 12 (4132512..4133535) 1024
16: [1004800..1005823]: 238567808..238568831 12 (4131488..4132511) 1024
17: [1005824..1006847]: 238566784..238567807 12 (4130464..4131487) 1024
18: [1006848..1007871]: 238565760..238566783 12 (4129440..4130463) 1024
19: [1007872..1009023]: 238564608..238565759 12 (4128288..4129439) 1152
20: [1009024..1010047]: 238563584..238564607 12 (4127264..4128287) 1024
21: [1010048..1011071]: 238562560..238563583 12 (4126240..4127263) 1024
22: [1011072..1012095]: 238561536..238562559 12 (4125216..4126239) 1024
23: [1012096..1013119]: 238560512..238561535 12 (4124192..4125215) 1024
24: [1013120..1014271]: 238559360..238560511 12 (4123040..4124191) 1152
25: [1014272..1015295]: 238558336..238559359 12 (4122016..4123039) 1024
26: [1015296..1016319]: 238557312..238558335 12 (4120992..4122015) 1024
27: [1016320..1018367]: 238555264..238557311 12 (4118944..4120991) 2048
28: [1018368..1019391]: 238554240..238555263 12 (4117920..4118943) 1024
... the remaining ~500 extents look very similar (~1024 block).
it looks as if there was only one writer initially (that's just a
conjecture), and that xfs simply interleaves write()'s by multiple writers
(1024 blocks is probably the i/o size the writers use, they use rather
large write()s).
looking at the extent above map, i also see this pattern quite often:
when the block order is abcdefghi
then xfs allocates extents in this order: abdcefhgi
i.e. it often swaps adjacent blocks, see, for example, pairs 6&7,
13&14. Looking at some other files this is quite common.
ext3 looks much better as it (seemingly) tries to allocate the files in
different block groups when multiple files are being written.
xfs_fsr, OTOH, does a perfect job - all files are single-extent files
after it ran, even when I run it while there are three other writers!
I'd run xfs_fsr continuously, but the i/o bandwidth lost is immense, and
xfs_fsr tends to copy gigabytes of a file and then detects that the file
is being modified, which somewhat precludes it's use on a busy filesystem.