On Wed, 12 Feb 2003, Chris Wedgwood wrote:
> Short answer:
> xfs_fsr can (and usually does) destroy locality of files.
Thank you for the answer !
> I'm curious as to why it helps so much in your case... what are you
> doing that causes lots of fragmentation?
Scientific simulations that produce large amounts of data over a long
time. Of course, "large amounts" and "long time" is relative, in our case
being "several hundreds megabytes" and "days". The file is created at the
beginning of the simulation and data is appended to it for the whole
simulation duration; data is never rewritten during the simulation.
Depending on data size, simulations can finish sooner or later and produce
files that are of different sizes, however in order to ease later backup,
we limit the size of one file to approximately 650 MB. Writting to disk is
not a problem, as the rate at which data is produced is very low. However,
reading the data becomes a problem and we need to do it to either analize
the data or transfer it from there to other computer or CD.
When using ext2, by writting a several hundreds megabytes file with
'dd if=/dev/zero' would produce a file for which reading speed was about
20 MiB/s, independent of the state of the FS (newly formatted or nearly
full), which was what we expected from this disk; however, reading one of
the simulation files is done at 2-3 MiB/s, leading to problems f.e. when
trying to write to CD with a piped "mkisofs | cdrecord".
After switching to XFS and (mea culpa!) forgetting to set up xfs_fsr to be
run by cron, the read speed would be similar after several days-weeks.
However, by using xfs_fsr we go back to reading with around 20 MiB/s even
for a pretty full FS.
> Does your disk ever get really full?
Sometimes; not everybody is careful about taking data away after it was
produced... But >75% full is normal.
> Do you do lots of synchronous/or direct writes?
Probably not, most of the code is written in Fortran...
> Do lots of NFS clients access your filesystem?
Most often, we run this simulations as parallel jobs; however only one
process does the I/O.
The behaviour is about the same when using either:
- NFS clients (<10 at one time) write to this FS
- there is one parallel job running on several nodes with one process on
the node with the disk which is the only process writting to the file; no
NFS access in this case
> As a general rule, XFS doesn't fragment all that badly.
Well, given the conditions, I think that any filesystem would have
problems. But the nice thing about XFS, which made me switch, is that
defragmentation occurs without unmounting the FS which lets us run
simulations _and_ read data at hight speeds.
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868