XFS write cache flush policy
Matthias Schniedermeyer
ms at citd.de
Fri Dec 14 05:19:24 CST 2012
> >> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
> >>
> >> Basically, you have an IO error situation, and you have dm-crypt
> >> in-between buffering an unknown about of changes. In my experience,
> >> data loss eventsi are rarely filesystem problems when USB drives or
> >> dm-crypt is involved...
> >
> > I don't know the inner workings auf dm-*, but shouldn't it behave
> > transparent and rely on the block-layer for buffering.
>
> I think that's partly why Dave asked you to test it, to check
> that theory ;)
To test that theory.
Technically this is an other machine than the original but i tried to
recreate as much from the original cirumstances as possible.
Kernel is 3.6.7
First i recreated the circumstances.
I plugged a HDD i'm throwing out into the enclosure that was the most
problematic, created the dm-crypt-layer & filesystem as reported and
started copying.
In all testes i didn't supply any mount-options!
1)
After a few minutes i "emulated" the problem by unplugging the cable.
At that point about 40 files were copied, but only 25 where there after
i replugged the cable.
2)
BUT the directory-structure had changed in the meantime, the first 22
files were in an other directory i didn't have the first time. In the
first test all >=200 files were in the same directory.
So i retested by just copying the directory with which i had my original
trouble.
This time i used a timer and after a little over 5 minutes 23 files were
copied, after replugging only the same 3 files as from the first try
where retained.
3)
This time i ditched the dm-crypt-layer.
I mkfs'ed with the same parameters on a plain 100GB partition.
Copied the same files as in 2), after 5 minutes 24 files were copied and
after re-plugging the same 3 files were retained.
At this point the amateur in me says: dm-crypt is "transparent".
A new kernel was released, so a retry with 3.7.0/plain-partition.
4)
Same as 3)
The only difference is that 3.7.0 appears to be much quicker to pass on
the error, the rsync-process was "happyly" procedding with 3.6.7 until i
manually cancled it a few second after unplugging the cable.
With 3.7.0 it immediately stopped with Input/Output error.
5)
Same as 3/4)
A second before unplugging i 'ls -l'ed the directory, all files copied
were visible at that point.
6)
Same as 5)
But this time i issued a 'sync' at about the halfway-point.
This time a total of 13 files were retained, a ls -l just before the
sync showed 12 files. But the sync took 20 seconds, so the 13th file
must have been completed in the time between start/finished of the sync
command.
In conclusive the amateuer in me says:
The data is never send to the drive, as all this test DON'T include a
power-failure, only connection failure.
--
Matthias
More information about the xfs
mailing list