XFS write cache flush policy
Ric Wheeler
rwheeler at redhat.com
Fri Dec 14 07:57:11 CST 2012
On 12/14/2012 11:19 AM, Matthias Schniedermeyer wrote:
>>>> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F
>>>>
>>>> Basically, you have an IO error situation, and you have dm-crypt
>>>> in-between buffering an unknown about of changes. In my experience,
>>>> data loss eventsi are rarely filesystem problems when USB drives or
>>>> dm-crypt is involved...
>>> I don't know the inner workings auf dm-*, but shouldn't it behave
>>> transparent and rely on the block-layer for buffering.
>> I think that's partly why Dave asked you to test it, to check
>> that theory ;)
> To test that theory.
>
> Technically this is an other machine than the original but i tried to
> recreate as much from the original cirumstances as possible.
> Kernel is 3.6.7
>
> First i recreated the circumstances.
> I plugged a HDD i'm throwing out into the enclosure that was the most
> problematic, created the dm-crypt-layer & filesystem as reported and
> started copying.
>
> In all testes i didn't supply any mount-options!
>
> 1)
> After a few minutes i "emulated" the problem by unplugging the cable.
> At that point about 40 files were copied, but only 25 where there after
> i replugged the cable.
Just a note - depending on the drive and its firmware, unplugging a cable is
*not* the same as a power loss since the firmware detects the loss of link and
immediately writes back any volatile cache data to platter (and it has power, so
that is easy for it to do :)).
You really should drop power to the enclosure to get a "mean" test :)
Ric
>
> 2)
> BUT the directory-structure had changed in the meantime, the first 22
> files were in an other directory i didn't have the first time. In the
> first test all >=200 files were in the same directory.
>
> So i retested by just copying the directory with which i had my original
> trouble.
> This time i used a timer and after a little over 5 minutes 23 files were
> copied, after replugging only the same 3 files as from the first try
> where retained.
>
> 3)
> This time i ditched the dm-crypt-layer.
> I mkfs'ed with the same parameters on a plain 100GB partition.
>
> Copied the same files as in 2), after 5 minutes 24 files were copied and
> after re-plugging the same 3 files were retained.
>
>
> At this point the amateur in me says: dm-crypt is "transparent".
>
> A new kernel was released, so a retry with 3.7.0/plain-partition.
>
> 4)
> Same as 3)
>
> The only difference is that 3.7.0 appears to be much quicker to pass on
> the error, the rsync-process was "happyly" procedding with 3.6.7 until i
> manually cancled it a few second after unplugging the cable.
> With 3.7.0 it immediately stopped with Input/Output error.
>
> 5)
> Same as 3/4)
>
> A second before unplugging i 'ls -l'ed the directory, all files copied
> were visible at that point.
>
> 6)
> Same as 5)
>
> But this time i issued a 'sync' at about the halfway-point.
> This time a total of 13 files were retained, a ls -l just before the
> sync showed 12 files. But the sync took 20 seconds, so the 13th file
> must have been completed in the time between start/finished of the sync
> command.
>
>
> In conclusive the amateuer in me says:
> The data is never send to the drive, as all this test DON'T include a
> power-failure, only connection failure.
>
>
>
>
>
More information about the xfs
mailing list