Thanks for the heads up. My current implementation is definitely not stable.
Do you know if the VFS-lock patch is included in the 2.4.19 or 2.4.19-aa
kernels? (I am running SuSE which is based on the -aa kernel, so I may have
this functionality and not know it.)
Does anyone know if there is a version of xfs/lvm that work together reliably
to take snapshots under load?
====> My simple tests
I just tried doing some snapshots under a couple of different i/o loads, and I
found them unreliable with simultaneous write activity.
I was using the latest SuSE kernel which I believe is based on the
SuSE also has an experimental kernel based on 2.4.19-aa that I have not tried.
For all cases I was using a simple script to invoke the snapshot process and I
was attempting to create a 2.5 Gig snapshot volume:
xfs_freeze -f /data
lvcreate --snapshot -L 2500m --name Data_snap /dev/VG/Data
xfs_freeze -u /data
lvremove -f /dev/VG/Data_snap
I was invoking the above script by hand, so there were several seconds minimum
between each iteration.
1) With zero i/o, I did 10 snapshots with no lockups.
2) With read-only i/o, I also did 10 snapshots with no lockups. (i..e dd
if=/data/largefile of=/dev/null bs=64k)
3) With read/write i/o, I had a lockup on my very first snapshot attempt.
(i..e dd if=/data/largefile of=/data/junk bs=64k)
The lockup occured on the lvcreate step.
Issuing a xfs_freeze -u /data from a different ssh login, cleared the lockup.
Deployment and Integration Specialist
Compaq ASE - Tru64 v4, v5
Compaq Master ASE - SAN Architect
The Norcross Group
>> Snapshots should be able to be generated on an active filesystem in full
>> flight. If you cannot do that then something is wrong.
>> I have had many issues with LVM snapshots and XFS/xfs_freeze but that was
>> long time ago (last year and early this year) you might check the archives
>> see if the issue is something simular. For me I stopped using xfs_freeze
>> relied on the VFS-lock patch.
>> I did have a problem that was time related. If I ran through snapshots
>> manually everything worked. If I ran exactly the same commands in a
>> it would die. I don't think anyone was able to work it out. In the end I
>> changed that way I was doing it.
>> > I think I know what I was doing wrong. I had the script I was
>> > writing/executing on the filesystem I was trying to snapshot.
>> > Bad idea!!!!
>> > I have moved my script to a drive I don't need to snapshot, and it seems
>> > be running reliably now.
>> > Before I figured this out, I locked up LVM so bad a couple of times that
>> > had to cycle power on the server. :<
>> Adrian Head
>> (Public Key available on request.)