[Top] [All Lists]

RE: clone of filesystem across network preserving ionodes

To: Peter Grandi <pg@xxxxxxxxxxxxxxxxxxx>, Linux fs XFS <xfs@xxxxxxxxxxx>
Subject: RE: clone of filesystem across network preserving ionodes
From: "Meij, Henk" <hmeij@xxxxxxxxxxxx>
Date: Tue, 4 Nov 2014 14:16:59 +0000
Accept-language: en-US
Delivered-to: xfs@xxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wesleyan.edu; s=feb2013.wesmsa; t=1415110620; bh=oeXNStoM7+77JdhB3QULuJKdvNJo2SZ9se/9nFHGzlY=; h=From:To:Subject:Date:References:In-Reply-To; b=JXLxy9fwQ6Fe/fv2PqXaVpyzwem95SGacmYDDb0IqNVbwgTscU3/jVlrWUuIsk9qC gh81PeeKPfAwRFcvs5GKeTHczdqKo8zDyWm3NRq2pepmdLStqX//sw71F0rBCLhmsn w9x+h47AstZLHoVt+RVrjTHikwaEH7sRxbx1TXFo=
In-reply-to: <21592.6241.228215.412783@xxxxxxxxxxxxxxxxxx>
References: <8688BD11DAC0574AA90295127E9E9F4AC047F1CA@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>,<21592.6241.228215.412783@xxxxxxxxxxxxxxxxxx>
Thread-index: Ac/3pUOyyQOPe9XjTQ+jW6h6Iz9yCwAR75OAABJta30=
Thread-topic: clone of filesystem across network preserving ionodes
Thank you list, I think the problem is in my work flow, and in my head.  I will 
be testing some suggestions and will report back. To answer some of the 

S1 and S2 are identical research storage units (112TB each) partitioned in 4 
28TB partitions, with duplicate data for fail over purposes (exceeds out 
enterprise backup capacity, cheaper this way).

DRBD is where we started. v8.4.5 was tested but could not deliver rates larger 
than 40MB/s, despite the lists excellent suggestions. We're unsure why; we ran 
w/wo BBU, changed RW policies, configured 2 global hot spares into raid 0, even 
made ram disks, multiple drbd devices, but the storage unit (xfs on raid 60) 
never obtained the "want" speeds of 100MB/s. rsync tests had no problems 
reaching 112MB/s and slightly better nic-to-nic. First initialization/re-sync 
times for DRBD would be 30 days unless fixed.

So we're building our own. Pulling data from multiple locations to seed S1, 
then use rsync or xfsdump to seed S2. Manual fail over of service IP with 
scripts. Nightly incrementals. We'd like to run daily, weekly, monthly 
snapshots on S2. We do that with the --link-dest=/path/to/d.1 /path/to/source 
/path/to/d.0   ... now the first daily snapshot (d.0) will be as large as the 
source.  And that's what I like to avoid (having 3 copies) so I need to change 
my thinking and present the S2 copy to --link-dest as the latest backup.


From: xfs-bounces@xxxxxxxxxxx [xfs-bounces@xxxxxxxxxxx] on behalf of Peter 
Grandi [pg@xxxxxxxxxxxxxxxxxxx]
Sent: Monday, November 03, 2014 7:05 PM
To: Linux fs XFS
Subject: Re: clone of filesystem across network preserving ionodes

> What is the best practice way to get the contents from
> server1 to server2 identical disk array partitions

Here you do not say that «get the contents» has to be over the
network, but the 'Subject:' line seems to indicate that.

> and preserve the ionode information using xfs tools or non-xfs
> tools? xfsdump/xfsrestore appear to not preserve this or I can
> not find the obvious setting.

If «preserve the ionode information» means "keep the same
inumbers for each inode" that seems to me fairly pointless in
almost every situation (except perhaps for forensic purposes, and
that opens a whole series of other issues), so there is no «best
practice way» as there is little or no practice, and management
speak won't change that. More clarity in explaining what you mean
and/or what you want to achieve would be better.

If you want for whatever reason to make a byte-by-byte copy from
block device X (potentially mounted on M) on host S1 to the
corresponding one of «identical disk array partitions» Y on host
S2, which also preserves the inumbers, something fairly common is
to use 'ssh' as a network pipe (run on host S1):

  M="`grep \"^$X\" /proc/mounts | cut -d' ' -f2`"
  case "$M" in ?*) xfs_freeze -f `"$M";; esac

  sudo sh -c "lzop < '$X'" | dd bs=64k iflag=fullblock \
    | ssh -oCipher=arcfour -oCompression=no "$S2" \
        "dd bs=64k iflag=fullblock | sudo sh -c "\'"lzop -d > '$Y'"\'""

  case "$M" in ?*) xfs_freeze -u `"$M";; esac

If 'xfs_freeze' is suitable, and if no interruptions happen; if
they happen put reasonable "seek" and "skip" parameters on 'dd'
and rerun.

If the target block device is bigger than the source block
device, one can use 'xfs_growfs' after the byte-by-byte copy
(growing filetrees has downsides though).  Unless by «preserve
the ionode information» or «identical disk array partitions» you
mean something different from what they appear to mean to me.

The above just copied for me a 88GiB block device in 1624s
between two ordinary home PCs.

One could replace 'ssh' with 'nc' or similar network piping
commands for higher speed (at some cost). I also tried to
interpolate 'xfs_copy -d -b "$X" /dev/stdout | lzop ...'  but
'xfs_copy' won't do that, not entirely unexpectedly.

If by «preserve the ionode information» you simply mean copy all
file attributes but without keeping the inumber (except for
'atime', but is usually disabled anyhow, and there is a patch to
restore it), if the source filetree is mounted at M on S1 and the
target is mounted at N (you may opt to mount with 'nobarrier') on
S2, a fairly common way is to use RSYNC with various preserving
options (run on host S1):

  sudo rsync -i -axAHX --del -z \
     -e 'ssh -oCipher=arcfour -oCompression=no -l root' \
     "$M"/ "$S2":"$N"/

The following is another common way that has slightly different
effects on 'atime' and 'ctime' (requires a patched GNU 'tar' or
be replaced with 'star') by using 'ssh' combines with 'tar':

  sudo sh -c "cd '$M' && tar -c --one --selinux --acls --xattrs -f - ." \
  | lzop | dd bs=64k iflag=fullblock \
    | ssh -oCipher=arcfour -oCompression=no "$S2" \
      "dd bs=64k iflag=fullblock | lzop -d \
        | sudo sh -c 'cd \"$N\" && tar -xv --preserve-p --atime-p -f -'"

> xfs_copy would, per documentation,

Probably 'xfs_copy' is only necessary in the case where you
really want to preserve inums, and the source block device is
larger than the target block device (and obviously the size of
the content is smaller than the size of the target block device).

> but can I get from server1 to server2?

That probably requires finding a target medium (file or block
device) big enough and that can be shared (or transported)
between S1 and S2.

This could involve a filetree on S2 that is exported (for example
by NFSv4) to S1 and large enough to hold the target file(s) of

PS Whichever way is chosen to the copy, a second run with RSYNC
   options '-n -c' might help verify that corruption did not
   happens during the copy.

xfs mailing list
<Prev in Thread] Current Thread [Next in Thread>