[PLUG] disk (filesystem) imaging one liner: corrections and a recent use

[PLUG] disk (filesystem) imaging one liner: corrections and a recent use case

In an earlier thread JP asked me about this and I gave an example of how a simple one line statement could be used to images XFS files systems live. I also mentioned that you could also use a similar one liner for any file system as long as it was not in use (but if you are running against an LV snapshot of a running file system, this will work just fine).

It occurred to me while working on something last night, I mangled the one liners previously posted and I wanted to correct the statements which were:

find / -xdev -print0 | cpio -pmdv

would image the fs into /mnt/vm

find / -xdev -print0 | cpio -pmdv | lzop > fs.lzo

would create /mnt/vm/fs.lzo

In the first one liner the cpio pass-through mode comand is missing a zero "0" in its parameters a destination mount point (e.g. /mnt/mynewfs) The find command is incorrect- the "/" should be a "." to indicate that operation will start from the directory you are currently in. It is usually best to cd into the source path and then use find this way.

In the second one liner, the find is wrong again in the same way but in this case, cpio should be used in the copy-out mode so, the parameters would be "-ov0". Also, you would not be writing that file into the the same path. You would be storing it elsewhere such as /mnt/myexternal/fs.lzo.

~ ~ ~

To help with eliminate the confusion, let me give an exact use case that occurred last night. Wil (yes, Wil from PLUG) and I are working on a project. Wil has a 320Gb drive with 17Gb in use of previous work product that we want to save. Since we're talking about 2 to 3 hours of driving (round trip) for one of us plus data transfer time, for 17Gb worth of data it makes sense to just image the data over the internet. Wil USB connected the drive to his Raspberry Pi, gave me an account (with sudo privileges) so I can log in. After I connect, I sudo then cd to the mounted disk file system I need to image and do...

find . -xdev -print0 | cpio -ov0 | ssh someuser@some.server.ip.addy "lzop > fs.cpio.lzop"

This walks the directory tree and uses cpio in copy-out mode. The data is piped over SSH and compressed with lzop into the fs.cpio.lzop file. The connection was pushing about 1.15MBs (which is about 9.2Mbs) so we're just about maxing out Wil's upstream. For 17Gb this would be about 4.1 hour run.

Of course at 23:00 that was ok...

Around 04:00 today, I checked the job and it had completed. I had an 8.5Gb file. 50% compression rate- not amazing but I'll take it. Now to make sure what I have works, I'm going to drop this data into a 32Gb raw image for a VM container that way I know this can be moved or even run from a 32Gb flash drive in the future if necessary.

1) create a sparse file... dd if=/dev/zero of=fs.img bs=1 count=0 seek=31G

2) partition it with fdisk... fdisk fs.img

3) create a loop device pointing to the partition... losetup -o $((512*2048)) /dev/loop1 pos.img

$((512*2048)) does the math for calculating the partition offset so I don't have to

4) format the partition... mkfs.ext4 /dev/loop1

5) mount the partition... mount /dev/loop1 /mnt/tmp

I could also do mount -o loop,offset=$((512*2048)) fs.img /mnt/tmp but since /dev/loop1 has already been set up, I might as well use it.

6) cd to "/mnt/tmp" and then restore the fs image... lzop -dc /home/someuser/fs.cpio.lzop | cpio -imdv

This runs cpio in copy-in mode and rebuilds the file system. That took about 2 minutes- probably because the user account was sitting on an SSD partition. That completes the rebuild of the file system into a raw image that will be used for the VM. It could also be converted into a qcow2 or vmdk format if I was going to keep it but this is just to confirm I reconstruct have a working system if necessary.

Since there no boot sectors I'll have to boot the the VM with a live cd or .iso of the a live cd (I generally keep the latter on my VM host servers). I'll skip the qeum/kvm launch string this time and instead give the commands for fixing the boot sectors. Once the VM is up on the live cd, the disk image will show as sda with the sda1 partition we've already rebuilt. Grab a terminal window, sudo if you have to and...

1) I create my own mount point to avoid verifying other directories are empty since it can vary from distro to distro so... mkdir /src

2) mount the file system... mount /dev/sda1 /src

3) before we chroot into the system, we need some bind mounts so here's another one liner... for d in dev proc sys; do mount --bind /$d /src/$d; done

4) now lets chroot... chroot /src

5) Ubuntu / Debian variants have a script which rebuilds your grub configuration... update-grub

6) now lets install it to "disk" sda... grub-install /dev/sda

7) exit and shutdown the VM

That's it. Since everything went well, I was able to boot VM container without any issues. That might seem like a lot of work but just like anything else if you know the steps and do it enough, you can fly though it pretty quickly... even at 4am... without crashing something :D

Of course, a valid question would be if the data transfer took 4 hours and truck replication would have taken a max of 3 hours. Why isn't the just a waste of time? Fair question and answer would be "yes" assuming the 3 hour drive could be made on-demand with no delays. Practically speaking its more that likely a no but here's two other points...

1) Wil actually was NOT running on a Raspberry Pi, he was running on an O-Droid XU3. That has 8 ARMv7 cores we can put to work!

2) A compression ratio of 50% was realized on my end so an XU3 could probably handle that

To prove that, while I was doing all those other steps, I repeated the initial cpio task but this time I'm compressing before sending, so...

time find . -xdev -print0 | cpio -ov0 | lzop -c | ssh someuser@some.server.ip.addy "dd of=fs.cpio.lzop"

For those who are unfamilar, prepending the "time" command give you time statistics for how how long the work after it takes. Since I was going back to bed, I'll let the system tell me how long the job took.

132 minutes

So now, this makes a lot more sense :D

The O-Droid XU3 also generated an 8.5Gb file (same 50% compression) but more importantly, it did that in pretty much 50% of the time at the same 1.15MBs transfer rate. One of the reasons, I use LZOP (LZO) compression is because it is opportunistic and designed for data transfers (fast compression and decompression) where you want to compress data but don't want to sacrifice a lot of transmission speed. It was designed by Oberhumer Company in Austria. NASA has used their algorithm on Mars mission assets Spirit, Opportunity and Curiosity. So, when you use LZOP you can literally say you're using space grade tech ;)

See http://www.oberhumer.com/

Again, sorry for the confusion. Use of cpio is a bit of a lost art and I wanted to get it right. I don't use it that much anymore since I run XFS but it really is an essential tool to understand if you move file systems around a lot. Since it will works with any file system you have less to remember. I hope this detailed example was useful. I figured I would get it out while it was still fresh on my mind.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Keith C. Perry, MS E.E.

Owner, DAO Technologies LLC

(O) +1.215.525.4165 x2033

(M) +1.215.432.5167