Notes on Linux Disk Tools

I am setting up an old PC as a TrueNAS replication target to back up data on my drive array. Fitting a modern SSD into the box was only part of the challenge, I need an SSD to put in it. This is a problem easily solved with money because I don't need a big system drive for this task, and we live in an era of 256GB SSDs on sale for under $20.(*) But where's the fun in that? I already have some old and small SSDs, I just need to do a bit of musical chairs to free one up.

These small drives are running various machines in my hoard of old PC hardware. 64-bit capable machines run Ubuntu LTS and 32-bit only hardware running Raspberry Pi Desktop. Historically they were quite... disposable, in the sense that I usually wipe the system and start fresh whenever I want to repurpose them. This time is different: one of these is currently a print server, turning my old Canon imageCLASS D550 laser printer into a network-connected printer. Getting Canon's Linux driver up and running on this old printer was a miserable experience. Canon has since updated imageCLASS D550 Linux driver so things might be better now, but I didn't want to risk repeating that experience. Instead of wiping a disk and starting fresh, I took this as an opportunity to learn and practice Linux disk administration tools.

Clonezilla

My first attempt tried using Clonezilla Live to move my print server from one drive to another. This failed with errors that scrolled by too fast for me to read. I rediscovered the "Scroll Lock" key on my keyboard to pause text scrolling so I could read the errors: partition table information was expected by one stage of the tool but was missing from a file created by an earlier stage of the tool. I have no idea how to resolve that. Time to try something else.

dd

I decided it was long overdue for me to learn and practice using the Linux disk tool dd. My primary reference is Arch Linux Wiki page for dd. It's a powerful tool with many options, but I didn't need anything fancy for my introduction. I just wanted to directly copy from one drive to another (larger) drive. To list all of my install storage drives, I knew about fdisk -l but this time I also learned of lsblk which doesn't require entering the root password before listing all block storage device names and their capacities. Once I figured out the name of the source (/dev/sdc) and the destination (/dev/sde) I could perform a direct copy:

sudo dd if=/dev/sdc of=/dev/sde bs=512K status=progress

The "bs" parameter is "block size" and apparently the ideal value varies depending on hardware capabilities. But it defaults to 512 bytes for historical reasons and that's apparently far too small for modern hardware. I bumped it up several orders of magnitude to 512 kilobytes without really understanding the tradeoffs involved. "status=progress" prints the occasional status report so I know the process is ongoing, as it can take some time to complete.

gparted

After the successful copy, I wanted to extend the partition so my print server can take advantage of new space. Resizing the partition with Ubuntu's "disks" app failed with an error message "Unable to satisfy all constraints on the partition." Fortunately, gparted had no such complaints, and my print server was back up and running with more elbow room.

Back to dd

Before I erase the smaller drive, though, I thought I would try making a disk image backup of it. If Canon driver installation were painless, I would not have bothered. In case of SSD failure, I would replace the drive and reinstall Ubuntu and set up a new print server. But Canon driver installation was painful, and I wanted an image to restore if needed. I went about looking for how to create a disk image and in the Linux world of "everything is a file" I was not too surprised to find it's a matter of using a file name (~/canonserver.img) instead of device name (/dev/sde) for dd output.

sudo dd if=/dev/sdc of=~/canonserver.img bs=512K status=progress

gzip and xz

But that raw disk image file is rather large, exactly the size of the source drive. (80GB in my case) To compress this data, Arch Linux Wiki page on dd had examples of how to pipe dd output into gzip for compression. Following those direction worked fine, but I noticed Ubuntu's "disks" app recognized img.xz natively as a compressed disk image file format and not img.gzip. Looking into that xz suffix, I learned xz was a different compression tool analogous to gzip, and I could generate my own img.xz image by piping dd output into xz, which in turn emits its output into a file, with the following command:

sudo dd if=/dev/sdc bs=512K status=progress | xz --compress -9 --block-size=100MiB -T4 > ~/canonserver.img.xz

I used xz parameters "-9" for maximum compression. "-T4" means spinning up four threads to work in parallel, as I was running this on a quad-core processor. "--block-size=100MiB" is how big of a chunk of data each thread receives to work.

A spinning-platter HDD was used as a test output and verified a restoration of this compressed image worked. Now I need to move this file to my TrueNAS array for backup, kind of bringing the project full circle. At 20GB, it is far smaller than the raw 80GB file but still nontrivial to move.

gio

I tried to mount my TrueNAS SMB shares as CIFS but kept running into errors. It would mount and I could read files, I just couldn't write any. After several failures I started looking for an alternative and found gio.

gio mount --anonymous "smb://servername/sharename"
gio copy --progress ~/canonserver.img.xz "smb://servername/sharename/canonserver.img.xz"

OK, that worked, but what did I just use? This name "gio" is far too generic. My first search hit was a "Cross-Platform GUI for Go" which is definitely wrong. My second hit "Gnome Input/Output" might be correct or at least related. As a beginner this is all very fuzzy, perhaps it'll get better with practice. For today I have an operating system disk up and running so I can work on my ZFS data storage drive.