Imaging mounted disk volumes under duress (2021)

_carbyau_ 6 months ago

I know TheFineArticle is in Linux land but for Windows people with this issue you might look at Sysinternals Disk2vhd.[0]

It can be run from the online OS itself and it can store the resulting vhd on the same disk it is imaging (with space and disk performance constraints).

I find it handy for turning my freshly superceded gaming machine into a VM on my new machine for easy access to files, before doing whatever with my old hardware.

[0] https://learn.microsoft.com/en-us/sysinternals/downloads/dis...

Reply View 3 replies

miles 6 months ago

Another marvel is Tom Ehlert's Drive Snapshot[0], which supports "disk image backups of live or offline Windows 2000-2022 systems all in a portable (and bootable!) ~1MB EXE".[1]
[0] http://www.drivesnapshot.de/en/index.htm
[1] https://news.ycombinator.com/item?id=42238451

Reply View | 0 replies
accrual 6 months ago

That's pretty slick. I recently upgraded my gaming PC and made my old rig headless so I could remote in and grab files/config.
Did you uninstall any games or other large files before converting to .vhd to keep the image size down?

Reply View | 1 reply
- _carbyau_ 6 months ago
  
  I keep my gaming machines for a long time and usually only upgrade the gpu in that time so the main "fast SSD" is much smaller than the "storage disk" of my new machine. But yes, I get rid of the Steam directory entirely and any large media files.
  If I wanted to be more careful I could probably just do a full registry export and keep C:\Users\[username]\AppData. But rather than dig around trying to recall and export MORE stuff (when I want to be playing on the new machine...) I'll just keep a copy of the whole thing for reference.
  And it'll get deleted down the track when I'm happily bedded into the new machine.
  Other tips: if you moved your license for Windows to the new machine, run the VM without networking...
  If you are wondering how to get stuff off it with no networking - because you are using (inbuilt to Windows Pro) hyperv instead of vmware - you can mount the VHD disks directly on your new machine while the VM is off.
  
  Reply View | 0 replies

honestSysAdmin 6 months ago

Since the year 2007, my working assumption is that if data is not on ZFS on physically redundant media that the data has not been successfully saved. And, any machines that don't have ZFS (some RedHat based boxes) should be configured only through Ansible and configured with the intention that all data (including syslogs) is either forwarded somewhere that does have ZFS or is accessed via NFS (backed by ZFS).

Or Ceph Bluestore, which does checksums on physically redundant media. We do N+3 replication because we're lazy.

Reply View 0 replies

nyrikki 6 months ago

FYI, while these block level methods do have a use case, parallel rsync and other file level tools are far safer and often faster with less additional load on the disk.

Duplicating the OS/FS behavior hits the decidable problem and you just hope for the best with block level, often you won't notice corruption either.

Reply View 8 replies

groby_b 6 months ago

1) The article's use case is explicitly bootable images.
2) No, most of us don't "hope for the best" with imaging, but would like to actually achieve a reasonable level of confidence. If your approach to data integrity is "you probably won't notice corruption", you don't have an approach to data integrity.

Reply View | 5 replies
- thesnide 6 months ago
  
  I'm also doing bootable images. But the old way with a r/o initrd, that does then mount the rootfs.
  The rootfs can be mkfs and rsynced nicely.
  That said, the article is awesome and the idea very clever.
  But more to do streaming replication that dd catchup.
  
  Reply View | 1 reply
  
  Joel_Mckay 6 months ago
  
  initrd with native OverlayFS kernel support is very versatile. ;)
  Yet the btrfs, CephFS, or ZFS all have snapshot syncing tricks that make state mirrors far more practical and safe to pull off. =3
  
  Reply View | 0 replies
- nyrikki 6 months ago
  
  Block level copies of boot volumes is high risk, because they are almost exclusively mounted in RW mode via label or guid.
  Pretty common problem for someone to do their boot drive, reboot and have it mount their backup.
  If you are using iSCSI or anything with multipathing this can happen even without a reboot.
  I know that block level copies seem like a good easy solution. But several decades in the storage admin to architect roles during the height of the SAN era showed that it is more problematic than you expected.
  To be honest, full block level backup of a boot volume is something you do when the data isn't that critical anyways.
  But if you use case requires this and the data is critical, I highly encourage you to dig into how even device files are emitted.
  If you are like most people who don't have the pager scars that forced you to dig into udev etc... you probably won't realize what appears to be a concrete implementation detail is really just a facade pattern.
  
  Reply View | 1 reply
  
  groby_b 6 months ago
  
  > Block level copies of boot volumes is high risk, because they are almost exclusively mounted in RW mode via label or guid
  Oh, 100% agreed. The article was about "under duress", and that's about the only time where I think this is a useful imaging approach :) (I.e. I'd mostly use this in the "I don't know if this'll ever start up again" case)
  You shouldn't land in this situation in the first place. Ideally. But, as the saying goes, "in theory, theory and practice are the same - in practice, they are not" ;)
  Stuff goes wrong. This is a good tool in the toolkit for when it's already gone seriously wrong.
  
  Reply View | 0 replies
- dambi0 6 months ago
  
  1) The article stated that using bootable images for backup was a preference. That doesn’t invalidate asking whether that’s an ideal preference
  2) Arguing that it might be better to avoid such methods because of possible problems with data integrity isn’t a lack of an approach to data integrity.
  
  Reply View | 0 replies
GTP 6 months ago

Where do you hit the decidability problem? Has it to do with not knowing if a new write is going to happen soon?

Reply View | 1 reply
- nyrikki 6 months ago
  
  Rice–Shapiro theorem.
  The number of writers on a typical OS means that you can't depend on a pathological case.
  I suppose you could reduce it to Rice's theorm and how threeTM is undecidable but remember it generalizes to even total functions.
  Just goes back to the equivalence of two static programs requires running them, and there is too much entropy in file operations to practically cover much of the behavior.
  When forced to, it can save you, but a block level copy on a live filesystem is opportunistic.
  Crash consistency is obviously the best you can hope for here, so that and holes in classic NFS writes may be a more accessible lens on the non-happy path than my preferred computability one.
  The guid being copied and no longer unique problem I mentioned below is where I have see people lose data the most.
  The undecidable part is really a caution that it doesn't matter how smart the programmer is, there is simply no general algorithms that can remove the cost of losing the symantic meaning of those writers.
  So it is not a good default strategy, only one to use when context forces it.
  TL:DR it is Horses for courses
  
  Reply View | 0 replies