A few days ago a colleague of mine emailed to ask for new laptop recommendations. His elderly Dell had finally had enough. Being who I am, I suggested a Mac, somewhat tongue-in-cheek, but also tried to steer him toward some favorably reviewed PC laptops (they were difficult to find). Two days later, another person, on the same project no less, emailed to say their laptop had gone to the choir invisible as well. An HP this time, through some XP malware. This rash of bad luck got me thinking about how I arrange my development machine to protect my work (and save me time), and that it might be useful to some others.
My system derives from a healthy sense of paranoia. I often have multiple clients, some without their own source control or even formal network storage on site. Loss of my personal laptop means loss of not only my work, but the client’s data and assets, which would result in a bunch of unpaid free hours as I recover (at best), to loss of a contract or lawsuit (at worst). So I am motivated to protect my data. I am a Mac user, so the specifics are tailored to Apple devices, but there’s really nothing here that a PC user couldn’t do just as well.
The four levels of paranoia
I came to virtualization quite by accident, and realized the side benefits only later. I kept acquiring new contracts with different platform requirements. One would be Linux-based, one Windows XP, one Windows 7, one would want some Mac OS stuff on the side. Obviously I wasn’t going to buy a seperate machine for each one, and bootcamp would only get me one more platform1. So virtual machines were really the only answer.
So, for each client I set up a different VM. I currently use Parallels, but I’ve had good success with VMWare as well. They tend to compete on features, which is nice as it makes both products better.
The side benefits of virtualization are many. So many, that even if I had a single client targeting Windows, and I had a Windows laptop, I’d still create a VM for that work. Here’s what you get:
Client isolation. Have multiple clients? Are they or you dubious about sharing a development machine? VMs instantly solve this, with minimum headache. Each client gets their own system, and they can’t cross-pollinate2. As a bonus, virtual machines can be network isolated by the host, providing another layer of protection behind double-NAT and the VM’s firewall.
Easy full backup and archiving. This is a big one, at least to me. One of the most boring and yet necessary parts of being a grown-up professional developer is making sure that my clients’ data are safe. VMs allow you to back up an entire OS instance as a single file. Just drag and drop the folder to an external drive and you are done.
Quick portability. This is also a huge benefit, and a follow on to the ease of backing up virtual machines. Has your main computer been lost, to failure, theft, or total protonic reversal? It doesn’t matter: grab that VM backup you made, transfer it to almost any other computer and you are up in running with the exact same data and configuration in a matter of minutes. With almost any other backup scheme, a restore (with all your applications and data as they were installed) can take hours, and that’s if you have a system you can restore to.
Ok, here’s the next level of paranoia. For any recovery scheme, you are only as safe as the last time you remembered to backup. For most of us, even the most conscientious, that’s maybe once a week. Probably once a month. Did you do any work in the last month that you would hate to lose? I thought so. Granted, most of us use source control (right?) so regular checkins can mitigate your loss somewhat (provided your SCM system is elsewhere), but there’s plenty of times when we have amassed a body of work that is purely local.
The easiest way to have up-to-date backups that I’ve found is with an online provider. I personally use Crashplan, but there’s plenty of other options, such as Carbonite and Backblaze. The key idea behind online backup is that you don’t have to think about it. The client software runs in the background, monitoring your file system for changes and periodically syncing those changes to the server. It’s perfect for folks who forget to take out the trash3.
How does online backup mix with virtual machines? Well it’s a bit non-obvious. The first approach one might take is to have just the backup client on the host and to allow it to backup the VM folders directly. This doesn’t work well, however. The VM files change quite a bit and defeat the clever differential schemes the online backup systems use. You’ll end up uploading large files constantly and will likely end up permanently behind. So what to do? Install the backup client to each VM individually, and have it run natively there. Select the folders that are important to you, and just treat each virtual machine as a separate computer. It’s not ideal, but it beats the alternative.
An added advantage to online backups, is you can often access files remotely, even from iOS devices (Crashplan supports this).
VM offline backup
As I described earlier, just copy the VM folders somewhere else periodically.
Drive level clone and Time Machine
Now we’re starting to get paranoid. In addition to all the stuff above, about once a week I will do a Time Machine backup of my entire system (minus the VM’s: Time Machine has similar issues to online backups with those). This theoretically would allow me to recover my entire system in the event of a drive failure.
Since I don’t entirely trust Time Machine (no reason why, again: paranoia), I also do one more thing about once a month: a block-level drive clone to another external drive. I use Carbon Copy Cloner for this, but SuperDuper is good too. What these applications do is create an exact copy of every bit of your drive on another one. If you have a laptop and perform the clone to a similar 2.5” drive, that means you can just swap and go without missing a beat.
Put the clone (and the drive you store your VM backups on) in a firesafe or off-site if possible for maximum security.
All of the above may seem like a lot of work, but it really isn’t. The online backups take care of themselves. The VM’s save a lot of work since they provide so much flexibility. Backups that require connecting an external drive take just a few minutes to set up every week or month, and can run overnight.
What this combination of schemes does allow is maximum adaptability to failure and minimum recovery time. Lose a drive? Restore from the clone or time machine, get recent files from the online backup. Lose an entire system to another hardware failure, theft, total house destruction? For the short term, run your VM’s elsewhere, then restore on the new system from the clones. Accidentally delete an important file? Get it from online backup.
This may not work for everyone. Is it too much? Maybe. Are there other ways of getting the same result? Almost certainly. Feel free to mix and match. It’s the mixing and matching that makes for the safest solution.
VM’s do cut into your speed a bit, and necessarily have fewer resource (like RAM) available than your host system does. If you need maximum speed and bare-metal GPU support, you’ll have trouble with a VM in some cases. ↩
Both VMWare and Parallels come with features to more tightly integrate the VM’s with the host OS. Things like sharing application lists, drives, and making application windows appear on the host desktop. For complete isolation, I turn most of these off. On top of that, they can be annoying. ↩
You do, however have to think about just what you want to back up online. As your whole system might be 100’s of GB of data, you’ll want to judiciously prune what you send online. I don’t consider this a “recover my system” scheme, more “recover those recent files I lost.” ↩