At the weekend I read an article on Heise about backup. This reflects pretty well my thoughts, experience and approach to data backup in the private environment. However, there are some specifics and aspects that I would like to outline here.
Disclaimer: I do not get any money to promote or bash any software. I only share my opinion and experience. The weakness of some tools could be just my own inability to use them properly – so please do not use this as a general criticism of any of them. |
For other people
For me, I actually implement the 3-2-1-(0) format mentioned in the article in a slightly enhanced way. For the PCs I take care of in the family, I have not been able to enforce this so far, since I do not physically perform the backup. So far, the options set up in the environment can be summarized to Windows File History or OneDrive. But especially in combination these don’t work well and only a fraction of the data is really backed up. Maybe in the future I can offer another option here with Resilio Sync and backup foreign data also with me. But for this I am still waiting for my new server and the fiber connection.
Back then …
My first memory of a real backup is my ZIP drive with 100 MB floppy disks. But my backup concept wasn’t that successful and sophisticated back then, because there are some files from the 90s that I don’t have anymore. But I don’t miss much. So explicitly, the main thing that comes to mind is the labeling of my leaf collection from science class. This would have to have been created in 5th grade with MS Works (on my Toshiba T1000 SE).
What I still have are game files of my old DOS games and my first programming attempts with Turbo Pascal. Beside many Corel Draw files these should not be trivial to use. I migrated most of the documents from those days to Rich Text Format during my time at university.
But now this goes more into the topic of archiving. More about this topic below.
My backup today
Synkron and Beyond Compare
For my data, I have a somewhat more complex structure of many sources and sinks. The most important data comes from my laptop. Mainly I describe the backup of these data.
My primary backup is a weekly (or after major changes also ad hoc) backup to an external HDD using the tool Synkron. This creates a 1:1 copy, but without versioning. The data is then easily available in the target without a tool with the same NTSF permissions and can be checked, compared and restored if necessary, with without additional tools. Since I have used it to transfer data after previous PC or drive migrations, I know it works without problems. For some purposes, I have replicated Synkron’s functionality in my Powershell Backup Script (but with versioning).
Furthermore, I’ve been experimenting around with Beyond Compare for years, but wasn’t entirely persuaded. While writing this article, however, I became so convinced that I would use this tool more in the future that I purchased a license and did not use the 30-day trial version as I had done before. I find it advantageous to see in advance what is synchronized and how long it takes. The batch processing is better in Synkron.
Windows File History
Since the introduction with Windows 8 I also use the Windows file history. The target is my server – yes the one with the great SATA power cable – but it is only turned on for backup. I was confused at first if in the Heise article the problems with “backup and restore” refers to the file history. But that doesn’t seem to be the case, since they are two different programs (Windows 7 is explicitly mentioned). Therefore, it is a pity that the file history, which has been the standard Windows program for 10 years now, is not mentioned in the article. While thinking about possible errors, some problems and solutions came to my mind:
- Not all folders can be backed up. For example, the folder with my marching music. I must explicitly exclude this one, otherwise the whole backup won’t work. But here I can do without a backup level.
- I have not done a full restore test yet. A partial restore of old versions has always worked without problems. If necessary, you can access all files in the respective folder structure and I use this randomly to check the correct execution.
[Update 2022-09-29]: I did a full restore with file version history this week on someone I know. I find it counterintuitive that you must create a backup in the new Windows first to be able to restore it, so I hesitated. In the end, it worked as expected and even fast. However, I could not verify whether all files were restored correctly with the data.
Offsite storage
The mentioned external HDD and the server are both in the same house as my laptop with the original data. Therefore, I have two more copies stored off-site. Here I make a copy again with Synkron (or just Beyond Compare in the future) to various old HDD. I am aware that it is not good to use old HDD for backup and especially not those with SMART errors. Here I have a clear risk evaluation and conclude that I rather have an additional backup with some unreliability than to do without it completely. On these outsourced disks are also the backups from my server (for example my RAW files), which are only located there. The allocation of M libraries to N drives with different sizes (3 or 4 TB) I do with Excel.
When mass copying and moving between various folders on various HDD, I miss a flag in the Windows copy menu to execute the individual copy operations sequentially. I used to use additional software for this, but I don’t think that is sustainable and sequence the jobs manually.
Partitioning
As mentioned in the last article, I use the Windows Storage Pool feature with some Storage Spaces on my server since Windows 8 (at that time only usable via PowerShell). I have eight HDDs of different sizes (currently 4 TB – 14 TB) in a pool and seven different Spaces – six with parity and one as a mirror. Purposes are for example my photos, videos or twice the File History.
I am only moderately satisfied with the performance, but the reliability is ensured so far despite various hardware failures and my own idiocy. Basically, I trust this technology, because it is also used in a very similar form in the Windows Server operating system.
So at least on my server I have a certain resilience beyond the backup.
I will describe an extensive failure in the second part of this article soon. Already in the last one I hinted that the failure of a cheap SATA power supply would have had the potential to lead to a total disaster.
Smartphone
Then there is my Android phone. I sync a lot of data to my laptop and use it, and the mechanisms described above, for backup. The synchronization is done with Resilio Sync in the Pro version. The main focus is the data updated under Android – mainly photos. Under Android I am very satisfied with this tool, but under iOS it is unfortunately very poorly usable, because the file system under iOS can be accessed only very limited. Only through the article from Heise I came across Syncthing, but as it reads, I stay with Resilio Sync.
Some folders from the laptop immediately end up on the smartphone via Resilio Sync and could be used as an additional backup.
The Cloud
Using OneDrive for data backup is not an option for me for several reasons:
- I have too much data. With Microsoft 365 Family, 6 users could be created and by a smart distribution, the 6 TB could be used from one account. But this seems to me not reliable and not stable
- My current internet is still too slow. In the best case I will get a fiber connection this year and then the argument disappears. I will then find a new one against it.
- It’s hard to sync back to another disk, especially if the OneDrive folders are only in the cloud. It would only be an option for folders that never or rarely change. It sounds like too much of a hassle though. Downloading X TB to back up to HDD doesn’t sound reasonable.
- I don’t trust Microsoft enough. There are enough documented cases of banned accounts and I have little desire for data loss by arbitrariness. But for some data, especially additional shared data (because Resilio Sync probably won’t take off) the option is good enough.
All other options like DropBox or Google Drive (except S3 or similar) are more expensive than my current solution anyway for my volume. The problems are mostly similar.
WordPress
Worth mentioning briefly is the backup of this blog. Optimization is still on the to-do list, but fundamentally I use an export and snapshots within AWS.
System recovery
I no longer use a full system restore or rescue disks. I’ve had rather bad experiences with this in the past (see next post) and it probably never would have helped me before. Besides, I find the alternative to rebuild Windows with my data quite charming. Basically, I would organize various partitioning, structures, etc. differently anyway.
The time savings are relative, because I assume to always have access to a spare device and data backups to do the most important things in everyday life.
Further aspects
Encryption
Another important aspect for me is the encryption. All HDDs for the backups use Bitlocker for encryption. The restore keys are themselves part of the backup and additionally secured (this is where the security by obscurity comes into play).
Storage media
With all the first problems I have had, no data medium with original data has ever been damaged. Therefore, the case has always been quite simple: I simply removed the disks and used them as external disks.
No disks or the server are permanently connected and turned on, but always only for the time of backup, to protect myself from ransomware and more importantly, to save power. However, all drives are powered on frequently enough to not lose data and these are also visited by chkdsk on a regular basis.
For my post on USB OTG, I was very surprised to find that the FDD was still working smoothly after what felt like 15 years without use. The chkdsk was error free.
File system
For the file system I use NTFS exclusively. Here I have made very good experiences with various utilities (not only onboard tools) for the recovery of defective file systems. For a while I also used ReFS, but just when I was looking for comparable tools as for NTFS, I decided to use NTFS again.
Photos
For photos, it’s really only the undeveloped RAWs and developed exports that are important. These undergo the full back up as described above (the RAW files are not in the file history). The final processed RAW files are offloaded to the server and rarely used (but occasionally, it happens that further development is requested). Additionally, they are backed up only once outside of the house. Before 2014, I didn’t keep the RAW files at all.
Especially since my 2017 problem (see next post), I back up important shootings (especially weddings) to an additional device immediately before sorting via PowerShell. Just yesterday I backed up 20 GB of spider photos as quickly as possible.
E-mails
Also worth mentioning is how I back up my e-mails. Too many providers and clients over the years have taught me to just export them and save them like any other files.
I rarely need these anyway and then I can always drag and drop the folder into a mail client and use it. But I am not bothered by problems with PST files. In addition, I can put the e-mails to other files (e.g. invoices).
OneNote
I use OneNote since 2009 very intensively. I moved some notebooks to the cloud years ago to be able to use them on the smartphone as well. Since I don’t quite trust OneDrive as described above, I previously used the export function. Since that became too much work for me, I once automated the backup of OneNote notebooks and included it in my Powershell Backup Script.
I have actually used the resulting versioning quite a bit.
Folders
A backup is only good if it is fully complete. But to back up everything is too much for me. For this I have a long list of folders that are important to me:
- My PortableApps (without file history).
- Favorites I prefer to manage as files in Windows Explorer
- Various Adobe settings (C:\Users\Stuxnerd\AppData\Roaming\Adobe\)
- Signatures in Outlook (C:\Users\Stuxnerd\AppData\Roaming\Microsoft\Signatures)
- SSH settings (C:\Users\Stuxnerd\.ssh\).
- The Bitcoin transactions, otherwise it will take a week to download everything again (C:\Users\Stuxnerd\AppData\Roaming\Bitcoin\)
- My JumpList settings (C:\Users\Stuxnerd\AppData\Roaming\Microsoft\Recent\AutomaticDesctinations\ and ..\CustomDestinations) – again without file history
- Lightroom MIDI Profiles (C:\Program Files\Adobe\Adobe Lightroom\profiles).
This is just a snippet, but shows the direction it is headed.
What I’m still missing are the Explorer settings, but I haven’t found those as a file to save yet.
Archiving – proprietary formats and tools
As already mentioned above, a permanent access for important documents is important to me. Some old formats I have not tried to access yet (e.g. Corel Draw (*.CDR)). But since these files don’t eat bread, I keep them in a ZIP archive. So the ZIP file for all school projects is only 24 MB.
Also with the used programs it is important to me that these are either standard of the operating system (I remain with Windows), are usable offline (to the emergency I install Windows 10 without Internet and have access to these) and the result is best completely without additional programs with on-board tools accessible.
Recovery
So far, there have been plenty of opportunities to test my procedures.
One important experience from this is that certain system adjustments work better before data recovery. I work quite a bit with hard links through mklink and have them fully scripted and as part of my backup.
Delimitation
Everything described here is purely private in nature. In the business environment, I often analyze different threats, different requirements and come up with completely different solutions. But the lessons there are less entertaining and controversial (at least on the subject of data backup).
Interim conclusion
Experience has shown that Recovery Time Objective (RTO) is not a major issue for me, as so far I have usually been able to continue working directly with a backup on another device. Recovery Point Objective (RPO) is actually an issue I could improve. So far, I’ve ensured that weekly and on an ad hoc basis. Here, protection from ransomware and saving power are higher priorities for me.
So far, despite years of unprofessional backup, I’ve been unaffected by any really painful data loss. This is not just skill, but by now I would claim to have a sophisticated concept and invest enough time in sufficient backups to at least not experience a total failure. With my playfulness, a backup is very important.
But as my former (office) co-occupant always says: “Luck is always with the stupid”. Therefore, I guess I was lucky.
Outlook
After describing here an ideal picture of my data backup and archiving, I would like to take a look at the reality in the next article. The failures of the last years have contributed not insignificantly to my current approach.