I'd like to delve a little further into Gostev's excellent topic today:
For the big topic for today, I wanted to talk about disk space efficiencies of ReFS/XFS backup repositories. I felt there was the need to raise more awareness here, because I often find myself recommending them as the solution to various customers' challenges, and in most cases this feature comes as a surprise. The fast cloning technology we're are able to use with ReFS/XFS volumes comes with many benefits, but the biggest one is what we call "spaceless" synthetic full backups. This quite literally means that GFS backups are "free" to create, because they take no physical disk space. But even if you don't need GFS backups, there may still be the need for periodic synthetic fulls, for example to "seal" the previous full backup chain in order to make it eligible for offload to object storage.
I'd love to know how to achieve this goal. I have numerous Backup Copy jobs with GFS policy specifications (usually: 30D (forever forward) + 4W, 3M, 4Q, 7Y), usually targeting ReFS filesystems, and they seem to always grow. That's because the GFS point is always a Full Backup. These Repos grow and grow, and only reach stability at the 1Y point, then grow each additional 1Y.
THX,
-John
John Borhek, Solutions Architect
https://vmsources.com
While creating synthetic GFS backups is "free" because they simply reference blocks created by previous incremental backups, you still need space to host those continues increments of the backups chain, which - being deltas - contain mostly unique data. I would say what you observe is totally expected.
I will try to explain it in other words. Forget you have GFS fulls for a moment, and imagine you're instead using forever-incremental job with a funky schedule that matches your current retention policy - but is reversed on the timeline, as if you were doing backups for a client located in the universe where the time runs backwards.
So for the first 3 quarters, your job will only run once a quarter, creating those 3 quarterly increments. Then on the 4th quarter, the job will start running monthly - and so on, down to the final 30 days of this 1 year period, when it will be running daily. In the end, you will end up with the same exact backup chain as you have today, just reversed in time. But now it should be easier to understand, that of course your disk space usage will keep growing each and every time this backup job creates the new incremental backup file, and saves it to the disk.
Now, the beauty of "spaceless" synthetic full backups is that if you decide to have some of those GFS incremental backups stored on disk in a form of full backup files, then this will not require any extra disk space comparing to the forever-incremental chain.