Comprehensive data protection for all workloads
Post Reply
dasfliege
Service Provider
Posts: 316
Liked: 69 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Backup storage evaluation / Changerate

Post by dasfliege »

We're evaluating a new backup storage for our MSP environement. Currently we're using SAN storage and are running ReFS and XFS repos.
For the future we're considerung the following options:
- New SAN all-flash storage with XFS repos
- Huawei OceanProtect with Dataturbo, Dedupe, etc..

We have roughly around 100TB of source data and a pretty unpredictable changerate, since it's a service provider environement with tenants from all kinds of businesses.
We are preforming daily incemental backup and keep them for 30 days / 52 weeks on disk and also copy them to a second datacenter site.

In order to get the right sizing for the OceanProtect, we need to find out our daily and yearly changerate, which due to Blockclone savings, is pretty hard to find out. We do not have VeeamOne.
I wrote a scripts which searches for the last successfull run of each backupjob and compares "Processed" with "Read" data and calculates the changerate. To my understanding, "Read" is the value that CBT (Or RCT in our hyper-v case) is reporting as changed data. "Transfered" is the amount ouf data that really is travelling through the network after it has been compressed by the source proxy. Is this correct? Where is per-job dedup happening in Veeam. Do i have to take this into consideration when reading the job session stats or is it happening afterwards?

For the sizing estimation for the new storage, we really need the "RAW" changerate, as OceanProtect is using it's own compression and dedupe mechanisms, in order to achive the best space savings.
My script, which is comparing "Processed" with "Read" amounts, is reporting between 3% and 30% of changerate per day. This results in an average chagerate of 12.6% overall. Quite a lot in my opinion. Don't know if this really is the truth or if i misunderstood some of the measurements veeam is providing. Also i would need to know the yearly changerate and i guess it's not as easy as just multiply the daily changerate with 365 :-)
Anyone has exprience with getting reliable information?

I can share my script is someone is interested.
david.domask
Veeam Software
Posts: 2873
Liked: 660 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Backup storage evaluation / Changerate

Post by david.domask » 1 person likes this post

Hi dasfliege,
I wrote a scripts which searches for the last successfull run of each backupjob and compares "Processed" with "Read" data and calculates the changerate. To my understanding, "Read" is the value that CBT (Or RCT in our hyper-v case) is reporting as changed data. "Transfered" is the amount ouf data that really is travelling through the network after it has been compressed by the source proxy. Is this correct? Where is per-job dedup happening in Veeam. Do i have to take this into consideration when reading the job session stats or is it happening afterwards?
Your understanding is correct. Read is the total amount returned from the Hypervisor when we query for changed data (or all data in the event of Active Full), and transferred is the amount sent from the source datamover to the target datamover. (i.e., from the proxy to the repository)

Deduplication is done on the source side before transferring to the target datamover.

Processed may be throwing off your calculations a bit as Processed is related to the total size of all VM disks processed by the job, not the actual allocated/used space. I would focus on the read and transferred amounts, and keep in mind because the backup process works at the block level, even a small change in a block results in the entire block being marked as changed (i.e., if even a few bytes of a 1 MB block change, the entire block is returned by RCT as changed) So the total Process/Read/Transferred may be higher than actual changes on the guestOSes, but you still need to accommodate for the amount of changed data.

VeeamOne is really best best for this, but your strategy is sound and probably it will be accurate. Tracking the space savings in full from fast clone unfortunately is a challenge as there isn't a convenient way of knowing how much was fast clone'd versus written normally, so in addition to tracking what was processed during the backup jobs, tracking the growth rate of the actual space usage on the repository is best to include in your calculation as well.
David Domask | Product Management: Principal Analyst
dasfliege
Service Provider
Posts: 316
Liked: 69 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Backup storage evaluation / Changerate

Post by dasfliege »

Thanks a lot for this clarification.

So if i got it right, "Transfered" contains the same data which was "Read" but after it has been compressed and deduped, right? If this is the case, the "Transfered" value is not suitable for us, as we have to deliver the real changerate, without any Veeam-magic :-)
On the other hand, i understand that a flipped byte can cause a change of 1MB and this is most obviously the root cause, why i get such high calculated changerates. So maybe the real changrate is somewhere between the values of "Read" and "Transfered", but thats just a wild guess.

Tracking the actual growth rate of the backups repo isn't really acurate as well, because it includes Veeam compression and dedupe and Blockclone savings and also there are months where tenants are disappearing and deleted but also months where we on-board 5 new tenants. But maybe tracking it down for the period of one year, would give us an additional source of information.

The powershell object that corresponds to "Processed" value i'm seeing in the Veeam console is "$VBRBackupSession.Info.Progress.ProcessedUsedSize [System.Int64]". According to it's name, i would assume that it's the actual used space and not all assigned space as you said?
https://github.com/VeeamHub/powershell/ ... Session.md
david.domask
Veeam Software
Posts: 2873
Liked: 660 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Backup storage evaluation / Changerate

Post by david.domask »

Hi dasfliege,

I would focus just on the average change rate as shown in the backups -- the actual data change rate on the VMs will be divorced from this slightly always due to RCT + deduplication + fast clone, but it will be the best source to understand how much data is actually being processed each run, and I think it's prudent to size based off of this. I think you're taking the right approach with the elements you're considering.

For Powershell, yes, the Progress property is where you want to be -- Processed size from the job will match the TotalSize property under $_.Info.Progress.
David Domask | Product Management: Principal Analyst
dasfliege
Service Provider
Posts: 316
Liked: 69 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Backup storage evaluation / Changerate

Post by dasfliege »

For Powershell, yes, the Progress property is where you want to be -- Processed size from the job will match the TotalSize property under $_.Info.Progress.
This is actually not true. The "ProcessedUsedSize" is what i see in the job and this is what confuses me if you say that the job is displaying the total assigned size of all VM disks. The "TotalSize" property is showing more then what is disaplayed in the job.

Image

Image
david.domask
Veeam Software
Posts: 2873
Liked: 660 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Backup storage evaluation / Changerate

Post by david.domask » 1 person likes this post

Hi dasfliege,

I must have incorrect information in my notes then on these objects :) Indeed, I am mistaken based on your numbers, ProcessedUsedSize will reflect the Processed value from the UI. Thank you for catching that!
David Domask | Product Management: Principal Analyst
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 39 guests