Comprehensive data protection for all workloads
Post Reply
KarlW
Influencer
Posts: 10
Liked: never
Joined: Jun 18, 2012 1:02 pm
Full Name: Karl Wallenius
Contact:

General questions about large environments

Post by KarlW »

Hello,

I've been appointed the task to re-organize our Veeam Infrastructre and I have some questions for the more experienced Veeam administrators out there.

Our environment consists of about 1000 VMs.

Our Veeam setup is the following:
Two ESXi-hosts with Veeam B&R server installed on their own VMs. We also have about two extra backup proxies consisting of the spare resources from the ESXI-hosts (we doublechecked the performance, and neither cpu, RAM was the issue).

The storage is located on a SAN connected with FC to the ESXi-hosts. And provisioned accordingly.
Our primary bottle neck is the time it takes for the backups to be written on the SAN-disks.

Now, after describing the environment I will proceed to the actual question(s).

1. Since there is no way for us to take a daily incremental backup on every VM in the entire environment (or is it?). How is the best way to setup the jobs and the scheduling? In the current configuration we take a backup on a different datastorage once a week, monday to friday. Is there any better suggestions of how to pull this off?
2. Is it even possible to take incremental backups of this many servers daily with our current configuration? (I can provide more detailed info of course).

And last,
Is there anyone out there with experience with large environments who can give me some pointers or suggestions?

Best regards,
KarlW
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: General questions about large environments

Post by dellock6 » 1 person likes this post

Hi Karl,

I manage an environment with almost the same size, 1000 VM and 64 TB in datastore size. For sure there are many improvements you can apply to your design to be able to do complete backups of all the infrastructure on a daily basis.

First of all, I think you have few proxies to be able to speed up backups, and also I'm not sure why you need two different veeam backup servers, instead of one managing all the backup infrastructure.

Also, some more informations are needed to better help you:
- which version of Veeam Backup are you using?
- are you using the same SAN for both production and backups? Is not clear
- you say you checked performances, but what the bottleneck stats shows you in the Veeam reports? You can see this value in the upper left of the job details

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
KarlW
Influencer
Posts: 10
Liked: never
Joined: Jun 18, 2012 1:02 pm
Full Name: Karl Wallenius
Contact:

Re: General questions about large environments

Post by KarlW »

Hi Luca, and thanks for your reply!

Well, I don't know really. It just seemed logical to me to have separate management servers for the different clusters. It was setup before I really knew exactly what the proxies were fore.

So step One is to have just ONE management server? And the rest proxies?

And to respond to your questions:
- We are using Veeam Backup & Replication 6.5. But we will probably upgrade to version 7, now we are remaking the whole thing from the ground. Also to take advantage of the combination of Win2012 and Veeam 7s combined deduplication power.
- No. We have the production and the bacups on different SANs. The Veeam server and proxies are connected to the VMWare enviornment via the backbone ethernet connection.
- According to the jobs the bottle neck is "Target". So I presume it's the disks which Veeam uses to write the backups to.

And also: How do you split up the schedulation?

Thanks again for your quick response!
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: General questions about large environments

Post by dellock6 »

KarlW wrote:Hi Luca, and thanks for your reply!

Well, I don't know really. It just seemed logical to me to have separate management servers for the different clusters. It was setup before I really knew exactly what the proxies were fore.

So step One is to have just ONE management server? And the rest proxies?
Yes! Even better, a central Veeam server with NO proxy or repository role, being virtual, with only management functions. All the other servers are proxies or repositories. In this way, Veeam can spread the jobs to all the proxies and maximize performances. Once upgraded to v7, you can also take further advantage from parallel processing, with such a large environment, I can assure you is going to make a huge impact (for the better ;))
- We are using Veeam Backup & Replication 6.5. But we will probably upgrade to version 7, now we are remaking the whole thing from the ground. Also to take advantage of the combination of Win2012 and Veeam 7s combined deduplication power.
Ok, but right because you are thinking forward to v7, think about BackupCopy and the possibility to create two-tier backups. Don't use Win2012 dedup on the landing storage of the backups, better keep only few restore points in the primary backup storage, and then use backup copy to extend the retention saving data into the secondary backup.
(Sorry for the shameless plug) I wrote about here few months ago:
http://www.virtualtothecore.com/en/?p=5021
- No. We have the production and the bacups on different SANs. The Veeam server and proxies are connected to the VMWare enviornment via the backbone ethernet connection.
Ok. So, another possible improvements is to have physical proxies, acting also as repositories, and being connected via FC to both production and backup SAN. This way you are never going to hit the lan in the whole Veeam data pipe...
- According to the jobs the bottle neck is "Target". So I presume it's the disks which Veeam uses to write the backups to.
Target is totally expected if you run reversed backups.
And also: How do you split up the schedulation?
Ehm, I don't :)
I use multiple proxies, set for all of them maximum cuncurrencies, and then I let the jobs start all at the same time. Only the maximum amount of jobs enter the queue, the others stay in the queue waiting for their turn. In this way, we get two results:
- we do not need to complicate the scheduling options, every new backup is created like to others in few clicks, and it does not requires to re-calculate all the scheduling matrix when added
- we maxmimize the use of the queues, since when a queue is free after finishing a backup, a new backup is started right after. You can clearly see this behaviour in the graph of Enterprise Manager, there are no more points with zero data going into the repositories.

Glad to help :)
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
KarlW
Influencer
Posts: 10
Liked: never
Joined: Jun 18, 2012 1:02 pm
Full Name: Karl Wallenius
Contact:

Re: General questions about large environments

Post by KarlW »

Thank you very much. I've taken all you've said and summarized it and brought it up with my team.
Your help is very much appriciated!
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: General questions about large environments

Post by dellock6 »

Glad to help!
Let me know what of the many ideas I put on the table you decide to implement, is always nice to compare ideas with other users :)

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
KarlW
Influencer
Posts: 10
Liked: never
Joined: Jun 18, 2012 1:02 pm
Full Name: Karl Wallenius
Contact:

Re: General questions about large environments

Post by KarlW »

dellock6 wrote:Glad to help!
Let me know what of the many ideas I put on the table you decide to implement, is always nice to compare ideas with other users :)

Luca.
Hello Luca!

We decided to go with the direct SAN, with physical proxies and repos. We had all the pre-reqs for it, so it was the natural way to go. We connected our backup SAN with our production SAN through FC and re-made the whole Veeam-environment.

We're still in a test phase, but so far so good. The speeds are incredible. I took som help from some more experienced VMWare Infrastructure colleagues to assist me.
We decided to run a job per cluster (my instinct was a job per LUN/Datastore) but just for a speed test and see the time required we decided to use this.

So cluster one is speeding away with glorious speeds, and the first full backup is soon completed and will approx take about 24-36 hours. I guess the reverse incremental will take a lot quicker.
kte
Expert
Posts: 179
Liked: 8 times
Joined: Jul 02, 2013 7:48 pm
Full Name: Koen Teugels
Contact:

Re: General questions about large environments

Post by kte »

any numbers, because the full backup is very quick 500MB/s but reverse incremental is alot slower 35MB/s i have 22 4TB spindels in sata disks
K
chrisdearden
Veteran
Posts: 1531
Liked: 226 times
Joined: Jul 21, 2010 9:47 am
Full Name: Chris Dearden
Contact:

Re: General questions about large environments

Post by chrisdearden »

dont necessarily focus on the processing speed , look at the time it took to process each VM.
Gostev
Chief Product Officer
Posts: 31804
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: General questions about large environments

Post by Gostev »

kte wrote: reverse incremental is alot slower 35MB/s i have 22 4TB spindels in sata disks
That's way too slow for 22 spindles. Are you using RAID10? May be you should try to play with RAID controller cache settings.
kte
Expert
Posts: 179
Liked: 8 times
Joined: Jul 02, 2013 7:48 pm
Full Name: Koen Teugels
Contact:

Re: General questions about large environments

Post by kte »

2 array controllers with each 2 GB cache 15% read, 85%write , 10 disks mdl sas 7k in raid 6 and 12 disks mdl sas 7k in raid 6, both luns in raid 0 in windows 2012 stiping, normal full backup is 500 MB/s, i guess incremetal jobs will go @ the same speed , but reverse incremental is depending on the job between 25 - 80 MB/s
Normal or not. veeam 7 patch1

k
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: General questions about large environments

Post by luckyinfil » 1 person likes this post

RAID 10 would be better for performance. (less write penalty than the other RAID groups)
luckyinfil
Enthusiast
Posts: 91
Liked: 10 times
Joined: Aug 30, 2013 8:25 pm
Contact:

Re: General questions about large environments

Post by luckyinfil » 2 people like this post

Your array would explain why you are getting horrible performance with reversed incrementals. It's been documented that reversed incrementals is heavy on the IOPS (more specifically the write IOPS). Not only is your array slow due to the use of 7200 RPM drives (75 IOPs per drive), the fact that it is a RAID 6 makes the performance even worst due to the write penalty. I'd suggest you stick with forward incrementals which are less heavy on IOPS.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: General questions about large environments

Post by tsightler » 2 people like this post

Reverse incremental will be much slower from a throughput perspective since it's dependent on I/O. How long does the job to take run, that's more important.

You're using RAID6 so you have a 6x write penalty, and reverse incremental is 2x write/1x read so a Veeam processing rate of 35MB/s is actually 105MB/s of random I/O, which is actually quite decent. However, you have decent cache sizes so that should offer significant benefit. What's the stripe size of those RAID volumes and the Windows stripe (Windows 2012 defaults to 64K stripe). Larger stripe sizes are generally better for Veeam.
kte
Expert
Posts: 179
Liked: 8 times
Joined: Jul 02, 2013 7:48 pm
Full Name: Koen Teugels
Contact:

Re: General questions about large environments

Post by kte » 1 person likes this post

I can't loose 24 TB of netto capacity by changeing from raid6 to taid 10 + I need raid 6 because of the risks with the 4TB disks
full backup of the exchnage server takes 2h30 minutes for 4 TB in reverse incremental between 4 and 6 hours, the performance is not allways the same every day
By using forward incremental how can I get minimal disk space usage, I tried once a week synthetic full + reverse roll back but it took 20 hours to convert 3 TB source backups,
run synthetic full + reverse roll back every day after the incremental backup, or justget to 31 restore points incremental backup and then do an active full again?

I formateed the lun with the biggest block size in windows to get the largest lun diskspace possible so I guees it would be 64K

K
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: General questions about large environments

Post by tsightler » 1 person likes this post

kte wrote:I formateed the lun with the biggest block size in windows to get the largest lun diskspace possible so I guees it would be 64K
I wasn't referring to NTFS block size I was referring to the underlying stripe size of the RAID0 windows stripe, and the RAID6 stripes of your physical arrays. Veeam generally uses a fairly large block size (typical I/O is 256KB blocks) so if you have a 64K stripe size (the Windows 2012 default) that means that each single write I/O from Veeam results in 2 write I/Os on each of your disk underlying disks. Since running reverse incremental will cause 2 write I/Os for every changed block, that's 8 total write I/Os (4 per stripe) which will then potentially be magnified by the stripe size of the RAID6 arrays assuming their somthing small (some devices default to 32). Add to that the RAID6 penalty and you loose a lot of potential random I/O performance (not much of an issue for sequential).
kte
Expert
Posts: 179
Liked: 8 times
Joined: Jul 02, 2013 7:48 pm
Full Name: Koen Teugels
Contact:

Re: General questions about large environments

Post by kte » 1 person likes this post

How can I verify the underlying windows stripe size of my raid0 stripe and over my to raid 6 luns ??
On my smart arry controllers the stripe size is 256K (but full striping size is 2560 KB I guess because I have 12 disk in raid6) I can go to 1024K would this improve perforamance in raid6 ?? Or must I thake 128K (raid 0 over 2 raid controlers) I use optimal copresssion in veeam so it should be 256K also.

K
bertdhont
Service Provider
Posts: 45
Liked: 5 times
Joined: Nov 08, 2013 2:53 pm
Full Name: Bert D'hont
Contact:

Re: General questions about large environments

Post by bertdhont » 1 person likes this post

kte,

Have you had any answer on this one?
We have the same question, what is the best stripe size.....
albertwt
Veteran
Posts: 941
Liked: 53 times
Joined: Nov 05, 2009 12:24 pm
Location: Sydney, NSW
Contact:

Re: General questions about large environments

Post by albertwt »

Same thing here Bert,

What's the stripe size that you are using now for large NTFS LUN used by the repository ?
--
/* Veeam software enthusiast user & supporter ! */
albertwt
Veteran
Posts: 941
Liked: 53 times
Joined: Nov 05, 2009 12:24 pm
Location: Sydney, NSW
Contact:

Re: General questions about large environments

Post by albertwt »

ok, so based on my understanding is this is the recommended settings:

If you are using Veeam Backup v8.0 and above with no backup to tape required, convert all backup job to The new forward incremental-forever mode:
Image

The backup retention policy will be deleting the backup job that is old to keep the backup size under control.


because using the Reverse Incremental on RAID5 or RAID6 will incur heavy penalty thus casuing the backup window to runs longer.

Is that correct ?
--
/* Veeam software enthusiast user & supporter ! */
veremin
Product Manager
Posts: 20400
Liked: 2298 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: General questions about large environments

Post by veremin » 1 person likes this post

Forward forever incremental is less stressful than reversed incremental mode in terms of I/O put on target repository. However, it still involves transformation activity and RAID 5/6 might not necessarily cope with it well.

So, you can check the resulting performance and see whether it meets your requirements.

Thanks.
albertwt
Veteran
Posts: 941
Liked: 53 times
Joined: Nov 05, 2009 12:24 pm
Location: Sydney, NSW
Contact:

Re: General questions about large environments

Post by albertwt »

Vladimir,

Yes that's what I would like to try, because at the moment disk space usage is also the issue here I'm facing not just the log backup window (longer snapshot open for bigger VMs).

So I guess, I can just try forward incremental mode on the new LUN that I'm setting up now.
--
/* Veeam software enthusiast user & supporter ! */
veremin
Product Manager
Posts: 20400
Liked: 2298 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: General questions about large environments

Post by veremin »

Yep, try that and see whether you find performance acceptable. In case of negative answer, it might be worth thinking about switching to either RAID 10 or forward incremental with periodic active fulls. Thanks.
albertwt
Veteran
Posts: 941
Liked: 53 times
Joined: Nov 05, 2009 12:24 pm
Location: Sydney, NSW
Contact:

Re: General questions about large environments

Post by albertwt »

... Deleted... Repost.
--
/* Veeam software enthusiast user & supporter ! */
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: General questions about large environments

Post by dellock6 » 5 people like this post

Albert, just a kind request, you are posting the same requests to two different threads. This makes it hard to follow the conversation and help, please can you stick to only one thread?

Thanks,
Luca
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
albertwt
Veteran
Posts: 941
Liked: 53 times
Joined: Nov 05, 2009 12:24 pm
Location: Sydney, NSW
Contact:

Re: General questions about large environments

Post by albertwt »

Sorry for the confusion Luca,
Please delete the thread above.
--
/* Veeam software enthusiast user & supporter ! */
Post Reply

Who is online

Users browsing this forum: laurentG, peeky1323 and 139 guests