Dear Veeam product development team, dear Anton,
We are a VASP partner located in Switzerland, and currently delivering an important backup project for one of our major customers (public account).
The customer has more of 400 TB of data to protect, including 300 TB of file services, which rely on a Dell Isilon / PowerScale architecture with multiple tiers (Full Flash nodes (F series), capacity nodes (A series), S3 object storage based on another Dell platform (ECS EX series)). The object storage layer is deployed in a geo-cluster mode (XOR), so that it shouldn't require to be backed-up.
To provide you average figures, let's consider that the full Flash nodes provide 10-15 TB out of those 300 TB, the capacity nodes provide 70-90 TB of the NAS capacity (main tier), and the S3 tier represents 220 TB of the overall capacity to protect.
The main consideration that we have is that the backup process shouldn't influence / impact any of the tiering mechanisms in place.
The tiering capabilities of the PowerScale are based on a pretty standard storage tiering process called SmartPools between different clusters of Isilon / PowerScale nodes (e.g. SSD & HDD), where files are moved transparently in a post-process manner, based on granular policies (e.g. file type, file size, file usage, location, etc.), our first objective would of course be to avoid saturating the SSD tier because of the backup process.
The secondary tiering capability of the PowerScale called CloudPools consists of moving files between the NAS and an on-premise or Cloud-based object store, and replacing the files in their original location by a symlink / SmartLink so that users could still access those files transparently. Again the tiering process is based on granular policies, accessing one or more files from the S3 bucket (Read operations) is a task that is performed in memory between the ECS cluster and the slowest PowerScale / Isilon nodes. Of course, the backup process shouldn't influence that tiering mechanism and shouldn't saturate the HDD tier of the NAS platform.
The main issue that we have is the fact that those tiering mechanisms are appropriate / convenient for a bunch of users accessing random files, not for a backup solution that needs to retrieve a copy of 200 millions of files spread over thousands of shares across different storage tiers.
We have made several tests for our customer, and have discovered that the backup process had a huge and direct impact on the tiering mechanisms of the platform. Based on a few sample shares (small ones, less than 1 TB) we have tried to backup, we have calculated that the first initial backup would potentially take years to complete, which is of course not acceptable for our customer, and the monitoring tool of the NAS platform generated a bunch of alarms due to the fact that the tiering (after the backup) was no longer compliant with the quotas in place.
The previous backup solution in place (not Veeam) used to rely on NDMP, which is of course old and has its own caveats, but is at least able, using an advanced option, to differentiate symlinks from real files, which means that the backup solution only backups the files on the active tier, as well as the symlinks, and the customer protected the "real" files located on the S3 storage using another method.
We of course know that you don't use the same protocols (NDMP vs SMB) and don't have the same visibility / capabilities to differentiate symlinks from files, anyway I was wondering if there would be any workaround to avoid "disturbing" those tiering mechanisms, meaning to reproduce more or less the NDMP approach without backuping the "real data", only by protecting the symlinks.
If we extrapolate that use case, most of the modern NAS platforms do support those S3 tiering mechanisms, those tiering mechanisms should be somewhat similar to the use case I described (replacement of files by symlinks, memory cache to retrieve files in case of access), the object storage could be Cloud storage from a public Cloud provider (e.g. AWS, MS Azure), and the customer would surely expect to have the freedom to chose if :
• The entire workload should be protected on-premise (Local copy of all data incl. Cloud-tiered files)
• Only the local data should be protected on-premise, and the Cloud-based workloads should be protected using another optimized backup method at the bucket level (e.g. from Cloud to Cloud)
To be clear, for this specific project, we are simply unable to start the first backup of the NAS platform, and we'll have to deal with the tiering mechanisms in place, without impacting the production environment (no down time allowed).
We do have product specialists from Dell involved in the process, but we would absolutely need support from your product development team, to see if, how and when you could potentially address such constraints for NAS backup, if you already plan to address such use cases in your roadmap, and / or if you could provide a workaround in the meantime ?
This kind of setup is quite common for Enterprise customers (tiering mechanisms), and we are currently working on other projects involving the same platforms, but at a higher scale, with petabytes of file services, and we just want to make sure that Veeam is able to handle such workloads…
In parallel we opened a support ticket Case # 07109334 to have technical feedback on advanced features to mitigate tiering impact on the system while doing the first file backup.
Thanks in advance for your quick feedback.
Regards
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Jan 29, 2024 5:25 pm
- Full Name: Eric Beltier
- Contact:
-
- Product Manager
- Posts: 9748
- Liked: 2575 times
- Joined: May 13, 2017 4:51 pm
- Full Name: Fabian K.
- Location: Switzerland
- Contact:
Re: Veeam file backup with Isilon Cloud Pool on ECS tiering compatibility
Hello Eric
Allow me some question before I reach out to our NAS team.
How was the ISILON appliance added to the backup server? As a NAS device or SMB/NFS share?
If the tiered data is transparently accessible by an end-user through the file share endpoint, it will also be for a backup application. We just read what the user also can read from the file share.
Best,
Fabian
Allow me some question before I reach out to our NAS team.
How was the ISILON appliance added to the backup server? As a NAS device or SMB/NFS share?
If the tiered data is transparently accessible by an end-user through the file share endpoint, it will also be for a backup application. We just read what the user also can read from the file share.
Veeam Backup & Replication has support for NDMP backups. Have you run tests with it as well? https://helpcenter.veeam.com/docs/backu ... ml?ver=120The previous backup solution in place (not Veeam) used to rely on NDMP, which is of course old and has its own caveats, but is at least able, using an advanced option, to differentiate symlinks from real files, which means that the backup solution only backups the files on the active tier, as well as the symlinks, and the customer protected the "real" files located on the S3 storage using another method.
We of course know that you don't use the same protocols (NDMP vs SMB) and don't have the same visibility / capabilities to differentiate symlinks from files, anyway I was wondering if there would be any workaround to avoid "disturbing" those tiering mechanisms, meaning to reproduce more or less the NDMP approach without backuping the "real data", only by protecting the symlinks.
Best,
Fabian
Product Management Analyst @ Veeam Software
-
- Product Manager
- Posts: 14782
- Liked: 3054 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: Veeam file backup with Isilon Cloud Pool on ECS tiering compatibility
Hello,
NAS backup does not follow symlinks. If tiering is done with symlinks, and we pull it from the S3 tier, then it needs to be investigated.
Just to be 100% clear: is the goal to back up only the "flash tier" or also the tiered data or both?
The whole tiering thing is the same since decades... massive vendor lock-in, painful for migrations and backup, something that looks great on PowerPoint but painful in real-life 200M files is not many files... Isilon & Veeam can easily work at that scale (and much higher) without tiering.
Best regards,
Hannes
NAS backup does not follow symlinks. If tiering is done with symlinks, and we pull it from the S3 tier, then it needs to be investigated.
Just to be 100% clear: is the goal to back up only the "flash tier" or also the tiered data or both?
The whole tiering thing is the same since decades... massive vendor lock-in, painful for migrations and backup, something that looks great on PowerPoint but painful in real-life 200M files is not many files... Isilon & Veeam can easily work at that scale (and much higher) without tiering.
Best regards,
Hannes
Who is online
Users browsing this forum: No registered users and 1 guest