Comprehensive data protection for all workloads
Post Reply
okrehan
Service Provider
Posts: 22
Liked: never
Joined: Mar 17, 2011 10:53 am
Full Name: Oliver Krehan
Contact:

Any evolution planned for guest file index handling?

Post by okrehan »

Hi all,

we have several customers that use large fileserver systems (100TB and above) with millions of files. They need the ability to search for deleted files over several restore points as their users often can't determine when files were deleted. I don't want to discuss if this is the right way to do things but I've seen competitors that handle this kind of "problem" much more elegant.
Just to get some discussion started, here are my observations and proposals.
I really love Veeam and I use VBR since v3 but in my opinion, the guest file indexing isn't enterprise ready. It's okay for smaller environments with "few" files on the indexed guest systems but not enough for large deployments.

- creating file indexes is quite time consuming. In former days where we ran on HDD SAN, guest file indexing took more than 24h. After migrating to AllFlash, time was reduced to 2h or less which is still longer than the runtime of the backup job itself but that's okay.
--> wouldn't it be better to do the guest file indexing as a post backup process? Mounting the backup and indexing it on the backupserver itself should reduce load on the primary target and reduces the time for the backup job. For restores, the backup has to be mounted as well so why do no use this functionality?

- having Windows dedup enabled on the fileserver volumes (often used by our customers) is neck breaking for guest file indexing. Veeam has to switch to a standard tree walk which is extremly time consuming.
--> probably not optimizeable but one have to know. There is only a simple hint that using dedup COULD lead to higher indexing times but it is dramatically how times will raise especially with large number of files. This should be mentioned a bit more clearly in the docs.

- saving the guest file indexes is some kind of 90's. Creating flat text files, compressing them and copying them over network to the VBRcatalog share isn't really state-of-the-art. If you use a separate Enterprise Manager VM (which we do for security reasons), the whole VBRcatalog is at least duplicated. With only 6 fileservers an 2 months of index data, the folder is nearly 1TB of size. Sure, we have BIG fileservers but as Veeam is Enterprise grade, this should be normal. Using malware detection will raise storage consumption additionally.
--> why not indexing data on the VM, transfer them over the Veeam components like the data mover and store the index data in databases? This should reduce footprint and could be better and more securely integrated as copying zip files over network to SMB shares.

- using the index data on the EM webinterface is a mess. With this big index files, searching for files inside mutilple jobs is constanly crashing with error "searcher not found" or extremely time consuming. Even if the specify the correct restore point, searching for files inside the index can take more than 1h. During this period, the EM VM with 8 vCPUs is completely at 100% load. Even if the search results are displayed, the GuestCatalog Service on the EM VM is still running on 100% for additional 10s of minutes, just like a cool down phase.
--> using databases with a much more efficient query language should reduce the load extremely. In the past, Veeam offered support for MOSS (MS search server). No idea why this support ended, perhaps MOSS isn't continued by MS but the idea behind was great.

These are only some of the observations we made with large customers. I just want to share my experience and perhaps others have the same problems and perhaps even a solution or workaround for this. It would be great if someone from Veeam could have a look at this topic and perhaps take it to the development to discuss improvements here.

Regards,
Oliver
Mildur
Product Manager
Posts: 10277
Liked: 2746 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Any evolution planned for guest file index handling?

Post by Mildur »

Hi Oliver

Thank you for sharing your valuable feedback. I am unable to provide any information or ETA when some of those points are solved. However, please be assured that we are aware of some of the concerns. Your feedback will be included in our feature tracking system.
Using malware detection will raise storage consumption additionally.
In V12.2, we are introducing a registry key that will allow users to customize the index retention. By setting the key to keep only a few days of the index, customers who do not wish to install Enterprise Manager or do not require search capabilities will be able to resolve storage consumption-related issues.
--> probably not optimizeable but one have to know. There is only a simple hint that using dedup COULD lead to higher indexing times but it is dramatically how times will raise especially with large number of files. This should be mentioned a bit more clearly in the docs.
You can directly provide feedback to our documentation team by using the dedicated user guide page. On each page, you will find a feedback button at the bottom. Our team sincerely appreciates customer feedback.
- using the index data on the EM webinterface is a mess. With this big index files, searching for files inside mutilple jobs is constanly crashing with error "searcher not found" or extremely time consuming. Even if the specify the correct restore point, searching for files inside the index can take more than 1h. During this period, the EM VM with 8 vCPUs is completely at 100% load. Even if the search results are displayed, the GuestCatalog Service on the EM VM is still running on 100% for additional 10s of minutes, just like a cool down phase.
May I ask, was this investigated by our customer support team?

Best,
Fabian
Product Management Analyst @ Veeam Software
okrehan
Service Provider
Posts: 22
Liked: never
Joined: Mar 17, 2011 10:53 am
Full Name: Oliver Krehan
Contact:

Re: Any evolution planned for guest file index handling?

Post by okrehan »

Hi Fabian,

thanks for your feedback and sorry for the delayed answer but I'm currently on vacation.
We opened a case at Veeam (case id 07376907) and it seems that there is a private fix available that will lower CPU consumption during searches and will also fix some minor issues. We installed the fix and CPU is indeed lowered by ~50% and searches finish faster. We are still testing if this resolves all our issues but at least it speeds up some things.
Nevertheless, having file indices in database format should be preferrable in my opinion.

Regards,
Oliver
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 139 guests