Comprehensive data protection for all workloads
Post Reply
kongking
Lurker
Posts: 2
Liked: never
Joined: Oct 12, 2010 10:49 am

Backup error

Post by kongking » Oct 12, 2010 11:04 am

We have several virtual machnes on ESX server which we backup through Veeam Backup and Replication.


Suddenly backing up has failed to work:

Code: Select all

Created by Veeam Backup
10/11/2010 2:21:19 AM
Backup: vcenter2 (Retry)
Created at: 8/31/2010 4:33:44 PM Created by: VEEAM1\root
Session Details
Status 	Error 	Start time 	10/11/2010 2:20:58 AM 	Details
Checking backup version Failed to wait mutex perlsoapbackup5.fc2.com: timeout 20 sec exceeded
Total VMs 	0 	End time 	10/11/2010 2:21:18 AM 
Processed VMs 	0 	Duration 	0:00:20  
Successful VMs 	0 	Total size 	0.00 KB
Failed VMs 	0 	Processed size 	0.00 KB
VMs in progress 	0 	Processing rate 	0 KB/s
Processed Objects
VM name 	Status 	Start time 	End time 	Total files 	Processed files 	Total size 	Processed size 	Processing rate 	Duration 	Details

I dig through logs and found nothing useful for me:

Code: Select all

[12.10.2010 11:23:35] <01> Info  Starting job "vcenter2". See log file at "C:\Documents and Settings\root\Local Settings\Application Data\Veeam\Backup\Job_vcenter2.log"
[12.10.2010 11:23:35] <01> Info  Log has been started by VEEAM1\root user (Non-interactive)
[12.10.2010 11:23:35] <01> Info  Logging level is 4
[12.10.2010 11:23:35] <01> Info  Module: C:\Program Files\Veeam\Backup and FastSCP\Veeam.Backup.Manager.exe version: 4.1.1.105
[12.10.2010 11:23:35] <01> Info  OS: Microsoft Windows NT 5.1.2600 Service Pack 3
[12.10.2010 11:23:36] <01> Info  CPU: Intel Pentium III Xeon processor
[12.10.2010 11:23:36] <01> Info  Memory: 256.00 MB
[12.10.2010 11:23:36] <01> Info  Network: Local Area Connection, AMD PCNET Family PCI Ethernet Adapter - Packet Scheduler Miniport, Ethernet, Up; Unicast IPs: 10.40.107.15; Gateway IPs: 10.40.107.1;
[12.10.2010 11:23:36] <01> Info  Network: MS TCP Loopback interface, MS TCP Loopback interface, Loopback, Up; Unicast IPs: 127.0.0.1;
[12.10.2010 11:23:36] <01> Info  --------------------------------------------------
[12.10.2010 11:23:36] <01> Info  Creating job session, jobID {46f4f1c5-a0bd-411d-90ed-c64ecf497cf1}, jobName "vcenter2"
[12.10.2010 11:23:36] <01> Info  Job session {b8d9ea98-e316-411e-96fe-2c65f2deec8c} has been created
[12.10.2010 11:23:36] <01> Info  Job options: jobType "NET Backup", <BackupJobOptions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><NetBackupOnVcbFailure>false</NetBackupOnVcbFailure><CompressionLevel>6</CompressionLevel><RunManually>false</RunManually><FullBackupDays /><MaxAmountOfDiffBackups>0</MaxAmountOfDiffBackups><PostCommand><Days /></PostCommand><VDDKMode>san;nbd</VDDKMode><Templates>true</Templates></BackupJobOptions>
[12.10.2010 11:23:36] <01> Info  Job VSS options: <CVssOptions xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><Credentials><Password>AQAAANCMnd8BFdERjHoAwE/Cl+sBAAAAv23QvUMV+UOpH1SWPYoTdwQAAAACAAAAAAADZgAAqAAAABAAAADnghjAsvNr1j9Y9q4xYpOfAAAAAASAAACgAAAAEAAAAN1L0ylLqyzp3pD0UuIVR7oIAAAAvYWR+asNXpYUAAAAyeCmLcqHKR0QBNLDch3kl62p6cI=</Password></Credentials><IgnoreErrors>false</IgnoreErrors></CVssOptions>
[12.10.2010 11:23:36] <01> Info  Retry mode: False
[12.10.2010 11:23:36] <01> Info  Job operation: Checking backup version
[12.10.2010 11:23:36] <01> Info  Backup version 1, current supported version 1
[12.10.2010 11:23:36] <01> Info  Target host: name 'backup5.fc2.com', info 'Linux Host', apiVersion 'Linux Host'
[12.10.2010 11:23:36] <01> Info  Creating file commander for the host "backup5.fc2.com" (allowNfc: True)
[12.10.2010 11:23:36] <01> Info  [Ssh] Creating new connection 'backup5.fc2.com:22:fc2:True:True:1'.
[12.10.2010 11:23:36] <01> Info  [Ssh] logon, host "backup5.fc2.com", port 22, user "fc2", elevation to root "yes", autoSudo yes
[12.10.2010 11:23:37] <01> Info  [Ssh] Server (backup5.fc2.com) version string: "SSH-1.99-OpenSSH_5.3"
[12.10.2010 11:23:57] <01> Error  Failed to wait mutex perlsoapbackup5.fc2.com: timeout 20 sec exceeded   at Veeam.Backup.Common.MachineMutex.Init(String subSys, String name, Boolean waitForRelease, Int32 waitSec)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Common.MachineMutex..ctor(String subSys, String name, Boolean waitForRelease, Int32 waitSec)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CSshFileCommander.Initialize()
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CFileCommanderFactory.GetCommander_(CDBHost host, EProtocol protocol, Boolean allowNfc)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CFileCommanderFactory.GetCommander(CDBHost host, Boolean allowNfc)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CBackupTarget..ctor(CDBJob job, CDBSession jobSess)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CTargetFactory.CreateTarget(CDBJob job, CDBSession jobSess)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CBackupJob.Execute(CDBJob job, CDBSession jobSess)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CJobStarter.RunSession(CDBJob job, CDBSession session)
[12.10.2010 11:23:57] <01> Error     at Veeam.Backup.Core.CJobStarter.Run()
[12.10.2010 11:23:57] <01> Info  Job result: Failed to wait mutex perlsoapbackup5.fc2.com: timeout 20 sec exceeded
[12.10.2010 11:23:57] <01> Info  Job session {b8d9ea98-e316-411e-96fe-2c65f2deec8c} has been completed, status: Failed, 0 of 0 bytes, 0 of 0 tasks, 0 successful, 0 failed, details: "Checking backup version\nFailed to wait mutex perlsoapbackup5.fc2.com: timeout 20 sec exceeded\n"
[12.10.2010 11:23:58] <01> Info  Generating xml report for session {b8d9ea98-e316-411e-96fe-2c65f2deec8c}
[12.10.2010 11:23:58] <01> Info  Transforming xml report to html report
[12.10.2010 11:23:58] <01> Info  Sending email notification, server "10.40.10.197", port 25, timeout 100000
[12.10.2010 11:23:59] <01> Info  Job has been stopped
[12.10.2010 11:23:59] <01> Info  [Soap] Clearing cache.
[12.10.2010 11:23:59] <01> Info  [Soap] Cache is empty.
[12.10.2010 11:23:59] <01> Info  [Ssh] Clearing connection cache
[12.10.2010 11:23:59] <01> Info  Removing from cache
[12.10.2010 11:23:59] <01> Info  Disconnecting from backup5.fc2.com
[12.10.2010 11:23:59] <01> Info  [Ssh] Connection cache cleared
[12.10.2010 11:23:59] <01> Info  VeeamBackup Manager has stopped
seems that veeam just unable to access linux host via ssh, but i found no reason why.
linux host is not even loaded and i can log on it without any problems:

Code: Select all

:~# ssh backup5.fc2.com
Warning: Permanently added 'backup5' (RSA) to the list of known hosts.
Last login: Tue Oct 12 20:00:01 2010 from 192.168.54.56
Linux backup5.fc2.com 2.6.33.2-v03 #3 SMP Sat Jul 3 00:30:17 JST 2010 x86_64

To access official Ubuntu documentation, please visit:
http://help.ubuntu.com/

root@backup5:~# w
 20:03:32 up 44 days,  5:51,  2 users,  load average: 1.74, 1.28, 1.17
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    ss-i.fc2.com     20:03    0.00s  0.00s  0.00s w

Vitaliy S.
Product Manager
Posts: 22777
Liked: 1526 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Backup error

Post by Vitaliy S. » Oct 12, 2010 12:45 pm

Hello, try rebooting Veeam backup machine, if it doesn't help please refer to our support team providing all log files for further investigation.

kongking
Lurker
Posts: 2
Liked: never
Joined: Oct 12, 2010 10:49 am

Re: Backup error

Post by kongking » Oct 12, 2010 8:41 pm

Vitaliy S. wrote:Hello, try rebooting Veeam backup machine, if it doesn't help please refer to our support team providing all log files for further investigation.
I think the problem was that we have two huge jobs running almost simoultaneously every night: vcenter2 and vcenter3.
Vcenter2 is about 264GB, and vcenter3 is about 4,5 TB

I've disabled the schedule on jobs. Restarted the server, and then run vcenter2 job by the hand.
It started to work, and the work still in progress: 24% at this moment.

Will see the further progress, hope it won't fail.

thank you

Vitaliy S.
Product Manager
Posts: 22777
Liked: 1526 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Backup error

Post by Vitaliy S. » Oct 12, 2010 9:26 pm

Yes, mutex timeouts could be caused by concurrently running jobs, I would suggest avoiding that in order not to have those kinds of errors later.

NightBird
Service Provider
Posts: 175
Liked: 32 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: Backup error

Post by NightBird » Mar 19, 2011 9:14 am

I have this kind of problem on replication job on my side.

I run two replication job everything hour from 6 am to 7 pm.
The problem is that the replication job start at the same time, is it possible to make the replication job not to start at the same time together ? (one at 6:00 and the over at 6:10 for example)

Regards,
Boris

Vitaliy S.
Product Manager
Posts: 22777
Liked: 1526 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Backup error

Post by Vitaliy S. » Mar 21, 2011 12:38 pm

Yes, I believe you should be able to use PowerShell to schedule replication jobs in a proper way.

Post Reply

Who is online

Users browsing this forum: Baidu [Spider], borninpa, garypigott, Google [Bot], Majestic-12 [Bot] and 42 guests