Our team is very interested in the questions posted on SMVI functionality, configuration, and automation … thanks for taking the time to share. And please keep the questions and feedback on your experiences working with SMVI, good or bad, coming. Good feedback makes for a good day, bad feedback makes for a better product J.
I also wanted to let you know that NetApp is hosting an SMVI/SRM Webcast on February 19 focusing on data protection in a VMware environment: NetApp SMVI for backup/restore and VMware SRM for disaster recovery. SDDPC Sys Admin Rick Scherer - who designed and maintains a 25 host VMware ESX 3.5 farm with well over 300 Virtual Machines, plus writes a great blog (http://vmwaretips.com/wp/) - will be joining us to describe how his team uses SMVI. There will also be a panel of folks, including best practices authors and reference architects, to address questions submitted via chat.
SMVI Product Manager
I am in the UK and it will be late in the evening here when this event starts. Will the streamed contents be recorded and available via the now site, after the event?
We are currently using OSSV to backup our VMs but will be changing to SMVI soon and it will be good to know about some of the "holes in the road" before we hit them?
we run smvi 1.0.1 for several weeks now and it is pretty stable (compared to 1.0!). last night, 2 of our daily backup jobs started off - and they still are in running state, producing event 4096 in the application log like this:
1st an Error:
2936921 [backup2 6778732c8323002d813f32f6dab0368e] ERROR com.netapp.common.flow.JDBCPersistenceManager - FLOW-10110: Lock "50228a39-7eba-0c39-5d7c-bb5b34be305f" already held by backup-create operation 9a80d4f1bc7f39fa6e4cabb0559a097a 
2nd a warning:
2936921 [backup2 6778732c8323002d813f32f6dab0368e] WARN com.netapp.smvi.task.AcquireVirtualMachineLockWrapper - Could not lock all virtual machines. The lock process will continue to be retried until it succeeds.
During yesterday afternoonn I was trying to restore a Test-VM which failed! Coud that maybe disturb the backup jobs?
I cannot stop the 2 backup jobs - have restarted the vc server (where smvi is also installed) - nothing helps. I cannot start new ones - so we are somewhat looked up!
Hope anyone can give me a hint how to stop this...
Thanks in advance
If you want to just clear out the current problem so that you can
continue running again, please read our knowledge base article,
I get an error: "Solution does not exist in this knowledge base" when clicking the link!
MCSE / CCEA / VCP
Tel +41 32 387 82 19
Fax +41 32 387 81 11
Besuche Sie uns im Internet unter: www.in4u.ch
Visitez nous sur Internet: www.in4u.ch
Visit our homepage: www.in4u.ch
Oh - we found out, why the scheduled Jobs were running all night long until 30 minutes ago! One of the volumes on the DR filer was removed because the storage people were thinking, we don't need it anymore. But the SMVI jobs trigger a snapmirror update and that was failing all the time.
Again it would be very nice to get to know, how we could stop these running jobs. As a said - we tried reinstalling SMVI, clearing all sorts of temp files and rebooting the VC-Server several times. Those jobs just cannot be killed!
I'm glad you found the cause on your end.
You can clear out ALL running tests by using the following steps (from that KB article)
SnapManager for VI utilizes an internal database to keep track of these locks and provides persistence across reboots. Simply rebooting the SnapManager for VI host will not clear these locks.
If you want to remove all currently running tasks in SMVI, perform the following:
- Stop SnapManager for VI service.
- Remove the <SMVI dir>/server/crashdb directory.
- Start SnapManager for VI service.
Performing these steps will not affect the scheduled jobs nor remove them from the interface. It will kill and remove any outstanding or in process tasks.
I have a question/problem with regards to restores. I have successfuly backed up a test VM and then I deleted it from disk. When I attempt to restore it, I get the following error:
=== CLIENT ===
OS Name=Windows 2003
=== ERROR ===
=== MESSAGE ===
Error restoring backed up entity
=== DETAILS ===
Could not locate or create an initiator group on storage system "vanna01" for ESX server "192.168.16.65". Please ensure the ESX server has one or more initiators logged into the storage system.
=== CORRECTIVE ACTION ===
=== STACK TRACE ===
com.netapp.nmf.smvi.main.SmviErrorDetailException: Error restoring backed up entity
at java.awt.event.InvocationEvent.dispatch(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
The ESX host in question is listed in VC as "192.168.16.65"
The Initiator group is called ISCSI-VANVS05
The initator nodename is iqn.1998-01.com.vmware:vanvs05-5ae8d131
The initiator alias is vanvs05.van.davis.ca
The initiator is logged into the storage system
Is there something that is supposed to match up between the host/ip/name used to login via SSH in SMVI and the iGroup that I am not aware of?
I have a customer running SMVI 1.0.1, on a 6080, running 100's of VMs.
When we run a backup on a datastore holding 15 guests, SMVI creates 1 snapshot per machine, which makes it impossible for the cusotmer to keep the snapshots on the main site for more that a few days (255 max snaps / 15 = 15 days).
The customer got burned in the past with a virus attack that took a while to discover and want to retain the snapshots for more than 15 days - is it possible to make SMVI take 1 snapshot of all 15 machines (we are aware of the fact that all machines will need to be in Hot Backup mode and that this will affect performance for the duration of the backup window).
SE, Strategic Accounts,
It sounds like your customer is creating separate SMVI backups for each VM. For their case, I would suggest looking at creating a smaller number of backups with more VMs. They could try a single backup of just that datastore. SMVI will, by default, create VMware snapshots for every VM in the datastore, then it will create a single ONTAP snapshot for the datastore(s) involved.
Please be aware that if the VMs are experience heavy I/O, it is possible for one or more of the VMware snapshots to fail. In those cases, the VMs that did not complete their VMware snapshots will not be properly backed up. If this occurs, the two resolutions are to either a) disable VMware snapshots, b) Reduce the number of VMs per SMVI backup. If the choice is to reduce the number of SMVI backups, I would start by just dividing in half and testing.
Appreciate the quick response.pparently there was a misunderstanding between the customer and me:
He has 17 LUNs (300GB each, 500VMs total) inside one volume, and has 17 backup jobs (on the datastore level) running one after the other (he doesn't want to backup all the machines in one snapshot fearing that 400 quiesce requests will overload the servers, and that's why he's getting 17 snaps per backup.
He claims that our best practices for SMVI were the reason he chose the LUN size to be 300GB, and what lead to the need for 17 LUNs.I realise that we can ask them to move some of the LUNs to other volumes, but that would require a downtime for some of the servers.Is there another way around this issue ?
Any idea how we can work around this issue ?