Win1, win2, win3, win8 and win13 - Resolved moved machines to other nodes

Very sorry guys this is my fault. I was doing maintenance on the iSCSI storage and one of the servers there had a link to it, and for some reason it forced the node to reboot with error when that went down. All but win1 are up (win1 had the iSCSI link on the maintenance I was doing)

The iscsi was not critical to operation of these servers, but it does seem it was important enough to make the node have a fault from it.
 
:( the node rebooted again from this, iSCSI will be up very shortly, working on highest priority.
 
Crud, it just rebooted again, and iSCSI is fully up now, it shoudl be up fully at boot at well when that comes live again, so iSCSI will be able to be ruled out if reboots again this time. Doens't look too good right now.
 
Looking at node side not seeing any alerts from the controls built in temps are really good at 26C internally, node is up again now, watching with an eagle eye.
 
This can't be a virtual machine making this issue, it has to be something else. Node just rebooted agian. RAID subsystems all good, iSCSI good, nothing can be faulted there.
 
I am still in debugging issue, I shut down the node reset all the montioring values, starting back up now. not seeing any ECC errors or other things.
 
There is something pretty bad wrong with the node, just going to black screen when it stops replying, we are going to a change of node for all impacted servers, and we will do all possible to pull that from live data on this node.
 
We currently have the node down while we plan attack on moving that data as if it keeps rebooting like this we'll see corruption occur, we don't want that at all.
 
This is still going to be quite a while, probably looking at 3-6 hours depending on speed of file copies. We are working to move it all as fast as we can. About to try moving some via network but while servers offline, at least giving a shot to 2x the speeds by two copies at once.

the good news will be that we have a huge upgrade in the node OS allowing such things as live migrations once it is done as we'll be moved to Windows 2012 based node not 2008 R1 sp1
 
Well, not good news but news, from safe mode, safe with network, all rebooting on trying to copy file. Pulled server down to workbench and working on it that way, pulled RAM out and running with minimum RAM, it jsut rebooted yet again there. About to change out the RAM sticks now, still staying small, so we can try to copy.
 
We've been up about 30 min (about time of last post) with new RAM swapped and seems to be at least copying data properly now, a solid step. Once I get one VM copied I will expand the operation to network copying as well as a HD copy, then we will be able to start getting clients up and running ASAP at least.
 
I can report we are finally stable on getting copies off, and should have some VMs coming back up in the coming hourish!
 
Windows says 26 minutes left on the file copy for Win8 server now, hoping that is accurate!
 
I went ahead and made win8 live, will get the logs attached back to it and logging ASAP. This is a temporary measure, and may be slightly slower while it is not there.
 
Back
Top