We are seeing packetloss on some servers, resolved

Stephen

US Operations
Staff member
We are checking as there is some packetloss appearing randomly on some servers across the network, so far it doesn't seem to have a pattern of same switch/subnet etc so it is being investigated.
 
MSSQL12 public interface is currently down as it seems to have some sort of attack or issue, other servers have been working well since it's interface is disabled.
 
It looks like we've identified the issue as a DDOS attack thats been coming on a VPS site for a couple days and fought off, but expanding from there to target other things now. We've shut that VPS off and all looking a lot better now but still keeping a close watch.
 
We are seeing attacks grow even on the down IP, and working to have it null routed now. We are attempting to work with the client to move some sites of theirs that aren't being attacked as well, however keeping them up there is not an option because of these attacks impacting other portions of the network too often.
 
There are occasional momentary issues with SQL database access on a few servers with this, VZSQL, MSSQL12 are two that I am aware of now.
 
We got the null route in place, and it looked good for a bit but seeing a bi more packetloss now. Still checking on matters.
 
This is very frustrating to all of us, and we know you as well. It isn't impacting every server but a couple hardware virtual machine nodes, a couple SQL servers (MySQL and MS SQL) are being impacted, PCS2, PSBM5, VZSQL, and MSSQL12 are seeing the most packetloss at this time.
 
I am now heading onsite to see if there is anything possible I am missing that may help here. We've just gotten off the phone and since the null route the traffic levels are normal completely yet we still see these packetloss coming quite regularly but sporadic.
 
I am going to reboot the core distribution switch core in about 20-30 minutes time frame. I am seeing a number of errors and just want to reboot to fully clear that out. It looks like it is tied to the DDOS attacks that were happening before sending some bad data. We will be on standby in the chance that it is a larger issue with the switch to quicky change it out and get back up and running.
 
The switch is back up after reboot but packetloss continues to we simply eliminated that as a possible problem.
 
We have been monitoring an still working on this matter, we are somewhere between 0 and 3% packetloss at any 15 minute period, so it is better now and not as impacting but we are still trying to bring it back to zero packetloss.
 
Well after 3 solid hours of no issues or any packetloss at any level, we have packets dropping again on a few servers.
 
We've made some optimizations of how some packets are handled and vzsql1/mssql12 are both working quite well now, we are seeing some issues still on the old virtuozzo subnet that we have moved some to Parallels Cloud server on.
 
During the night (well it still is, but the part where I took a 'sleep' of 4 hours) I ran a ping from my home cable wifi connection and while a few packets were lost from the wifi link, it was a total to "0%" in the end due to the vast amount of successful pings.
 
Just as a comprehensive update, we have found what is happening and not quite how to block it but how to prevent it from making as bad. What is happening is an amplification attack on internal networks from forged packets from the outside. It is making some network auto discovery parts of Windows to respond in large volumes making the packetloss.
On the virtual server side, parallels Virtuozzo and Cloud Server we are seeing some packetloss occasionally. Now when it happens we are able to identify where it is coming from and stop. Anyone that we do not have working login information and we have recently requested with security updates, but has not responded may find their passwords reset to default as we set in welcome mails in order for us to proceed and stop the amplified attackers method of causing havok.
 
Seeing some packetloss right now, we also see where the reflection is coming from and just trying to get in to turn it off.
 
Current one resolved until the next VM starts responding to these requests. We are trying to be very pro-active and disabling the auto response before it is actively causing problems, but we can only catch them as some auto response comes in on we just hope it is not on a large scale when we see it and stop, sometimes it is already large non stop replies and makes the packetloss for others.
 
Since the PCS4 node issue and the servers coming live there, we are seeing some packetloss due to the auto discovery functions, working to login to VMs and turn off each as quickly as possible, somtimes it is also hard for us to login with the packetloss.
 
Forgot to update this but the network has been packetloss free since about 10 minutes after this post.
 
Back
Top