IIS failure alert?

After Sunday's maintenance window my VPS server came back up with a crappy IIS service -- some sites were okay, others just hung. I had to wait for a client to call me to point this out before I realized what was going on, and b/c of the nature of the problem I had to get jodohost tech support's help in troubleshooting the issue (always great tech support, of course, but nicer to not have the trouble in the first place).

I'd like to be able to sleep even on nights when my VPS server is in the shop for a service overhaul, so my question is this: does anyone know of a good tool that would monitor IIS and other services on the server and send out some kind of ping (SMS, email, carrier pigeon) to alert me that my server is behaving badly? I appreciate that a server-based service can't fire if the server is completely busted, but if it's running on two cylinders perhaps an OS-level app could still do the job?

I've got a couple of sites set up with http-based site monitors, but those things misfire all the time, so not really as reliable as a server-level service.
 
That is odd that some would work and others not, do you know of any log in event viewer? There are a few "remote" event viewer tools you could have mail you in then event of a similar log using some key words.
 
Stephen,

Looking over log files, I find this warning: "A process serving application pool 'plesk(default)(pool)' failed to respond to a ping. The process id was ..." for a whole bunch of processes -- I'm assuming those were the sites on the server that failed to come back up after the maintenance. But it doesn't tell me anything about WHY they failed. There's no other indication that IIS encountered trouble during the reboot. I'm surprised that Windows only considers the ping failure worthy of a warning in the log, not an error -- but maybe there really was no error reported by IIS during startup, just a ping failure when the botched sites were first hit? There's nothing special about the sites that failed -- they don't specifically rely on CF or MySQL or have anything much in common (except, of course, they're challenging clients of mine who'll spend the day giving me grief about the AM downtime... :( )

Also, ColdFusion MX7 (of course) failed to come back up after the reboot this AM; digging thru the logs I find: "The ColdFusion MX 7 Application Server service could not be started within 240 seconds. Increase the server startup timeout value using C:\CFusionMX7\runtime\bin\jrunsvc.exe -starttimeout <seconds> "ColdFusion MX 7 Application Server"." I couldn't get the service started thru Virtuozzo, either, but Akshay was able to do it thru the VPS eventually. I'm going to try to increase that timeout value and see if it works next time the server reboots; but I would've thunk 240 seconds would be enough time even for a monster like MX7 to get up and running, no?
 
I have seen the ping failures happen, in fact we occasionally have it happen on the real servers.

As for the CFMX startup 240 should be plenty, I'd check your java settings and make sure the ram limits are not set to the 512 that I believe is the default, if it is trying to launch a large process it may be dying and getting into a re-spawn launch. I'd set to about 96MB default size and 160MB max size on CF with a VPS if it were up to me.
 
...in fact we occasionally have it happen on the real servers.

<sobbing>Makes me feel like I'm on some kind of "fake" server... </sobbing>

Thx for the CF info, I'll tweak and see what happens. I'm not pushing CF that hard, so the lesser RAM settings should hold.
 
haha not what I meant at all, just it is virtual private server
In fact we run jodohost.com on winvps1 :)
 
ipcheck is good for local service(we run it on outside servers, checking to internal each minute), for outsourced service look at alertra or hyperspin(we used hyperspin a while back but they had some bad false alerts that caused us to drop them, they have since improved, according to them)
 
Back
Top