Quantcast
Channel: Remote Desktop Services (Terminal Services) Forumu
Viewing all articles
Browse latest Browse all 27533

Windows Server 2003 R2 64-bit Terminal Servers Seizing Up

$
0
0

We currently run three Windows Server 2003 R2 64-bit terminal servers for all of our employees to work on.  We have done so for a number of years.  Since 2011 we have been running them in a virtual environment (VMware ESXi 4.1, migrating to 5.1 soon).  All in all, they have worked great for us and have been a great work environment for our employees.  Out of nowhere, however, we started having issues.

On May 20th of this year, we suddenly had one of our terminal servers seize up on us.  All employees working on this server suddenly saw their sessions stop responding entirely.  No new logins to the server could be made--even at the console.  Attempts to remotely connect via Event Viewer or Performance Monitor were no go.  Interestingly, remotely browsing the file system works fine, as well as connecting to see the services (though starting and stopping the services won't work).  The bottom line is that the server becomes completely unresponsive and all I am able to do is do a hard shutdown of the server and boot it back up.

Following is a list of bullet point notes of observations regarding this issues:

* This has happened since the given date to all 3 of our terminal servers and multiple times.  Sometimes it may go a couple days before it happens, sometimes it may go a couple weeks (we actually just went two weeks without any of this and then it happened again yesterday).

* Event Viewer looks clean when I get the server back up.  It indicates no issues logged, so whatever is causing this doesn't leave anything in the Event Viewer log.

* I have scoured the logs of our VMware ESXi server and things look mostly normal there.  No indications of abnormal events.

* I have moved the virtual machines to an entirely different VMware ESXi host and the issue still happens, so this helped rule out hardware issues on the ESXi host.

* No new updates have been made to the ESXi host servers--they have been running the same software and update revisions for quite some time now.

* I removed the most recent windows updates we had done before the first May 20th event (as best I could find for what had recently been installed), but that was to no avail--the issue still popped up

* The terminal server doesn't all seize up at once.  I'll find a few users will first report that they are seized up and usually by this time I am not able to log in at all to the console (or make any new remote desktop connection, either).  However, some users may still be able to work for a couple more minutes.  This all to say that it's not a sudden seize up for everyone on there at once.  It takes maybe 3-5 minutes from the first report before every last person is fully seized up.

* We have the latest service pack and windows updates (minus ones I removed which I had recently installed prior to the first event).

* We run the latest uphclean software (version 2.0.49 beta -- the most current that works on 64-bit terminal servers).  Unfortunately, the most recent non-beta version doesn't work on 64-bit Windows.

* The issue can happen no matter the load on the terminal server.  One of our terminal servers has half the number of people on it as the other two and roughly half the load.  The CPU usage and memory can be quite a bit lower than the maximum allocated for the machine and the issue will still happen.

* The issue doesn't seem to be tied to the length of terminal server uptime.  The first incident happened within 24 hours of a terminal server reboot.  This last stint has been two weeks of no seizing up (though right now we reboot our terminal servers weekly in off hours through a scheduled task).

* This KB article best describes the issue, except it's from 2003 and pre SP1 I think.  All current files I have are newer and I couldn't install this hotfix if I wanted to (I think it's rolled into current SPs / updates): http://support.microsoft.com/kb/832971?wa=wsignin1.0

* Another note on hardware activity.  Eventually once everyone has seized up and nobody is working anymore, the hardware monitor inside of the ESX host server shows most all hardware usage for the virtual machine going to little usage, except the CPU usage never flatlines.  There's still an accumulated usage of a couple Ghz of processor between 4 cores which indicates there's still something running on the terminal server and using CPU cycles.  It isn't anywhere near pegged, but there is still constant CPU activity after everything has stopped functioning.  I've noticed this every time one of our terminal servers has locked up.  There's a constant and steady indication of CPU usage that's roughly the same on any one of the terminal servers after it seizes up.

I'm sure there are more details and notes I don't have readily available of observations I have made of this issue.  I am fully and completely stumped and have no idea what more to do.  There are no leading indicators it's going to happen--it's just sudden and out of the blue.  I've been searching high and low for a pattern.  I've scoured through all software to see if a particular piece of software or an update to software might be causing it, but I'm finding nothing thus far.  I spent a couple hours last night after the lockup we had yesterday trying to get into the server to see what was going on (I moved all employees on this particular server off to a standby server to work while I tried to figure out the issue).  I can't get anywhere to see what might be the culprit.

For anyone who might be able to provide any insight or help into this problem, I would be truly grateful.  I am going to continue looking into other pieces of 3rd party software... and maybe printers as well (though nothing has been updated there in quite some time either).  I'm grabbing at straws to find what suddenly brought this on.

Thanks.


Viewing all articles
Browse latest Browse all 27533

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>