Server Issue Today

Status
Not open for further replies.

Scott Greczkowski

Welcome HOME!
Original poster
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,500
25,489
Newington, CT
As you might have noticed today just as soon as we started our Dish Network Retailer Recap our servers went nuts.

All of our servers and their server load went through the roof for no rhyme or reason. We restarted the services and the servers themselves and the problem remained.

Because I was busy doing the Retailer Chat Recap (and grabbing screen shots) and the fact that LER was busy with his real job I made the decission to throw us in "Pat Battle Mode" (O&A fans will love that) so that I could update everyone on whats happening on the retailer chats.

After the chat was done whatever was going on with our servers was gone, server loads were back to normal.

The speed is not up to 100% yet and this is due to the fact that our servers are verifying their raid arrays and this will take awhile for this verification to happen.

Tonight LER will go through the logs to better find out what happened and what we can do keep this from happening again.

Thanks for your patience and understanding.
 
"Pat Battle Mode"? I'm familiar with "Franchise Mode" and "Serious Questions Only Mode", but not that one.
 
Well, those Battle Mode updates were very useful, and I appreciate the work you did (even if the actual updates themselves were infuriating).
 
I had a choice to try fixing the server or do the uplink report. Since I knew most were coming for the uplink report I decided to do that over fix the server.

Speaking of fixing the server, I have the vBulletin Tiger Team looking into our issues which are still ongoing. Because of this there may be periods where you receive database errors or the site is unavailable altogether. Do not be alarmed as we are working to fix things for you. :)

Thanks again for your support!
 
Wow, I saw a "Page Generated" time of 57+ seconds! That's a first (and I hope last!)... This page generated in under a second, so thanks for all of your and LER's efforts to keep the lights on.

Maybe you need more kibble for the mouse that turns the wheel?
 
Nah, we just need my customers to stop having Sev 1's that require me to build 3 new AIX LPAR'S in 2 days, and not have a release of a SaaS product the same week...........
 
Nah, we just need my customers to stop having Sev 1's that require me to build 3 new AIX LPAR'S in 2 days, and not have a release of a SaaS product the same week...........
Have you been hanging around with P Smith? Decoder needed for us common folk. :D
 
Let's decode:
1) 2 different customers of mine (well, my full-time employer) had issues that have our Engineering team scrambling
2) I needed to build up 3 new AIX (IBM Unix) VM's (Virtual Machine)/LPAR (Logical Partition) copies for them to reproduce the issues on.

Took me till 10+pm Last night, and then catch up today.

Basically, I'm swamped......
 
Thanks Ler, now if we could get one of those decoders for P Smith life would be good. :D :D

BTW they are now working on the server. I have been trying to see what they are doing but when I try anything I get the following message from the server...

fork: Resource temporarily unavailable

Damn no peeking. :D
 
They are still working on the servers now, they mentioned that they may have to reboot the servers a few times as they work.

I am headed to bed after a very long day, I hope when I wake up in the morning all the server issues we had today will be a thing of the past. :)

The servers we run on are overkill for our site, and with that said they should be running better then they are now.

Thanks for your patience and support!
 
Looks to be working perfectly now.
 
Pages are indeed popping right up. However we are not at our normal user load, so I will keep an eye on it today.
 
the bottom of the page said:
Copyright 2003 - 2009 SatelliteGuys Incorporated- All Rights Reserved
Page generated in 0.23403 seconds with 18 queries
:up:up:up
Scott Greczkowski said:
Pages are indeed popping right up. However we are not at our normal user load, so I will keep an eye on it today.
Yeah, let Dish light up all the HD we've been waiting for, then let's see what happens to the server load...
 
Actually looking now as we start getting normal traffic something is still wrong.

Normally our server load is under 2.0 but as I am writing this here are the loads...

[Server Loads: 5.22 5.46 : 4.65]

This concerns me as we are not at our peak traffic period of the day yet.

All was working great until yesterday just after noon time. Then our load jumped from 1.47 to 25.75 (and then it kept going up and up and was up to about 100 when I killed the server and flipped us to just the HTML page.) We were not doing anything to the server at the time and nothing changed on the server in a few days. LER did the database server update the other night which does not seem to be the issue at the database servers load averages are currently: 0.34, 0.53, 0.54.

I have notified the server tune team of the high load I am seeing now. But with that abnormal load pages are still snapping up nicely.
 
Not good, server load continuing to climb as traffic reaches normal user levels.

[Server Loads: 7.36 7.49 : 6.86] :(
 
Just got an update from the server tuning guys...

So, after careful examination of your server, I found that your server is under Syn TCP Flooding attack intensifying from time to time and that's why the CPU load sometimes jumps to over 10.

I have contacted the ISP who is now on the case. :)
 
Status
Not open for further replies.