Lots of Database errors

Status
Not open for further replies.

yaz96

Baby, It's Cold Outside
Dec 22, 2005
12,829
1
Front Range, Colorado
When trying to switch between pages, getting alot of Database Errors today.

And the site could not be found a few times over the past couple of days.

Thought I'd let you know.
 

Scott Greczkowski

Welcome HOME!
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,775
26,458
Newington, CT
The database errors were do to our ISP accidentally unplugging us from the database server when they were looking at something for me.

Since late Thursday night we have been having a problem with our web server. At times the site would just die and the screen would fill with page fault errors.

Larry suspected something wrong with one of our drives early on, but our RAID cards reported no problems... until this morning when it reported one of the drives had bad sectors on it. Now I knew which drive was going bad. I ordered a new drive and have it being sent overnight to Dallas and it will be replaced tomorrow.

Hopefully this fixes the issues we have been seeing. This is the second drive to fail on it this year. So much for server grade drives. :)
 

Scott Greczkowski

Welcome HOME!
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,775
26,458
Newington, CT
Yup the database errors started at 5 and were fixed as soon as I was notified of the issue. :)

The other issue as I said has been going on since Thursday night.
 

Scott Greczkowski

Welcome HOME!
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,775
26,458
Newington, CT
No only images are sent from the cloud servers.

The issue is not the raid server, the issue is the hard drives failing. They are mechanical they will fail. Got to love these "Server Grade" drives. :)
 

Scott Greczkowski

Welcome HOME!
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,775
26,458
Newington, CT
They are not big enough yet.

Besides i has had 2 SSD drives fail on me due to controller board failure... They died faster then a standard drive.
 

Hall

SatelliteGuys Master
Feb 14, 2004
18,409
3,199
Germantown OH
Besides i has had 2 SSD drives fail on me due to controller board failure... They died faster then a standard drive.
And with a "mechanical" hard drive, you likely have signs that it's starting to fail. With an SSD, it probably just goes *poof* one day and it's gone.
 

Scott Greczkowski

Welcome HOME!
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,775
26,458
Newington, CT
Yup thats true Hall... (I use SSD drives on the computer in my truck, since the vibration o f the truck quickly killed my regular hard drives.)

With a big RAID 10 aray like we have we did not know which drive was bad until the raid card reported thats one of the drives was starting to have sector errors.

Until then we couldnt do anything as we did not know which drive to replace in the array.
 

scooby2

Pub Member / Supporter
Jun 25, 2005
659
0
Chicago, IL
What kind of RAID card? Some allow you to check the predictive failure and SMART variables of the drives. I have some nice Nagios plugins to throw warnings when drives believe they are going to fail. When they start failing or completely fail it throws a critical.
 

Scott Greczkowski

Welcome HOME!
Staff member
HERE TO HELP YOU!
Cutting Edge
Sep 7, 2003
102,775
26,458
Newington, CT
They are 3WARE cards. (9500s I believe)

When the problem first started happening, LER suspected a drive or RAID issue because of all the Page Fault error we were getting. The problem was we didn't know what was wrong until a few day later when the RAID controller noticed the drive had bad sectors and was trying to fix it.

That is when we finally knew exactly what the problem was and what drive to replace. :)
 

Foxbat

Addicted to new HW
Supporting Founder
Pub Member / Supporter
Lifetime Supporter
Nov 25, 2003
20,800
14,612
Michiana
With a big RAID 10 aray like we have we did not know which drive was bad until the raid card reported thats one of the drives was starting to have sector errors.

Until then we couldnt do anything as we did not know which drive to replace in the array.
Why RAID 10? I assume the RAID controller is striping mirrored sets of drives because the other way (mirroring stripe sets) is more likely to fail (I should know!) If you have enough room and money, you set up a Hot-swap Spare that can be used to replace a drive that starts throwing errors. Whatever the situation, one drive in a RAID array should never cause the system to "see" the error at the file or application level. Losing one drive in a RAID array (except for RAID-0) should still leave the system in a usable, albeit vulnerable state.
 
Status
Not open for further replies.

Users Who Are Viewing This Thread (Total: 0, Members: 0, Guests: 0)

Who Read This Thread (Total Members: 1)

Latest posts