Wednesday 3 March 2010

Recover a Suspect SQL 2005 database

One day it will happen - a database will drop off the face of the earth while someone is working on it, or will not appear after a restart, and you get the terrifying words ...

     Database cannot be opened.  It has been marked SUSPECT    

Step One - Don't Panic!! It's easy to do, and it is what most of the rest of your company will be doing, especially if it is a main production database.

Step Two - read step one again!

Now read the error log - there will probably be a clue in there as to why it has been marked SUSPECT, along with the helpful advice of restoring the database from a backup. 

You may find the symptoms can easily be fixed (e.g. out of diskspace), but otherwise you need to set the database into EMERGENCY mode so you can at least access it again.

     ALTER DATABASE DBName SET EMERGENCY     

Once set, the Database icon should turn red in management studio, and you can access your database. 
The best way to fix the issue is now to run

     DBCC CHECKDB (DBname)    

This may magically cure the problem, or may alert you there are errors that need fixing - the best you can hope for here is to fix with data loss but at least it will get your database back online

You will probably want to make a note of the outcome of this check, so you can identify which table(s) were corrupted, as you may want / need to manually update data in them later.

Note that you will need to switch the database into SINGLE_USER mode to run this.

     ALTER DATABASE DBName SET SINGLE_USER    

     DBCC CHECKDB (DBName, REPAIR_ALLOW_DATA_LOSS)    

This should fix any issues, and you can switch back to MULTI_USER mode to get back up and running again.

ALTERNATIVES

If the database in question is large, you may find the DBCC CHECKDB takes a very long time.
You may be able to identify however from the SQL error log, exactly which table has caused the problem, and directly attack the issue.
You can run a DBCC CHECKTABLE to confirm the errors are on that table - although running with repair_allow_data_loss doesn't seem to fix these errors. 

If this indicates it is an index that has caused the problem, you could drop the index and rebuild which will fix the issue.

More likely it will indicate the actual table, but in Emergency mode you can SELECT the available data out of this table (although there will be some loss on the corrupted pages) and write it into a holding table. 
Now you can drop the offending table, and switch the database back online with

     ALTER DATABASE DBName SET MULTI_USER    
     ALTER DATABASE DBName SET ONLINE    

and then manually fix the corrupt table.

Hopefully you have now recovered your database, and proved why the company needs a DBA!

Finally, refer to Step one again!

No comments:

Post a Comment