SharePoint Failover Scenarios

SharePoint Failover Scenarios

I recently published a blog post entitled, “SharePoint and Disaster Recovery Options,” but it did not specifically address SharePoint failover scenarios.

I began to give more thought to failover scenarios when a prospective customer asked us to devise a SharePoint environment that has the option for a fully redundant, geographically dispersed disaster recovery plan. After being asked to break out the cost for a direct failover time of a few seconds, 15 minutes and 4 hours, I was asked about our options to accommodate the few seconds, 15 minutes and 4 hours with no data loss incurred.

Failover of SharePoint is implemented differently depending on which version of SharePoint and SQL Server a client is using. There are usually two components to “recovery” – RTO (Recovery Time Objective) and RPO (Recovery Point-in-Time Objective). RTO is how long does it take to recover. RPO is what point-in-time do you want to recover to.

Breaking out the cost pertains only to RTO. However, the second question, regarding options to accommodate time lapses, also needs to be considered and is probably more important in determining a solution than RTO. If your RPO is “recovery with no loss,” then the only solution is to use SQL Server Enterprise Edition with multi-threaded Mirroring enabled.

The failover dilemma with SharePoint 2010 is that there are many more databases than in previous editions. For example, in MOSS, fully implemented, there are only four databases. In WSS 3.0, there are only three databases. In SharePoint Foundation 2010, there are 12 databases. In a SharePoint Server 2010 bare-bones installation, there are 22 databases.

The number of databases in SharePoint 2010 is directly tied to the number of features which get implemented. If, for example, Excel Service gets implemented, then that’s another database. If Access Services gets implemented, then that’s another database. That said, you could get away with doing Log Shipping in previous versions, but with SharePoint 2010, it is a royal pain in the butt. Each database needs a separate Log Shipping configuration. However, some of the databases will not recover properly with Log Shipping, because they are tied to SharePoint’s always-running Timer activity.

That said, here are my recommendations assuming RPO is immediate (i.e., no data loss):

For recovery in a few seconds:

– SQL Server Enterprise Edition with multi-threaded Mirroring implemented

For recovery in 15 minutes:

– SQL Server Enterprise Edition with multi-threaded Mirroring implemented

For recovery in 4 hours:

– SQL Server Enterprise Edition with multi-threaded Mirroring implemented

As I mentioned in my previous blog, the SharePoint 2010 Disaster Recovery Guide by John L. Ferringer and Sean P. McDonough is a good resource. I would be interested in your thoughts on this too. Leave a comment, please.

 

Terry Engelstad is Network Operations Manager for AIS Network.  He holds a MCP, MCSE, CCNA, MCDBA, MCTS and MCITP.

Leave a Comment