We will be performing an upgrade of our 2 primary SQL clusters tomorrow morning, November 9th, 2013 beginning at 11am EST (4pm UTC). We will be upgrading the databases only (not the OS) from SQL Server 2012 SP1 to SQL 2014 CTP 2.
Here’s the planned timeline, which will of course go out the window once we get started:
- 10 am: Polish resumes just in case
- 10:05 am: Remove node weights from redundant datacenter nodes
- 10:10 am: Begin backups on NY-SQL03 (40 minute runtime)
- 10:30 am: Begin backups on NY-SQL01 (20 minute runtime)
- 11 am: Take NY-SQL02 and NY-SQL04 SQL offline, begin patches and upgrading
(this is where timelines go to hell)
- 1st Upgrade +5 min: Run health checks
- 1st Upgrade + 10 min: Fail over internal availability groups
- 1st Upgrade + 15 min: StackOverflow.com goes READ ONLY and fails over to NY-SQL02
- 1st Upgrade + 20 min: Monitor Stack Overflow health, if we’re good then exit READ ONLY
- 1st Upgrade + 25 min: Failover all other availability groups
(the Stack Exchange network is now running on 2014)
- Failover + 20 min: If all is well, continue upgrading remaining nodes to 2014 CTP2
- Failover + 21 min: Patch and Upgrade NY-SQL01 & 03
- Failover + Restart: Smoke test NY-SQL01 & 03
- Failover + Restart + 5 min: Fail back to NY-SQL01 & 03
We only expect 2 brief interruptions in total as we failover to redundant nodes and back during this process. These failover interruptions typically last a few seconds as the moved IPs come online and ARP is updated everywhere. We’ll blog more about what advancements this upgrade gets us in the coming weeks.
If you want updates during the upgrade or you see something you think we’re unaware of, you can use @StackStatus on twitter - we’ll be watching.