Today we did our last deployment of 2013 (Sprint 57). You can read about it here: http://visualstudio.com/en-us/news/2013-dec-11-vso
Our next deployment would normally fall very close to the Christmas holiday and to avoid that, we’ll be skipping the Sprint 58 deployment and just rolling all of that work into the Sprint 59 deployment in mid January.
We have worked very hard to ensure that this deployment goes more smoothly than the one we did in mid-November. Overall, it went much better but wasn’t incident free. We had a hiccup on Tue during the beginning of the rollout. We know what happened and have applied a fix but we don’t know 100% why. The code that caused the issue hasn’t changed in a long time. The strongest suspicion is just that the instance is just running with enough load now that even things that are a little out of whack can cause backups that cascade into outages. In this case lock contention caused a backup.
In my post on the launch issues, I talked about doing work to find and eliminate all the cases where we do cross server/service requests while holding a lock. If that work were finished this issue would not have happened. That work is in progress and should finish up early next year. It will eliminate a major source of the backups we’ve been seeing. We’ve also accelerated the work to enable multiple production instances so we can spread the load across instances and reduce choke points. The second production instance will go live in late Jan/early Feb and we will gradually direct traffic off the current instance to balance them.
Also, starting with this deployment and continuing until we feel like we’ve got a firm grasp on the issues that have been causing deployment related outages, we’ve moved our deployments to off hours to reduce the risk of interrupting large numbers of people. Stay tuned for more.
Brian