Google Apps | Great Example of Accountability and Professionalism
Below is a message I received from Google Apps today regarding a recent Google Apps (Gmail) web interface mail outage.
--- START GOOGLE EMAIL---
"Dear Google Apps customer,
We would like to follow up on the recent Gmail outage and resulting credit through our service level agreement (SLA) with you.
Between 12:45 PM to 2:15 PM PDT | 19:45 - 21:15 GMT on Tuesday, September 1, 2009, Google Apps Gmail users were unable to access their accounts
through the Gmail web interface. Users could continue to access their accounts via IMAP and POP. No data was lost during this time; messages were received and delivered, but could not be displayed.
As a result of this incident, we are extending a 3-day SLA credit to your account. This credit will be reflected in an automatic 3-day extension to your Google Apps term date, and no action is needed on the part of your administrators.
We understand that this service outage has affected our valued customers and their users, and we sincerely apologize for the disruption and any impact.
Following are the key points from the incident report:
On Tuesday, September 1, a small portion of Gmail's web capacity was taken offline during a routine upgrade and service update. This is normal operating procedure as the Gmail web interface runs in multiple locations, and Gmail's request routing automatically directs users' requests to available servers. However, we underestimated the increased load that some of the new updates placed on request routing.
As a result, at approximately 12:30 PDT, a few request routers became overloaded and responded by refusing all incoming requests. This response transferred the load to the other request routers, and as the effect rippled through the system, almost all of the request routers became overloaded. As a result, users could not access Gmail through the web interface since their requests could not be routed to a Gmail server. Gmail processing and access through the IMAP/POP interfaces continued as usual because these processes use different request systems.
Upon receiving the error alerts, the Gmail Engineering team immediately began analyzing the issue and initiated a series of actions to help alleviate the symptoms. After determining the root cause to be insufficient available capacity, the Engineering team deployed a large-scale addition of request routers through Google's flexible capacity server systems. As they distributed incoming traffic across the expanded pool of request routers, access to the Gmail web interface returned to normal.
During the incident, we published ongoing reports to the Google Apps dashboard, Gmail Help Center, the Enterprise and Gmail blogs, and the GoogleAtWork and Google Twitter feeds, to help provide customers with the latest status and available workarounds.
The complete incident report (http://www.google.com/appsstatus/ir/buuqdnt6fcervea.pdf) in the Google Apps Status Dashboard describes the corrective and preventative measures to address the underlying causes of the issue and to help prevent recurrence. For ongoing service performance information, please see the Google Apps Status Dashboard at http://www.google.com/appsstatus.
Once again, we apologize for the impact that this incident has caused. Thank you very much for your continued support.
The Google Apps Team
Email preferences: You have received this required email service announcement to notify you about important changes to your Google Enterprise product or account.
1600 Amphitheatre Parkway
Mountain View, CA 94043"
--- END GOOGLE EMAIL---
This is a great example of Accountability and Professionalism. Moreover it illustrates why the world should want to work with Google Apps. This proves to me that when the %@#!^ hits the fan accountability is not optional! Thank you for stepping up and following through Google Apps! We are grateful to be a Customer!
President & CEO