Chat with us, powered by LiveChat

Blog

Back

Issues With Connecting to Microsoft Office 365 on 22nd July 2015

17 Aug 2015 by Ian Byrne

One of the cool perks about collecting data from over two million Microsoft Office 365 seats on a daily basis is that we are often witness to interesting trends and patterns.
One recent incident was that we found we were unable to connect to a number of our customers over a brief period on the 22nd July 2015 – An error “The sign-in name or password does not match one in the Microsoft account system” was being returned. For one customer, this occurred over 200 times within the space of a few minutes before finally authenticating.
Failed Login Attempts
We noticed that the issue seemed to begin at around 9:30 UTC, with a brief lull before occurring again and finally resolving just before midday UTC:
Failed Login Attempts Histogram
It is possible that this was linked to a Microsoft incident logged under the ID MO27972, although Microsoft’s times vary by a few hours. You can see Microsoft’s full report below:
 

Office 365 Customer Ready Post Incident Review

Incident Information

Incident ID MO27972
Incident Title Office 365 third-party application access issue
Service(s) Impacted Office 365

User Experience

Affected users were unable to log in to the Office 365 service using third-party applications.

Customer Impact

Federated customers using specific cloud identity providers were impacted by this event. Other federated customers using Active Directory Federation Services (ADFS) were not affected.

Incident Start Date and Time

July 22, 2015, at 12:17 AM UTC

Incident End Date and Time

July 22, 2015, at 4:22 AM UTC

Root Cause

As part of our ongoing work to improve the Office 365 login experience, an update was deployed to redirect requests to new authentication infrastructure. The improved service encountered a protocol issue in certain third-party identity provider services and this caused login attempts to fail.

Actions Taken (All times UTC)

July 22, 2015
12:17 AM: Incident began.
2:31 AM: Engineers correlated customer reports of the issue and began to investigate.
3:36 AM: Engineers identified specific errors which indicated a potential issue with authentication infrastructure.
4:12 AM: The investigation determined that an update was deployed to redirect service requests to new authentication infrastructure. Some third-party identity providers did not correctly handle a new protocol option, causing authentication requests to fail for customers of those services.
4:15 AM: Engineers began to revert the update.
4:22 AM: The update was reverted and engineers confirmed that the issue was no longer present. The incident was resolved.

Next Steps

Findings Action Completion Date
An update to the Office 365 service caused specific types of authentication requests to fail. Update the service to enable the handling of authentication requests from the known third-party identity providers. July 2015
Customers reported this event before an alert prompted a high priority investigation. Review the monitoring infrastructure for improvements. August 2015