Exchange Delivery Delays
Incident Report for AppRiver LLC
Postmortem

We sincerely apologize for any inconvenience that yesterday’s Exchange outage caused for our customers. Here’s the root cause analysis, along with our future mitigation strategy:

Incident timeline: 11:15 AM CT to 2:30 PM CT

Symptoms/Impacts: During the incident, there were three instances of a 15-20 minute delay in mail flow and one instance of nearly an hour delay in mail flow. Also some customers would have experienced a few login prompts from Outlook and forced reconnects in OWA. However, that did not impact a wide range of customers and should have mostly occurred for an hour out of the outage window.

Root cause: Bulk disabling of Split Domain Routing caused the Microsoft Exchange message routing table to become corrupt.

Troubleshooting/Corrective Actions: Our monitoring system did catch that there was a performance problem and alerted us properly. Finding and fixing the root cause of the problem took most of the outage time as the routing table was replicated to most of the servers in our Hosting Exchange environment

Follow Up (Actions and Changes): The bulk disabling has happened daily for the past several years without an incident, another reason it took so long to track down the root cause. To ensure something like this doesn’t impact us in the future, we have modified that process to only run on the weekend.

Posted 6 days ago. Dec 05, 2018 - 10:15 CST

Resolved
The delays impacting Exchange delivery and provisioning have been resolved. We'll post more once the root cause analysis has been completed.
Posted 7 days ago. Dec 04, 2018 - 15:33 CST
Update
Our Exchange engineers have identified the issue and are currently in the process of resolving. It is possible that some Outlook clients will have occasional issues connecting. Customer Portal users may also experience some slowness. We will continue to monitor the issue and provide updates as they become available.
Posted 7 days ago. Dec 04, 2018 - 14:23 CST
Update
Our Exchange engineers have identified the issue and are currently in the process of resolving. It is possible that some Outlook clients will have occasional issues connecting. Customer Portal users may also experience some slowness. We will continue to monitor the issue and provide updates as they become available.
Posted 7 days ago. Dec 04, 2018 - 14:22 CST
Update
Our Exchange engineers have identified the issue and are currently in the process of resolving. It is possible that some Outlook clients will have occasional issues connecting. Customer Portal users may also experience some slowness. We will continue to monitor the issue and provide updates as they become available.
Posted 7 days ago. Dec 04, 2018 - 14:12 CST
Update
Our Exchange engineers are continuing to investigate and resolve the mail delivery issues. We have also seen report of Outlook client connectivity issues. We will continue to monitor the issue and provide updates as they become available.
Posted 7 days ago. Dec 04, 2018 - 13:15 CST
Identified
Exchange is currently experiencing mail delivery and provisioning delays in some cases. Our engineers have identified the issue and are working to resolve.
Posted 7 days ago. Dec 04, 2018 - 12:03 CST
This incident affected: Secure Hosted Exchange (Exchange 2013/2016+ (EXG7)).