Increased Latency Cause Site Unresponsiveness
Incident Report for Inc.

Summary of event: experienced a severe increase in API response time.

Business Impacts: was unresponsive from 10:01 AM until roughly 10:20 AM.

How was the issue resolved: We rolled back an update that was pushed out this morning. Removing the update from our production servers allowed our servers to recover and return to normal capacity.

Root cause: One of the updates we rolled out on 3/10/2022 caused a problem with our servers writing to the cache. An error occurred as we were writing the data to the cache, which led to the servers failing.  As servers failed, response times increased and the system became unresponsive.

Preventative next steps: We have prepared a fix to the problematic code and will be redeploying the update with the fix next week. In addition, we are improving our testing on our staging environments to help expose these problems before they are introduced to our providers and patients.

Posted Mar 14, 2022 - 10:48 EDT

The fix we implemented has corrected the problem. We are marking this incident resolved. We will be sharing a post mortem in the coming days.
Posted Mar 10, 2022 - 17:35 EST
A fix has been implemented and we are continuing to monitor the situation.
Posted Mar 10, 2022 - 10:49 EST
The issue has been identified, and we are currently working on a fix.
Posted Mar 10, 2022 - 10:20 EST
We are continuing to investigate this issue.
Posted Mar 10, 2022 - 10:11 EST
We are currently investigating the issue.
Posted Mar 10, 2022 - 10:11 EST
This incident affected: API.