Inkling - Inkling Web Reader Performance Degradation – Incident details

Inkling Web Reader Performance Degradation

Major outage
Started over 2 years agoLasted about 1 hour


Inkling Web Reader

Major outage from 2:36 PM to 4:02 PM

  • Update

    At 7:36 AM PT on September 09, 2022, a large burst of roughly 45,000 tasks were emitted by an internal system and entered the API task queue. The queueing system itself was not capable of holding those tasks in memory and became unresponsive. At first, Inkling tried offloading those events to a "dead" queue where no other service components would attempt to operate on them. This had the effect of restoring service to certain parts of the system, but all components which required working with the task queue continued to fail. This was because the queueing system was still holding that large collection of tasks in memory, and it was still failing to interface with other parts of the system. It was decided that these items needed to be completely purged to restore operation. This was done, and service improved yet again. Engineering then monitored systems and restarted services which were shown to not be 100% functional. Our systems show a total of 13 minutes and 58 seconds of downtime. Investigation continues into the source of these events and how the task queueing system can be improved so as not to get overloaded by this rare high event count.

  • Resolved

    This incident has been resolved.

  • Monitoring

    The issue is resolved and Inkling engineers are continuing to monitor this closely.