Discord suffered what they classified as a ‘massive outage’ that prevented users from logging into the service or using voice chats.
The outage started at 2:49 PM EST and was initially caused by an issue with the application programming interface (API) outage, preventing various services from communicating with each other.
However, after resolving the API issue, Discord discovered a secondary issue with a database cluster, causing further problems.
“We have identified the underlying issue with the API outage but are dealing with a secondary issue on one of our database clusters. We have our entire on-call response team online and responding to the issue,” Discord explained on their status page.
When users attempted to log into Discord during the outage, they were shown a spinning logo, which ultimately displayed a message about the API outage.
Discord states that they began rate limiting logins to prevent an overload of their operational servers while they fixed the problematic database cluster. During this rate-limiting period, users had to wait long before they were fully logged into the service.
At around 5:12 PM EST, Discord removed the rate-limiting but warned that users might continue seeing issues interacting with bots using the slash commands. Over time, these issues will be resolved.
The full status updates provided by Discord during this outage can be found below:
Monitoring – We have fully removed all rate limits and Discord is almost back to normal.
Over the next hour, some Discord servers may continue to see some issues interacting with bots using slash commands. As part of resolving the incident, we needed to reduce load on our databases and we turned down some parts of our slash command system.
We are going to complete our internal postmortem process to really dig in and understand exactly what happened here, but we really apologize for the inconvenience if you were unable to login today or had other issues.
Jan 26, 14:12 PSTUpdate – We are down to the last set of offline users and we anticipate everybody being fully online within the next 10 minutes.
Jan 26, 13:50 PSTUpdate – More than half of Discord users are back online and working normally. We continue to work to bring the rest of the users back online.
Jan 26, 13:07 PSTUpdate – We are continuing to work on a fix for this issue.
Jan 26, 13:06 PSTUpdate – The database is healthy again and our internal error rate has fallen to nominal levels. We are beginning to raise the login rate limit to allow users to reconnect.
Jan 26, 12:29 PSTUpdate – We are continuing to work through some issues with one of our database clusters. We are still rate limiting login traffic. Next update in 15 minutes.
Jan 26, 12:21 PSTUpdate – We have instituted a rate limit on logins to manage the traffic load. Users who are logged in are successfully using Discord at this point, and we will be slowly raising the limits here to allow more users in as we can. We expect this to be resolved in the next 15 minutes.
Jan 26, 12:07 PSTIdentified – We have identified the underlying issue with the API outage but are dealing with a secondary issue on one of our database clusters. We have our entire on-call response team online and responding to the issue.
Jan 26, 12:03 PSTInvestigating – We are currently investigating a widespread API outage.
Jan 26, 11:49 PST
Update 1/26/22 5:48 PM EST: Article is rewritten to explain the outage.
Source: www.bleepingcomputer.com