Friday June 10, 2016

Outage

Website and API outage

We have completed the last steps of our immediate remediation of this morning's incident in which web server capacity was exhausted. After the initial incident that impacted Slack's website and API was addressed the problem resurfaced to impact file upload and download; this, too, has been fixed. We are reviewing logs and metrics to ensure we do not experience similar problems again in the future.

12:39 PM PDT - Jun 10th・See in your timezone

And now we believe the Slack website, API, and file upload/download are all back to functioning normally. Monitoring continues as our operations team checks our assumptions for all things related to web tier memory usage. As of now service is completely restored.

11:56 AM PDT - Jun 10th・See in your timezone

The main Slack service and API are confirmed to be functioning well. However our fix has had an impact on serving file uploads and downloads which we are addressing now. They may be very slow or fail outright and we are implementing a fix for this problem immediately.

11:36 AM PDT - Jun 10th・See in your timezone

We believe we've addressed the cause of the terrible Slack outage this morning. We're very sorry for the interruption to your days and we're taking steps now to address the problems uncovered during this incident. We will continue to monitor the situation to ensure our changes completely fix the problem before closing this incident.

11:24 AM PDT - Jun 10th・See in your timezone

We are releasing the fix more widely and ensuring that all our customers get reconnected to Slack. We're continuing to monitor the situation.

11:11 AM PDT - Jun 10th・See in your timezone

We are preparing and testing a potential fix for the problem that has caused Slack's website and API to be down this morning. Thank you for your patience.

11:00 AM PDT - Jun 10th・See in your timezone

We're continuing to search for the source of resource exhaustion that's causing Slack to be unavailable to our customers.

10:40 AM PDT - Jun 10th・See in your timezone

Slack's web application is failing more widely so as to include administrative web pages as well as API calls from Slack clients. We're continuing to work to restore service and allow disconnected users to connect to Slack once again.

10:21 AM PDT - Jun 10th・See in your timezone

API failures continue and due to this many users are unable to connect to Slack. We're working to restore service as quickly as possible.

10:12 AM PDT - Jun 10th・See in your timezone

Slack's web servers are being overwhelmed at the moment and we're working to restore full capacity and get everyone back to using Slack. API requests may respond in error and chat may behave quite slowly in the meantime.

10:06 AM PDT - Jun 10th・See in your timezone

Status

Resolved