London Outage / Nov 5th – 6th Incident

First off. We’re sorry.

 

We know that for most people, the worst part of the outage was the lack of communication (it was for us too), so we’re going to answer some questions in-depth so you can be as informed as possible. The lack of communication from a provider in a situation like this can be excruciatingly frustrating, and we get that, so we hope that by providing you with this statement you are able to gain an understanding of the whole situation.

 

Were there two outages?

Yes.

The first occurred between 17:33 and 18:47 GMT (5th Nov 18) and then the second between 00:06 and 14:34 GMT (6th Nov 18).

 

Has the problem been 100% fixed?

It will be. While we don’t anticipate any outages again in the near future, as you will note further into our question set, we’re going to be performing some pretty massive changes to gain control over the environment we operate from to prevent this from happening again.

 

What type of outage were these incidents?

These were network outages. There was no power or cooling loss through the process that allowed servers to remain powered on at all times resulting in no data loss. The servers were never turned off.

 

What caused the outages?

The network feeds that supply our core infrastructure within London were suspended due to an administrative error that occurred between our colocation provider and the datacenter. The suspension was performed manually by the datacenter that resulted in all our systems being completely disconnected from the network.

 

Why were we unable to provide more information during the outage?

We fully understand that more than likely, the worst part out about the whole incident was the lack of information. We know. Unfortunately, we were in the same situation as you due to communication failings with our colocation provider and the datacenter to definitively relay the cause of the problems. The communication breakdown resulted in us being passed incorrect information and assumptions that were initially identified as being an outage caused by the datacenter. After direct communication with datacenter technicians, we decided not to post an announcement with such information until it could be verified.

 

Why was the initial outage corrected and the second wasn’t?

Our colocation provider followed standard procedures to make contact with the datacenter to inform them of the incident, to which they were able to bring the services back online while they looked into what caused it. The second outage was a manually performed suspension after the initial situation had been reviewed.

 

How did EL get notified of the outage and what was the response time?

We make use of status monitors on all core systems that we operate; these monitors let us know about the outage within 30 seconds of it starting on both occasions. Unfortunately, unlike other scenarios, having a short notification and fast response time of fewer than 2 minutes didn’t provide any benefit due to the communication failure.

 

What compensation are we providing customers that were affected?

While the outage was only in London, it also knocked off all web packages including our control panels that resulted in the vast majority of our users being negatively impacted in some form. All services located in London will be issued with a two-week extension, and all game servers within New York City and Chicago will receive a one-week extension on the chance that the customer experienced problems accessing the control interface. If you would like to receive this compensation, please let us know by contacting us via the ticket system with the request made to ‘Administration’.

 

What are we doing to prevent this from happening again?

Administration issues and mistakes do happen in business. With the uncertainty about the situations root cause, lack of current admin safeguards and the improvements still being made at our colocation provider, we have decided to make some significant changes within our London location. We will be discontinuing business with our colocation provider and are switching to colocation directly with the datacenter within the next few weeks, allowing us to have much greater control over all aspects of the service that we provide. All IP addresses will remain the same, but we will be scheduling some emergency maintenance within the next few days where the equipment will be re-racked within the facility. We will be posting a status update with the scheduled time window once we’ve completed contract signing and rack provisioning.

 

I have questions that aren’t answered here, what do I do?

If we haven’t answered a question that you have, please feel free to reach out to us directly at [email protected] (applicable to both ELHostingServices and PrimeNodes customers), and we will be more than happy to have a conversation with you to get your questions answered.

 

We are incredibly sorry for any inconvenience this situation has caused; we hope that the compensation and steps that we’re currently performing to ensure that this doesn’t happen again are enough to show you that we genuinely care about every one of our customers. Thank you for taking the time to read this statement and we hope that you have found the information you needed.

 

Once again, sorry.

 

William Wilson & William Phillips

Encrypted Laser Management