Snappy, The HostGator Mascot

Gator Crossing

The Official HostGator Company Blog!

Dragonfly
AirPlane

Anatomy of an Outage

Written by Douglas Hanna

Monday, August 10th, 2009

Wednesday, August 5, 2009 started out as a normal day at HostGator’s Houston headquarters. Around 4:00 PM CT, a major power surge that occurred as the result of a transformer near our office blowing up made the day anything but ordinary.

Lights flickered, battery backups beeped, fire alarms went off, and Internet signals all died down almost immediately. People began to wait for the building’s $200,000 hurricane-ready generator to start up, but it didn’t.

In the mean time, one of the three major “legs” of power that feeds the building with the power it needs to function was out because of exploded transformer. The building was underpowered and the higher voltage motors and equipment started burning out from the heat and stress of running without the adequate amounts of power. Expensive equipment continued to get damaged.

A compressor on the air conditioning burnt out (cost: $35,000), air handlers got destroyed (cost: $5,000), an elevator motor got fried (cost: $10,000) and lots of other equipment in the building’s mechanical room still isn’t working correctly (cost: unknown). The total cost of the damages is expected to be upwards of $60,000.

As the building’s systems started to go down and the people in charge of HostGator’s office began calling in electricians, power companies, and repairmen, the rest of the management team began going into what we refer to internally as “hurricane mode.”

Twitter

  • Twitter updates started to go out informing customers of a power problem in the building and possible service delays.
  • Employees were rallied and were sent to the other employees’ homes.
  • Our phone number was redirected (our VOIP system is housed in our office) and the message on our phone system was updated to inform customers of the outage.
  • Our support site was updated with an emergency notice.
  • A forum post was made with additional details.

As the makeshift offices were being setup in our managers’ homes, chats were being taken, servers were being monitored, and updates were being provided. Within an hour of the surge, HostGator’s support operations were almost fully functional, albeit delayed (with the exception of phone support).

By 11:30 PM, employees were starting to work at the office again. The phones were turned on shortly afterwards and average email response times went back down to 45 minutes or less.

Twitter

Much of this expensive and inconvenient damage would have been prevented had the building’s generator worked as planned. If it did, the building would have only lost power for a minute or so instead of multiple hours. The cause was the generator maintenance done less than a week before (by an outsourced company) was done improperly. The company put the wrong fuel filter on the generator, which caused the generator to immediately fail on start up.

The outage could have obviously been much worse. No customer servers or accounts were affected in any way (we don’t house any customer servers in our office building) and we were able to get back up and running relatively quickly.

Regardless of the relative severity of the event, though, HostGator did learn a lot.

  • Most notably, the fact the immediate communication is essential was reaffirmed. We first learned about the importance of immediate communication during a datacenter outage at The Planet. In this situation, a Twitter update went out less than 15 minutes after the power surge occurred. Updates continued to be provided across Twitter, the forums, and our support site until the situation was completely resolved. We were even lucky enough to get comments from customers praising us for our handling of the situation.
  • We also learned that it’s critical to have systems tested and maintained by companies we know are getting the job done properly. We are obviously looking into a new generator maintenance company and looking at our other vendors to ensure they’re prepared to deal with issues if they occur.

During the entire occurrence, our customers were patient and understanding and we sincerely appreciate that. Stanley Marcus of Neiman Marcus fame is credited with saying “The road to success is paved with well handled mistakes” and we couldn’t agree more.

Things happen (the web hosting business and the act of running a business are never dull) and Wednesday’s events were just one of the many examples of things that no one could have ever predicted happening.

Click on the images below to see a larger version with a caption.

Posted in

Customer Service, Random
Comments
  • http://drumcreative.com Ben Moffett

    I noticed all of these twitter announcements coming through and it was a great relief to be in the know. I am not a employee of your company but I have 2 reseller accounts and a dedicated server through hostgator. I took a quick look to just make sure my servers were fine which of course they were. I just wanted you guys to know that keeping us up to date on twitter felt like I was a part of hostgator. Your transparency through this situation even though it did not affect me was really appreciated.
    Thanks
    Ben

  • http://www.collierrepair.com Cory Collier

    A GREAT Example of a great company putting my favorite motto to use…SEMPER GUMBY! WTG HOST GATOR.

  • dgibsonky

    I think it’s great that your employees would invite coworkers into their homes to keep the business going. Kudos to you all.

  • Hillary

    Wow. Pat and Lance are studs!

  • http://Hostgator.com Lance Custen

    Thank you Hillary.

  • http://www.ruicruz.pt Rui Cruz

    Hi,

    Just a litle question: who and how will the blame be “spread”?

    Rui

  • PL

    Love the way you work guys… I’ve been using your hosting services for 2 years now and I just have great comments for you. As someone said before, I really appreciate your transparency, which I consider as a confidence, but also a respect mark. Keep on, gators ! :)

  • http://www.a2phone.com a2purn

    Just have a great comment for HG and hoping would keep the quality

  • jack

    You might want to emphasize more that it was HQ …as in NOT the place where the customer’s sites are hosted.
    Just a thought :)

  • http://www.balinter.net Eddy Harianto

    Thank you for email us and good support.

  • http://www.lilbeginnings.com Mary Lou

    Thank you Hostgator!! You are the BEST!!!

  • Lee UK

    I hope Hostgator can sue / claim for negligence against the company that set up the generator wrongly, thus causing its failure and the subsequent costly damage.

  • John Oroko

    This is makes me love HG the more.You handled it well.Thanks guys.

  • http://www.alterxmedia.com Nick Walsh

    I switched to Hostgator about two years ago when my former host refused to get CPanel. Your service is unparalleled in the industry. They have never failed to resolve an issue… and they have done it with great attitudes! You can’t find this anywhere else.

    No damn wonder you’re among the fastest growing companies! I will soon be moving to a WHM with you and heck… you can have all my business and that of anyone I can influence! Kudos to who ever is calling the shots there.

    Please… don’t fix it.. .it ain’t broke!!

  • Customer

    Have virtual hosts of customers been down or unreachable during that time? Could the rest of the world connect to the virtual hosts of customers during the office meltdown?

    That’s actually the thing I care most. I appreciate your transparency.

  • Customer

    Oh sorry, I just read:

    “No customer servers or accounts were affected in any way (we don’t house any customer servers in our office building) and we were able to get back up and running relatively quickly.”

    That’s good planning. I hope you have a nicely working (and TESTED) emergency scenario for the servers.

  • http://www.akouseto.gr aritoni

    $200,000 generator? Im impressed!

  • All Software List

    Top notch company that Ranks 239 on Inc. 5000, of course, HG is the best.

  • http://www.DutaNada.Com Kurnia

    I have been with Hostgator for more than 2 years. So far have very satisfied with the service for 24 hours (my location is 12 hours difference). Referring the host server to friends and colleagues due to excellent service….

  • http://www.heftelstudios.com Kawika

    Thanks for the transparency, guys. Makes things much better when dealing with an outage.

  • http://www.monhosteur.com just2com

    As a professionnel webhost, we know that this kind of outage create a big stress to all the staff. But you have just done the perfect things !

  • http://www.studentdebtsrelief.com Dave

    Good pictures. I always wondered who Im chatting with when I need help with my server.

  • Corey

    The more I read this blog, the more I like HostGator. Keep up the good work.

    P.S. -I’m a new customer, but I expect to be a customer who sticks around for quite a while – having read all that I have this morning.

  • http://best-seo.net Barry SEO

    When i switched to SEOHosting from my regular HG baby account i must say that i was somewhat concerned. But it turned out to be a good move, i can recommend C-Class hosting to anyone.
    Cheers

  • http://holygroundelectric.com Electrician

    As a licensed electrician I’m pretty amazed that the maintenance company didn’t do a final inspection/start up on a $200.000 piece of equipment. It’s great that you guys have a working, tested emergency plan. Kudos

  • http://magnetechtransformerrepair.com/ Mitchell

    Ha! That is such a great story! …and a great testimony of human ingenuity and quick acting. I once saw a transformer explode when I was pumping gas – needless to say it was a crazy moment!

  • http://www.electriciantoronto.ca/ Vic

    There is a lot of protective equipment on the market to save circuits or the entires electrical system of the building from power surges, lost phase or two, voltage drops, etc.

    Building manager did not invest in such protection. The trouble like this was unavoidable.

qwaszxerdfcv3.14 | 1776zxasqw!!