When Good Web Services Go Bad
TypePad collapses twice in a week, and now del.icio.us falls apart.
I have a mere 1986 del.icio.us bookmarks, but that’s a lot of information that I’d dearly miss. It’s a list of the stuff that I found important since late 2004. If that were to be lost, I’d have a hollow feeling. I’ve used del.icio.us to augment my memory. I’d have to wonder, what am I forgetting?
Software Is Only Human
Now is the time for the pundits to cry for quality in software, and liability. This collapse of infrastructure just won’t do! Software vendors should be held to the same standards of quality as aerospace. This is the only way that the ASP model of software can move forward.
But it’s back. My bookmarks are back. TypePad is back. We’re all good.
Why so good?
- If you were to take the time spent rebooting, reformatting, or regretting MS/DOS, these outages are drops in the bucket. Now add Windows in all it’s flavors, and you’ve got yourself an ocean of blue screens.
UPDATE - Caveat, I’m not a Windows basher, that’s what it look like here, but no. Windows grows ever more stable. My point was that we went through the teething of the PC. New is flakey. We dig it. We deal. Stable can be had, but we’re not having it. We want what’s next, now.
- Because we are software.
This is pretty obvious to bloggers and friends. We’ve written little scripts or macros, we’ve added little tweaks to our blogs, we’ve made countless mistakes. It’s really not all that foreign. A software failure is more of a human failure, and we can forgive those.
This is pretty obvious in a large software system, where training is used to complete a software application. In these cases it’s the task that matters, not the software, and counting on perfect software is a blatant violation of Murphy’s Law.
Those that call for aerospace quality in our bookmark manager are calling on us to waste valuable resources that should be used to keep airplanes in the air.
How a Web Service Can Surviving an Web Service Outage
Here are the three things you must do to keep your customers sane during a meltdown.
- Tell Them What Went Wrong - Tell them why it went down. In English, describe the failure. Tell them what you are doing. Don’t apologize. People don’t want apologies, they want service.
Keep them posted. Live blog the crisis.
- Make no promises - You won’t keep them.
Do not promise deadlines. In crisis deadlines are meaningless. Ballpark it. That’s all. You’re moving as fast as you can.
Do not promise full recovery. Someone probably did lose something. Someone will take you literally, and feel betrayed if they lost a single record, even if they lost it by clicking delete and yes I’m sure I want to delete, two weeks before the incident.
- Apologize - Wait until the service has recovered. Running apologies sound insincere. Save it for a great big apology at the end of the crisis. Then apologize to everyone for the lack of service. Apologize to individuals who lost information, or have specific examples of lost revenue, and offer something to make them whole.
Move on. An apologetic non-human entity, like a corporation, gives me the heebie-jeebies.
- Tell Them What Went Wrong In Detail - Write down exactly what went wrong. Do a white paper describing the failure and your response, the issues faced in the response, and what measures are put in place to prevent recurrences. Put it out there for people to consider. Share your experiences.
The del.icio.us outage was no big thing for me. I’ve gathered my bookmarks on my desktop, to feed del.icio.us later. The blogged about the crisis. I felt for them. It was all too familiar. The service is back up. I’m happy.
How Users Can Prepare for a Web Service Outage
All of these services allow you to download your data. Consider the ones that are most important to you, and put a date on your calendar to download the information to your local computer.
In del.icio.us for example, you can download your bookmarks by clicking on the settings link and clicking the export link found on that page. It will generate an HTML page that it can re-import.
Here’s a URL for my bookmarks export, you’d have to put your name in place of “alan”.
http://del.icio.us/settings/alan/export?export=1&showtags=showtags&showextended=showextended
Now, if only OS X Automator would remember my cookies, I might be able to share a script with you. Hmm…
Crisis = Opportunity
Now is maybe a good time for one C# programmer to make a TypePad backup widget for Windows, and one Cocoa programmer to make a TypePad backup widget for OS X. Then do one for del.icio.us. Make it extensible so you can add other services. You’ll have one, very popular download. Put up a PayPal link and ask for $4.99.
Or make it a web service. Heh.
(Linux users wrote their shell program and popped it their crontab when they signed up.)
UPDATE - Scoble Says: Long Live the Client Side
Scoble touches on Web Services gone Bad in his posting Salesforce Hits Problems.
But, back to the Six Apart point. Truth is that these systems are still way too fragile and having a totally resilient system is extremely difficult. I’m certainly not going to throw the first rock here. But, I love having systems that have BOTH a Web and a local storage capability.
See. Now go start developing your Web 2.0 desktop clients. They are next big thing. And for goodness sake, why don’t you write the universal backup widget for Windows or OS X?
UPDATE - Christopher Coulter give 8 Reasons Why Web Services Fail
In a Comment on Salesforce hits problems, Christopher Coulter reminds us of the great many ways in which web applications can fall apart.
Servers, scability, limits of mark-up tech, coding errors/buffer-overflow’s/poor error-control, differing browser implementations/different renderings, database strain, DOS attacks, Security/hackings — whatever the point of failure and wherever you decide to lay the (political) blame, the fact remains — backups are vital. Indeed the first step.
Something to consider. Why network when you don’t have to?