Putting your trust in the cloud?
10:47 am January 23rd, 2008 by Sal Cangeloso
If you haven’t been following the Bingodisk/Strongspage, you should because it is an interesting one. Basically after updating its Sun server they started to experience some issues and took the services offline. At this point it’s been about three days, and it’s been noticed by a lot of people. The people at Joyent apparently encountered a known issues with Sun’s OpenSolaris OS and its file system, which is known as ZFS. Not only has there been serious downtime, but the backups are corrupted as well.
The blog post points out that both systems were based on a single Sun X4500 server. Even now that the problem has been identified it is taking a significant about of time to get the server’s (up to) 24TB of data sorted out and restored. Once this is done all the data should be restored.
A little clip from Bingodisk’s site:
Protected against disk failure
BingoDisk uses hardware and disk architecture that is an order of magnitude more reliable than that hard drive you may be tempted to buy down at the local CompuMall. Rather than writing your data to a single drive, BingoDisk spreads the data across up to 45 disks, which significantly improves reliability while never sacrificing access speed.
Not to kick Joyent while they’re down but this does raise some interesting questions about online storage and the entire cloud mentality. The reason people store things in the cloud is for security and access. We trade privacy and certain aspects of control for access from anywhere and the comfort of knowing that professionals are taking care of our data and that it is 100% secure. This is why people, even technically proficient people, use webmail and online storage and it is the fundamental case for cloud computing.
As it turns out, all this data was stored on a single X4500, which amounts to a $35,000 piece of hardware.
…
I forgot to work on this draft for a few days, but it actually worked out well because now the two services are back online. No data was lost, which is a testament to skill of the Joyent team and ZFS. Another of semi-good news, is that this happened with their consumer products, not Accelerator, which is their main commercial product (though there have been complaints about this- mainly from Twitter’s people I think). Still there were ten days of downtime. This is being addressed by replacing the Strongspace service and giving 9 months of free service to current Strongspace users. Users will also get two months free for the new service and Strongspace will be opened sourced, which is really cool. Bingodisk users get two months free service and people can opt out of extended contracts with a refund if they want.
Overall Joyent is doing a great job of cleaning up their mess. Yes, they are just throwing money at people and hoping the stick around, but these things happen and they have handled everything extremely well. Not being a customer of theirs I am feeling really forgiving, but if I was relying on that data and could not get to it for 10 days, that might be a different story.
As as side note- I am still considering using Joyent for hosting in future projects. This is not the best thing for their reputation, but the Accelerator (the important service) was untouched and Joyent handled the issue with transparency. Plus they are really taking care of users now, which is nice, as a 10-day refund would not have been nearly enough.

