Hold it! Don't back up to a cloud until you've eyed up these figures
Trevor Pott drills into the real price of online storage
Online data vaults are everywhere. On the small storage side, we have options such as Google Drive, Dropbox, and Teamdrive. My Synology NAS, the upcoming 2012 Microsoft Server Suite and any number of virtual appliances can all back up bulk data to the cloud. The software side of things may be settled, but is this all truly feasible?
The internet is not a dump truck
Remember how we all had a laugh at US Senator Ted Stevens when he said the internet is a "series of tubes"? Well, the internet is not something that you just dump something on. It's not a big truck. It's a series of tubes.
And if you don't understand, consider that those tubes can be filled and if they are filled when you put your message in, it gets in line and it's going to be delayed by the enormous amount of material clogging that tube.
Spend enough time working out the details of cloud storage, and you'll start to be a lot more ambivalent about the ridicule the web rushed to heap upon Stevens, who later died in an aeroplane crash.
Looking at cloud storage, can you realistically push this kind of data over a WAN connection and get other work done at the same time? The answer is "it depends".
If you are doing "small storage" - defined by me as less than 100GB per month as of October 2012 - then the answer is most likely "yes, you can move that around without degrading your internet connectivity". Until we start talking about fibre to the premises, most businesses are probably restricted to a cable or ADSL connection. An upstream of 2.5Mbps (the average for my hometown in Alberta, Canada) can push up 791GB per month.
To make sure that your cloudy storage doesn't impinge upon your regular usage of your broadband you have two options: enable features in your router to guarantee quality-of-service, or restrict uploads to "at night only". A lot of modern consumer routers do good QoS, so I don't think small businesses will have trouble with that option. Restricting your broadband to the eight hours of "off-peak" dead-of-the-night means you can move 263GB per month on a 2.5Mbps upstream. Even allowing for overhead, you can easily get your 100GB per month worth of backups onto the cloud.
I can state with confidence that if you have already have a business ADSL with 2.5Mbps upstream and at least a 200GB per month transfer limit (not hard to find in urban areas in most developed nations) then cloud storage for anything below 100GB per month will make sense. The convenience and reliability are easily worth the marginal cost.
Wait, what does it cost to back up 1TB every month?
I have clients that have backup requirements of 1TB per month. These are small businesses; approximately 15 people per biz, but they generate this kind of data without a problem. According to Amazon's S3 calculator, Amazon does not let you make 1TB volumes. We will thus break up our monthly data into 500GB volumes for simplicity. The cost of storing 1TB of data on Amazon's S3 cloud is $97 per month.
What about the bandwidth cost? My local ISP offers $350 per month for 1TB of data, the maximum theoretical capacity of that link (15Mbps) is approximately 4.6TB a month - which is not realistic over a cable modem, business package or not. I should be able to squeeze a terabyte a month out of it though, assuming I don't use it for anything else. (Shaw offers a $250/month package with a 5Mbps upstream. Despite trying in multiple cities, we can't actually get it to sustain a high enough throughput to push up a whole terabyte in a single month.) That makes the cost of backing up 1TB per month to Amazon S3 $447 per month.
Doing this in-house, I would buy two 2TB drives (the formatted capacity of a 1TB drive is less than 1TB) and put them into a RAID 1 setup. The internet tells me this costs $400 for the disks and the box to house them. I use this solution with many customers; there is a local delivery service that collects the hard drive boxes and stores them for $35 a month.
The in-house solution cost for 1TB is $435 per month; comparable to Amazon's S3. Where it falls apart is recovery. If I had to retrieve 1TB of data, Amazon would charge me $123. Shaw doesn't officially discuss what it would do if you went over the data limit on their top-tier connection. This is still being revisited internally, and rumours are that you will be throttled into the ground.
This means that in order to pull the data down from S3, I would need to get another connection for that month; the data recovery cost is $473 per terabyte. My delivery and storage service would charge me only $100 to find the right "tape" and return it to me. It would also take a month to get the data off the S3 cloud, versus a guaranteed 2-hour maximum from my local delivery guy.
A terabyte is sod all. What about moving 15TB a month?
I also have one customer cheerfully ploughing through approximately 15TB per month of backups. Now, 31 Amazon EC2 volumes at 500 gigabytes each transferring in 15TB of data costs $1547/month. Assuming I take the $900/month fee the ISP charges just to light the fibre up out of the equation, we pay $2,800/month for a sustained 50Mbps connection, no metering. Assuming 30 days in a month, this has a theoretical capacity of approximately 15.45TB a month. This lines up with what I see in the real world; we flatten it all month long and just get the backups done. This makes the monthly cost of the solution $4,374.
The 4TB 7200 RPM Hitachi Deskstar sells for $329 at my local computer retailer. Five of these drives (for RAID 5) is $1,645; a Synology DS1512+ costs $899. A 10x10 storage unit is $233/month, and the delivery guy costs me $33 per run. So for me to back up 15TB off-site each month is $2,800 per month.
Any time we wanted to do a recovery of that data, Amazon would charge us $1,689.51. I would also have to up the bandwidth allocation for that month by another 50Mbps, meaning an additional $2,800. This makes the cost of recovery $4,489.51, and it would take a month to pull down the data. Even if I wanted to, I can't make it go faster; the fibre pipe the ISP can provision me only goes to 100Mbps.
Everything about those calculations is wrong
The above needs to be taken with a bag full of salt. These are back-of-beer-mat calculations that leave out a lot of important considerations. At first blush, 1TB of online storage doesn't compare too badly with the local delivery guy's solution. The 15TB one is obviously nowhere near feasible.
This misses a couple of things. The online solution presumes that my 1TB of online storage is completely overwritten each month. If not, (as is the case with many types of backups) then the cost of storage will rise perpetually. The local storage option also presumes that the backup "tapes" are purchased and then never re-used. In some backup schemes, this is true. In others, these "tapes" are rotated out, being overwritten again on a regular cycle. I tend to use one where there are a total of 12 "tapes" in play; backups occur weekly, and each "tape" is recycled every three months.
The cost of connectivity is also highly variable. If you are sitting on an internet pipe so fat that bandwidth is measured in cents per gigabyte, then cloud storage economics shift entirely. What seems economically reckless to the small and medium enterprise market starts to make a lot more sense at scale.
What about legal jurisdictions?
I have a massive DVD and Blu-ray collection. I have ripped every single one of these discs to my NAS. Until just recently, this was legal in Canada; no "digital locks" laws had yet been passed. Those ripped movies are perfectly legal for me to retain; what is their legal status if they are synchronised to a cloud service? If they are synchronised to a service in a country where their mere creation is illegal (because it required breaking encryption) can I be financially ruined for backing them up?
What about an ebook of George Orwell's 1984? In Canada, Australia and other nations this novel is a public-domain work. It is not public domain in the United States. Assuming the ebook I am using has disclaimed all rights (there are several such works), or I have scanned a print version myself, under what laws am I judged if this gets stored on an American server?
What if it merely travels through an American router or American-owned fibre pipe? Corporate liability regarding intellectual property remains legally nebulous in many places.
We can do this cloud storage thing, if we choose to. In some circumstances, it makes good business sense to do so. In others, it would be ridiculous to even attempt.
Setting aside legal ambiguities, the limiting costs of cloud storage aren't the cost of the service provided by companies such as Amazon. Cloud storage providers offer storage cheaper than most companies can source it themselves. The boundaries on cloud storage are the throughput and rates provided to us by our ISPs. The cloud storage revolution can't truly begin until the cost of internet access comes down. ®