June 17, 2008, 10:32 p.m.

Good, Fast, Cheap? Eh, No Thanks

I had a conversation with a guy today about scaling his app. The app looks really simple and is still kind of below radar in popularity, but is expected to be growing for a couple different reasons. They're experiencing a bit of slowness, but nothing too out of control. I heard they just bought about seventy new servers to run the application.

Now, their app is a really simple read-mostly content spewing app. They're adding some more interactivity in it, but it's the kind of stuff that is more realtime-ish — the kind of stuff which I would absolutely not put a database in the critical path of. Admittedly, they don't have any experience with load balancing, caching, etc... However, they do have about seventy servers now.

I'm pretty sure I could meet their current load requirements on one really bored server, so I thought I'd offer some assistance. I suggested that a bit, showed examples of how I'd done stuff like this in the past, and suggested that it was a really bad idea to buy so many machines and asked what his thoughts were on EC2 since it'd be immediately cheaper and could pretty close to instantly reach whatever scale they needed in the medium-long term.

This is where things got a little weird for me to the point of distracting me away from the conversation about software architecture I'd intended to have to trying to trying to talk about a lower level of scaling and general cost reduction. The guy said he didn't like the idea of EC2 for a number of reasons which I'll iterate below.

  1. I don't want an EC2-based business
  2. I want to be running on my own hardware so I'll have more assets for a potential sell in 2-3 years
  3. What if Amazon jacks up the price a lot?
  4. What if Amazon decides to not run this service anymore?
  5. What if they have a huge traffic spike — can't be spinning up instances while they're being beaten down.
  6. I used EC2 before and it cost me $150/mo for an idle machine. Can you imagine multiplying that by 70?
  7. We've got lots of money, so cost efficiency isn't a problem

I found the list a little... backwards. I'll go into detail just in case any of them seem to make sense.

I don't Want to be an EC2 Based Business

His is not an EC2 based business. His is a software business that provides services to clients. It's a web site. Getting piles of hardware adds a lot of op ex as he develops a cost center for his business and distracts himself from his core competency (which, as he said, is not scaling out hardware).

I Need the Assets to Sell My Business

This one I didn't understand at all. If there's significant value in commodity hardware 2-3 years from now, he'll be in a sad shape.

The interesting thing is that the hardware is already about two years old. He paid about $24k for two racks of 34 machines each. Best part, it came out of an Amazon cluster. So it's older than any machine you'd get in EC2, and if you're planning on being bought in 2-3 years, it'll be 4-5 year-old commodity hardware by then.

You may find after a year or so that newer hardware would result in a lower cost per request served around a time that you need to serve significantly more requests. New hardware would make a lot of sense then. You can throw away all these machines you got, or you can just start rebooting EC2 instances. I know which one I think is easier.

What if Amazon Raises the Price Significantly?

You move.

Right now, it's the cheapest way to deploy an app you want complete control over and want to be able to scale with demand. If it's not tomorrow, pick it up and take it elsewhere.

Here he's betting that Amazon is going to lock him in somehow and then screw him out of more than $24k. That's not the whole story, though. $24k is the acquisition cost. These 68 machines still have to be located somewhere, have connectivity into them, have redundant switches, redundant power supplies, careful management of distribution across different PDUs and switches to prevent local outages from taking you out, spare parts when MTBE strikes and lots of other hidden costs. While it's a valid way to do things if you know it'll be cheaper, the op ex is likely to be at least as high as Amazon, but with a cap ex introduced.

What if There's a Huge Traffic Spike

In the EC2 model fronted by something like fuzed and a bit of preparation in creating a custom AMI, it'd be unlikely to take a full minute to add a node to a cluster. And as Jeff Bezos talked about at startup school, Animoto went from 50 to 3,500 servers in three days. You just can't do that with standard colocation practices.

In contrast, he wants to have all of his new machines running 24/7 in case of a traffic spike. When your traffic is low to normal, you're just burning cash. When the traffic is really high, you're just plain burning. A significant amount of new hardware can't even be acquired in a day even if you have a place to put it. Installation will be a pain. And when the huge spike is over, you'll just be burning cash even faster.

I Used EC2 Before and It Cost a Lot

He said he spent $150/mo on an idle machine in the past and felt that that was an incredible cost. Can you imagine that times 70?. Um, no, I can't. Because you'd just be trying to give away money if you chose to run 70 idle high-cpu medium servers 24/7 for a month. Fundamentally, it's the same problem you'll have running them in house, but that always gets calculated differently for some reason. But if $150/mo is what it takes to run his app, then his new cluster will pay for itself in... a bit over thirteen years.

At low traffic times, he can probably survive on, say, three machines ($210/mo minimum). At high traffic times, let's imagine he needs up to 70 servers at peak ($7/hr during the peak hours). That's all that needs to go on.

But imagine for a moment he did need 70 servers running all the time. That's under $5k/mo for small servers (which I'm guessing is what this purchased hardware is, or is at least equivalent to). That means it'd take about five months for the new hardware to pay for itself if electricity, bandwidth, and maintenance were completely free.

But, again, the reality is that he'd be likely paying closer to $500/mo for long enough that by the time he ran up $24k worth of EC2 bills, the hardware would be completely obsolete.

But We're Not Concerned About Cost

Maybe not, but you're still trying to walk a fine line between having enough servers to handle the day everyone on digg and slashdot start using your product and having few enough that you can burn more money on making a good and fast product and less on trying to play with old computers.

With 68 machines, you'll probably have what appears to be a surplus today, so you can afford to be sloppy with your coding. You might find yourself running at 60% or so capacity earlier than you would if you were generally aware of cost. Now, given that cost isn't an issue, that itself isn't a problem. The real problem is what it's going to take to grow it beyond this.

In Conclusion

EC2 isn't right for everyone, but better, faster and cheaper solutions make better, cheaper, and faster companies. If someone will do some of the hard parts that fall outside of your core competency, then you hopefully have a really good reason to do so yourself.

What he's doing will work. I've seen terrible things work when I really just with they wouldn't. It'll just be a lot harder and a lot more expensive than it needs to be. What are you going to do, though?

As for me, I run most of my software on old crappy computers at my house where I've got really bad connectivity. If you can even see this post, consider yourself lucky. Seeing as how I'm the only one who finds any of this interesting, it doesn't matter, though. :)

blog comments powered by Disqus