And… you said “relational”? Facebook and others do a lot of denormalization, they don’t ever use JOIN, they’d rather do several consequent requests and build intermediate results on a webserver (when you have 20 times more webservers than DBs it’s obviously good to move some load there). They treat good old MySQL as object storage with very fast B+ tree indexes. Finally, the resulting database is not a relational one. One thousand of MySQLs is just a distributed object storage with simple fast indexes and a bunch of hand-written code in php/ruby/python/whatever around it.
I’ve come upon this sort of idea several time recently (and the above was written a couple months ago). I’m warming up to the idea. Without piles of cash and able systems-type folks, scaling databases out is a really nasty problem. Even then, my reading is there are definite bounds for how far you can go.
Assembling datasets on the more easily scaled app server is appealing. It sounds fun (hey, real programming!) and is interesting to think about. But I wonder if it leads to having to figure out consistency in your application. From where I sit, its the hardest part of ACID to reason with.