Many developers, especially of the younger generation, dislike relational databases and their business-partner, SQL. It is regarded by some as the new assembly language. With all this distaste going around, how did it gain such a strong foothold in industry?
I offer you two answers: ACID and surface area.
Atomicity, consistency, isolation and durability. It’s not something most folks want to think about. To a rookie developer, it’s overwhelming. They’re not yet familiar with the semantics of the systems their programs run upon. Is
fread thread-safe? “How should I know, I just learned C last semester and about
fread’s parameters last week!”
The promises of a modern relational database include a compelling bullet point: your data is safe with us. Use our APIs, don’t break the rules, and we will make sure you never blow away some data and get a call at 3 AM. Rather, the DBA will, but what do you care about that guy?
So I submit to you that most programmers don’t use databases because they’re great. Rather, they have come to rely upon them because the canonical tome on transactions is heavy enough to maim small mammals and rife with formalisms. So they skip the nine-hundred page textbook and pick up the six-hundred page O’Reilly book.
Most programs that people will pay you to write involve side-effects. Further, many of those side-effects have to do with saving data off so you can perform further side-effects on it in the future.
The rookie developer typically leans first to files. Files are familiar and pervasive. But files leave a lot to said rookie. How should I structure my data? How do I load and save the data? How do I manipulate the data once it’s in memory? Even in scripting languages, with their simplified APIs, this means the rookie is faced with APIs like this:
When I was bit a wee lad o’ programming, I found this Gordian knot difficult to cut. But then, one day, I was told by a programmer of greater wisdom to use a database. That API looked like this:
It was a lot easier to understand, even though the last four are a completely different language.
So, I submit to you, that SQL also won because it was easier to understand how one might structure their programs, make them work and, if they’re lucky, get them to run quickly.
I’d wager that five years from now, the generation of developers who are now upcoming won’t take the database tier for granted. Key-value stores, distributed file systems and document databases will all play into the question of “what do we do with the important data?” Sometimes, relational databases will prove useful. But increasingly, other things will too.
In the end, there’s two ways to look at this: we will soon throw down the shackles of our relational overlords, or, prepare yourself for the database renaissance in programming fashion that will occur in a decade or so.
13 thoughts on “How did SQL get so popular?”
Great article Adam! I think you are spot on.
When I read the title, I was thinking more on SQL vs CouchDB or Persevere or db4o rather than SQL vs flat-files…
“So I submit to you that most programmers don’t use databases because they’re great. Rather, they have come to rely upon them because the canonical tome on transactions is heavy enough to maim small mammals and rife with formalisms. So they skip the nine-hundred page textbook and pick up the six-hundred page O’Reilly book.”
What do you mean by the above paragraph? So we use databases because the manuals for databases are complex and huge? Or are you saying that transactions are complex to implement therefore we, by default, use a system that has already implemented the transactions?
@Hendy yeah, “How did relational databases get so popular” would have been a better title. But, most people strongly couple SQL to relational databases anyway. Also, I arbitrarily picked file-based storage because that’s what was available when I learned about databases. Someone coming up to speed today would be faced with even more choices.
@Jeremy I’m saying that the manuals for databases are easier to read than the books on implementing transactional, i.e. ACID, systems. So, people learn those rather than implement their own datastore that provides one or more of the ACID guarantees that a typical relational database does.
I don’t understand point #1. “ACID” and “relational” are completely orthogonal. I’ve used non-ACID implementations of SQL, and I’ve used ACID non-relational databases. In fact, MyISAM was one of the most popular ways to use SQL for a while (and maybe still is), and it’s not ACID.
I think #2 is part of it, but the answer is really much simpler: there’s a standard SQL with multiple competing implementations, and (in the past 10-15 years) many of them free. As a developer, this is a sign that (a) it’s not going away, and (b) they’re actually going to try to compete.
I’ve used OODBs which were awesome, but I know of only a couple OODBs, and I can only name one which implements (a small part of) OQL. OODBs are awesome to use (if you use their native API), but as soon as there were 2 RDBMSs that used SQL, network effects took over and it became advantageous for everybody to use that.
How about: the relational model is a fundamentally superior way of thinking about data when compared to the network and hierarchical models, and SQL’s success is due to this key fact.
Modern (web-developer) obsession with easy super-scalability and cheapness over data integrity and flexibility has led to a devaluing of what the relational model provides.
@Ken, those are all excellent points. Thanks for adding some nuance for those who read this far.
@JKF I wouldn’t go so far as to say it’s fundamentally superior. But, it does have it’s place. As to the current relational backlash and value judgements, well, you probably shouldn’t go work for Facebook anytime soon ;)
The point of the article seems to be that SQL is better for database manipulations than pure C. Well, duh! It is. SQL was designed as a representation of relational algebra and as such can be used to effectively describe relations.
However, it is very inconvenient for programming work. The level of abstraction it presents does not appear to be a good fit for software development. This is the main reason for the dislike of SQL among developers.
But we have nothing else. Essentially, we use SQL because we need to use relational databases.
@Simon I originally used verbatim C APIs, but changed it to look like a scripting language to reflect modern practice. Even then, amongst Ruby, Python and PHP, only Ruby gives you a better abstraction than C. Last time I used the others, you had to drag file descriptors (or something like them) around.
It’s an excellent point that SQL is valuable because it maps well to the relational model. As was pointed out on Hacker News, SQL also abstracts the programmer away from nasty things like references and sorting.
1) SQL is NOT relational…it was first designed off of the Relational Model, but it is not relational (ie: duplicates)
2) Most programmers these days are taught and accustomed to an imperative style of programming. SQL, being a declarative style comes off as being more difficult. However, this is due to the short sightedness of realizing that declarative code is less prone to errors and is more robust.
@Chris, I think many folks also struggle with thinking in sets, which is pretty crucial to getting down with SQL.
Quote fom the SQLite home page:
“Think of SQLite not as a replacement for Oracle but as a replacement for fopen().”
Great find, Stephan.
Comments are closed.