For the last few days I’ve been struggling to bend Rails to my will regarding the proper way to assure data consistency. Today I made some progress. This builds upon some research I did a few months ago, and hopefully this is a more or less complete solution to the problem of making Rails work the way I want it to regarding test databases.
DHH has clearly stated that he does not like a smart database. This is common among application developers, particularly in the agile methods camp, in that they generally appear not to understand relational set theory, or if they do, they believe that it is inherently inferior to object oriented methods (which lack a theoretical basis, as Fabian Pascal will happily shout at anyone who will listen). I gather from DHH’s statements that he merely is trying to practice Don’t Repeat Yourself (a.k.a. DRY, one of the most important values of Rails). I gather from Rails itself that he either respects the need of some folks to disagree with him enough to provide hooks to bypass ActiveRecord, or that he at least agreed with someone else’s patch. By this I mean that there are ways around the ORM features of ActiveRecord, to do raw SQL and to execute raw DDL at database creation time, which implies that he isn’t trying to force his opinions on others, but rather to make it easier to do things his way than to do them a different way.
Fair enough. Rails is opinionated software, as DHH often says, and I have found several cases where letting go of my particular way of doing things has been fine, given that Rails has a different but equally valid way of doing things that is made super easy by the framework.
However, I disagree with his decision to keep the DB stupid, for two reasons.
First, I prefer to put logic where it belongs, rather than gathering it all in one place. AJAX, and in particular Google Maps, is a good example of presentation logic going where it belongs, making the whole application work better. SQL RDBMSs have features that can be abused, and in some cases these features are there because a wrong-thinking but wealthy client demanded them, but most of the advanced features that a “Real Database” has are there so that you can protect yourself against data loss or data corruption. The database is in a unique position to let you declare rules for things that must always be true, and then to trust that the database will never violate those rules. Older versions of MySQL were notably lacking in these features and their absence was justified by MySQL staff who basically said “you don’t need that, and if you want it, you’re confused.” Rails has inherited some of these damaged assumptions from MySQL, leaving basic relational features like foreign keys out of the framework(!). Fortunately Rails allows plugins, and there is a set of foreign key plugins that overturn this decision. But in general, if the database belongs to your application, that’s not an excuse to move database functionality into application code. By calling it your application’s database (as opposed to an Integration Database) you imply that it is part of your application, and therefore any rules or procedural code in it is necessarily also part of your application. You can’t monopolize the database and say that no one else has any business using it, while at the same time holding it at arm’s length and saying it’s not a valid part of the application. It is. No, business rules probably don’t belong in the database, but basic data consistency maintenance (in rule or procedural form) does.
I’m being charitable here, but my experience with individual practitioners of the Stupid Database Method invariably ends with me finding out that they don’t really understand databases at all (hence the desire to abstract the database away entirely with a driver plugin architecture topped by an ORM layer, lest they have to understand how a specific database product works), and would rather remain ignorant and reinvent the same functionality in the application layer or in the ORM layer. (It’s a case of “when all you have is a hammer, everything looks like a nail”, where the hammer is a general-purpose programming language, and you’re looking at a problem of high performance concurrent transactional programming.)
Not surprisingly, the database of choice for these folks is the least featureful, lowest cost, easiest to install one available. Because naturally it’s much more agile to write and debug new multithreaded transactional code in a high level dynamic language. than it is to get the same functionality for free in a thoroughly tested product that’s written in C. Right? Perhaps DHH is not one of these people. I assume he is not, again based on what he has said and coded. But nevertheless, the folks I’ve talked to personally who agree with his point of view are all coming from a point of view of willful ignorance.
Secondly, I prefer to employ defense in depth against data errors. Transient errors can have workarounds, but data errors are permanent, and that means that if your data is valuable, the damage done can be irreversible. Just because it’s possible for correct application code to avoid race conditions, improper escaping, etc. doesn’t mean that you should put all your eggs in that basket. When the price of data corruption is high (i.e. if you value the data in your database) then it’s worth the duplication of effort: test the application code, but also put a constraint in the database that will catch things the application code missed.
This is the same sort of thinking that leads to using automated unit tests, then functional tests, then integration tests, and then some manual QA, all overlapping. Duplication of effort? Yes. Worth it? Yes. Database bugs are arguably the worst kind of bugs to find in production, so they merit extra code that maybe isn’t absolutely necessary for the application to work, but is nice to have since you’d like to sleep at night.
So, I feel justified in wanting to put CHECK constraints and triggers in my database.
The implementation details are discussed in part 2.
When your programming experience consists of web applications with a simple database back-end the Rails model almost makes sense. The typical Rails developer apparently has little experience with large-scale databases that serve data to multiple applications. In addition to the reasons cited in the article, putting data integrity enforcement and business logic into the application instead of into the database eventually violates the DRY principle because that same logic must be duplicated in every application that uses the database. That might almost be OK if every client application was written with Rails, but that’s not practical in many real-world applications that already have established databases and related applications.
DHH and the Rails community dismiss “legacy” databases as if they are old-fashioned and soiled and should be replaced by a new dumb database. That isn’t realistic for a lot of companies, which is why the Rails success stories are small applications and not large-scale projects. That’s OK, Rails has a niche that it’s good in, but that is not a good reason to criticize the relational database model and 30+ years of collective experience with database management and programming.
See my article on Abject-Oriented Databases for more information:
http://typicalprogrammer.com/databases/abject-oriented-databases/
I agree with the statements you make in this post but I feel like the tone comes off as pretentious and unfairly critical towards the rails developers (DHH in particular). I admit that he deserves an equal share of vim and vigor that he doles out on the regular and certainly everyone is entitled to their own tone. However, I think the excellent points you make in the last two paragraphs lose out due to your delivery.
Coming from a point of data integrity you are right on the money. I don’t think that the original design goals of Rails was geared towards this mindset but this type of engineering comes at a price both in hiring experienced DBAs and in writing the extra transactional SQL and application binding. If you are going to pay this price you certainly won’t have a problem replacing the modular ORM back-end on rails. It sounds like you are already well along this path.
I think everyone in the rails community would love to see you submit extensions that help model the types of data integrity correctness that you are passionate about.
Just don’t go crazy and put all your business logic in stored procedures, like most DB engineers would have you do.
@greg:
The Abject-oriented stuff is pretty funny. I’m reminded somewhat of mrbunny.com and the hilarious books and card game that can be found in Mr. Bunny’s universe of sarcastic software self help books.
@proj:
Unfortunately I tend to be confrontational in tone, and perhaps aiming to be less shrill than Fabian Pascal is not sufficiently amiable. But I try to distinguish between DHH (who apparently gave the issue due thought and made an informed decision that I disagree with) and people who are just defending the hill they happen to be standing on. I hope the technical point, and the human point (understand trade-offs instead of just disregarding the unfamiliar) are still clear despite my unintentionally inflammatory tone.
@asd
Paragraph 5 ends with: “No, business rules probably don’t belong in the database, but basic data consistency maintenance (in rule or procedural form) does.” DB engineers can be just as guilty as app developers of basing architectures on their current skill set instead of what the application really needs. It’s just human nature to want to work in your comfort zone; we have to learn to balance our own productivity with what the app needs in order to work properly.
One thing I wanted to nitpick, was your point that having data logic in the application invariably leads to code duplication if multiple apps want to execute that logic. This is true if both apps access the database. But there is a growing trend of applications asking other apps for data, instead of other databases via web services of some kind (REST, WS-*, EJB, etc). In this scenario, the application that *completely* manages the database is the only place where the logic is hosted. Although this may not always be the right approach, sometimes there can be huge performance benefits of this architecture by leveraging in-memory caches of object structures. Also, the database, in my experience, almost always ends up being the bottleneck when scaling an application. Sometimes the dumb database approach just means more DB interaction, which is bad. But if the application can leverage caching or some other tricks, then this architecture can scale very easily (depending on the actual requirements of the app).
I’m certainly not disagreeing with what you’re saying. I guess this topic is just typical of the “it depends” answer. My experience is that the most scalable architecture is usually a combination of the DHH approach and yours.
Jamie,
This isn’t new to Rails. At least the Rails people *have* an ORM; the PHP folks were preaching “dumb databases” 5 years ago and didn’t have ActiveRecord as a crutch to make it work.
I did a little presentation on the topic for PHPCon in 2003. It’s up on my website:
http://www.powerpostgresql.com/Downloads/database_depends_public.swf
You would be amazed how common the fear of databases is in the Java world as well. Look at things like SQLJ, a product the developers I work with are now considering using in order to avoid having to deal with things like ‘functions’ and ‘triggers’ and ‘referential integrity’ which just get in the way of real work.
The java people think everything (including all sql, datatype constraints, etc) should be in the application, the database people think everything (including business logic) should be in the database. I think it largely stems from not understanding how all the components of an enterprise (and I apologize for that word) system work together.
I would theoretically be a perfect candidate for Rails fandom: I have no formal training in computers and I’ve only done web development, never any standalone apps or mainframe or anything of that sort. But I’ve sought to learn how everything works together and I firmly believe in using the best tool for the job. Let the database do what it’s good at (particularly heavy lifting written in C for speed), let the application do what it’s good at (business logic, user interface, external system interfaces).
Another note about people who fear databases, they also tend to not understand UI or separation of logic from layout, at least in my experience.
I agree that ruby fans likely go too far in their low opinions of databases, but I think you are ignoring some of the advantages of the gained agility. Changing a database model is easier when all your code is written around an ORM instead of in non-standard triggers and procedures. When the time comes to get more out of your database, rewriting parts of a functional ruby webapp in SQL is easier than switching db engines.
At least I hope that that is the case as I am close to having to rewrite my django project to make it scale.
I’d generally agree with you because I’m a database junkie myself, but there are a few points to consider:
1. Databases have lousy ways of communcating to the user. A constraint violation error from a database if caught by the framework, usually results in an ugly error message returned to the user. You can frame an error message nicer in Rails, although I think you have to give up MVC and build logic into the presentation layer as well if you really want a user-friendly system. Actually I think the next big thing which is waiting to happen is to define these rules once and somehow replicate them to ALL layers of the system. I mean, in my dreams a foreign key constraint would mean that 1) user entry is checked by JavaScript and when the user keys in the data, the field would turn red right after the first wrong character entered, not even waiting until the user leaves the field 2) if the given controller is used on a different view that maybe has no such logic, the user would get a standard Rails-style error message 3) if a controller without such logic would violate a model constraint the same thing would happen 4) if a consumer of the web service would do it, would get an error in XML 5) if any application that tries to write to the DB would do it, would get a SQL constraint error. I think this WILL be possible. It MUST be, because until then NO application can be both DRY and usable. I think all we need is LOT of generators that can read constraints from the DB and generate lots of different code.
2. SQL is amazingly good for reports – I feel sorry for the poor people who try to write complex reports in an ORM instead of SQL views and stored procedures. However, SQL is lousy for INSERT/UPDATE.
3. SQL is also lousy for simple reports that traverse record by one to many relationships. And they are very common – show me the total value of the orders of this given customer etc. They are the most common, actually. If they database already knows of my foreign key relationships, why do I have to write where child.parent_id = parent.id every time? Even the simplest queries tend to contain 3-4 such clauses. ORMs really shine in this?
4. I think the long run the good combination is an ORM for basic tasks and for logic IF we can implement the above “define once, enforce at every layers” stuff for constraints , and SQL for complex reporting.
5. Database agnosticism sounds like a really stupid idea – f.e. PostGres has this amazing thing that every table is also automatically a datatype and I hate to be not allowed to use this – but the thing is, if the customer says you have to talk to this app which lives in MS SQL or Oracle it’s a lot better if your app lives there too. But Rails is lousy in talking to these apps.
6. Generally, I feel Rails is an amazing, sexy, lovely, ingenious, polished TOY – Rails both as a technology and as a culture pessimistic and cynical enough to be used in most real business scenarios. It intends to be the replacement for Java which is not pessimistic enough either. In mos t real world scenarios, you need to glue together a bunch of stinking legacy crap from greenscreen AS/400-s to Excel tables with horrible macros. Rails is the Mac of the web, in short.
Typos:
instead of “ORMs really shine in this?” : “ORMs really shine in this!”
insteaf of “Rails both as a technology and as a culture pessimistic and cynical” : “Rails both as a technology and as a culture IS NOT pessimistic and cynical”
If you are familiar with the capabilities of RDBMS and relational concepts, and have realized what little awareness/appreciation the Rails community has for these things, i have just one question which continues to baffle me everytime i read articles like this:
**why are you bothering to use rails at all??**
There are superior approaches available right now, which include all the nice agile features rails have, with none of the overbearing “opinionated” ignorance of essential techniques and tools needed for proper database work. I am talking of course about ORM tools like SQLAlchemy, in conjunction with highly rails-like frameworks such as Pylons. Despite the existence of these toolsets (and others), seasoned developers continue to bang themselves over the heads with rails and all its “opinionated” limitations as though theres nothing else out there.
you *do* know there’s alternatives, right ?
oh p.s.: hi jamie ! (i work with your code at SECV)
If you pick one database backend for your Rails app, there is *nothing* keeping you from overriding every built-in ActiveRecord method & database migration script with your own direct SQL.
Don’t think of the “Rails conventions” as limitations, think of them as a starting point in pure Ruby.
I do believe I disagree. Stored procedures in a database are almost always a bad idea. If you’re writing a thin web application, you can put the logic in your code, abstracted cleanly into a library. If you’re writing an enterprise app with many possible clients, you should go the full three-tier route, and make a proper application server.
Either way, the layer (library or application server) handles data consistency for you, in the same way that stored procedures would. You do, however, get to write it in a real language, instead of some poorly designed stored procedure language. You don’t tie yourself to one database (or at least, not as strongly). This is especially an issue if you’re writing an application used in more than one place (I.E. a product you sell to customers, as opposed to a product used by a single organization). You can also higher more general-purpose software engineers, who tend to be cheaper, smarter, and easier to find than Oracle engineers.
@mike bayer,
please! WTF, SQLAlchemy and Pylons are better than Rails, which is only for people who haven’t realized the true power of the database. God, those head bangers. sorry, the world isn’t as smart as you Mike.
@jamie,
Yeah, it’s an opinion, i’m not disagreeing, but I also don’t think it matters. You can use Rails with a database that has stored procedures, foreign keys, triggers, different transaction isolation levels. It takes a little bit of work, but you can do it. As far as the “simple database approach”? It’s the right choice for some applications, maybe not the applications you or I work on, but let bygones be bygones
Also, you seem to be determined to attack programmers who are defending the hill they happen to be standing on. I’ve seen it work both ways, very dogmatic DBAs defending Stored procedures against anything even resembling an ORM, and java programmers running around saying things like, “The database should be dumb”. Both side of the debate are wrong, the right answer is invariably: “do whatever gets the job done with the people currently at hand”.
But, to actually take a stand, i’ll say this. Stored procedures, OLAP, what have you are great ideas if the application demands them, but more often than not a problem can be solved with the simplest of tools. The challenge is knowing when to abandon the idea of stupid databases, and I think this is the problem you’ve identified. Stupid databases make sense for the simplest of application, but programmers tend to be dogmatic, so you end up with a whole department of hypnotized agilists holding on to the idea even in the face of conflicting evidence.
the problem isn’t stupid databases per se. the problem is dogmatic adherence to anything not founded on real experience. There are parts of agile mumbo-jumbo that make some sense, but 9 times out of 10, some 20 year old reads some Agile manifesto book and suddenly thinks that they have the 20/20 vision of a 30 year industry veteran – “Oh, we don’t need triggers or views, blah blah blah….” Where the 30 year veteran would actually end up saying…”Hmmmm….as a general rule, Stored Procs are not the first choice, but when we need them, let’s not rule them out.”
but, for the record, Rails is great. I’m sorry, but it just is.
Thank you for this article. Rails is definitely a neat technology and probably can help people build nifty applications in a short time, but ultimately if you want to build a database-backed app you really need to grok the database. These frameworks that intentionally obscure the database so that you barely know will only lead to trouble in the long run as your “dumb” queries start piling up. Sprocs are a good thing!
Rails suffers from the “All Web Development is New Development” anti-pattern; see http://www.devpapers.com/article/259/1
hey tim-
we are talking here about activerecord, and how its a crappy product which ignores basic tenets of relational theory; whose community, when presented with these notions, instead of taking on the challenge of improving the product, or ever spending *30 seconds* to look at *anything* which is not rails, defends its mediocre design with lame ideas like using triggers in place of foreign keys and proclaiming composite primary keys as a poor design idea. None of this has anything to do with “me”; we aren’t talking about “me” here. So thanks to your response I guess we’re also now talking about the phenomenon of wagon-circling, ad-hominem diversionary bullcrap that inevitably arises in any attempt to seriously question any aspect of rails.
I think Rails is more about preventing premature optimization. There are lot of articles about scaling a Rails app and it is obvious that you check your logs, profile your app and of course do your DB engineering! When there are bottlenecks, why don’t use a stored procedure? The question is, _when_ to do that. Most of the apps never get that big. And when they do, you have the scaling issue with every technology.
The productivity with Rails (because of Ruby) is so much higher that you can do optimizations when needed. When you do premature optimization by using a big upfront db design and bloated technology and methodology chances are big, that the project never gets done. In these cases its almost better to start with light-weight frameworks like Rails and rewrite the app or parts of the app when nessecary.
Great post Jamie. I agree with it, however using the Rails approach of not worrying about data integrity makes development really, really fast. That’s a good thing, but no excuse for ignoring data integrity. Luckily, I think this fits in with the whole “Premature Optimization”. Build the app, then once its working and unit tested, optimize the heck out of it including adding all the database level integrity checks (along with ruby unit tests to verify).
@mike bayer
Yes there are alternatives like Java and .net, kind of like democrats and republicans. If Ruby users didn’t think there was anything else they would be – Java ROFs (Religious Old Farts) – instead of the affectionately named fanboys we have been coined – hey we didn’t start the name calling.
BTW composite PKs are a great theory but in implementation they are a continuous cause of heart burn and refactoring that will haunt you until the product is retired – regardless of the language you are using. If you haven’t learned this yet you have been fortunately enough not to have had to maintain hundreds of thousands of lines of code using schemas written by someone else in your career. That being said Dr Nic has provided a plug-in to provide to handle this in AR for legacy applications. There are also plug-ins to handle stored procs and foreign keys and other aspects that you mention – had you bothered to look inside the Ruby world instead reaching for your favorite hammer to make everything look like a nail.
Your right this article wasn’t written for you, take a couple of aspirin and get over it.
Rails is designed for a particular problem space and handles that problem space very well. It’s very easy to extend it beyond its typical range when necessary but getting buy-in for that at a core level can be difficult because generally the Rails core team concerns itself with 37 Signals -sized problems rather than enterprise-sized problems.
The assumption with Rails is that you can divert requests within the community for design changes into plugin development. That’s mostly true, except that it effectively limits the size and extent of such changes. If you want to add brains to a database which is otherwise totally “normal” from Rails’ point of view, the question is whether the project is as huge and twisted as adding brains to the Frankenstein monster, or if it’s just a matter of writing some good plugins.
There’s no doubt that you will find a lot of younger Rails developer espousing good practices, and being very proud of it, without entirely understanding them. That’s actually a huge improvement from a few years ago, when the average developer disparaged or ignored good practices without understanding them. It’s true that the Rails community gets incredibly annoying sometimes, but it’s also true that Rails is a thing of beauty.
The download link is broken.