John Ashmead on 16 Dec 2005 13:05:20 -0000

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PhillyOnRails] Rails and Database Design Issues

I understand that most people working with RDBMSs for long periods of time
get very ingratiated with the idea of using stored procedures everywhere,
but I find that most of those same people have ignore the real purpose
of a stored procedure. (make an oft-run SQL query go faster) Other levels
of optimisation are available in almost all cases.

Actually, speaking as one of those "working with RDBMSs for long periods of time", the principle benefits of stored procedures are to improve:

1) security

2) maintainability

3) speed

*in that order*. Speed is important, but not as important as security and maintainability.

1) Security. In the case where medical or financial data is kept, all access to the data -- including reads as well as updates -- has to be regulated or Mr. CEO goes to jail (Sarbanes-Oxley). The only way to do this easily is to consolidate all access in one place, which you can then watch. This means stored procedures, which can not only do the data access, but log every access. Because stored procedures live inside the database they are very easy to monitor. You can run selects against the tables that store the stored procedure code to see all stored procedures that access a specific table or field. This is a big win when verifying access rules are being adhered to. Also, stored procedures let you control permissions in a very detailed way: a bank clerk can be given access to the "transfer_balance" stored procedure but not allowed access to the underlying account tables. And his/her transfer can be (& always is) logged by the procedure itself.

2) Maintainability. If all access to the data is consolidated, then it is easier to change fields & tables, see where performance hits are coming from, track data bugs, and so on. I've had major maintenance problems locating rogue SQL (there is such a thing!) which was hiding in frontend code. You haven't lived until you've had to break a table into two or more tables and find every single reference in the frontend code & fix each. Worst case: the SQL is built in the front end on the fly. References that don't exist until execution time are especially hard to find/fix! :)

3) Speed. The stored procedure is compiled once, then run many times. Individual SQL's have their query plans compiled every time they are run, which can be a major performance hit. Also, with stored procedures the inputs travel over the wire to the database and the outputs travel back, but all the rest of the work -- which can be thousands of lines long -- happens on the server with no wire charges. Wire charge reduction can be a second significant performance win. Note large scale data accesses will almost always be done faster by the database -- which was designed and built to do them quickly -- than by anything else. Small scale data access happen so quickly nowadays they don't count. One good model, when applicable, is to have the stored procedure doing most of the grunt work collecting data, throwing out and/or summarizing millions of rows, then pass back tens or at most hundreds of rows for further work on the frontend, letting the user apply graphic tools to them or the like.

In re speed, I'm not clear what is meant by "other levels of optimization". Most of the performance problems I've seen have been corrected by 1) fixing the join condition 2) adding indexes 3) fixing the table design 4) switching to specialized data loading tools 5) reducing network traffic. Frontend tools have usually not created performance problems, at least not in my experience. This is because the amount of data a single user can type in is minimal: the worst they can do is kick off some query that queries the entire database & that problem can be checked for and trapped.

Further comment: to a frontend, stored procedures look much like SQL, depending. A select type stored procedure gets a few variables, then returns one or more rows. Not very different from a select. A stored procedure to change data gets a few variables, then returns status. Very like an update/delete. An insert type stored procedure gets a lot of variables (for a wide table) then returns status. Very like an insert. The easiest way to think of stored procedures is as select/update/delete/insert SQL with a bit of smarts.

I have nothing against doing all database access from inside a single tool, such as Ruby, but I think -- as I observed in an earlier email -- that as an application gets larger, longer lived, more complex -- it can make sense to break parts of it out into tools specialized for the specific tasks, i.e. database, data analysis, report generation, and so on.


John (person working with RDBMSs for long periods of time) Ashmead

talk mailing list